190 likes | 403 Vues
What is Multidimensional Scaling [MDS] ?. Anthony P.M. Coxon Emeritus Professor of Sociological Research Methods, University of Wales Honorary Professor, Cardiff University Honorary Professorial Research Fellow, University of Edinburgh Co-founder & Co- Director of MDS software packages,
E N D
What is Multidimensional Scaling [MDS] ? • Anthony P.M. Coxon • Emeritus Professor of Sociological Research Methods, University of Wales • Honorary Professor, Cardiff University • Honorary Professorial Research Fellow, University of Edinburgh • Co-founder & Co- Director of MDS software packages, • MDSX [OS] (freeware)and • NewMDSX for Windows (not-for-profit) • Website:www.newmdsx.com • Course materials: http://apmc.newmdsx.com/ • see my entry on multidimensional scaling in Lewis-Beck, M.S. et al, eds (2004) The Sage Encyclopaedia of Social Science Research Methods. London Sage Publications ) What is MDS?Prof APM Coxon, U Cardiff
ORIGINS / DEVELOPMENT OF MDS • MDS (aka “Smallest Space Analysis”) • Has origins in Psychometrics in 1920-’60s: • Scale construction and dimensionality reduction • Underwent major burst of development in 1960s due to “non-metric revolution”(Coombs) and computing developments allowing iterative estimation • Originally designed for analysis of LTM of dis/similarities data , taking a range of measures (not just PM correlations): • “anything which, by an act of faith, can be considered a similarity” (Shepard) • Extended rapidly to deal with wide range of other types of data • Rectangular matrices ; triads, pair-comparisons, free-sorting • “stacks” of matrices (3-way scaling – INDSCAL) What is MDS?Prof APM Coxon, Cardiff U
CONSTRUCTING A MAP … • Given a map, it’s easy to calculate the distances between the points … • MDS operates the other way round: • Given the data [ interpreted as quasi “distances” ] it attempts to find the configuration [location of points] which generated the distances • This is “Classic MDS”: developed in 1930s – but imperfect, not robust, & works only if data are ratio. • Whereas more recent MDS can work when only the ordinal information exists: “Non-metric” = ordinal MDS (Coombs / Kruskal “non-metric revolution” ) • What?? You can create an accurate map from only the rank –order of the distances??? Yes! And it works!!
The RANK of distances can recover the Map…though not the coastline NEWMDSX (RUNSCRIPT + SYNTAX) RUN NAME Rank of Scottish distances, COMMENT 1 = smallest; 120 = max; dissimilarity data F3.3, p48 The User’s Guide to MDS N OF STIMULI 16 PARAMETERS DATA TYPE(1) LABELS BERWICK EDINBURGH GLASGOW STRANRAER AYR PERTH DUNDEE ABERDEEN STIRLING OBAN FORT_WM INVERNESS KYLE_LOCHALSH BRAEMAR ULLAPOOL THURSO READ MATRIX 17 53 11 92 68 36 70 30 4 11 34 7 19 83 45 27 8 29 93 58 1= Perth-Dundee 63 56 83 115 103 35 24 43 4 2 58 21 3 14 66 99 57 26 72 36 43 63 98 28 96 60 39 89 62 41 49 79 33 6 100 75 75 112 97 45 49 48 60 52 22 111 89 78 107 89 72 81 94 70 26 9 23 67 36 51 106 80 15 10 19 30 65 30 15 54 114 105 101 119 109 85 87 86 88 68 40 16 17 55 117 113 115 120 119 103 102 77 110 107 95 47 62 74 42 COMPUTE = Stranraer - Thurso What is MDS?Prof APM Coxon, Cardiff U
WHAT IS MULTIDIMENSIONAL SCALING? A student’s definition: • If you are interested in how certain objects relate to each other … and if you would like to present these relationships in the form of a map then MDS is the technique you need” (Mr Gawels, KUB)A good start! • MDS provides … • a useful and easily-assimilable graphic visualisation of all sorts of data • Tukey: “A picture is worth a thousand words” • In a user-chosen (small) # of dimensions • providing a graphical representation of the structure underlying a complex data set • And measure how well / badly the solution distances match the data dissimilarities (Stress) What is MDS?Prof APM Coxon, Cardiff Uni
MDS is a family of models differentiated by … • (DATA) the empirical inter-relationships between a set of “objects”/variables which are given in a set of dis/similarity data • Basically, type of input data, defined by their “Way” and “Mode” [e.g. 2W1M]. (Cf observations vs data) • (FUNCTION) data are then optimally re-scaled (according to permissible trans-formations for the data) in terms of … • Choice of level of measurement [e.g. ordinal ] • (MODEL) the assumptions of the model chosen to represent the data • Usually (Euclidean) Distance model What is MDS?Prof APM Coxon, Cardiff U
VARIANTS OF MDS due to type of data MDS can be used with a wide variety of DATA e.g.: SORTS OF DATA • direct data (pair comparisons, ratings, rankings, triads, counts) • derived data (profiles, co-occurrence matrices, textual data, aggregated data) • measures of association etc derived from simpler data, and • tables of data. • TYPES of DATA • Described by WAY(2W=matrix; 3W=stack of matrices …) • And MODE (# sets of distinct objects – eg variables, subjects) • E.G. 2W1M; 2W2M; 3W2M … 7W4M What is MDS?Prof APM Coxon, Cardiff Uni
VARIANTS OF MDS MODELS due to TRANSFORMATIONS MDS can also be used with a wide variety of: Transformations (“levels of measurement”) • monotonic (ordinal), • linear/metric (interval), … but also • Splines (SPSS PROXSCAL) • local preservation of distance • log-interval (MRSCAL), • Power (MULTISCALE) • “smoothness” What is MDS?Prof APM Coxon, Cardiff U
VARIANTS OF MDS due to type of MODEL What is MDS?Prof APM Coxon, Cardiff Uni • DISTANCE “Minkowski-r” • Usually Euclidean (r=2) • Less often “City Block”, r=1 • Sometimes “Dominance”, r=∞≈ 32 • SCALAR PRODUCTS/Factor • scalar product : a ・ b = |a| |b| cos θ • E.g. Covariance, PM Correlation • As used in PCA, FA, MDPREF • COMPOSITION • Most usually, Additive (cf ANOVA), as in Impression Formation: • X(i.j) = a(i) + b(j) + … • nb Ordinal.non-metric ANOVA • But also, difference, product, mixed
HOW DOES MDS WORK? What is MDS?Prof APM Coxon, Cardiff Uni • Iteratively! • START: Produce Init. “Guestimate” Configuration • (a) FIT • Calculate distances (d) • Compare with data (δ) [via Ordinal regression] • Calculate overall badness -of-fit measure • Stress (d- δ) … well, almost! Actually more complex • Perfect/Acceptable? EXIT • (b) IMPROVE: For each point, • find direction of improvement (don’t ask: calculus! Derivatives!) • How far to move? Step-size (call it ‘heuristic’ ; “parachute & mist”) • (c) MOVE configuration/points • BACK TO (a)
MDS PROGRAMS: 1. Usually either “General Purpose” Package (SPSS ) • Basic Model for 2W1M data: PROXSCAL and 3W2M INDSCAL • Also contains CORRESP, HICLUS and (in >SPSS13 ) PREFSCAL (2W2M) 2. or “Library” : set of programs, each specific to Data-shape, Trans & Model (e.g. NewMDSX for Windows); includes • BASIC 2W1M SCALING: • Non-metric (ordinal) MINISSA , Metric (MRSCAL) linear, • Clustering (Hierarchical & Non-hierarchical) • 2W2M (“Rectangular”) SCALING: • Multidimensional … Preference, Triads, Unfolding, Sorting • 3W2M (and higher) SCALING: • Individual Differences (INDSCAL), (Tucker) Points-of-View • Procrustean IndDiffs (Lingoes’ PINDIS) • Or “ Interactive “ Package (PERMAP via NewMDSX) • primarily for basic model • Visually animated • Superb diagnostic procedures What is MDS?Prof APM Coxon, Cardiff Uni
SITES & SOFTWARE: SITES • NEWMDSX AND DOCUMENTATION: http://www.newmdsx.com • INTERACTIVE PERMAP (Heady) • (presently obtained via NewMDSX) • THREE-WAY SCALING (Kroonenberg) • http://www.leidenuniv.nl/fsw/three-mode/content.htm • FORREST YOUNG’S VISTA (Visual Statistics) http://forrest.psych.unc.edu/research/index.html What is MDS?Prof APM Coxon, Cardiff Uni
WHAT IS MDS? … and now for an example! What is MDS?Prof APM Coxon, Cardiff Uni
APPENDICES Interpretation: Headlines MVA & MDS Professor APM Coxon
MDS: Interpretation: Headlines For Euclidean Distance MDS: "What information is stable/significant?“ Beware Local minima [PERMAP] Remember: You may translate, reflect, (rigidly) rotate the configuration: do so! [e.g. NewMDSX Graphics; PERMAP] “CLEARING UP” Configuration: [PERMAP] Map Evaluation & Diagnostics; Points and Links; selective removal and hints of structure via Waern’s Graphic links. BASIC STRUCTURES: Regional: what points are close to each other and distant from others? CLUSTERING [(HI)CLUS, SPSS] Linear: directions in space where some property is increasing: External properties [PRO-FIT NewMDSX], If you must ... dimensions -- remember changing the origin or dimensional orientation has no effect on relative distance. Most MDS rotated at end to PCA ... Unlike FA, dimensions may/ may not have importance. SIMPLE STRUCTURES dimensions, yes -- but also other simple structures (“horseshoes”, radex/circumplex). Professor APM Coxon
MDS & other “Dimensional” Multivariate Analysis models What is MDS?Prof APM Coxon, Cardiff Uni
SOME POSSIBLE WEAKNESSES in MDSThere ARE any??! • Relative ignorance of the sampling/inferential properties of stress • But, simulation (Spence), MLE estimation • Prone-ness to local minima solutions • but less so, and multiple starts & interactive programs like PERMAP allow thousands of runs to check • A few forms of data/models are prone to degeneracies • especially MD Unfolding, but see new PREFSCAL in SPSS14) • difficulty in representing the asymmetry of causal models • though external analysis is very akin to dependent-independent modelling, • there are convergences with GLM in hybrid models such as CLASCAL (INDSCAL with parameterization of latent classes) What is MDS?Prof APM Coxon, U Edinburgh