1 / 40

& J. Douglas Carroll Rutgers Business School – Newark and New Brunswick

NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCES By Ulas Akkucuk. & J. Douglas Carroll Rutgers Business School – Newark and New Brunswick. Outline. Introduction Nonlinear Mapping Algorithms

bond
Télécharger la présentation

& J. Douglas Carroll Rutgers Business School – Newark and New Brunswick

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCESBy Ulas Akkucuk & J. Douglas Carroll Rutgers Business School – Newark and New Brunswick

  2. Outline • Introduction • Nonlinear Mapping Algorithms • Parametric Mapping Approach • ISOMAP Approach • Other Approaches • Experimental Design and Methods • Error Levels • Evaluation of Mapping Performance • Problem of Similarity Transformations • Results • Discussion and Future Direction

  3. Introduction • Problem: To determine a smaller set of variables necessary to account for a larger number of observed variables • PCA and MDS are useful when relationship is linear • Alternative approaches needed when the relationship is highly nonlinear

  4. Shepard and Carroll (1966) • Locally monotone analysis of proximities: Nonmetric MDS treating large distances as missing • Worked well if the nonlinearities were not too severe (in particular if the surface is not closed such as a circle or sphere) • Optimization of an index of “continuity” or “smoothness” • Incorporated into a computer program called “PARAMAP” and tested on various sets of data

  5. 20 points on a circle

  6. 62 regularly spaced points on a sphere, and the azimuthal equidistant projection of the world

  7. 49 points regularly spaced on a torus embedded in four dimensions

  8. In all cases the local structure is preserved except points at which the shape is “cut open” or “punctured” • Results were successful, but severe local minimum problem existed • Addition of error to the regular spacing made the local minimum problem worse • Current work is stimulated by two articles on nonlinear mapping (Tenenbaum, de Silva, & Langford, 2000; Roweis & Saul, 2000)

  9. Nonlinear Mapping Algorithms • n : number of objects • M : dimensionality of the input coordinates, in other words of the configuration for which we would like to find an underlying lower dimensional embedding. • R : dimensionality of the space of recovered configuration, where R<M • Y : n  M input matrix • X : n  R output matrix

  10. The distances between point i and point j in the input and output spaces respectively are calculated as:  [ ij ] D [ dij ]

  11. Parametric Mapping Approach • Works via optimizing an index of “continuity” or “smoothness” • Early application in the context of time-series data (von Neuman, Kent, Bellison, & Hart, 1941; von Neuman, 1941)

  12. A more general expression for the numerator is: • Generalizing to the multidimensional case we reach 

  13. Several modifications needed for the minimization procedure: • d2ij + Ce2 is substituted for d2ij , C is a constant equal to 2 / (n - 1) and e takes on values between 0 and 1 • e has a practical effect on accelerating the numerical process • Can be thought of as an extra “specific” dimension, as e gets closer to 0 points are made to approach “common” part of space

  14. In the numerator the constant z, and in the denominator [2/n(n1)]2 • Final form of function:

  15. Implemented in C++ (GNU-GCC compiler) • Program takes as input e, number of repetitions, dimensionality R to be recovered, and number of random starts or starting input configuration • 200 iterations each for 100 different random configurations yields reasonable solutions • Then this resulting best solution can be further fine tuned by performing more iterations

  16. ISOMAP Approach • Tries to overcome difficulties in MDS by replacing the Euclidean metric by a new metric • Figure (Lee, Landasse, & Verleysen, 2002)

  17. To approximate the “geodesic” distances ISOMAP constructs a neighborhood graph that connects the closer points • This is done by connecting the k closest neighbors or points that are close to each other by  or less distance • A shortest path procedure is then applied to the resulting matrix of modified distances • Finally classical metric MDS is applied to obtain the configuration in the lower dimensionality

  18. Other Approaches • Nonmetric MDS: Minimizes a cost function • Needed to implement locally monotone MDS approach of Shepard (Shepard & Carroll, 1966)

  19. Sammon’s mapping: Minimizes a mapping error function • Kruskal (1971) indicated certain options used with nonmetric MDS programs would give the same results

  20. Multidimensional scaling by iterative majorization (Webb, 1995) • Curvilinear Distance Analysis (CDA) (Lee et al., 2002), analogue of ISOMAP, omits the MDS step replacing it by a minimization step • Self organizing map (SOM) (Kohonen 1990, 1995) • Auto associative feedforward neural networks (AFN) (Baldi & Hornik, 1989; Kramer, 1991)

  21. Experimental Design and Methods • Primary focus: 62 located at the intersection of 5 equally spaced parallels and 12 equally spaced meridians • Two types of error A and B • A: 0%, 10%, 20% • B: ±0.00, ±0.01, ±0.05, ±0.10, ±0.20 • Control points being irregularly spaced and being inside or outside the sphere respectively

  22. To evaluate mapping performance:We calculate “rate of agreement in local structure”abbreviated “agreement rate” or A • Similar to RAND index used to compare partitions (Rand, 1971; Hubert & Arabie, 1985) • Let ai stand for the number of points that are in the k-nearest neighbor list for point i in both X and Y. A will be equal to

  23. Example of calculating agreement rate k=2,Agreement rate = 2/10 or 20 %

  24. Problem of similarity transformations: We use standard software to rotate the different solutions into optimal congruence with a landmark solution (Rohlf & Slice 1989) • We use the solution for the error free and regularly spaced sphere as the landmark • We report also VAF

  25. The VAF results may not be very good • Similarity transformation step is not enough • An alternating algorithm is needed that reorders the points on each of the five parallels and then finds the optimal similarity transformation • We also provide Shepard-like diagrams

  26. Why similarity transformation is not enough?

  27. Results • Agreement rate for the regularly spaced and errorless sphere 82.9%, k=5 • Over 1000 randomizations of the solution: Average, and standard deviation of the agreement rate 8.1% and 1.9% respectively. Minimum and maximum are 3.5% and 16.7%

  28. We can use Chebychev’s inequality stated as: • 82.9 is about 40 standard deviations away from the mean, an upper bound of the probability that this event happens by chance is 1/402 or 0.000625, very low!

  29. (a) (b) (c) (d)

  30. (e) (f) (g) (h)

  31. (i) (j) (k) (l)

  32. (m) (n) (o)

  33. A=48.1 % ISOMAP A=82.9% PARAMAP

  34. Shepard-like Diagrams

  35. SWISS Roll Data – 130 points • Agreement rate=ISOMAP 59.7%, PARAMAP 70.5%

  36. Discussion and Future Direction • Disadvantage of PARAMAP: Run time • Advantage of ISOMAP: Noniterative procedure, can be applied to very large data sets with ease • Disadvantage of ISOMAP: Bad performance in closed data sets like the sphere

  37. Improvements in computational efficiency of PARAMAP should be explored: • Use of a conjugate gradient algorithm instead of straight gradient algorithm • Use of conjugate gradient with restarts algorithm • Possible combination of straight gradient and conjugate gradient approaches • Improvements that could both benefit ISOMAP and PARAMAP: • A wise selection of landmarks and an interpolation or extrapolation scheme to recover the rest of the data

More Related