Non-linear Methods for Dimensionality Reduction

Isomap Algorithm http://isomap.stanford.edu/ Yuri Barseghyan Yasser Essiarab

Linear Methods for Dimensionality Reduction • PCA (Principal Component Analysis): rotate data so that principal axes lie in direction of maximum variance • MDS (Multi-Dimensional Scaling): find coordinates that best preserve pairwise distances

Limitations of Linear methods • What if the data does not lie within a linear subspace? • Do all convex combinations of the measurements generate plausible data? • Low-dimensional non-linear Manifold embedded in a higher dimensional space http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction1.pdf

Non-linear Dimensionality Reduction • What about data that cannot be described by linear combination of latent variables? • Ex: swiss roll, s-curve • In the end, linear methods do nothing more than “globally transform” (rotate/translate/scale) data. Sometimes need to “unwrap” the data first PCA http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Non-linear Dimensionality Reduction • Unwrapping the data = “manifold learning” • Assume data can be embedded on a lower-dimensional manifold • Given data set X = {xi}i=1…n, find representation Y = {yi}i=1…n where Y lies on lower-dimensional manifold • Instead of preserving global pairwise distances, non-linear dimensionality reduction tries to preserve only the geometric properties of local neighborhoods

Isometry • From Mathworld: two Riemannian manifolds M and N are isometric if there is a diffeomorphism such that the Riemannian metric from one pulls back to the metric on the other. For a complete Riemannian manifold: d(x, y) = geodesic distance between x and y • Informally, an isometry is a smooth invertible mapping that looks locally like a rotation plus translation • Intuitively, for 2-dimensional case, isometries include whatever physical transformations one can perform on a sheet of paper without introducing tears, holes, or self-intersections

Trustworthiness [2] The trustworthinessquanties how trustworthy is a projection ofa high-dimensional data set onto a low-dimensional space. Specically a projectionis trustworthy if the set of the t nearest neighbors of each data point in the lowdimensionalspace are also close-by in the original space. r(i, j) is the rank of the data point j in the ordering according to the distancefrom i in the original data space Ut(i) denotes the set of those data points that areamong the t-nearest neighbors of the data point i in the low-dimensional space but notin the original space. The maximal value that trustworthiness can take is equal to one.The closer M(t) is to one, the better the low-dimensional space describes the originaldata.

Several methods to learn a manifold • Two to start: • Isomap [Tenenbaum 2000] • Locally Linear Embeddings (LLE) [Roweis and Saul, 2000] • Recently: • Semidefinite Embeddings (SDE) [Weinberger and Saul, 2005]

An important observation • Small patches on a non-linear manifold look linear • These locally linear neighborhoods can be defined in two ways • k-nearest neighbors: find the k nearest points to a given point, under some metric. Guarantees all items are similarly represented, limits dimension to K-1 • ε-ball: find all points that lie within εof a given point, under some metric. Best if density of items is high and every point has a sufficient number of neighbors http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction1.pdf

Small Euclidean distance Isomap • Find coordinates on lower-dimensional manifold that preserve geodesic distances instead of Euclidean distances • Key Observation: If goal is to discover underlying manifold, geodesic distance makes more sense than Euclidean Large geodesic distance http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction1.pdf

Calculating geodesic distance • We know how to calculate Euclidean distance • Locally linear neighborhoods mean that we can approximate geodesic distance within a neighborhood using Euclidean distance • A graph is constructed byconnecting each point to itsK nearest neighbours. • Approximate geodesic distances are calculated by finding the length of the shortest path in the graph between points • Use Dijkstra’s algorithm to fill in remaining distances http://www.maths.lth.se/bioinformatics/calendar/20040527/NilssonJ_KI_27maj04.pdf

Dijkstra’s Algorithm • Greedy breadth-first algorithm to compute shortest path from one point to all other points http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Isomap Algorithm • Compute fully-connected neighborhood of points for each item • Can be k nearest neighbors or ε-ball • Calculate pairwise Euclidean distances within each neighborhood • Use Dijkstra’s Algorithm to compute shortest path from each point to non-neighboring points • Run MDS on resulting distance matrix http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Isomap Algorithm [3]

Time Complexity of Algorithm http://www.cs.rutgers.edu/~elgammal/classes/cs536/lectures/NLDR.pdf

Isomap Results Find a 2D embedding of the 3D S-curve http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Residual Fitting Error Plotting eigenvalues from MDS will tell you dimensionality of your data http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Neighborhood Graph http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

More Isomap Results http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Results on projecting the face datasettotwo dimensions(Trustworthiness−Continuity) [1]

More Isomap Results http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Isomap Failures • Isomap has problems on closed manifolds of arbitrary topology http://www.cs.unc.edu/Courses/comp290-090-s06/Lecturenotes/DimReduction2.pdf

Isomap: Advantages • Nonlinear • Globally optimal • Still produces globally optimal low-dimensional Euclideanrepresentation even though input space is highly folded,twisted, or curved. • Guarantee asymptotically to recover the truedimensionality.

Isomap: Disadvantages • Guaranteed asymptotically to recover geometricstructure of nonlinear manifolds • As N increases, pairwise distances provide betterapproximations to geodesics by “hugging surface”more closely • Graph discreteness overestimates dM(i,j) • K must be high to avoid “linear shortcuts” nearregions of high surface curvature • Mapping novel test images to manifold space

Literature [1] Jarkko Venna and Samuel Kaski, Nonlinear dimensionality reduction viewed as information retrieval, NIPS' 2006 workshop on Novel Applications of Dimensionality Reduction, 9 Dec 2006 http://www.cis.hut.fi/projects/mi/papers/nips06_nldrws_poster.pdf [2] Claudio Varini, Visual Exploration of Multivariate Data in Breast Cancer by Dimensional Reduction, March 2006 http://deposit.ddb.de/cgi-bin/dokserv?idn=98073472x&dok_var=d1&dok_ext=pdf&filename=98073472x.pdf [3] YimingWu, Kap Luk Chan, An Extended Isomap Algorithm for Learning Multi-Class Manifold, Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference, Aug. 2004 http://ww2.cs.fsu.edu/~ywu/PDF-files/ICMLC2004.pdf

Non-linear Methods for Dimensionality Reduction

Non-linear Methods for Dimensionality Reduction

Presentation Transcript

Domain Name System

Nonlinear Dimensionality Reduction Frameworks

Distributed Power Systems ELCT 908

Packet Classification using Hierarchical Intelligent Cuttings

Modified from Stanford CS276 slides Chap. 13: Text Classification; The Naive Bayes algorithm

Stanford University

NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap]

Algorithm Portfolios: Motivation

Yasser Ali Layla A l B looshi Abdulla Fared

The Raft Consensus Algorithm

Stanford APM:Plane Overview

Non-Linear Dimensionality Reduction

A new algorithm for bidirectional deconvolution

Geant4 on GPU prototype

For more information, see solar-center.stanford/SID

Algorithm Analysis

Mark Branom markb@stanford stanford/people/markb/

Experimental Realization of Shor’s Factoring Algorithm ‡

Stanford Streaming Supercomputer

stanfordnetdb.stanford Sunia Yang sunia@stanford

Balaji Prabhakar