420 likes | 796 Vues
Linear and Nonlinear Data Dimensionality Reduction. David Gering. Eigenfaces Turk, Pentland, 1991 Locally Linear Embedding Saul, Roweis, 2000 Isomap Tennenbaum, Langford, 2000. Agenda. Introduction to the Problem Background on PCA, MDS Eigenfaces LLE Isomap Summary.
E N D
Linear and NonlinearData Dimensionality Reduction David Gering • Eigenfaces • Turk, Pentland, 1991 • Locally Linear Embedding • Saul, Roweis, 2000 • Isomap • Tennenbaum, Langford, 2000
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Principle Component Analysis • 3 Appoaches: • Least Squares Dist. • Change of Variables • Matrix Factorization
m d=0: yk=[m]T PCA Approach 1: Least Squares Dist. [Hotelling, 1901] 0-Dimensional: single point D=2: xk=[xk1 xk2]T
PCA Approach 1: Least Squares Dist. 1-Dimensional: single line I2 u m I1 D=2: xk=[xk1 xk2]T d=1: yk=[yk1]T
PCA Approach 1: Least Squares Dist. d-Dimensional: d lines I2 u2 u1 m I1 Ii = standard basis (axes) xij = standard components ui = principle axes yij = principle components D=2: xk=[xk1 xk2]T d=2: yk=[yk1 yk2]T
PCA Approach 1: Least Squares Dist. d-Dimensional: d lines I2 I2 u2 u1 u2 m m u1 I1 I1
PCA Approach 2: Change of Variables [Pearson 1933] kTH principle component = linear combination ukTx that maximizes Var(ukTx) subject to uTu=1 & Cov(ukx, ujx) = 0 for k < j
PCA vs MDS: Solution: Given: MDS: Classical Multidimensional Scaling
Reconstruction: PCA Summary Compression:
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Eigenfaces • PCA • Project faces onto a face space that spans the significant variations of known faces • Projected Face = weighted sum of Eigenfaces • Eigenfaces are the eigenvectors (principle axes) of the scatter matrix that span face space
Variance Dimensionality Left to right: original, reconstruction from 84, 40, 20, 3, 2, and 1 dimensions. Eigenfaces: Experimental Results
Eigenfaces: Applications • (Training) • Calculate the basis from the training set images. • Project the training images into FaceSpace • Compression • Project the test image into FaceSpace • Detection • Determine if the image is a face by measuring its distance from FaceSpace • Recognition • If it is a face, compare it to the training images (using FaceSpace coordinates) • Knobification
Eigenfaces: Advantages • Discovers structure of data lying near a linear a subspace of the input space • Unsupervised learning • Linear nature easily visualized • Simple implementation • No assumptions regarding statistical distribution of data • Non-iterative, globally optimal solution • Polynomial time complexity • Training: O(N3) • Test: O(D+ND)
Eigenfaces: Disadvantages • Not capable of discovering nonlinear degrees of freedom • Optimal only when xi form a hyperellipsoid cloud • Multi-dimensional Gaussian distribution • Consequence of least-squares derivation • Unable to answer how well new data are fit probabilistically • Registration and scaling issues • Compression >> Detection >> Recognition • Eigenfaces are not logical face components
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Locally Linear Embedding (LLE) Main Idea: Overlapping local structure – collectively analyzed – can provide information about global geometry
LLE: Advantages • Ability to discover nonlinear manifolds of arbitrary dimension • Non-iterative • Global optimality • Few parameters: K, d • Captures context • O(DN2) and space efficient due to sparse matrix
LLE Disadvantages • Requires smooth, non-closed, densely sampled manifold • Must choose parameters: K, d • Quality of manifold characterization dependent on neighborhood choice • Fixed radius would allow K to vary locally • Clustering would help high, irregular curvature • Sensitive to outliers • Weight less those Wi for points with poor reconstructions (in least squares sense)
Comparisons: LLE vs Eigenfaces • PCA: find embedding coordinate vectors that minimize distance to all data points • LLE: find embedding coordinate vectors that best fit local neighborhood relationships • Application to Recognition and Detection (Map test image from input space to manifold space) • Determine novel point’s K neighbors • Compute test point’s reconstruction as linear combination of training points • Approximate point’s manifold coordinates by applying its weights to Y from training stage
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Isomap Main Idea: Use approximate geodesic distance instead of Euclidean distance
Isomap: Comparison Eigenfaces • Eigenfaces w/ MDS: • DX derived from X in Euclidean space • Only difference with Isomap: • DG is computed in approximate Geodesic space • Map test image to manifold: • Compute dx to neighbors • Interpolate y of neighbors
Isomap: Advantages • Nonlinear • Non-iterative • Globally optimal • Parameters: K, d
Isomap: Comparison to LLE • Similarities: • Begin with preprocessing step to identify neighbors • Preserve intrinsic geometry of data by computing local measures, after which data can be discarded • Overcome limitations of attempts to extend PCA • Difference in local measure: • Geodesic distance vs. neighborhood relationship • Differences in application: • Depends how well local metrics characterize manifold • Isomap more robust to outliers • Isomap preserves distance, LLE angles • LLE avoids complexity of pairwise distance computation
Choosing the Algorithm for the Application • Smooth manifolds manifest themselves as slightly noticeable changes between an ordered set of examples • Video sequence • Data that could conceptually be organized as an aesthetically pleasing video sequence • Presence of nonlinear structure not known a priori • Run both PCA and manifold learning to inspect results for discrepency • Preservation of gedesic distances
Agenda • Introduction to the Problem • Background on PCA, MDS • Eigenfaces • LLE • Isomap • Summary
Isomap: Disadvantages • Guaranteed asymptotically to recover geometric structure of nonlinear manifolds • As N increases, pairwise distances provide better approximations to geodesics by “hugging surface” more closely • Graph discreteness overestimates dM(i,j) • K must be high to avoid “linear shortcuts” near regions of high surface curvature • Mapping novel test images to manifold space