1 / 18

Is PCA enough ?

Is PCA enough ?. Irena Váňová. A 1. A 2. A 3. Dot product. B. for i=1...N if then end end. *. *. *. *. *. o. o. *. *. o. *. o. *. o. o. Perceptron algorithm. … labels of classes. repeat until no sample is misclassified.

adriel
Télécharger la présentation

Is PCA enough ?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Is PCA enough? Irena Váňová

  2. . . . A1 A2 A3 Dot product B

  3. for i=1...N if then end end * * * * * o o * * o * o * o o Perceptron algorithm … labels of classes repeat until no sample is misclassified

  4. Find the coefficients is equivalent to find Gram matrix Rewitten algorithm – dual form repeat for i=1...N if then end end until no sample is misclassified • In the dual representation, the data points only appear inside dot products • Many algorithms have dual form

  5. Mapping to higher dimensions • Perceptron works for linear separable problems • There is a computational problem (very large vectors) • Kernel trick:

  6. Example of kernels • Polynomial kernels • Gaussian kernels • Infinit dimensions • Separated by a hyperplane • Good kernel? • Bad kernel! • Almost diagonal

  7. repeat for i=1...N if then end end until no sample ismisclassified Kernel Perceptron • We precompute • We are in implicitly in higher dimensions (too high?) • Generalization problem - easy to overfit in high dimensional spaces

  8. Kernel trick • Kernel function • Use: replacing dot products with kernels • Implicit mapping to feature space • Solve the computational problem • Can make it possible to use infinite dimensions • Conditons: continuous, symmetric, positive definite • Information ‘bottleneck’: contains all necessaryinformation for the learning algorithm • Fuses information about the data AND the kernel

  9. PCA • Orthogonal linear transformation • The greatest variance = first coordinate, … • Rotation around mean value • Dimensionality reduction • many dimensions = high correlation

  10. Singular value decomposition • W,T – unitary matrix ( ) • Columns of W,V? • Basis vector, eigenvectors of XTX, resp. XXT n n m n m m n

  11. PCA • Data with zero mean, SVD n m n m m m covariance matrix (1/n) eigenvectors

  12. Kernel PCA • Projections of data onto few larger eigenvectors equation for PCA kernel function equation for high-dim. PCA We don’t know eigenvector explicitly - only vector of numbers which identify the vector projection onto k-th eigenvector

  13. KPCA example

  14. If something does wrong PCA is blind

  15. LDA • fundamental assumption: normal distribution • First: same covariance matrix , full rank

  16. LDA • fundamental assumption: normal distribution • First: only full rank • Kernel variant

  17. Example LDA • Face recognition – eigenfaces

  18. LDA versus PCA

More Related