A Quick Practical Guide to PCA and ICA

A Quick Practical Guide to PCA and ICA Ted Brookings, UCSB Physics 11/13/06

Blind Source Separation Suppose we have a data set that • Has many independent components or channels • Audio track recorded from multiple microphones • Series of brain images with multiple voxel • We believe is driven by several independent processes • Different people speaking into microphones • Different neuronal processes occurring within the brain • We have no a priori notion of what those processes look like Our goal is to figure out what the different processes are by grouping together data that is correlated

Our Simple Example • Driven by two sin signals with different frequencies • 100 Sample Times • 200 Channels: • 150 are a linear combination of Signal1 and Signal2, with Poisson noise • 50 are Pure Poisson noise

PCA (Principle Component Analysis) • Linear transform ---chooses a new basis • Perpendicular • First component explains the most variance, second component explains the most remaining variance, etc. Finds a weight matrix W, and set of signals S, that approximate the data X: X = W * S The weight matrix is the eigenvectors of the correlation matrix, so the eigenvalues provide the order of components Image from: http://www.umetrics.com/default.asp/pagename/methods_MVA_how10/c/1

Spelling Things Out The meaning of the basis equation: e.g. if W = .6 and W = .2, then X = .6 S + .2 S . That is, X is actually being generated (at least partly) by the processes S and S . 11 12 1 1 2 1 1 2 X is typically a time series ---that is, X is measured at discrete intervals. However, our basis doesn’t change, because the fundamental processes that are at work is presumed to be constant. Because of this, W is constant in time, and S changes with time. The end result of PCA is then S(t), and W, which tells us the activity of each component, and how to generate the original data from the components.

PCA Results Unsurprisingly, PCA discovers two dominant components We might expect trouble here: PCA will probably go diagonal

PCA Results • Oops! The signals are mixed. • But… They’re a lot cleaner, because PCA has removed a lot of gaussian noise

ICA (Independent Component Analysis) • Linear transform ---chooses a new basis • NOT Perpendicular • The basis is chosen to be maximally-independent • There is no particular ordering of the basis vectors

Er… “Maximally Independent”? Technical, and the definition depends somewhat on the algorithm being used. Ultimately boils down to cross-correlations. If two variables are uncorrelated, they are independent. Correlated: Uncorrelated: Images from web page by Aapo Hyvärinen, http://www.cis.hut.fi/aapo/papers/NCS99web

Requirements • At most one gaussian-distributed element of data • The number of independent data must be greater than the number of components: m > n. E.g. number of microphones greater than number of voices.

ICA Results Ick! Might have expected this, because there’s a ton of gaussian noise in the system.

Do ICA on the Results of PCA! • PCA cleans up the gaussian noise (and reduces the dimension). • Most PCA packages incorporate PCA or some other preprocessing for this reason. • ICA picks the basis that is maximally independent.

For More Info • Check out Wikipedia (seriously). • The articles on PCA/ICA • Are actually good. • Provide links to software packages for C++, Java, Matlab, etc. See especially FastICA. • Many of the external links provide good overviews as well.

The Aftermath… Great! Now that we have what we’ve always wanted (a list of “components”) what do we do with them? Since ICA is “blind” it doesn’t tells us much about the components. We may simply be interested in data reduction, or categorizing the mechanisms at work. We may be interested in components that correlate with some signal that we drove the experiment with.

A Quick Practical Guide to PCA and ICA

A Quick Practical Guide to PCA and ICA

Presentation Transcript

Bayesian belief networks 2. PCA and ICA

PCA vs ICA vs LDA

A Practical Guide to SVM

Quick guide to PCA

A Practical Guide to Apps

A Quick Guide to Personas

ICA and PCA

A QUICK GUIDE TO:

A quick Guide to Lloyd’s

A Quick Guide to Visualization

A quick guide

A Practical Guide to APA Style

A Practical Guide to Linux

A Quick Guide To Taekwondo

A Quick Practical Guide for Effective SEO Copywriting

A quick-guide to food and diet

A PRACTICAL GUIDE

A quick guide