Learning multiple nonredundant clusterings

Learning multiple nonredundantclusterings Presenter : Wei-Hao Huang Authors : Ying Gui, Xiaoli Z. Fern, Jennifer G. DY TKDD, 2010

Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

Motivation • Data exist multiple groupings that are reasonable and interesting from different perspectives. • Traditional clustering is restricted to ﬁnding only one single clustering.

Objectives • To propose a new clustering paradigm for ﬁnding all non-redundant clustering solutions of the data.

Methodology • Orthogonal clustering • Cluster space • Clustering in orthogonal subspaces • Feature space • Automatically Finding the number of clusters • Stopping criteria

Orthogonal Clustering Framework X (Face dataset)

Orthogonal clustering ) Residue space

Clustering in orthogonal subspaces Projection Y=ATX • Feature space • linear discriminant analysis (LDA) • singular value decomposition (SVD) • LDA v.s. SVD • where

Clustering in orthogonal subspaces A(t)= eigenvectors of Residue space

Compare moethod1 and mothod2 A(t)= eigenvectors of M’=M then P1=P2 • Residue space • Moethod1 • Moethod2 • Moethod1 is a special case of Moethod2.

Experiments • To use PCA to reduce dimensional • Clustering • K-means clustering • Smallest SSE • Gaussian mixture model clustering (GMM) • Largest maximum likelihood • Dataset • Synthetic • Real-world • Face, WebKB text, Vowel phoneme, Digit

Experiments Evaluation

Experiments Synthetic

Experiments Face dataset

Experiments WebKB dataset Vowe phoneme dataset

Experiments Digit dataset

Experiments • Finding the number of clusters • K-means  Gap statistics

Experiments • Finding the number of clusters • GMMBIC • Stopping Criteria • SSE is less than 10% at first iteration • Kopt=1 • Kopt> Kmax Select Kmax • Gap statistics • BIC Maximize value of BIC

Experiments Synthetic dataset

Experiments Face dataset

Experiments WebKB dataset

Conclusions • To discover varied interesting and meaningful clustering solutions. • Method2 is able to apply any clustering and dimensionality reduction algorithm.

Comments • Advantages • Find Multiple non-redundant clustering solutions • Applications • Data Clustering

Learning multiple nonredundant clusterings

Learning multiple nonredundant clusterings

Presentation Transcript

Multiple Intelligences Differentiated Learning

Learning Styles and Multiple Intelligences

Finding Low Error Clusterings

Greedy Unsupervised Multiple Kernel Learning

Learning from Multiple Outlooks

Multiple Instance Learning

Multiple Kernel Learning

On Multiple Kernel Learning with Multiple Labels

On Clusterings : Good, Bad, and Spectral

Multiple Kernel Learning

Quality of Clusterings

Iterative Optimization and Simplification of Hierarchical Clusterings

Multiple Intelligences Learning Stations

Natural learning Multiple intelligence

Multiple Perspectives on Online Learning

Multiple-Instance Learning

Multiple Kernel Learning

Multiple Kernel Learning

Multiple Sense Learning

Multiple timescales for multiagent learning

Learning Preferences/ Multiple Intelligences