1 / 42

Spectral Visual Clustering Tendency

L. Wang, X. Geng, J. C. Bezdek, C. Leckie, and K. Ramamohanarao, “SpecVAT: Enhanced visual cluster analysis,” in Proceedings of the Eighth IEEE International Conference on Data Mining, 2008. (ICDM ’08), Dec. 2008, pp. 638–647.

petra
Télécharger la présentation

Spectral Visual Clustering Tendency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. L. Wang, X. Geng, J. C. Bezdek, C. Leckie, and K. Ramamohanarao, “SpecVAT: Enhanced visual cluster analysis,” in Proceedings of the Eighth IEEE International Conference on Data Mining, 2008. (ICDM ’08), Dec. 2008, pp. 638–647. School of Engineering, The University of Melbourne, Vic 3010, Australia Spectral Visual Clustering Tendency

  2. Clustering

  3. Conventional K-means Clustering 4) Steps 2 and 3 are repeated until convergence has been reached. 3) The centroid of each of the k clusters becomes the new means. 1) k initial "means" (in this case k=3) 2) associating every observation with the nearest mean. How to determine the k?

  4. Determining the Number of Clusters • Determining Before Clustering Cluster Tendency Analysis • Determining After Clustering Cluster Validity Measurement Cluster Tendency Analysis Cluster Validity Measurement Clustering Input Output

  5. Visual Analysis of Cluster Tendency (VAT) Scatter plot of a 2D data set Unordered image I(D) Reordered VAT image I(D’) J. C. Bezdek and R. J. Hathaway. VAT: A tool for visual assement of (cluster) tendency. In Proc. International Joint Conference on Neural Networks, pages 2225–2230, 2002.

  6. Dissimilarity Matrix n objects Dissimilarity Image Dissimilarity Matrix 5 D 1 3 d12 4 2 Dissimilarity between objects oi and oj Scatter plot of a 2D data set

  7. Reordered Dissimilarity Matrix 5 1 3 D d12 4 2 Reordering 5 4 3 D 2 1

  8. Example

  9. VAT Algorithm Dissimilarity Image Dissimilarity Matrix 5 1 3 Max Dissimilarity 4 2 D 5 4 3 2 1

  10. Problem of VAT Scatter plot Reordered VAT Image

  11. Scatter plots of 9 synthetic data sets. From left to right and from top to bottom: S-1 ∼ S-9

  12. Spectral Clustering Scatter plot of a 2D data set K-means Clustering Spectral Clustering U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, Germany, 2006.

  13. Spectral Graph Connected Groups Similarity Graph

  14. Similarity Graph Similarity Graph Vertex Set Weighted Adjacency Matrix Similarity Graph

  15. Similarity Graph • ε-neighborhood Graph • k-nearest neighbor Graphs • Fully connected graph Gaussian Similarity Function ε-neighborhood K-nearest neighbor ε

  16. Spectral Graph Connected Groups Similarity Graph

  17. Graph Laplacian L: Laplacian matrix W: adjacency matrix D: degree matrix

  18. Example W: adjacency matrix D: degree matrix 1 2 3 4 5 Similarity Graph L: Laplacian matrix

  19. Property of Graph Laplacian • L is symmetric and positive semi-definite. • The smallest eigenvalue of L is 0, the corresponding eigenvector is the constant one vector 1. • L has n non-negative, real-valued eigenvalues 0= λ 1 ≦ λ 2 ≦ . . . ≦ λ n. L: Laplacian matrix 1 2 3 4 5 Similarity Graph

  20. Eigenvalue and Eigenvector of Graph Laplacian Connected Component  Constant Eigenvector

  21. Example L: Laplacian matrix 1 2 3 4 5 Similarity Graph Two Connected Components  Double Zero Eigenvalue Eigenvectors: f1= [1 1 1 0 0]’ f2= [0 0 0 1 1]’

  22. Example First Two Eigenvectors W: adjacency matrix 1 2 3 4 5 Similarity Graph For all block diagonal matrices, the spectrum of L is given by the union of the spectra of Li

  23. Spectral Clustering First k Eigenvectors  New Clustering Space 1 2 3 4 5 Use k-means clustering in the new space Similarity Graph

  24. Spectral Clustering Scatter plot of a 2D data set K-means Clustering Spectral Clustering

  25. Spectral VAT (SpecVAT) Scatter plots Reordered VAT Image

  26. SpecVAT Algorithm 1. Construct Similarity Matrix W 2. Construct Laplacian Matrix L 3. Choose First k Eigenvectors u1,…,uk 4. Construct New Dissimilarity Matrix D’ Data

  27. SpecVAT Images Original VAT Image SpecVAT Images with Different k Desired Result

  28. SpecVAT Image Analysis VAT Images Histogram of VAT Images “Good” VAT Image “Clarity” and “Block Structure”

  29. SpecVAT Image Analysis Within-Cluster Between-Cluster Within-Cluster Variance σW Between-Cluster Variance σB Desired Distribution: Small σW and σB

  30. “Goodness” Measurement of VAT Images T Test All T=1~255 to find the smallest σB Within-Cluster Variance σW Between-Cluster Variance σB Desired Distribution: Small σW and σB

  31. Determining the Number of Clusters Test All k=1~kmax to find the smallest σB Scatter plots of S-1 data Scatter plots of S-5 data

  32. Visual Clustering Scatter plot Good Partition Bad Partition C1 C2 C3 C1 C2 C3

  33. Visual Clustering Scatter plot Good Partition Bad Partition C1 C2 C3 C1 C2 C3

  34. Visual Clustering Scatter plot Good Partition Bad Partition Dark within-region and Bright between -region C1 C1 C2 C2 C3 C3

  35. Visual Clustering Scatter plot Good Partition Dark within-region and Bright between -region C1 C2 C3 Genetic Algorithm is Applied in Paper

  36. Result VAT Images S-1 S-2 S-3 Scatter plots Original VAT Images SpecVAT Images

  37. Result VAT Images S-4 S-5 S-6 Scatter plots Original VAT Images SpecVAT Images

  38. Result VAT Images S-4 S-5 S-6 Scatter plots Original VAT Images SpecVAT Images

  39. Results

  40. Results [27] L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In Proc. Advances in Neural Information Processing Systems, 2004.

  41. Results

  42. Conclusions • The VAT is enhanced by using spectral analysis. • Based on SpecVAT, the cluster structure can be estimated by visual inspection. Number of clusters can be automatically estimated.

More Related