1 / 16

Automatically Determining the Number of Clusters in Unlabeled Data Sets Using Dissimilarity Images

This paper presents a methodology for automatically estimating the number of clusters in unlabeled datasets, focusing on innovative approaches such as reordered dissimilarity images (RDI) and distortion-based extraction (DBE). It describes experiments conducted on synthetic and real datasets to validate the performance of the proposed methods in comparison to existing techniques, highlighting advantages such as ease of parameter setting and robust clustering outcomes. The findings suggest a preference for larger clusters and propose combining cluster analysis with image processing techniques for improved results.

pahana
Télécharger la présentation

Automatically Determining the Number of Clusters in Unlabeled Data Sets Using Dissimilarity Images

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatically Determining the Number ofClusters in Unlabeled Data Sets Presenter : Lin, Shu-Han Authors : Liang Wang, Christopher Leckie, KotagiriRamamohanarao, and James Bezdek IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING(TKD), 2009

  2. Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments

  3. Motivation “reordered dissimilarity image” (RDI) Howtoautomaticallyestimatethenumberofclustersinunlabeleddataset?

  4. Objectives ExtractDarkBlock 4

  5. Methodology– VAT VAT 5

  6. Methodology– VAT VAT 6

  7. Methodology– DBE 1 2 3 4 7

  8. Methodology– DBE1.Dissimilaritytransformationandimagesegmentation f(t) Graythreshfunction(Matlab):σ 8 after before

  9. Methodology– DBE2. Directionalmorphologicalfilteringofthebinaryimage a=2% a=1% Symmetric: along horizontal and vertical directions Linear: along the same direction 9

  10. Methodology– DBE3. Distancetransformanddiagonalprojectionoffilteredimage Nearest non-zero pixel 10

  11. Methodology– DBE4. Detection of major peaks and valleys in the projectionsignal Smooth(parameter:a) Major“peaks/valleys”(parameter:a) 11

  12. Experiments – Syntheticdatasets 12

  13. Experiments– ComparewithCCE 13

  14. Experiments – ComparewithCCE Syntheticdatasets Realdatasets 14

  15. Conclusions • The most method prefer “larger” rather than “smaller” clusters • The DBE • (Nearly) Automatically estimating the number of clusters • Just one easy-to-set parameter: a

  16. Comments • Advantage • An visual assessment of cluster tendency (VAT) • Combine the cluster analysis problem with the image processing tech. • Drawback • … • Application • …

More Related