1 / 24

Validity index for clusters of different sizes and densities

Validity index for clusters of different sizes and densities. Presenter: Jun-Yi Wu Authors: Krista Rizman Zalik , Borut Zalik. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 PRL. Outline. Motivation Objective Methodology Experiments Conclusion Comments.

lucian
Télécharger la présentation

Validity index for clusters of different sizes and densities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista RizmanZalik, BorutZalik 國立雲林科技大學 National Yunlin University of Science and Technology 2011 PRL

  2. Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments

  3. Motivation Most of the previous validity indices have been considerably dependent on the number of data objects in clusters, on cluster centroids and on average values. Most popular validity measures have the tendency to ignore clusters with low density and are not efficient in validation of partitions having different sizes and densities.

  4. Objective • Two cluster validity indices are proposed for efficient validation of partitions containing clusters that widely differ in sizes and densities. • To design a cluster validity index that is suitable for the validation of partitions having different sizes and densities. A good partitions: • Overlap • Compactness • Separation distance

  5. Methodology

  6. Methodology G+ Index PC D Index CE C Index DB Index XiE G Index Review several popular validity indices.

  7. Methodology • new clustering validity indices. • SV-index • Validation of index SV • Fuzzification of the SV index • The proposed index OS exploiting overlap and separation measures • Overlap measure • Separation measure and validity index SV • Validation of index OS

  8. Methodology • SV-index a measure for partition validity that consists of clusters that widely differ in density or size

  9. Methodology Validation of index SV

  10. Methodology • Fuzzification of the SV index A fuzzy version of the index SV is obtained by integrating the membership values in the variation measure.

  11. Methodology • The proposed index OS exploiting overlap and separation measure • Experiment results suggested that inter-cluster separation plays a more important role in cluster validation. • Indices are limited in their ability to compute the compactness and the separation in partitions having overlapping clusters and clusters of different sizes, which leads to an incorrect validation results. • Considering these results a cluster validity index is suggested based on an overlap and separation measures.

  12. Methodology Overlap measure

  13. Methodology Separation measure and validity index SV

  14. Methodology Validation of index OS

  15. Experiments • To demonstrate the effectiveness of the proposed SV and OS indices for determining the optional number of clusters. • Artificial data set A1 • Artificial data set A2 • Artificial data set A3 • Iris data set • Wine data set • Glass data set

  16. Experiments-Artificial data set A1

  17. Experiments-Artificial data set A2 .

  18. Experiments-Artificial data set A3

  19. Experiments-Artificial data set A3

  20. Experiments -Iris data set. .

  21. Experiments-Wine data set

  22. Experiments-Wine data set

  23. Conclusion The experimental results proved that the new indices outperform the other considered indices, especially when cluster widely differ in sizes or densities. A good partition is expected to have low degree of overlap and a larger separation distance and compactness. The maximum value of the ratio of the SV index and the minimum value of the OS index indicate the optimal partition.

  24. Comments • Advantage • Drawback • …. • Application • Clustering • Validity index 24

More Related