1 / 27

A genetic clustering algorithm for data with non-spherical-shape clusters

A genetic clustering algorithm for data with non-spherical-shape clusters. Outline. Motivation Objective Introduction The basic concept of genetic strategy The genetic clustering algorithm Experiments Concluding remarks and Summary Personal opinions Review. Motivation.

Télécharger la présentation

A genetic clustering algorithm for data with non-spherical-shape clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A genetic clustering algorithm for data with non-spherical-shape clusters

  2. Outline • Motivation • Objective • Introduction • The basic concept of genetic strategy • The genetic clustering algorithm • Experiments • Concluding remarks and Summary • Personal opinions • Review

  3. Motivation • Some problems of the clustering. • The number of clusters? • The threshold distance d in neighborhood clustering. • Non-spherical-shape clusters.

  4. Objective • To solve the problem of these traditional clustering algorithm. • A genetic clustering algorithm for clustering. • Non-spherical-shape clusters. • According to the similarities and automatically find the proper k.

  5. Introduction • These clustering methods can broadly be classified into two categories: • Hierarchical • agglomerative • divisive • Non-hierarchical • k-means

  6. Introduction • The problems in most of these clustering algorithms • The number of clusters? • Non-spherical shape cluster? • The threshold of distance for merge? • GA clustering algorithm • Searching, as same as clustering.

  7. Encoding schemas Fitnessevaluation YES Testing the end of the algorithm Halt NO Parent selection Crossover operators Mutation operators Basic concept of Classical Genetic Algorithm

  8. First stage Nearest Neighbor Second stage GA clustering C1, C2, …, Cm n objects, O1, O2, …, On merge The genetic clustering algorithm • The algorithm CLUSTERING consists of two stages

  9. First Stage Step 1: find the nearest neighbor of each object Oi. Step 2: dav, the average of the nearest neighbor distances. The mean of u ?

  10. First Stage Step 3: compute the adjacency matrix Anxn. Step 4: connected components be denoted by C1, C2, …, Cm.

  11. Encoding schemas Fitnessevaluation YES Testing the end of the algorithm Halt NO Parent selection Crossover operators Mutation operators Second Stage • The initialization step • Population • Coding • Dinter and Dintra • The three phases of GA • Reproduction phase • Crossover phase • Mutation phase

  12. Second Stage • Distance matrix Dmxm of each pair of cluster Ci and Cj.

  13. m 1 1 1 0 0 1 0 1 1 0 R1 R2 Second Stage • The initialization step • Population: 50 strings. • The length of each string is m: {C1, C2, …, Cm} • For each string Ri, two sets Ui and U’i are defined U1={C1, C2, C3} ; U’1={C4, C5} U2={C1, C3, C4} ; U’2={C2, C5}

  14. Second Stage • Intra-distance Dintra and the inter-distance Dinter U1={C1, C2, C3} ; U’1={C4, C5, C7}

  15. Second Stage • Reproduction phase • Fitness function SCORE(Ri) = Dinter(Ri)*w – Dintra(Ri),w within [1,3]. • Reproducted probability • Crossover phase • pc = 0.8. • Mutation phase • pm= 0.1.

  16. Merge_Sets_Finding Algorithm Step 1: Sort the fitness of the strings. Step 2: Choose Ri. Step 3: Choose smallest l > i such that . IF no such l exists THEN go to Step 4(discarded) ELSE i = l and go to Step 2(merge) Step 4: End. R1={C1, C2, C3} R2={C3, C4, C6} R3={C4, C5}

  17. Experiments - 1 Noise : distance > 2dav Original

  18. Experiments - 1 7 clusters u=1.2, 8 clusters

  19. Experiments - 1 u=1.5 or 2, 5 clusters 6 clusters

  20. Experiments - 1 u=1.2, w=2, 4 clusters (best) 3 clusters

  21. Experiments - 1 4 clusters (direct GA) 2 clusters

  22. Experiments - 1 4 clusters (k-mean)

  23. Experiments - 2 Original 4 clusters 2 clusters 3 clusters

  24. Experiments - 3 Original 4 clusters

  25. Concluding and Summary • A genetic clustering algorithm CLUSTERING • Non-spherical shape. • Automatic clustering. • Binary searching the proper interval for w.

  26. Personal Opinions • The proper number of cluster decide by the value of w.

  27. Review • Using GCA to automatic clustering. • Split : NN. • Merge : Merge_Sets_Finding Algorithm.

More Related