180 likes | 344 Vues
Fast modified global k-means algorithm for incremental cluster construction. Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.
E N D
Fast modified global k-means algorithm for incremental cluster construction Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • The global k-means algorithm and the modified global k-means algorithm are incremental clustering algorithms. • allow one to find global or a near global minimizer of the cluster (or error) function. • However, these algorithms are memory demanding • they require the storage of the affinity matrix . • Alternatively, this matrix can be computed at each iteration, however, this extends thecomputational time significantly.
Objectives • A new version of the modified global k-means algorithm is proposed: • apply an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. • the best solutionis selected as a starting point for the next cluster center. • information gathered in previous iterations of the incremental algorithm to avoid computing the whole affinity matrix. • the triangle inequality for distances is used to avoid unnecessary computations
Methodology • Modified global k-means algorithm [1] • Starts with the computation ofone cluster centerand attempts tooptimally add one new cluster center at each iteration. • An auxiliary cluster function • using k-1 cluster centers from the(k-1)-th iteration • to compute the starting point for the k-th center. • The k-means algorithm is applied starting from this point to find the k-partition of the dataset. • Fast modified global k-means algorithm • auxiliary cluster function to generate a set of starting points • the best solution is selected • avoid computing the whole affinity matrix
Modified global k-means algorithm cluster function
Modified global k-means algorithm : the solution to the(k-1)-partition problem Auxiliary cluster function: y S(y) x1 x3 x2
y S1.0(y) x1 x3 x2 Fast modified global k-means algorithm u=0.2 u=1.0 y S0.2(y) x1 x3 x2
Reduction of computational effort S(ai) ai aj x1
Computational complexity • The modified global k-means algorithm • O(mk2T+km2+kmt) • The fast modified global k-means algorithm • O(p(mk2T+km2+kmt)) (without complexity reduction schemes) • O(p(mk2T+km1 2+km1t)) (with complexity reduction schemes) T the number of iterations by Algorithm 2 tthenumberofiterationsbyAlgorithm1 m1the number of data points in the set P(u)∩A and m1<<m.
k the number of clusters fopt the best known value of the cluster function × m E the error in %, α the number of Euclidean norm evaluations t the CPU time Numericalexperiments
fFMGKM/fGKM fFMGKM/fMGKM α CPU time
The Dunn’s validity index The Davies–Bouldin cluster validity measure • Show a similar pattern. • Generate similar cluster structures
Conclusions • Developed a new version of the modified global k-means algorithm • Using the k-1 cluster centers from the previous iteration to solve the k-partition problem. • does not rely on the affinity matrix to compute the starting point • use more than one starting point to minimize the auxiliary function • Two schemes to reduce the amount of computational effort • no guarantee that it will converge to the global solution.
Comments • Advantages • Schemes to avoid computational effort. • Shortages • Determine the set U is not easy. • Applications • clustering