1 / 79

Swap-based algorithms

Swap-based algorithms. Clustering Methods: Part 2d. Pasi Fränti 31.3.2014 Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND. Part I: Random Swap algorithm.

konane
Télécharger la présentation

Swap-based algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Swap-based algorithms Clustering Methods: Part 2d • Pasi Fränti • 31.3.2014 • Speech & Image Processing Unit • School of Computing • University of Eastern Finland • Joensuu, FINLAND

  2. Part I:Random Swap algorithm P. Fränti and J. KivijärviRandomised local search algorithm for the clustering problem Pattern Analysis and Applications, 3 (4), 358-369, 2000.

  3. Pseudo code of Random Swap

  4. Demonstration of the algorithm

  5. Centroid swap

  6. Local repartition

  7. Fine-tuning by K-means1st iteration

  8. Fine-tuning by K-means2nd iteration

  9. Fine-tuning by K-means3rd iteration

  10. Fine-tuning by K-means16th iteration

  11. Fine-tuning by K-means17th iteration

  12. Fine-tuning by K-means18th iteration

  13. Fine-tuning by K-means19th iteration

  14. Fine-tuning by K-meansFinal result after 25 iterations

  15. Implementation of the swap 1. Random swap: 2. Re-partition vectors from old cluster: 3. Create new cluster:

  16. Random swap as local search Study neighbor solutions

  17. Random swap as local search Select one and move

  18. Role of K-means Fine-tune solution by hill-climbing technique!

  19. Role of K-means Consider only local optima!

  20. Role of swap: reduce search space Effective search space

  21. Chain reaction by K-means after swap

  22. Independency of initialization Results for T = 5000 iterations Worst Initial Best Initial Initial

  23. Part II:Efficiency of Random Swap

  24. Probability of good swap • Select a proper centroid for removal: • There are M clusters in total: premoval=1/M. • Select a proper new location: • There are N choices: padd=1/N • Only M are significantly different: padd=1/M • In total: • M2significantly different swaps. • Probability of each different swap is pswap=1/M2 • Open question: how many of these are good?

  25. Number of neighbors Open question: what is the size of neighborhood ()? Voronoi neighbors Neighbors by distance

  26. Observed number of neighborsData set S2

  27. Average number of neighbors

  28. Expected number of iterations • Probability of not finding good swap: • Estimated number of iterations:

  29. Estimated number of iterationsdepending on T Observed = Number of iterations needed in practice. Estimated = Estimate of the number of iterations needed for given q S1 S2 S3 S4

  30. Probability of success (p)depending on T

  31. Probability of failure (q) depending on T

  32. Observed probabilities depending on dimensionality

  33. Bounds for the number of iterations Upper limit: Lower limit similarly; resulting in:

  34. Multiple swaps (w) Probability for performing less than w swaps: Expected number of iterations:

  35. Number of swaps neededExample from image quantization

  36. Efficiency of the random swap Total time to find correct clustering: • Time per iteration  Number of iterations Time complexity of a single step: • Swap: O(1) • Remove cluster: 2MN/M = O(N) • Add cluster: 2N = O(N) • Centroids: 2(2N/M) + 2 + 2 = O(N/M) • (Fast) K-means iteration: 4N = O(N)* *See Fast K-means for analysis.

  37. Time complexity and the observed number of steps

  38. Time spent by K-means iterations

  39. Effect of K-means iterations

  40. Total time complexity Time complexity of a single step (t): t = O(αN) Number of iterations needed (T): Total time:

  41. Time complexity: conclusions • Logarithmic dependency on q • Linear dependency on N • Quadratic dependency on M(With large number of clusters, can be too slow) • Inverse dependency on  (worst case = 2) (Higher the dimensionality and higher the cluster overlap, faster the method)

  42. Time-distortion performance

  43. Time-distortion performance

  44. Time-distortion performance

  45. Time-distortion performance

  46. Time-distortion performance

  47. Time-distortion performance

  48. References Random swap algorithm: • P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-369, 2000. • P. Fränti, J. Kivijärvi and O. Nevalainen, "Tabu search algorithm for codebook generation in VQ", Pattern Recognition, 31 (8), 1139‑1148, August 1998. Pseudo code: • http://cs.joensuu.fi/sipu/soft/ Efficiency of Random swap algorithm: • P. Fränti, O. Virmajoki and V. Hautamäki, “Efficiency of random swap based clustering", IAPR Int. Conf. on Pattern Recognition (ICPR’08), Tampa, FL, Dec 2008.

  49. Part III:Example when 4 swaps needed

  50. 1st swap MSE = 4.2 * 109 MSE = 3.4 * 109

More Related