100 likes | 205 Vues
Explore clustering methods based on Code Vector Activity Detection for improved Traveller Search Algorithm performance. Centroids classified into active, static, closer or farther states impact data point classifications. Enhance search efficiency by focusing on active centroids that move closer to data points.
E N D
Clustering Methods 2010 Aki Heikkinen Travellersearch in CodeVectorActivityDetectionbased GLA
FastExact GLA Based on CodeVectorActivityDetection • Centroidareclassified into states [2]: • Active • Static • Data pointclassifications [2]: • Static, when the centroid is static • Balanced, when the centroid is active, but the distancebetweencentroid and data pointdidn’tchange • Farther, when the centroid is active and itmovesawayfrom the datapoint • Closer, when the centroid is active and itmovescloser to the datapoint
TravellerSearch Algorithmimprovement: When data point is in ’closerstate’ (ie. currentcentroid hasmovedcloser to the data point) instead of searchingall activecentroids, seachonly the currentcentroid and all the other activecentroidsthathavemovedgreaterdistancethan the currentcentroid [1]. Centroidsmovinggreaterdistanceare ”travellers”
TravellerSearch D1 <D5? NO! D6 D2 D1 < D6? ok D1 < D3? ok D1 < D2? ok D1 <D4? NO! D5 Closer data point D1 D3 Search the nearestcentroid! D4
TestResults Specs: Average of 500 runs 100 swaps 2 K-Meaniterations S1-Dataset MSE TIME Default 0,89875 0,48281 Travellersearch0,904560,46938 S2-Dataset MSE TIME Default 1,33087 0,54935 Travellersearch1,329320,52371
TestResults Specs: Average of 100 runs 100 swaps 2 K-Meaniterations Birch1-Dataset MSE TIME Default 4,74680 31,77298 Travellersearch4,7461930,74745
TestResults Specs: Average of 100 runs 100 swaps 20 K-Meaniterations S1-Dataset MSE TIME Default 0,89176 1,41334 Travellersearch 0,891761,40127 S2-Dataset MSE TIME Default 1,32791 2,25996 Travellersearch 1,327912,22452
TestResults Specs: Average of 100 runs 100 swaps 50 K-Meaniterations S1-Dataset MSE TIME Default 0,89176 1,50824 Travellersearch 0,89176 1,49933 S2-Dataset MSE TIME Default 1,32791 2,46833 Travellersearch 1,327912,43418
TestResults Specs: Average of 500 runs K-Meanalgorithm S1-Dataset MSE TIME Default1,87835 0,04246 Travellersearch1,895140,04172 S2-Dataset MSE TIME Default1,984990,05895 Travellersearch1,992120,05600
References [1] Kuo-LiangChung, Jhin-SianLin, Faster and morerobustpointsymmtery-based K-means algorithm, PatternRecognition, 40, 410-422, 2007. [2] T. Kaukoranta, P. Fränti, O. Nevalainen, A fastexact GLA based on codevectoractivitydetection, IEEE Trans. on ImageProcessing, 9 (8), 1337-1342, August 2000.