1 / 47

Hao Cheng, Kien A Hua, and Khanh Vu School of Electrical Engineering and Computer Science University of Central Florida

Hao Cheng, Kien A Hua, and Khanh Vu School of Electrical Engineering and Computer Science University of Central Florida. Constrained Locally Weighted Clustering. Contents. Introduction Locally Weighted Clustering Constrained Clustering Experiments Conclusions. Clustering.

thurman
Télécharger la présentation

Hao Cheng, Kien A Hua, and Khanh Vu School of Electrical Engineering and Computer Science University of Central Florida

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hao Cheng, Kien A Hua, and Khanh Vu School of Electrical Engineering and Computer Science University of Central Florida Constrained Locally Weighted Clustering

  2. Contents • Introduction • Locally Weighted Clustering • Constrained Clustering • Experiments • Conclusions

  3. Clustering • Clustering is to partition a given dataset into a set of meaningful clusters, so that data objects of a cluster share some similar characteristics. • Data are generally complicated and in high dimensional spaces. The clustering task is non-trivial.

  4. Overview • Clusters reside in subspaces. • Locally Weighted Clustering: each cluster is associated with an independent weighting vector to capture its local correlation structure. • Pairwise instance-level constraints are usually available for clustering practices. • Constrained Clustering: data points are arranged into small groups based on the given constraints, and then, these groups are assigned to feasible closest clusters.

  5. Conventional Clustering • Partitional: [K-Means] • Hierarchical: [Single-link, Complete-Link, Ward’s, Bisection K-Means] • Euclidean distance is used to measure the (dis)similarity between two objects. All dimensions are equally important throughout the whole space.

  6. Challenges • Data reside in high dimensional spaces. • Curse of dimensionality: the space becomes sparse, and objects becomes (equally) far away from each other. • Clusters reside in subspaces. • Different subsets of data may exhibit different correlations; and in each subset, the correlation may vary along different dimensions.

  7. Related Methods • Global projections: dimension reduction and manifold learning [PCA, LPP] • Adaptive dimension selection: [CLIQUE, ProClus] • Adaptive dimension weighting: [LAC]

  8. Dim 1 Dim 2 Dim 3 Dim 1 & 3 Dim 2 & 3 Different Correlation Structures Dim 1 & 2

  9. Iterate until convergence. Euclidean distance: K-Means • Iteratively refine the clustering objective. Start with initial centroids. S1: Assign points to closest centroids. NMI: 0.4628 Rand: 0.7373 S2: Update centroids.

  10. PCA and LPP • Projection directions are defined in order to minimize data distortion. LPP 2-Dim projections PCA 2-Dim projections NMI: 0.5014 Rand: 0.7507 NMI: 0.5294 Rand: 0.7805

  11. Heterogeneous Correlations • Data in a cluster can be strongly correlated in some dimensions, and in the rest dimensions, the data may vary greatly. The correlation structures differ from cluster to cluster. • A dimension is not equally important for all the clusters. • In a cluster, dimensions are not equally important.

  12. Correlations Weights Dim 1 & 3 Dim 2 & 3 A weight vector is associated with a cluster. Dim 1 & 2

  13. Local Weights • A cluster is embedded in the subspace spanned by an adaptive combination of the dimensions. • In the neighborhood of a cluster, weighted Euclidean distance is adopted.

  14. Locally Weighted Clustering • Minimize the sum of weighted distances • Get rid of zero weights by constraints: • Optimal centroids and weights:

  15. Pairwise distances of points in cluster k. Smaller weights Smaller pairwise distances, greater correlations Larger weights Iterate until convergence. Locally Weighted Clustering • Weights of a cluster only depend on data points that belong to this cluster. Start with initial centroids, weights. S1: Assign points to closest centroids. S2: Update centroids, weights Greater pairwise distances, smaller correlations

  16. Objective function: Constraints: LAC • LAC is sensitive to tunable parameter.

  17. Dim 1 & 3 Dim 2 & 3 LWC Dim 1 & 2 NMI: 1 Rand: 1

  18. Constrained Clustering • A pairwise instance-level constraint tells whether the two points belong to the same cluster • Must link • Cannot link • This form of partial knowledge is usually accessible and valuable to clustering practices. • Constrained clustering: utilizes a given set of constraints to derive better data partitions.

  19. Related Methods • Learn a suitable distance metric [RCA, DCA] • Guide the clustering process: • Enforce the constraints [Constrained K-Means] • Penalize constraint violations [CVQE] • Unified method: [MPCK-Means]

  20. Chunklet • Chunklet: ‘a subset of points that are known to belong to the same although unknown class’. • Data objects which are inferred similar, can be placed into the same chunklet. • A set of pairwise constraints can be represented in a Chunklet graph.

  21. Must Link Cannot Link Chunklet Graph

  22. Must Link Cannot Link Chunklet Graph

  23. 1 4 1 1 1 3 1 Chunklet Graph Must Link Cannot Link the number of points in a chunklet

  24. 3 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 Graph Construction • Initially, each point is a chunklet node. • Merge two nodes if they are inferred similar. • An edge is added between two chunklet nodes if they are inferred dissimilar. Do cluster assignment in chunklet.

  25. Chunklet Point x belongs to the chunklet Size of the chunklet C1 C2 Gaussian Data • Assume: Data in each cluster follow a Gaussian distribution. • Two clusters:

  26. Chunklet Assignment • A chunklet can be assigned to a cluster in bulk: • Two neighboring chunklets can be assigned to two different clusters:

  27. Probability of Assignment • In case of two clusters: • One single chunklet is assigned correctly with the probability where • Two neighboring chunklets

  28. 1 1 C1 C2 1 K-Means • K-Means • unaware of constraints, • assigns points independently. • The average number of points (in a chunklet) that are assigned correctly to their true cluster: • Each event (a single point): • events are independent; the occurrences follow a Binomial distribution.

  29. 1 1 1 1 C1 C1 C2 C2 1 1 Constrained K-Means • Constrained K-Means enforces the constraints strictly. • The average number of correct assignments: Assume the 3 points belong to cluster 1.

  30. 1 1 C1 C2 1 Chunklet Assignment • Chunklet is assigned in bulk. • The average number of correct assignments: Similarly, we can analyze the case of two neighboring chunklets.

  31. Chunklet versus One-by-one • It is better to assign points in chunklet. • The bigger the chunklet, the more correct assignments. • It is better to assign two neighboring chunklets together.

  32. Build the chunklet graph. S1: Assign points to closest centroids. Chunklet assignments S2: Update centroids, weights Iterate until convergence. CLWC • Combine local weighting scheme with chunklet assignment. Start with initial centroids, weights.

  33. 2 2 3 C2 C1 2 1 2 1 1 C3 Chunklet Assignment • Try to do the most confident assignments first. • If a node has a neighbor, assign they two. • Assign larger chunklets first. • Chunklets are placed in closest feasible clusters.

  34. K-Means 50 links 300 links Better Clustering 4 classes of images (100 each) from COREL DB Ground truth

  35. Techniques: Datasets: Evaluating metrics: Experimental Setup Pairwise constraints

  36. K-Means Hierarchical Clustering Dimension Reduction Manifold Learning LAC LWC Performances

  37. Direct enforcement CLWC Metric learning violation penalty Performances

  38. Performances

  39. Conclusions • An independent weighting vector is used to capture the local correlation structure around a cluster. The weights help define the embedding subspace of a cluster. • Data points are grouped into chunklets based on the input constraints. The points in a chunklet are treated as a whole in the assignment process. Try to do the most confident assignments first (least likely incorrect).

  40. Thank you!

  41. . • .

  42. Efficiency • The cost of each iteration is • Local weighting generally lets the algorithm converge fast. • More constraints, the faster the algorithm converges.

  43. 3 1 2 3 1 C1 C2 No feasible assignment. Constraint Violations • No guarantee to satisfy all constraints.

  44. Constraint Violations

  45. Probability Constraints • Use a real value in the range [-1, 1], to denote the similarity between two points, the confidence that the two points are in the same cluster. • Clique: points are similar (with a high similarity value) to each other. • For each point, search a clique (include this point). • The degree of dissimilar between two cliques can be computed. • Do assignment in clique.

  46. Two Neighboring Chunklets • Number of correct assignments:

  47. Dim 1 Dim 2 Dim 3

More Related