1 / 15

Semi-supervised Affinity Propagation

Semi-supervised Affinity Propagation. Inmar Givoni, Brendan Frey, Delbert Dueck PSI group University of Toronto. Affinity Propagation. Clustering algorithm that works by finding a set of exemplars (prototypes) in the data and assigning other data points to the exemplars [Frey07].

frederique
Télécharger la présentation

Semi-supervised Affinity Propagation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semi-supervised Affinity Propagation Inmar Givoni, Brendan Frey, Delbert Dueck PSI group University of Toronto

  2. Affinity Propagation • Clustering algorithm that works by finding a set of exemplars (prototypes) in the data and assigning other data points to the exemplars [Frey07] • Input: pair-wise similarities (negative squared error), data point preferences (larger = more likely to be an exemplar) • Approximate maximization of the sum of similarities to exemplars • Mechanism – message passing in a factor graph

  3. Semi-supervised Learning • Large amounts of unlabeled training data • Some limited amounts of side information Partial labels Equivalence constraints

  4. Some Motivating examples

  5. AP with partial labels • All points sharing the same label should be in the same cluster. • Points with different labels should not be in the same cluster. • Imposing constraints • Via the similarity matrix • Explicit function nodes

  6. Same label constraints • Set similarity among all similarly labeled data to be maximal. • Propagate to other points (teleportation) • Without teleportation, local neighborhoods do not ‘move closer’. • e.g. Klein02] S(x1,x2)=0 y2 x1 x2 y1

  7. x1 x2 Different labels • Can still do a similar trick and set similarity among all pair-wise differently labeled data to be minimal. • But no equivalent notion of anti-teleportation.

  8. Adding explicit constraints to account for side-information

  9. Adding explicit constraints to account for side-information

  10. Problems • Let’s call all the labeled points portals • They induce the ability to teleport… • At test time, if we want to determine a label for some new point we need to evaluate its closest exemplar, possibly via all pairs of portals - expensive. • Pair-wise not-in-class nodes for each pair of differently labeled points is expensive. • Introducing…

  11. Meta-Portals • An alternative way of propagating neighborhood information. • Meta-portals are ‘dummy’ points, constructed using the similarities of all portals of a certain label. • We add N new entries to the similarity matrix, where N is the number of unique labels.

  12. Meta-portals • mtp’s can be exemplars. • Unlike regular exemplars, mtp’s can be exemplars for other points but choose a different exemplars themselves

  13. These function nodes force the MTP’s to choose other data points as their exemplars. Similarities alone are not enough, since both MTP can choose same exemplars and still have –inf similarities.

  14. Some toy data results

  15. Future work • Investigate interplay between modifying similarities and incorporating explicit constraints. • Possible tool for user-guided labeling

More Related