1 / 37

Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging

Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging. Lei Wu *† , Steven C.H. Hoi * , R ong Jin # , Jianke Zhu ‡ , Nenghai Yu †

nike
Télécharger la présentation

Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging Lei Wu *†, Steven C.H. Hoi*, Rong Jin#, Jianke Zhu‡, NenghaiYu† *NanyangTechnological University, †University of Sci. & Tech. of China, #Michigan State University, ‡ETH Zurich

  2. BACKGROUND • Annotation/tagging is essential to making images accessible to Web users • Billions of images on the Web lack proper annotation/tags • Automatic image annotation has been actively studied in multimedia community

  3. BACKGROUND (cont’) • Social media data in social websites enjoy rich tagging information provided by Web users • Can we resolve the challenge of auto-photo annotationby leveraging the emerging huge amount of rich social media data?

  4. BACKGROUND (cont’) • Annotationby Search from Social Images Hawk Bird Sky Eagle … Sun Bird Sky Blue … Sun Cloud Hawk Fly … Bird Fly White Cloud …

  5. MOTIVATION • Annotation by Search • Find similar image from social image DB • Annotate the image by the tags of high frequency • Research Challenges • Visual feature representation • Tag data mining • Scalable search & indexing • Distance/similarity measure • Distance Metric Learning

  6. MOTIVATION (cont’) • Related Work of Automated Photo Tagging • Built a large collection of web images with ground truth labels for helping object recognition research (Russell et al.2008) • A fast search-based approach for image annotation by some efficient hashing technique (Wang et al. 2006) • Utilized visual and text modalities simultaneously in clustering images (Rege et al. 2008) • Efficient image search and scene matching techniques for exploring a large-scale web image repository. (Torralba et al. 2008) • Learning based method for improving the efficiency of manual image annotation (Yan et al. 2008) Adopt Hamming or Euclidian distance

  7. MOTIVATION (cont’) • Distance Measure • Hamming distance • Euclidian distance • Mahanalobis distance • Distance Metric Learning • Learning to optimize the metric M • Side Information (a.k.s. “Pairwise Constraints”) • Similar pairs S(x1, x2) : x1 and x2 belong to the same category • Dissimilar pairs D(x1, x2): x1 and x2 belong to different categories

  8. MOTIVATION (cont’) • Related Work on DistanceMetric Learning • Probablistic Global Distance Metric Learning (PGDM)(Xing et al. 2002) • Neighbourhood Components Analysis (NCA)(Goldberger et al.2005) • Relevance Component Analysis(RCA) (Bar-Hillel et al.2005) • Discriminative Component Analysis (DCA) (Hoi et al.2006) • Large Margin Nearest Neighbor (LMNN) (Weinberger et al.2006) • Regularized Distance Metric Learning (RDML) (Si et al.2006) • Information-Theoretic Metric Learning (ITML) (Davis et al.2007) Clean side information is given explicitly

  9. MOTIVATION (cont’) • Annotation by Search from Social Media • NO explicitpairwise side information available • But rich information is available with social images • Ideas of our research • To discover implicit pairwise relationship between social images via a probabilistic approach • To learn effective distance metrics from uncertain side information that is discovered from social images implicitly

  10. METRIC LEARNING FRAMEWORK FORAUTOMATED PHOTO TAGGING • Overview of Our Approaches • Discovery of probabilistic side information • A Graphical Model Approach • Learning distance metrics from probabilistic side information • A Probabilistic RCA Method • Automated photo tagging by applying the optimized metric in visual similarity search

  11. METRIC LEARNING FRAMEWORK FORAUTOMATED PHOTO TAGGING

  12. Latent Chunklet Estimation for Probabilistic Side Info. • Problem Formulation • Latent Chunklets • i.e., the hidden topics • Assumption • both visual images and associated textual metadata are generated from the hidden topic • Calculation • Multi-model hidden topic analysis

  13. Graphical Model ForLatent Chunklet Estimation Visual Modality Hidden Topic Text Modality

  14. Graphical Model ForLatent Chunklet Estimation (cont’) • Generation Process • Inference • Probabilistic Side Info., as Prior Prob. Matrix

  15. Probabilistic Distance Metric Learning • Problem Definition and Notations • Probabilistic Side Info.: • Centers/Means for the Latent Chunklets • Membership Probability • Given the estimation of latent chunkletsP0, how to formulate the DML problem to find the optimal metric M? • Propose an extension of RCA with Prob. Side Info.

  16. Probabilistic Relevance Component Analysis (pRCA) • The objectivefunctionof pRCA: Minimize Sum of square distances of examples from their chunklet’s centers regularization preventing the trivial solution Corollary 1. When fixing the means of chunkletsμ and the matrix of probability assignments P (assuming with hard assignments of 0 and 1),the Probabilistic Relevance Component Analyasis (pRCA) formulation reduces to the regular RCA learning.

  17. Probabilistic Relevance Component Analysis (pRCA) • Iterative algorithm • Fixing P and μto optimize M: • Fixing M and μto optimize P: • Fixing P and M to find μ:

  18. Probabilistic Relevance Component Analysis (pRCA) • pRCA Algorithm

  19. Automated Photo Tagging • Query image • Steps of Auto Photo Tagging via Search • Distance/Similarity Measure • To retrieve a set of visually similar social photos • Set of k-Nearest Neighbor Images • Set of images with distance less than some threshold

  20. Automated Photo Tagging (cont’) • Annotating the query photo by the relevant tags associated with the set of similar images • A tag is more preferred if it has a higher frequency among the set of similar social images • A tag is more preferred if its associatedsocial image are visually more similar to the query photo • Our tagging approach Frequency of tag w among the retrieved social images Average distance from the query photo to the tag’s associated social images

  21. EXPERIMENTS • Experimental Testbed • Totally 205,442 photos from Flickr • Distance Metric Learning: 16,588 photos + tags • Knowledge Database: 186,854 photos + tags • Query Image: 2,000 random photos • Compared Schemes: • Relevance Component Analysis (RCA) • Discriminative Component Analysis (DCA) • Information-Theoretic Metric Learning (ITML) • Large Margin Nearest Neighbor (LMNN) • Neighbourhood Components Analysis (NCA) • Regularized Distance Metric Learning(RDML)

  22. EXPERIMENTS (cont’) • Settings: • 500 latent chunklets • 1,000 visual words • 10,000 tags • Learning rate γ=0.5 • Top k nearest photos, k=30 • Top t relevant tags for annotation, t=1,…,10

  23. Average Precision • Fixed the number of nearest neighbors k to 30 forall compared methods

  24. AverageRecall • Fixed the number of nearest neighbors k to 30 forall compared methods

  25. Precision-RecallCurves • Fixed the number of nearest neighbors k to 30 forall compared methods

  26. Empirical Observations • DML techniques are beneficial and critical to the retrieval-based photo tagging tasks • In general, pRCA algorithm considerably outperformed other approaches in most cases. • For some cases, some DML methods did not perform well, which could be even worse than the Euclidean method. • Noisy (uncertain) side information issue • Robustness is important to DML

  27. Evaluation of Varied kAndt • Examine the annotation performance of pRCA by varying the value of kfrom 10 to 50

  28. Empirical Observations • The number of nearest neighbors parameter kcan influence the annotation performance • In our case, when k equals to 30,theresulting performance is generally better than others • Too large k, lots of noisy tags may beincluded as there may not exist many relevant images in thedatabase. • Too small k, some relevant tags maynot appear, which again may degrade the performance

  29. TimeCostForMetricLearning • To evaluate the time efficiency performance of the proposed DML algorithm on the same dataset • Findings • The most efficient method is the regular RCA approach • The most time-consuming one is NCA • pRCA is quite competitive, which is worse than RCA,DCA, and RDML, but is considerably better than ITML, LMNN, and NCA

  30. Some Good Examples

  31. Some PoorExamples

  32. CONCLUSIONS • Contributions: • Study DML from uncertain side information that exploits probabilistic side information • Propose a two-step probabilistic distance metric learning (PDML) framework • Present an effective probabilistic RCA (pRCA) algorithm • Apply the algorithm to the auto photo annotation by search task • Encouraging results showed that our technique is effective and promising

  33. Future Work • To improve visual feature representation, especially for annotating objects. • To expand the scale of database • To improvelarge scale search & indexing • To filter spam and irrelevant tags • To adoptuser’s feedback to improve automated tagging performance on APT.

  34. Q&A • More information is available: Http://www.cais.ntu.edu.sg/~chhoi/APT/ • Online demo of Auto Photo Tagging (APT) is available: Http://msm.cais.ntu.edu.sg/APT/ • Contact: WU Lei leiwu@live.com Steven CH Hoi CHHoi@ntu.edu.sg School of Computer EngineeringNanyang Technological UniversitySingapore 639798 Email: chhoi@ntu.edu.sg Tel:  (+65) 6513-8040  Fax: (+65) 6792-6559   Http://www.ntu.edu.sg/home/chhoi/

  35. GRAPHICAL MODEL FOR LATENTCHUNKLETESTIMATION • Inference • Jointprobability on documents and topics • Conditional probability on tags, visual words and topics • Gibbs sampling estimation

  36. Appendix

More Related