370 likes | 508 Vues
Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging. Lei Wu *† , Steven C.H. Hoi * , R ong Jin # , Jianke Zhu ‡ , Nenghai Yu †
E N D
Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging Lei Wu *†, Steven C.H. Hoi*, Rong Jin#, Jianke Zhu‡, NenghaiYu† *NanyangTechnological University, †University of Sci. & Tech. of China, #Michigan State University, ‡ETH Zurich
BACKGROUND • Annotation/tagging is essential to making images accessible to Web users • Billions of images on the Web lack proper annotation/tags • Automatic image annotation has been actively studied in multimedia community
BACKGROUND (cont’) • Social media data in social websites enjoy rich tagging information provided by Web users • Can we resolve the challenge of auto-photo annotationby leveraging the emerging huge amount of rich social media data?
BACKGROUND (cont’) • Annotationby Search from Social Images Hawk Bird Sky Eagle … Sun Bird Sky Blue … Sun Cloud Hawk Fly … Bird Fly White Cloud …
MOTIVATION • Annotation by Search • Find similar image from social image DB • Annotate the image by the tags of high frequency • Research Challenges • Visual feature representation • Tag data mining • Scalable search & indexing • Distance/similarity measure • Distance Metric Learning
MOTIVATION (cont’) • Related Work of Automated Photo Tagging • Built a large collection of web images with ground truth labels for helping object recognition research (Russell et al.2008) • A fast search-based approach for image annotation by some efficient hashing technique (Wang et al. 2006) • Utilized visual and text modalities simultaneously in clustering images (Rege et al. 2008) • Efficient image search and scene matching techniques for exploring a large-scale web image repository. (Torralba et al. 2008) • Learning based method for improving the efficiency of manual image annotation (Yan et al. 2008) Adopt Hamming or Euclidian distance
MOTIVATION (cont’) • Distance Measure • Hamming distance • Euclidian distance • Mahanalobis distance • Distance Metric Learning • Learning to optimize the metric M • Side Information (a.k.s. “Pairwise Constraints”) • Similar pairs S(x1, x2) : x1 and x2 belong to the same category • Dissimilar pairs D(x1, x2): x1 and x2 belong to different categories
MOTIVATION (cont’) • Related Work on DistanceMetric Learning • Probablistic Global Distance Metric Learning (PGDM)(Xing et al. 2002) • Neighbourhood Components Analysis (NCA)(Goldberger et al.2005) • Relevance Component Analysis(RCA) (Bar-Hillel et al.2005) • Discriminative Component Analysis (DCA) (Hoi et al.2006) • Large Margin Nearest Neighbor (LMNN) (Weinberger et al.2006) • Regularized Distance Metric Learning (RDML) (Si et al.2006) • Information-Theoretic Metric Learning (ITML) (Davis et al.2007) Clean side information is given explicitly
MOTIVATION (cont’) • Annotation by Search from Social Media • NO explicitpairwise side information available • But rich information is available with social images • Ideas of our research • To discover implicit pairwise relationship between social images via a probabilistic approach • To learn effective distance metrics from uncertain side information that is discovered from social images implicitly
METRIC LEARNING FRAMEWORK FORAUTOMATED PHOTO TAGGING • Overview of Our Approaches • Discovery of probabilistic side information • A Graphical Model Approach • Learning distance metrics from probabilistic side information • A Probabilistic RCA Method • Automated photo tagging by applying the optimized metric in visual similarity search
Latent Chunklet Estimation for Probabilistic Side Info. • Problem Formulation • Latent Chunklets • i.e., the hidden topics • Assumption • both visual images and associated textual metadata are generated from the hidden topic • Calculation • Multi-model hidden topic analysis
Graphical Model ForLatent Chunklet Estimation Visual Modality Hidden Topic Text Modality
Graphical Model ForLatent Chunklet Estimation (cont’) • Generation Process • Inference • Probabilistic Side Info., as Prior Prob. Matrix
Probabilistic Distance Metric Learning • Problem Definition and Notations • Probabilistic Side Info.: • Centers/Means for the Latent Chunklets • Membership Probability • Given the estimation of latent chunkletsP0, how to formulate the DML problem to find the optimal metric M? • Propose an extension of RCA with Prob. Side Info.
Probabilistic Relevance Component Analysis (pRCA) • The objectivefunctionof pRCA: Minimize Sum of square distances of examples from their chunklet’s centers regularization preventing the trivial solution Corollary 1. When fixing the means of chunkletsμ and the matrix of probability assignments P (assuming with hard assignments of 0 and 1),the Probabilistic Relevance Component Analyasis (pRCA) formulation reduces to the regular RCA learning.
Probabilistic Relevance Component Analysis (pRCA) • Iterative algorithm • Fixing P and μto optimize M: • Fixing M and μto optimize P: • Fixing P and M to find μ:
Probabilistic Relevance Component Analysis (pRCA) • pRCA Algorithm
Automated Photo Tagging • Query image • Steps of Auto Photo Tagging via Search • Distance/Similarity Measure • To retrieve a set of visually similar social photos • Set of k-Nearest Neighbor Images • Set of images with distance less than some threshold
Automated Photo Tagging (cont’) • Annotating the query photo by the relevant tags associated with the set of similar images • A tag is more preferred if it has a higher frequency among the set of similar social images • A tag is more preferred if its associatedsocial image are visually more similar to the query photo • Our tagging approach Frequency of tag w among the retrieved social images Average distance from the query photo to the tag’s associated social images
EXPERIMENTS • Experimental Testbed • Totally 205,442 photos from Flickr • Distance Metric Learning: 16,588 photos + tags • Knowledge Database: 186,854 photos + tags • Query Image: 2,000 random photos • Compared Schemes: • Relevance Component Analysis (RCA) • Discriminative Component Analysis (DCA) • Information-Theoretic Metric Learning (ITML) • Large Margin Nearest Neighbor (LMNN) • Neighbourhood Components Analysis (NCA) • Regularized Distance Metric Learning(RDML)
EXPERIMENTS (cont’) • Settings: • 500 latent chunklets • 1,000 visual words • 10,000 tags • Learning rate γ=0.5 • Top k nearest photos, k=30 • Top t relevant tags for annotation, t=1,…,10
Average Precision • Fixed the number of nearest neighbors k to 30 forall compared methods
AverageRecall • Fixed the number of nearest neighbors k to 30 forall compared methods
Precision-RecallCurves • Fixed the number of nearest neighbors k to 30 forall compared methods
Empirical Observations • DML techniques are beneficial and critical to the retrieval-based photo tagging tasks • In general, pRCA algorithm considerably outperformed other approaches in most cases. • For some cases, some DML methods did not perform well, which could be even worse than the Euclidean method. • Noisy (uncertain) side information issue • Robustness is important to DML
Evaluation of Varied kAndt • Examine the annotation performance of pRCA by varying the value of kfrom 10 to 50
Empirical Observations • The number of nearest neighbors parameter kcan influence the annotation performance • In our case, when k equals to 30,theresulting performance is generally better than others • Too large k, lots of noisy tags may beincluded as there may not exist many relevant images in thedatabase. • Too small k, some relevant tags maynot appear, which again may degrade the performance
TimeCostForMetricLearning • To evaluate the time efficiency performance of the proposed DML algorithm on the same dataset • Findings • The most efficient method is the regular RCA approach • The most time-consuming one is NCA • pRCA is quite competitive, which is worse than RCA,DCA, and RDML, but is considerably better than ITML, LMNN, and NCA
CONCLUSIONS • Contributions: • Study DML from uncertain side information that exploits probabilistic side information • Propose a two-step probabilistic distance metric learning (PDML) framework • Present an effective probabilistic RCA (pRCA) algorithm • Apply the algorithm to the auto photo annotation by search task • Encouraging results showed that our technique is effective and promising
Future Work • To improve visual feature representation, especially for annotating objects. • To expand the scale of database • To improvelarge scale search & indexing • To filter spam and irrelevant tags • To adoptuser’s feedback to improve automated tagging performance on APT.
Q&A • More information is available: Http://www.cais.ntu.edu.sg/~chhoi/APT/ • Online demo of Auto Photo Tagging (APT) is available: Http://msm.cais.ntu.edu.sg/APT/ • Contact: WU Lei leiwu@live.com Steven CH Hoi CHHoi@ntu.edu.sg School of Computer EngineeringNanyang Technological UniversitySingapore 639798 Email: chhoi@ntu.edu.sg Tel: (+65) 6513-8040 Fax: (+65) 6792-6559 Http://www.ntu.edu.sg/home/chhoi/
GRAPHICAL MODEL FOR LATENTCHUNKLETESTIMATION • Inference • Jointprobability on documents and topics • Conditional probability on tags, visual words and topics • Gibbs sampling estimation