PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 SeoSeokJun

Abstract • Video information retrieval • Finding info. relevant to query • Approach • Pseudo-relevance feedback • Negative PRF

Questions • How this paper approach to content-based video retrieval • What is the advantage of negative PRF • What this paper do to remove extreme outliers

Introduction • Content-based access to video info. • CBVR • Allow users to query and retrieve based on audio and video • Limite • capturing fairly low-level physical features • Color, texture, shape, … • Difficult to determine similarity metrics • diff. query scenario -> diff. similarity metrics • Animals -> by shape • Sky, water -> by color

Introduction • Making the similarity metric adaptive • Adapting similarity metric • Automatically discover the discriminating feature subspace • How? • Cast as classification problem • Margin-based classifier • SVMs, Adaboosting • High performance • Learning the maximal margin hyperplane • Users’ query only provides a small positive data with no explicit negative data at all

Introduction • Thus, to use, more training data needed • Negative examples • Random sampling • As positive data # in a collection is very small • Risk: positive examples might be included as negative • In standard relevance feedback • Ask user to label • Tedious! • Automatic retrieval is essential!

Introduction • Automatic relevance feedback • Based on not tailored to specific queries • Negative feedback -> sample the bottom-ranked examples • Ex) car -> different from query images in “shape” • Feedback negative data • re-weight • Refine discriminating feature subspace • Learning algorithm would be better than universal similarity metric(used in all query)

Introduction • Learning process • Purpose • Discover a better similarity metric • Finding the most discriminating subspace between positive and negative examples. • Cannot produce fully accurate classification • Training data is too small • Negative distribution -> not reliable! • Risk! -> feedback from incorrect estimate • Combining!(with generic similarity metric)

Related work • Briefly discuss some of the features of complete system • The Informedia Digital Video Library • Relevance and Pseudo-Relevance Feedback

Pseudo-Relevance Feedback • Similar to relevance feedback • Both oriented from document retrieval • Without any user intervention • Few study in multimedia retrieval yet • No longer can assume top ranked are always relevant • Relatively poor performance of visual retrieval

Pseudo-Relevance Feedback • Positive example based learning • Partially supervised learning • Begin with a small # of positive examples • No negative examples • Goal: associate all examples in collection with one of the given categories • Out goal? • Producing a ranked list of the examples

Pseudo-Relevance Feedback • Semi-supervised learning • Two classifier • Training set of labeled data • Working set of unlabeled data • Transductive learning • Paradigms to utilize the info. of unlabeled data • Successful in image retrieval • Computation is too expensive • Multimedia -> large collection

Pseudo-Relevance Feedback • Query: text + audio + image/video • Retrieving a set of relevant video shot • Permutation of the video shots • Sorted by their similarity • Difference(two video segments) -> similarity metric • Video feature • Multiple perspective • Speech transcript, audio, camera motion, video frame

Pseudo-Relevance Feedback • Retrieval as classification problem • Data collection can be separated into pos/neg • Mean average precision • Precision and recall is common measure • But not taking the rank into consideration • Area under an ideal recall/precision curve

Pseudo-Relevance Feedback • PRF • Users’ judgment -> output of a base similarity metric • fb: base similarity metric • p: sampling strategy • fl: learning algorithm • g: combination strategy

Pseudo-Relevance Feedback

Algorithm Details • Base similarity metric • Dissimilarity for x to query q1,…,qn • Score -> for each frame • But retrieval unit -> shot(multiple frames) • Choose maximal score of a frame in one shot • Sampling Strategies • From speech transcript -> positive feedback • Due to high precision of textual retrieval

Algorithm Details • Classification Algorithm • SVMs • Posterior probability • Linearly normalize the score • = g(, ) = + • : combinational factor

Algorithm Details • Combinational with text retrieval • Externally provided video summaries are source of textual information • Posterior probability set to 1 if keyword exists • Posterior probability for • + + • : posterior prob. of transcript retrieval • : video summary retrieval • Each for • In experiment • , = 1, = 0.2 • Whole video as a unit -> too coarse to be accurate

Pseudo-Relevance Feedback • Positive example • Query examples • Negative example • Strongest negative examples • Feedback only one time • Computational issue • Automatically feedback the training data based on generic similarity metric • To learn adaptive similarity metric • Generalize the discriminating subspace for various queries

Pseudo-Relevance Feedback • Why good? • Good generalization ability of margin-based learning algorithm • Isotropic data distribution -> invalid • Directions vary with different queries, topics • Sky -> color • Car -> shape • In this case, PRF provide better similar metric than generic.

Pseudo-Relevance Feedback • Test two case • Positive data • Along the edge of the data collection • Center of the data collection • Both case • PRF superior • Base similarity metric: generic metric • Cannot be modified across query

Pseudo-Relevance Feedback

Pseudo-Relevance Feedback • PRF metric can be adapted based on the global data distribution and training data • By feeding back the negative examples • Near optimal decision boundary • Associate higher score • Farther away from the negative data • Good when positive data are near the margin • Common in high dimensional spaces

Pseudo-Relevance Feedback • Downside • Some neg. outlier assigned a higher score than any positive data -> more false alarm • Solution • Combining base metric and PRF metric • Smooth out most of the outlier • Just simple linear combination(1:1) • Reasonable trade-off between local classification behavior and global discriminating ability

Experiment • Video: TREC Video Retrieval Track • Text: NIST • 40 hours of MPEG-1 video • Audio: splits the audio from the video • Down-samples to 16cKz, 16 bit sample • Speech recognition system • Broadcast news transcript • Image processing side • Low-level image features; color and texture • Query as xml

Experiment

Results

results

conclusion • Classification task • Machine learning theory to video retrieval • SVMs learn to weight the discriminating features • Negative PRF • Separate the means of distributions of the neg. and pos. examples • Smoothing with combination

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL