280 likes | 438 Vues
Question Answering from Errorful Multimedia Streams ARDA AQUAINT. Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University. Carnegie Mellon. Outline. Pseudo-Relevance Feedback for Imagery Experimental Evaluation Results Conclusions.
E N D
Question Answering from Errorful Multimedia StreamsARDA AQUAINT Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon
Outline • Pseudo-Relevance Feedback for Imagery • Experimental Evaluation • Results • Conclusions
Motivation • Question Answering from multimedia streams • Questions contain text and visual components • Want a good image that represents the ‘answer’ • Improve performance of images retrieved as answers • Relevance feedback works for text retrieval !
QUERY SYSTEM RESULTS Relevance Judgment Feedback HUMAN • Why Pseudo? QUERY SYSTEM RESULTS Feedback without human intervention What is Pseudo Relevance Feedback • Relevance Feedback (Human intervention)
Query Text Image Text Score Image Score Final Score Original System Architecture • Simply weighted linear combination of video, audio and text retrieval score Retrieval Agents
Query Text Image Retrieval Agents Image Score PRF Score Text Score Final Score System Architecture with PRF • New step: • Classification through Pseudo Relevance Feedback (PRF) • Combine with all other information agents (text, image)
Classification from Modified PRF • Automatic retrieval technique • Modification: use negative data as feedback • Step-by-step • Run base retrieval algorithm on image collection • K-Nearest neighbor (KNN) on color and texture • Build classifier • Negative examples: least relevant images in the collection • Positive examples: image queries • Classify all data in the collection to obtain ranked results
The Basic PRF Algorithm for Image Retrieval Input Query Examples q1 … qn Target Examples t1 … tn ========================= Output Final score Fi and final ranking for every target ti ========================= Algorithm Given initial score s0i for each ti based on f0(ti, q1 … qn) Using an initial similarity measure f0 as a base Iterate k = 1 … max Given score ski, sample positive instances pki and negative instances nki using sampling strategy S Compute updated retrieval score sik+1= fik+1(ti) where fik+1 is trained/learned using nki,pki Combine all scores for final score Fi =g(s0 … smax)
Evaluation using the 2002 TREC Video Retrieval Task • Independent collection, queries, relevant results available • Search Collection • Total Length: 40.16 hours • MPEG-1 format • Collected from Internet Archive and Open Video websites, documentaries from the ‘50s • 14,000 shots • 292,000 I-frames (images) • Query • 25 queries • Text, Image(Optional), Video(Optional)
Analysis of Queries (2002) • Specific item or person • Eddie Rickenbacker, James Chandler, George Washington, Golden Gate Bridge, Price Tower in Bartlesville, OK • Specific fact • Arch in Washington Square Park in NYC, map of continental US • Instances of a category • football players, overhead views of cities, one or more women standing in long dresses • Instances of events/activities • people spending leisure time at the beach, one or more musicians with audible music, crowd walking in an urban environment, locomotive approaching the viewer
Sample Query and Target Query: Find pictures of Harry Hertz, Director of the National Quality Program, NIST
Speech: We’re looking for people that have a broad range of expertise that have business knowledge that have knowledge on quality management on quality improvement and in particular … OCR:H,arry Hertz a Director aro 7 wa-,i,,ty Program,Harry Hertz a Director Sample Query and Target Query: Find pictures of Harry Hertz, Director of the National Quality Program, NIST
Combination of Agents • Multiple Agents • Text Retrieval Agent • Base Image Retrieval Agent • Nearest Neighbor on Color • Nearest Neighbor on Texture • Classification PRF Agent • Combination of multiple agents • Convert scores to posterior probability • Linear combination of probabilities
2002 Results *Video OCR was not relevant in this collection
Discussion & Future Work • Discussion • Results are sensitive to queries with small numbers of answers • Images alone cannot fully represent the query semantics • Future Work • Incorporate more agents • Utilize the relationship between multiple agent information • Better combination scheme • Include web image search (e.g. Google) as query expansion
Conclusions • Pseudo-relevance feedback works for text retrieval • This is not directly applicable to image retrieval from video due to low precision in the top answers • Negative PRF was effective for finding better images