410 likes | 535 Vues
This study explores innovative methodologies in image retrieval, focusing on the integration of text and visual data. It discusses Bag-of-Words models, visual words' spatial locations, and part-based models, alongside discriminative methods for segmentation and recognition. A significant emphasis is placed on joint learning approaches that enhance large-scale retrieval, utilizing noisy labels to improve search algorithms like Google's Image Search. Key findings include optimized image ranking techniques, advanced data collection methods, and insights from seminal works within the field presented at ICCV and CVPR conferences.
E N D
Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions
Retrieval domains Internet image search Video search for people/objects Searching home photo collections
Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Improving Google’s Image Search • Fergus, Fei-Fei, Perona, Zisserman, ICCV 2005 • Variant of pLSA that includes spatial information
Re-ranking result: Motorbike Topics in model Automatically chosen topic
Animals on the Web Berg and Forsyth, CVPR 2006 Gather images using text search Use LDA to discover “good” images using features based on nearby text, shape, color
Boostrapping of Image Search Schroff, Zisserman, Criminisi, Harvesting Image Databases from the Web, ICCV 2007 Images returned with PENGUIN query Final rankingusing SVM Removal of drawings and abstract images Naives Bayes ranking using noisy metadata Train SVM……. 4 2
OPTIMOL Li, Wang, Fei-Fei CVPR 07
Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Matching Words and Pictures • Barnard, Duygulu, de Freitas, Forsyth, Blei, Jordan, JMLR 2003
Images to text • Use Blobworld or nCuts to segments images into regions • Need to deduce labels attached to each image
Names and Faces in the News Berg, Berg, Edwards, Maire, White, Teh, Learned-Miller, Forsyth. CVPR 2004 Collected 500,000 images and text captions from Yahoo! News Find faces (standard face detector), rectify them to same pose. Perform Kernel PCA and Linear Discriminant Analysis (LDA). Extract names from text. Cluster faces, with each name corresponding to a cluster. Use language model to refine results
Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Vocabulary tree Nistér & Stewénius CVPR 2006. KD-tree in descriptor space Inverse lookup of features Specific object recognition Not category-level
Pyramid Match Hashing • Grauman & Darell, CVPR 2007 • Combines Pyramid Match Kernel (efficient computation of correspondences between two set of vectors) with Locality Sensitive Hashing (LSH) [Indyk & Motwani 98] • Allows matching of the set of features in a query image to sets of features in other images in time that is sublinear in # images • Theoretical guarantees
Semantic Hashing • Salakhutdinov and Hinton, SIGIR 2007 • Torralba, Fergus, Weiss, CVPR 2008 • Map images tocompact binary codes • Hash codes for fastlookup