1 / 22

Similarity searching in image retrieval and annotation

Similarity searching in image retrieval and annotation. Petra Budíková. Outline. Motivation Image search applications General image retrieval Text-based approach Content-based approach Challenges and open problems Multi-modal image retrieval Comparative study of approaches

rigsby
Télécharger la présentation

Similarity searching in image retrieval and annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Similarity searchinginimage retrievalandannotation Petra Budíková

  2. Outline • Motivation • Image search applications • General image retrieval • Text-based approach • Content-based approach • Challenges and open problems • Multi-modal image retrieval • Comparative study of approaches • Our contributions • Current results, research directions • Automatic image annotation • Naive solution • Demo • Better solution (work in progress) stamp

  3. Motivation • Explosion of digital data • Size - world’s digital data production: • 5 billion GB (2003) vs. 1 800 billion GB (2011) • Data is growing by a factor of 10 every five years. • Diversity • Availability of technologies => multimedia data • Images • Personal images • Flickr: 4 billion (2009), 6 billion (2011) • Facebook: 60 billion (2010), 140 billion (2011) • Scientific data • Surveilance data • …

  4. Motivation II • Applications of image searching • Collection browsing • Collection organization • Targeted search • Data annotation and categorization • Authentization • … Cornflower Photo from my trip to highlands purple • blue • color • plant • beauty • nature • garden • petals • hydrangea • weed Iris in the botanical garden Unknown violet flower Get maximum information about this: General image collections Large-scale searching What is this? Summer holiday 2011 1552 photos Checking… OK

  5. General image retrieval • Basic approaches: • Attribute-basedsearching • Text-basedsearching • Content-basedsearching • Attribute-basedsearching • Size, type, category, price, location, … • Relationaldatabases • Text-basedsearching • Image title, text ofsurrounding web page, Flickrtags, … • Mature text retrievaltechnologies • Basic assumption: additionalmetadata are available • Humanparticipationmostlyneeded • Not realisticfor many applications

  6. General image retrieval II • Content-basedsearching • Q uery by example • Similarity measure (distance function) • Optimal unknown • Subjective, context-dependent • Should reflect semantics as well as visual features • State-of-the-art representations of image • Reflect low-level visual features • Global image descriptors: MPEG7 colors, shapes • Local image descriptors: SIFT, SURF • Semantic gap problem • In general, it is very difficult to extract semantics • Possible only in specialized applications, e.g. face search • The more sophisticated representation of image, the more costly evaluation of distances

  7. General image retrieval III • Summary Observations: • Simple image descriptors -> semantic gap, not distinctive enough • Complex image descriptors -> extraction and evaluation not feasible • A single ideal descriptor does not exist • Current direction in image retrieval: multimodal searching • More similarity measures combined in efficient way

  8. Multi-modal searching • Modalities: projections of data into search spaces • Global visual descriptors, local descriptors, text, category, … • Typical combinations • Text + local visual descriptor (Google, Bing) • Text + global visual descriptor (MUFIN) • Different global visual descriptors (MUFIN) • Visual descriptors + GPS • … • Advantages of multi-modal searching • More distinctive than single modality • Simple text search vs. Google text search with Page rank • Allows flexible balancing of modalities • Better approximation of human understanding of similarity • Allows efficient implementation • Parallel processing of modalities • Iterative filtering of candidates

  9. Multi-modal searching II • Challenges • Selection of suitable modalities • Availability • Suitability for given dataset and application • Balancing of importance of individual modalities • Automatic • User-defined • Cross-modality information mining • Automatic • User-assisted • Efficient implementation of multi-modal retrieval

  10. Multi-modal searching III

  11. Multi-modal searching IV • Our focus • Let us suppose two modalities – text and global visual features • Frequently used • Available in web search applications • Only consider two-phase searching • Basic search over whole database • Postprocessing of basic search results • Categorize possible solutions • Implement & evaluate • Large-scale data processing • Analyze results

  12. Text-and-visual basic search • Single modality basic search • Text: Lucene search engine • Visual: MESSIF content-based retrieval • Multi-modal basic search Query Candidate objects Query Dataset Result Results postprocessing Basic search

  13. Postprocessing • Types of ranking functions: • Orthogonal modality ranking • Rank by modality other than the one(s) used for basic search • Fusion ranking • Merge multiple results of basic search • Differs from late fusion in the size of the merged sets • Pseudo-RF ranking • Some additional knowledge about query object or similarity function is mined from the results of basic search • Interactive ranking • User provides additional information • Not considered in experiments Query Candidate objects Query Dataset Result Results postprocessing Basic search

  14. Evaluation • Experiments • 6 basic search methods x 7 ranking methods x parameters • About 90 solutions for two-modal search :) • 100 queryobjects • Top-30 queryforeachmethodandqueryobject • 2 datasets • Profimedia: 20M high-qualityimageswithrichandpreciseannotations • Flickr: 20M imageswith user descriptions • Humanevaluationofresults relevance • Highlyrelevant / partiallyrelevant / irrelevant two coins smiling face zebra cornfield handwriting

  15. Preliminary results • Profimediadatasetresultsonly • Bestmethod: Text-basedsearch + visualranking • Googlesolution • Betterthan more complexfusionsolutions • Collection with high-quality text • Semantics very important in queries • Ranking adds about 20% relevance • Text search vs. text search + visual ranking • Content-based search vs. content-based + text ranking

  16. Preliminary results II [NDCG] • Limitations of text-based approaches • Not enough relevant images with relevant keywords • Too broad semantical concept • Visual component crucial [k] Query text: bird

  17. Multi-modal search – future work • Text-and-visual search • Complete analysis of results • Determine conditions which influence usability of individual methods • Dataset properties • Query properties • Automatic recommendation of query processing • Multi-modal search in general • Combination of more than two modalities

  18. Annotation • Task • For a given image, retrieve relevant text information • Easier: relevant keywords • More difficult: relevant text (Wiki page, …) • Applications • Recommendation of tags in social networks • Classification • Method • Only image available – search by visual features theonly possibility • Exploit dataset of images with textual information • Obtain a set of results, what can we do with these? • Simple solution: analyze keywords related to images in similarity search result, return the most frequent ones • Advanced solution: analyze relationships between keywords

  19. Annotation – simple solution • MUFIN Image Annotation plugin for Firefox Cornflower Photo from my trip to highlands purple • blue • color • plant • beauty • nature • garden • petals • hydrangea • weed Iris in the botanical garden Unknown violet flower

  20. Annotation – simple solution II • Limitations • Relevance of results found by content-based retrieval • Semantic gap • Quality of source data • Spelling mistakes, different languages, names, stopwords, … • Natural language features • Synonyms • Hypernyms, homonyms • Noun vs. verb • … • Possible solutions • Consistence checking over the results • Source text cleaning • Advanced text processing

  21. Annotation – advanced solution • Employ knowledge-base to learn about semantics • WordNet: lexical database of English • Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. • Synsets are interlinked by means of conceptual-semantic and lexical relations • Dataset preprocessing • Determine the correct synsets for keywords in the dataset • Analysis of keywords related to the same image • The correct synsets should be “near” in the WordNet relationships graph • Annotation process (work in progress) • Retrieve similar objects • Analyze relationships between synsets • Synsets found: beagle, dog, terrier -> there’s a dog in the image

  22. For more information… … visit mufin.fi.muni.cz http://mufin.fi.muni.cz/profimedia collection browsing and targeted search in 20M image collection http://mufin.fi.muni.cz/annotation info about annotation, demo, plugin download stamp

More Related