1 / 19

Searching for Logo and Trademark Images on the Web

Searching for Logo and Trademark Images on the Web. Euripides G.M. Petrakis * Epimenidis Voutsakis * Evangelos Milios ** * Technical University of Crete, Chania, Greece ** Dalhousie University, Halifax, Canada. Retrieval of Logo & Trademarks.

anakin
Télécharger la présentation

Searching for Logo and Trademark Images on the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching for Logo and Trademark Images on the Web Euripides G.M. Petrakis* Epimenidis Voutsakis* Evangelos Milios** *Technical University of Crete, Chania, Greece **Dalhousie University, Halifax, Canada

  2. Retrieval of Logo & Trademarks • Important characteristic signs of corporate Web sites and of products presented there • Comprise 32,6% of total number of images on the Web • Retrieval of logo & trademarks is of significant commercial interest • Eg. Detection of unauthorized usage http://www.intelligence.tuc.gr/intellisearch

  3. Image Retrieval on the Web • Text queries: keywords, free text • Answers: images in Web pages with similar text • Images not always relevant or relevant but not important • Important: From corporate web sites, organizations • Less important: From individuals and small companies • Link analysis: assign higher ranking to answers from important web sites http://www.intelligence.tuc.gr/intellisearch

  4. Contributions • Enhance accuracy of retrievals • Support queries by image example • Preference to images from important Web sites • Evaluation of state-of-art methods • Retrieval by Text • Retrieval by Image content • Retrieval by importance • Combination of the above http://www.intelligence.tuc.gr/intellisearch

  5. Image Content Representation • Text surrounding images in Web pages • Image filename, Alternate text, Page title, Caption • Image features computed on Intensity & Energy histograms • Mean & Variance on histograms • Moment invariants on raw images • Count of number of distinct intensity levels http://www.intelligence.tuc.gr/intellisearch

  6. Histograms • Intensity Spectrum: distribution of intensity values • Energy Spectrum: distribution of average energy over co-centric rings on DFT http://www.intelligence.tuc.gr/intellisearch

  7. Logo & trademark Detection • Distinguish from images of other categories • Small images • Few intensity levels • Rich frequency content • Image features form vectors which are used to train a decision tree • Accuracy: 85% • Each image is a assigned a probability of being logo or trademark • Retrieval gives more emphasis to images with high logo-trademark probability http://www.intelligence.tuc.gr/intellisearch

  8. Logo-Trademark Similarity • Simage-similarity(Q,D) = Sfeatures + Stext • Sfeatures= Smoment-invariants + Sintensity-histogram + Senergy-histogram • Stext= Simage-caption + Sfile-name + Salt-text + Spage-title http://www.intelligence.tuc.gr/intellisearch

  9. Image Retrieval by Text • Compute text similarity between Image and Query text descriptions using Vector Space Model (VSM) • Text is represented by vectors of tf.idf term weights • Q=(q1,q2,…qN) , D=(d1,d2,…dN) • Similarity http://www.intelligence.tuc.gr/intellisearch

  10. Retrieval by Image features • The similarity between histograms is computed by their inter intersection • The similarity between moment invariants is computed as vector similarity http://www.intelligence.tuc.gr/intellisearch

  11. Link Analysis • Assign importance to Web pages, images • Main idea: co-cited and co-contained images are likely to be related • PageRank and HITS for text retrieval • PicASHOW for Web pages with images using links alone • WPicASHOW handles image and text content in queries and Web pages http://www.intelligence.tuc.gr/intellisearch

  12. Focused graph F • Retrieve initial set F of images • Stop images (banners, buttons) are filtered out • Non-logo/trademarks are filtered out (based on probability) • Expand F with pages pointing to images in F • Expand F with pages and images pointed to by pages in F • Repeat until F sufficiently large http://www.intelligence.tuc.gr/intellisearch

  13. Example of Focused Graph http://www.intelligence.tuc.gr/intellisearch

  14. WPicASHOW • Create the focused graph F • Weighted links: image similarity between Queries are Images is used for regulating the influence of links in F • Authorities: principal eigenvector of [(W+I)MT](W+I)M • W: page to page relationships in F • M: page to image relationships in F • Answers: Rank answers by authority (eigen)value http://www.intelligence.tuc.gr/intellisearch

  15. Evaluation • Database assembled locally by crawler • 1,5M pages with images • Text queries: VSM, PicASHOW, WPicASHOW • Image queries (example image + text): VSM, WPicASHOW • Average Precision/Recall on top 30 answers http://www.intelligence.tuc.gr/intellisearch

  16. Text Queries http://www.intelligence.tuc.gr/intellisearch

  17. Queries by text and image http://www.intelligence.tuc.gr/intellisearch

  18. Conclusions • VSM: Relevant but not always important answers • PicASHOW retrieves important but not always relevant answers • WPicASHOW: good compromise between relevance and importance • The size of the data set is a problem http://www.intelligence.tuc.gr/intellisearch

  19. Web Implementation • Try the system at http://www.intelligence.tuc.gr/intellisearch • Selection of retrieval method • Link analysis methods • And more.. http://www.intelligence.tuc.gr/intellisearch

More Related