1 / 109

Document Image Retrieval using Bag of Visual Words Model

Document Image Retrieval using Bag of Visual Words Model. Ravi Shekhar CVIT, IIIT Hyderabad Advisor : Prof. C.V. Jawahar. Motivation. Large number of printed books are digitized. Motivation. Large number of printed books are digitized

Télécharger la présentation

Document Image Retrieval using Bag of Visual Words Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Document Image Retrieval using Bag of Visual Words Model Ravi Shekhar CVIT, IIIT Hyderabad • Advisor : Prof. C.V. Jawahar

  2. Motivation • Large number of printed books are digitized

  3. Motivation • Large number of printed books are digitized • Digital libraries like Universal Digital library (UDL), Digital library of India (DLI) and Google Books etc. Digital Library Database

  4. Motivation • Large number of printed books are digitized • Digital libraries like Universal Digital library (UDL), Digital library of India (DLI) and Google Books etc. • Need to design efficient and effective methodology for content level access Digital Library Database

  5. Process Overview Processing Input Query Scanning Index Database Matching Retrieved Documents Documents Matching can be done by two levels : “Text” and “Image”

  6. Matching Approaches • Recognition Based Approach (Text Level Matching) • Optical Character Recognition (OCR) • Recognition Free Approach (Image Level Matching) • Word Spotting

  7. Recognition Based Approach • Optical Character Recognition (OCR) • Binarization of Document • Segmentation using connected components • Line level • Word level • Character level • Character recognition using different features like patch, profile etc • Classification using ANN or SVM

  8. Limitations of Recognition Based Approach • Cuts

  9. Limitations of Recognition Based Approach • Cuts • Merges

  10. Limitations of Recognition Based Approach • Cuts • Merges • Variation in Script

  11. Limitations of Recognition Based Approach • Cuts • Merges • Variation in Script • Variation in Font and Typesetting

  12. Limitations of Recognition Based Approach • Cuts • Merges • Variation in Script • Variation in Font and Typesetting • Underline and Over Written

  13. Recognition Free Approach • Word Spotting • Representation of word image using global (profile) features

  14. Recognition Free Approach • Word Spotting • Representation of word image using global (profile) features • Matching features using different distance measures like L1, L2 etc

  15. Recognition Free Approach • Word Spotting • Representation of word image using global (profile) features • Matching features using different distance measures like L1, L2 etc • Comparison of different size word images using Dynamic time warping (DTW)

  16. Why Recognition Free Approach ? • Robust OCRs are unavailable for many non-Latin languages • These languages have rich heritage and there is a need for content level search • Word Spotting based methods are too slow for real time system • Most of the existing retrieval methods are memory intensive • Scalability is an immediate challenge

  17. Word Image Retrieval using Bag of Visual Words

  18. Bag of Visual Words (BoVW) • Bag of Words (BoW) representation is the most popular representation for text retrieval • BoW based efficient systems like Lucene are publically available • Bag of Visual Words (BoVW) performs excellently for image and video retrieval • BoVW based system is flexible, powerful and scalable to Billions of images

  19. BoVW Representation • Word Images are represented using Histogram of Visual Words

  20. BoVW Representation • Code Book generation • Subset of Images is used • Clustering is done using Hierarchical K-Means (HKM) • HKM is faster than K-Means both in building tree and finding nearest neighbours

  21. BoVW based Representation

  22. BoVW based Representation

  23. BoVW based Representation Histogram of Visual Words

  24. BoVW based Representation Cuts

  25. BoVW based Representation Cuts Histogram of Visual Words

  26. BoVW based Representation Merges

  27. BoVW based Representation Merges Histogram of Visual Words

  28. Proposed Architecture

  29. Advantages of BoVW based Representation • Fixed size representation

  30. Advantages of BoVW based Representation • Fixed size representation Clean Clean

  31. Advantages of BoVW based Representation • Fixed size representation • Robust against degradation

  32. Advantages of BoVW based Representation • Fixed size representation • Robust against degradation Cuts Merge Clean

  33. Advantage of BoVW based Representation • Fixed size representation • Robust against degradation • Scalable to Billions of images

  34. Advantages of BoVW based Representation • Fixed size representation • Robust against degradation • Scalable to Billions of Images • Language independent

  35. Spatial Verification • Lost Geometry

  36. Spatial Verification • Lost Geometry Clean Clean

  37. Spatial Verification • Lost Geometry Clean Clean Clean

  38. Spatial Verification • Lost Geometry Clean Clean Clean

  39. Spatial Verification • Lost Geometry • Spatial Verification

  40. Spatial Verification • Lost Geometry • Spatial Verification

  41. Spatial Verification • Lost Geometry • Spatial Verification

  42. Re-ranking • SIFT based re-ranking • Higher the Total Score, better the match

  43. Experimentations • Books Used in Experimentations

  44. Quantitative Results • Performance Statistics

  45. Quantitative Results • Performance Statistics

  46. Quantitative Results • mAPVs Query Length

  47. Quantitative Results • mAPVs Query Length • More the # characters, better the results

  48. Quantitative Results Retrieval Time and Index Size

  49. Qualitative Results HI

  50. Qualitative Results

More Related