1 / 24

Beyond bags of features: Adding spatial information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba. Adding spatial information. Forming vocabularies from pairs of nearby features – “doublets” or “bigrams” Computing bags of features on sub-windows of the whole image

zared
Télécharger la présentation

Beyond bags of features: Adding spatial information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond bags of features:Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

  2. Adding spatial information • Forming vocabularies from pairs of nearby features – “doublets” or “bigrams” • Computing bags of features on sub-windows of the whole image • Using codebooks to vote for object position • Generative part-based models

  3. From single features to “doublets” • Run pLSA on a regular visual vocabulary • Identify a small number of top visual words for each topic • Form a “doublet” vocabulary from these top visual words • Run pLSA again on the augmented vocabulary J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005

  4. From single features to “doublets” Ground truth All features “Face” features initiallyfound by pLSA One doublet Another doublet “Face” doublets J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005

  5. Spatial pyramid representation level 0 • Extension of a bag of features • Locally orderless representation at several levels of resolution Lazebnik, Schmid & Ponce (CVPR 2006)

  6. Spatial pyramid representation level 1 • Extension of a bag of features • Locally orderless representation at several levels of resolution level 0 Lazebnik, Schmid & Ponce (CVPR 2006)

  7. Spatial pyramid representation level 2 • Extension of a bag of features • Locally orderless representation at several levels of resolution level 1 level 0 Lazebnik, Schmid & Ponce (CVPR 2006)

  8. Scene category dataset Multi-class classification results(100 training images per class)

  9. Caltech101 dataset Multi-class classification results (30 training images per class) http://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html

  10. visual codeword withdisplacement vectors training image Implicit shape models • Visual codebook is used to index votes for object position B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

  11. Implicit shape models • Visual codebook is used to index votes for object position test image B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

  12. Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering

  13. Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering • Map the patch around each interest point to closest codebook entry

  14. Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering • Map the patch around each interest point to closest codebook entry • For each codebook entry, store all positions it was found, relative to object center

  15. Implicit shape models: Testing • Given test image, extract patches, match to codebook entry • Cast votes for possible positions of object center • Search for maxima in voting space • Extract weighted segmentation mask based on stored masks for the codebook occurrences

  16. Generative part-based models R. Fergus, P. Perona and A. Zisserman, Object Class Recognition by Unsupervised Scale-Invariant Learning, CVPR 2003

  17. Probabilistic model Partdescriptors Partlocations h: assignment of features to parts Candidate parts

  18. Probabilistic model h: assignment of features to parts Part 1 Part 3 Part 2

  19. Probabilistic model h: assignment of features to parts Part 1 Part 3 Part 2

  20. Probabilistic model Distribution over patchdescriptors High-dimensional appearance space

  21. Probabilistic model Distribution over jointpart positions 2D image space

  22. Results: Faces Faceshapemodel Patchappearancemodel Recognitionresults

  23. Results: Motorbikes and airplanes

  24. Summary: Adding spatial information • Doublet vocabularies • Pro: takes co-occurrences into account, some geometric invariance is preserved • Con: too many doublet probabilities to estimate • Spatial pyramids • Pro: simple extension of a bag of features, works very well • Con: no geometric invariance • Implicit shape models • Pro: can localize object, maintain translation and possibly scale invariance • Con: need supervised training data (known object positions and possibly segmentation masks) • Generative part-based models • Pro: very nice conceptually • Con: combinatorial hypothesis search problem

More Related