1 / 32

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic Andrew Zisserman.

simone
Télécharger la présentation

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic Andrew Zisserman [7] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proc. ICCV, 2007.

  2. Outline • Introduction • Methods in this paper • Experiment & Result • Conclusion

  3. Outline • Introduction • Methods in this paper • Experiment & Result • Conclusion

  4. Introduction • Goal • Specific object retrieval from an image database • For large database • It’s achieved by systems that are inspired by text retrieval (visual words).

  5. Flow • Get features • SIFT • Cluster • Approximate k-means • Feature quantization • Visual word • Soft-assignment (query) • Re-ranked • RANSAC • Query expansion • Average query expansion

  6. Outline • Introduction • Methods in this paper • Experiment & Result • Conclusion

  7. Feature • SIFT

  8. Quantization (visual word) • Point List = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)] • Sorted List = [(2,3), (4,7), (5,4), (7,2), (8,1),(9,6)]

  9. Soft-assignment of visual words • Matching two image features in bag-of-visual-words in hard-assignment • Yes if assigned to the same visual word • No otherwise • Sort-assignment • A weighted combination of visual words

  10. Soft-assignment of visual words A~E represent cluster centers (visual words) points 1–4 are features

  11. Soft-assignment of visual words • d is the distance from the cluster center to the descriptor • In practice is chosen so that a substantial weight is only assigned to few cells • The essential parameters • the spatial scale • r, nearest neighbors considered

  12. Soft-assignment of visual words • the weights to the r nearest neighbors, the descriptor is represented by an r-vector, which is then L1 normalized

  13. TF–IDF weighting • Standard index architecture

  14. TF–IDF weighting • tf • 100 vocabularies in a document, ‘a’ 3 times • 0.03 (3/100) • idf • 1,000 documents have ‘a’, total number of documents 10,000,000 • 9.21 ( ln(10,000,000 / 1,000) ) • if-idf = 0.28( 0.03 * 9.21)

  15. TF–IDF weighting • In this paper • For the term frequency(tf) • we simply use the normalized weight value for each visual word. • For the inverse document(idf) • feature measure, we found that counting an occurrence of a visual word as one, no matter how small its weight, gave the best results

  16. Re-ranking • RANSAC • Affine transform Θ : Y = AX+b • Algorithm • 1. Randomly choose n points • 2. Use n points to find Θ • 3. Input N-n points to Θ • 4. How many inlier • Repeat 1~4 K times • Pick the best Θ

  17. Re-ranking • In this paper • No only counting the number of inlier correspondences ,but also scoring function, or cosine =

  18. Average query expansion • Obtain top (m < 50) verified results of original query • Construct new query using average of these results • where d0 is the normalized tf vector of the query region • di is the normalized tf vector of the i-th result • Requery once

  19. Outline • Introduction • Methods in this paper • Experiment & Result • Conclusion

  20. Dataset • Crawled from Flickr & high resolution(1024x768) • Oxford buildings • About 5,062 high resolution(1024x768) images • using 11 landmarks as queries • Paris • Used for quantization • 6,300 images • Flickr1 • 145 most popular tags • 99,782 images

  21. Dataset

  22. Dataset • Query • 55 queries: 5 queries for each of 11 landmarks

  23. Baseline • Follow the architecture of previous work [15] • A visual vocabulary of 1M words is generated using an approximate k-means [15] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proc. CVPR, 2007

  24. Precision Recall Evaluation • Compute Average Precision (AP) score for each of the 5 queries for a landmark • Area under the precision-recall curve • Precision = RPI / TNIR • Recall = RPI / TNPC RPI = retrieved positive images TNIR = total number of images retrieved TNPC = total number of positives in the corpus • Average these to obtain a Mean Average Precision (MAP)

  25. Evaluation • Dataset • Only the Oxford (D1) 5,062 images • Oxford (D1) + Flickr1 (D2) 104,844 images • Vector quantizers • Oxford or Paris

  26. Result Parameter variation Comparison with other methods [15] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. CVPR, 2007. [14] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. CVPR, 2006. [18] T. Tuytelaars and C. Schmid. Vector quantizing feature space with a regular lattice. ICCV, 2007.

  27. Result Spatial verification Effect of vocabulary size

  28. Result Query expansion Scaling-up to 100K images

  29. Result

  30. Result ashmolean_3 goes from 0.626 AP to 0.874 AP christ_church_5 increases from 0.333 to 0.813 AP

  31. Outline • Introduction • Methods in this paper • Experiment & Result • Conclusion

  32. Conclusion • A new method of visual word assignment was introduced: • descriptor-space soft-assignment • It improves that descriptor lost in the quantization step of previously published methods.

More Related