1 / 16

Improving the Fisher Kernel for Large-Scale Image Classification

Improving the Fisher Kernel for Large-Scale Image Classification. Florent Perronnin , Jorge Sanchez, and Thomas Mensink , ECCV 2010. VGG reading group, January 2011, presented by V. Lempitsky. From generative modeling to features. dataset. Discriminative classfier model. Input sample.

sasha
Télécharger la présentation

Improving the Fisher Kernel for Large-Scale Image Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving the Fisher Kernelfor Large-Scale Image Classification FlorentPerronnin, Jorge Sanchez, and Thomas Mensink, ECCV 2010 VGG reading group, January 2011, presented by V. Lempitsky

  2. From generative modeling to features dataset Discriminativeclassfier model Input sample Generative model fitting Parameters of the fit

  3. Simplest example Dataset of vectors • Codebooks • Sparse or dense component analysis • Deep belief networks • Color GMMs • .... Discriminativeclassfier model Input vector K-means Codebook fitting Closest codeword

  4. Fisher vector idea Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. NIPS’99 Generative model Discriminativeclassfier model Input sample Parameters of the fit fitting Information loss (generative models are always inaccurate!) Can we retain some of the lost information without building better generative model? Main idea: retain information about the fitting error for the best fit. Same best fit, but different fitting errors!

  5. Fisher vector idea Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. NIPS’99 Generative model Discriminativeclassfier model Input sample Fisher vector fitting X λ Main idea: retain information about the fitting error of the best fit. Fisher vector: (λ1,λ2)

  6. Fisher vector for image classification F. Peronnin and C. Dance // CVPR 2007 • Assuming independence between the observed T features • Encoding each visual feature (e.g. SIFT) extracted from image to a Fisher vector • Using N-component gaussian mixture models with diagonalized covariance matrices: N dimensions 128N dimensions 128N dimensions

  7. Relation to BoW F. Peronnin and C. Dance // CVPR 2007 BoW N dimensions 128N dimensions Extra info 128N dimensions

  8. Whitening the data Fisher matrix (covariance matrix for Fisher vectors): Whitening the data (setting the covariance to identity): Fisher matrix is hard to estimate. Approximations needed: [Peronnin and Dance//CVPR07] suggest a diagonal approximation to Fisher matrix:

  9. Classification with Fisher kernels F. Peronnin and C. Dance // CVPR 2007 • Use whitened Fisher vectors as an input to e.g. linear SVM • Small codebooks (e.g. 100 words) are sufficient • Encoding runs faster than BoW with large codebooks(although with approximate NN this is not so straightforward!) • Slightly better accuracy than “plain, linear BoW”

  10. Improvements to Fisher Kernels Perronnin, Jorge Sanchez, and Thomas Mensink, ECCV 2010 Overall very similar to how people improve regular BoW classification Idea 1: normalization of Fisher vectors. Justification: our GMM probability distribution of VW in an image Assume: Image specific “content” Then: =0 Thus: Observation: image non-specific “content” affects the length of the vector, but not direction Conclusion: normalize to remove the effect of non-specific “content” ...also L2-normalization ensures K(x,x) = 1 and improves BoV [Vedaldi et al. ICCV’09]

  11. Improvement: power normalization α =0.5 i.e. square root works well c.f. for example [Vedaldi and Zisserman// CVPR10] or [Peronnin et al.//CVPR10] on the use of square root andHellinger’s kernel for BoW

  12. Improvement 3: spatial pyramids • Fully standard spatial pyramids [Lazebnik et al.] with sum-pooling

  13. Results: Pascal 2007 Details: regular grid, multiple scales, SIFT and local RGB color layout, both reduced to 64 dimensions via PCA

  14. Results: Caltech 256

  15. PASCAL + additional training data • Flickr groups up to 25000 per class • ImageNet up to 25000 per class

  16. Conclusion • Fisher kernels – good way to exploit your generative model • Fisher kernels based on GMMs in SIFT space lead to state-of-the-art results (on par with the most recent BoWwith soft assignments) • Main advantage of FK over BoW are smaller dictionaries • ...although FV are less sparse than BoV • Peronnin et al. trained their system within a day for 20 classes for 350K images on 1 CPU

More Related