1 / 24

Robust Place and Object Recognition using Local Appearance based Methods

+. Robust Place and Object Recognition using Local Appearance based Methods. Gregory Dudek and Deeptiman Jugessur Center for Intelligent Machines McGill University. Outline . Applications PCA: shortcomings Objectives Approach Background System Overview Results Conclusion.

damians
Télécharger la présentation

Robust Place and Object Recognition using Local Appearance based Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. + Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek and Deeptiman Jugessur Center for Intelligent Machines McGill University Dudek & Jugessur

  2. Outline • Applications • PCA: shortcomings • Objectives • Approach • Background • System Overview • Results • Conclusion Dudek & Jugessur

  3. Two Applications • Object recognition: what is that thing? • Recognizing a known object from its visual appearance. • Landmarks, grasping targets, etc. • Place recognition (coarse localization): what room am I in? • Recognizing the current waypoint on a trajectory, validating the current locale for the application of a precise localization method, topological navigation. Dudek & Jugessur

  4. PCA-based recognition. • Has now become a well established method for image recognition. • PCA-based recognition: global transform of image with N degrees of freedom into an eigenspace with M << N degrees of freedom. • Freedoms M are the “most important” characteristics of the set of images being memorized. • Avoids having to segment image into object & background by using the whole thing. Dudek & Jugessur

  5. Observations • Using whole image implies recognizing combination of object AND background. • Segmenting object from background would avoid dependence on background, but it’s too difficult. • Using a small sub-region gives a less precise recognition (e.e. the sun-window could come from more than one image), it’s is efficient. • Many subwindows together can “vote” for an unambiguous recognition. • If the sub-windows are suitably chosen, they may totally ignore the background. Dudek & Jugessur

  6. Problem Statement • Improving the performance of classic PCA based recognition by accounting for: • Varying backgrounds • Planar rotations • Occlusions • Also (discussed in less detail) • Changes in object pose • Non-rigid deformation Dudek & Jugessur

  7. Our key idea(s). • Use sub-windows: several together uniquely accomplish recognition. • Sub-windows are selected by an attention operator (several kinds can be used). • Each sub-window is sampled non-uniformly to weight it towards it’s center. • Use only the amplitude spectrum to buy rotational invariance. Dudek & Jugessur

  8. Background • Standard Appearance Based Recognition • M. Turk and S. Pentland 1991 • S.K. Nayar, H. Murase, S.A. Nene 1994 • H. Murase, S.K. Nayar 1995 • Shortcomings (due to global approach): • Background • Scale • Rotations • Local changes of the image or object • Occlusion Dudek & Jugessur

  9. Background (part 2) • “Enhanced” Local sub-window methods • D. Lowe 1999: scale invariance, simple features. • C. Schmid 1999: Probabilistic approach based on sub-windows extracted using Harris operator. • C. Schmid & R. Mohr 1997: numerous sub-windows extracted using Harris operator for database image retrieval (simpler problem). • K. Ohba & K. Ikeuchi 1997: K.L.T. operator used for the extraction of sub-windows for the creation of an eigenspace. Only handles occlusion. • Interest Operator of choice: • D. Reisfeld, H. Wolfson, Y.Yeshurun 1995: Local symmetry operator Dudek & Jugessur

  10. Approach • 2 phases: • Training (off-line) for the entire database of recognizable images: • Run an interest operator to obtain a saliency map for each image. • Choose sub-windows around the salient points for each image. • Select most informative sub-windows and use foveal sampling. • Create the eigenspace with the processed sub-windows. • Testing (on-line) for a candidate test image: • Run the same interest operator to obtain the saliency map. • Choose the sub-windows and process the information within them. • Project the sub-windows onto the eigenspace • Perform classification based on nearest neighbor rules. Dudek & Jugessur

  11. Recognition Model Database of recognizable images Run all images though the interest operator Create low dim. eigenspace Extract sub-windows based on interest operator saliency values and information content 2D FFT Obtain amplitude spectra for the sub-windows Eigenspace for classification Candidate test image Project onto eigenspace 2D FFT Run the image through the interest operator Off-line On-line Dudek & Jugessur

  12. Polar Samplings and 2D FFT Polar Sampling Polar Sampling Same Amplitude Spectrum (in theory) 2D FFT 2D FFT Dudek & Jugessur

  13. Shift Theorem Dudek & Jugessur

  14. Training Images Best match Best match Place Recognition Test Images Dudek & Jugessur

  15. Training Images Best match Best match Place Recognition (2) Test Images Dudek & Jugessur

  16. Training Image Recognition Object Recognition Test Image Dudek & Jugessur

  17. Object Recognition (2) Test Image Training Image Best matches Note: background variation and occlusion Dudek & Jugessur

  18. Performance metrics • On-line performance: • 15x15 pixel subwindows: 90% recognition with 10 subwindows (10 interest points). • 15x15 pixel subwindows: 100% recognition using 15 more subwindows • Interest operator can take 1/30s to 10 min. (depending on the operator, images size, etc.). • Classification in Eigenspace well under 1 sec (can be performed in real time). Dudek & Jugessur

  19. Performance vs Number of Interest Points 100% Note: 10 windows of size 15x15 means using only 0.7% of the total image content. Recognition Rate Number of features Dudek & Jugessur

  20. Conclusion & Extensions • Approach to object and place recognition from single video images. Works despite planar rotation, occlusion or other deformations. • Highly robust. • Recognition rates of up to 100% with 20 test images. • Improved robustness to background can be achieved using “masking” [Jugessur & Dudek CVPR 2000]. • Ongoing work sees to exploit geometry of interest points. • Could filter in Eigenspace during training to select only “useful” features. Dudek & Jugessur

  21. That’s all Dudek & Jugessur

  22. Questions you could ask • Have you considered the use of alternative interest/attention operators? Does the operator matter? • What if the background is much more interesting (to the operator) that the object? • How much does color information matter? • What is the consequence of not using geometric information (and what does that really mean)? Dudek & Jugessur

  23. Dudek & Jugessur

  24. Performance metrics • Training time: roughly 64 windows, 15x15, 17 objects, 3 views per object: 24 hours. • This is using MATLAB and highly non-optimized code. • Using similar methods on global images, other groups have reported times on the order of minutes for similar tasks. • On-line performance: • Interest operator can take 1/30s to 10 min. (depending on the operator, images size, etc.) • Classification in Eigenspace well under 1 sec (can be performed in real time). Dudek & Jugessur

More Related