1 / 17

Generic Object Recognition

Generic Object Recognition. A Project on. -- by Yatharth Saraf. Problem Definition and Background. Recognizing generic class or category of a given object as opposed to recognizing specific, individual objects

morton
Télécharger la présentation

Generic Object Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generic Object Recognition A Project on -- by Yatharth Saraf

  2. Problem Definition and Background • Recognizing generic class or category of a given object as opposed to recognizing specific, individual objects • humans are much better at generic recognition, machines are more competitive at specific object recognition • Early work by Marr led to the ‘reconstruction school’ • advocates 3-D reconstruction and modeling before further reasoning of a scene • Current work in object categorization tends to fall in the ‘recognition school’ • work in the 2-D domain, with 2-D image features and descriptors • e.g. Bag of features approaches, spatial 2-D geometry approaches as in the ‘constellation model’

  3. Applications • Image database annotation and retrieval • Video surveillance • Driver assistance, autonomous robots • Cognitive support for disabled people

  4. Related Work • Discriminative approaches • SVM, subspace methods • Bag of features • Representation of objects with point descriptors • Constellation model • Representations that take into account spatial geometry (2-D) of key points

  5. Assumptions • Images are scale-normalized • Images are clean, i.e. no background clutter/occlusion • (-) Implies segmentation is necessary as a pre-processing step • (+) Avoids the problem of exponential search

  6. Outline of the Method (Training) • Detect salient regions in all training images using Kadir-Brady feature detector • Extract X,Y coordinates, scale and 11x11 intensity patches around detected features • Reduce dimensionality of appearance patches from 121 to 16 using PCA • Estimate model parameters • A single full Gaussian for location; one Gaussian per part

  7. Outline of the Method (Testing) • Extract features of test images in the same manner as in training phase • Use the learnt model to estimate probability of detection • Use Bayes’ Decision Rule to classify

  8. Experiments • Careful tweaking of detector parameters needed • A single set of parameter settings may not be suitable for all categories

  9. Starting scale: 3 Starting scale: 23

  10. Experiments (contd.) • 47 clean motorbike images used for training motorbike model • Sorting the extracted patches by X-coordinate helped (as opposed to sorting by saliency) • Appearance model not doing as well

  11. 9 test images used (1-4 motorbikes, 5-7 cars, 8-9 faces)

  12. Log-probabilities of the 9 test images from location model Features sorted by X-coordinate. Features sorted by saliency. Image 5 Image 9

  13. Appearance log-probabilities of the 9 test images Features sorted by saliency. Features sorted by X-coordinate. Total log-probabilities of the 9 test images

  14. Experiments (contd.) • Using a Mixture of Gaussians for the appearances of parts didn’t make too much difference 3 mixture components per part (EM initialized with k-means and sample covariances)

  15. Experiments (contd.) • Levenshtein distances on the appearance patches worked quite nicely • Each appearance patch is a single character • Matching cost was computed using a straight SSD • Cost of inserting a gap = matching cost of the patch with a canonical 11x11 patch having uniform intensity of 128.

  16. Conclusions and Future Work • Strong dependence on feature detector • Appearance model doesn’t seem to be working too well • Levenshtein distances could be more promising • Experiments with more clean training and test data, multiple categories • Exponential search for dealing with clutter and occlusion

  17. Questions? -- Thank You

More Related