Generic Object Recognition

Generic Object Recognition A Project on -- by Yatharth Saraf

Problem Definition and Background • Recognizing generic class or category of a given object as opposed to recognizing specific, individual objects • humans are much better at generic recognition, machines are more competitive at specific object recognition • Early work by Marr led to the ‘reconstruction school’ • advocates 3-D reconstruction and modeling before further reasoning of a scene • Current work in object categorization tends to fall in the ‘recognition school’ • work in the 2-D domain, with 2-D image features and descriptors • e.g. Bag of features approaches, spatial 2-D geometry approaches as in the ‘constellation model’

Applications • Image database annotation and retrieval • Video surveillance • Driver assistance, autonomous robots • Cognitive support for disabled people

Related Work • Discriminative approaches • SVM, subspace methods • Bag of features • Representation of objects with point descriptors • Constellation model • Representations that take into account spatial geometry (2-D) of key points

Assumptions • Images are scale-normalized • Images are clean, i.e. no background clutter/occlusion • (-) Implies segmentation is necessary as a pre-processing step • (+) Avoids the problem of exponential search

Outline of the Method (Training) • Detect salient regions in all training images using Kadir-Brady feature detector • Extract X,Y coordinates, scale and 11x11 intensity patches around detected features • Reduce dimensionality of appearance patches from 121 to 16 using PCA • Estimate model parameters • A single full Gaussian for location; one Gaussian per part

Outline of the Method (Testing) • Extract features of test images in the same manner as in training phase • Use the learnt model to estimate probability of detection • Use Bayes’ Decision Rule to classify

Experiments • Careful tweaking of detector parameters needed • A single set of parameter settings may not be suitable for all categories

Starting scale: 3 Starting scale: 23

Experiments (contd.) • 47 clean motorbike images used for training motorbike model • Sorting the extracted patches by X-coordinate helped (as opposed to sorting by saliency) • Appearance model not doing as well

9 test images used (1-4 motorbikes, 5-7 cars, 8-9 faces)

Log-probabilities of the 9 test images from location model Features sorted by X-coordinate. Features sorted by saliency. Image 5 Image 9

Appearance log-probabilities of the 9 test images Features sorted by saliency. Features sorted by X-coordinate. Total log-probabilities of the 9 test images

Experiments (contd.) • Using a Mixture of Gaussians for the appearances of parts didn’t make too much difference 3 mixture components per part (EM initialized with k-means and sample covariances)

Experiments (contd.) • Levenshtein distances on the appearance patches worked quite nicely • Each appearance patch is a single character • Matching cost was computed using a straight SSD • Cost of inserting a gap = matching cost of the patch with a canonical 11x11 patch having uniform intensity of 128.

Conclusions and Future Work • Strong dependence on feature detector • Appearance model doesn’t seem to be working too well • Levenshtein distances could be more promising • Experiments with more clean training and test data, multiple categories • Exponential search for dealing with clutter and occlusion

Questions? -- Thank You

Generic Object Recognition

Generic Object Recognition

Presentation Transcript

OBJECT RECOGNITION

Object recognition

Dense Object Recognition

Statistical Object Recognition

Object Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Visual Object Recognition

Object recognition

Object Recognition

Object Recognition

Object Recognition

Multiclass object recognition

Object recognition

Object Recognition

Object recognition

Object Recognition I

Object Recognition