1 / 30

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects. By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz Malisiewicz for CMU’s Misc-Read April 26, 2006. Talk Overview. Objective CRF  HRF  LayoutCRF LayoutCRF Potentials Learning

lavina
Télécharger la présentation

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz Malisiewicz for CMU’s Misc-Read April 26, 2006

  2. Talk Overview • Objective • CRF  HRF  LayoutCRF • LayoutCRF Potentials • Learning • Inference • Results • Summary

  3. LayoutCRF Objectives • To detect and segment partially occluded objects of a known category • To detect multiple object instances which possibly occlude each other • To define a part labeling which densely covers the object of interest • To model various types of occlusions (FG/BG, BG/FG, FG/FG)

  4. Conditional Random Field (Lafferty ‘01) • A random field globally conditioned on the observation X • Discriminative framework where we model P(Y|X) and do not explicitly model the marginal P(X)

  5. Hidden Random Field (Szumer ‘05) • Extension to CRF with hidden layer of variables • The hidden variables represent object ‘parts’ in this work Deterministic Mapping Deterministic Mapping

  6. LayoutCRF • An HRF with asymmetric pair-wise potentials, extended with a set of discrete valued instance transformations {T1,…,TM} M foreground object instances

  7. LayoutCRF • *only one non-background class is considered at a time • M+1 instance labels: yi \in {0,1,…,M} • Each object instance has a separate set of H part labels hi \in {0,1,…,H x M}

  8. LayoutCRF • Each transformation T represents the translation and left/right flip of an object instance by indexing all possible integer pixel translations for each flip orientation • Each T is linked to every hi

  9. LayoutCRF Potentials • Unary Potentials: Use local information to infer part labels (randomized decision trees) • Asymmetric Pair-wise Potentials: Measure local part compatibilities • Instance Potentials: Encourage correct long-range spatial layout of parts for each object instance

  10. LayoutCRF Potentials: Unary Window of size D around pixel i • A set of decision trees; each trained on a random subset of the data (improves generalization and efficiency) • Each DT returns a distribution over part labels; K DTs are averaged • Each non-terminal node in the DT evaluates an intensity difference or absolute intensity difference between a learned pair of pixels and compares this to a learned threshold

  11. Layout Consistency (for pair-wise potentials) Colors represent part labels A label is layout-consistent with itself, and with those labels that are adjacent in the grid ordering above Neighboring pixels whose labels are not layout consistent are not part of the same object

  12. Distinguished Transitions 1. Background: hi and hj are BG labels 2. Consistent FG: hi and hj are layout-consistent FG labels 3. Object edge: one label is BG, the other is part label lying on object edge 4. Class occlusion: one label is interior FG label, the other is a BG label 5. Instance occlusion: both are FG labels, but not layout-consistent (at least one label is object edge) 6. Inconsistent Interior FG: both labels are interior FG labels, but not layout-consistent (rare)

  13. LayoutCRF Potentials: Pair-wise • The value of the pair-wise potential varies according to the transition type • eij is image-based edge cost which encourages object edges to align with image boundaries Contrast term estimated for each image

  14. LayoutCRF Potentials: Instance • Look-up tables (histograms) • Encourage the correct spatial layout of parts for each object instance by gravitating parts towards their expected positions, given transformation of the instance Weighs strength of potential Returns position i inverse-transformed by the transformation Tm

  15. LayoutCRF: What comes next? • We just defined the LayoutCRF and its potentials • First we need to learn the parameters of the LayoutCRF from labeled training data • Then we apply the model to a new image (inference) to obtain a detection and segmentation

  16. Learning (the model parameters) • Supervised Algorithm requires foreground / background segmentation, but not part labels

  17. Unary Potential and Part Labeling • Part labeling for the training images is initialized based on a dense regular grid that fits the object bounding box • Unary classifiers are learned, then new labeling is inferred • *Two iterations are sufficient Dense grid is spatially quantized such that a unique part covers several pixels (on average 8x8)

  18. Learning Pair-wise Potentials • Parameters are learned via cross-validation by a search over a sensible range of positive values • Gradient-based ML learning too slow; (future work:more efficient means of learning these parameters)

  19. Learning Instance Potentials • Deformed part labelings of all training images are aligned on their segmentation mask centroids • A bounding box is placed relative to the centroid around the part labelings • For each pixel within the bounding box, the distribution over part labels is learned by histogramming the deformed training image labels Empirical Distribution over parts h given position w

  20. Inference (on a novel image) • Initially, we don’t know the number of object instances and their locations • Step1: collapse part labels across instances, merge instance labels together, and remove transformations. MAP inference is performed to obtain part labeling image h*

  21. Inference (on a novel image) • Step2: determine number of layout-consistent regions in h* using connected component analysis; pixels are connected if they are layout-consistent • This gives us an estimate of M (number of object instances) and initial instance labeling • estimate T separately for each instance label

  22. Inference (on a novel image) • Step3: re-run MAP inference with full model to get full h, which now distinguishes between instances

  23. Approximate MAP inference via Annealed Expansion Move Algorithm • Alternating regular grid expansions at random offset and standard alpha expansions (for changing to BG label) • Annealing schedule weakens pair-wise potential during early stages by raising to a power less than one

  24. Results on Cars *Training on images that contain only one visible car instance False Positive

  25. Segmentation Accuracy on Cars • Evaluated segmentation accuracy on 20 randomly chosen images of cars, containing 34 car instances • Segmentation Accuracy per instance: ratio of intersection to the union of the detected and ground-truth segmentations = .67

  26. Results on Faces

  27. Multi-class LayoutCRF (Future Work)

  28. Summary • LayoutCRF used to detect multiple instances of an object of a given class • Deformed-grid part labeling densely covers the object • Simultaneous detection and segmentation

  29. Questions?

  30. References J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning, 2001. M. Szummer. Learning diagram parts with hidden random fields. In International Conference on Document Analysis and Recognition, 2005. J. Winn and J. Shotton. The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects. In CVPR 2006.

More Related