1 / 17

Outline

Outline . Scene Understanding Motivation for Context Fixed order model (and constellation) Known bag of objects WITH context Independent TDP Unknown bag of objects WITHOUT context CASPER Distribution Preliminary Experiments Experimental Plan. Scene Understanding - JL.

urbana
Télécharger la présentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline • Scene Understanding • Motivation for Context • Fixed order model (and constellation) • Known bag of objects WITH context • Independent TDP • Unknown bag of objects WITHOUT context • CASPER Distribution • Preliminary Experiments • Experimental Plan

  2. Scene Understanding - JL • Localize and classify all objects in an image • Not necessarily “segment” at the pixel level • A “description” of a scene is a pair of vectors l and rho where l gives the class for each instance and rho gives the location • We are trying to find P(l,rho)

  3. Why Context? • Represent “context” • Show LOOPS picture • How do you use context? • (l,rho) version

  4. Fixed Order Model • Known set of objects • Joint Gaussian over all centroids • P(l,rho) = 1{l == l_fixed_order} P(rho | l) • Problems: • We don't always know the exact set of objects • Facebook example • What if there are two instances from the same class? • Bedroom scene example

  5. TDP • Unknown set of objects • Gaussian over centroid for each • Centroids are independent • P(l,rho) = P(l) prod_i P(rho_i | l_i) • Problems: • This doesn't take pairwise constraints into account • We have lost context

  6. CASPER • Unknown set of objects • Joint Gaussian over centroids given ANY set • P(l,rho) = P(l) P(rho | l) • Questions: • How do we represent P(l)? • How do we represent P(rho | l)? • How do we learn • How do we infer

  7. P(l) • Options: • Dirichlet Process • IID Multinomial • Other smart things

  8. P(rho | l) - GH • Desiderata: • Correlations between rho's • Sharing of parameters between l's • ... • Options: • Independent • Learn a different Gaussian for every l • Can't share parameters, large number of l's • Gaussian Process • Correlation is the not natural space to represent these relationships • Product of Experts • Each “expert” represents a Gaussian offset between objects • This is where we have spent the most time

  9. CASPER P(rho|l) - JL • Some math and examples: • P(rho,d|l) = 1/Z prod_ij P_c(rho_i-rho_j)^dijc • P(d|l) = Multinomial • P(rho|d,l) = Gaussian • Precision space view

  10. Learning the Experts • Training set with (l,rho) pairs per image • Gibbs over the hidden variables: • Graph for the image (d's) • Loop over edges

  11. Generative Process • Show some synthetic generated images

  12. Preliminary Experiments - GH • Bedroom and Streets Scenes from LabelMe: • Features SIFT • Features (x,w)

  13. Learning/Inference in Full Model • Three stage Gibbs: • Features to Instances • Black box: previous algorithm • Graph for the image (d's) • Loop over edges • Instances to Classes • Training • Supervise Feature to instance and instance to class assignments • Testing • Introduce new images and gibbs away

  14. Results • Show Sucky Pictures

  15. Problems • It's hard to evaluate • If your goal is detection you lose to discriminative approaches • Context is not the main reason why TDP is failing • [If you evaluate based on discovered structure, then context is a lower order consideration]

  16. New Framework • Detectors for a set of object classes • Turn down the threshold • Each detection gets a l_i variable and has a centroid rho_i • Goal is to assign l_i's to every detection in a way that uses both the “detection strength” and the context of other detections

  17. Possible Datasets • Bedrooms • Faces • Overhead Traffic

More Related