html5-img
1 / 73

Learning to Segment with Diverse Data

Learning to Segment with Diverse Data. M. Pawan Kumar Stanford University. Semantic Segmentation. sky. tree. car. road. grass. Segmentation Models. sky. tree. car. MODEL w. road. grass. x. y. P( x , y ; w ). y * = argmin y E( x , y ; w ). y * = argmax y P( x , y ; w ).

mliss
Télécharger la présentation

Learning to Segment with Diverse Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning to Segment withDiverse Data M. Pawan Kumar Stanford University

  2. Semantic Segmentation sky tree car road grass

  3. Segmentation Models sky tree car MODEL w road grass x y P(x,y; w) y* = argminyE(x,y; w) y* = argmaxyP(x,y; w) P(x,y; w) αexp(-E(x,y;w)) Learn accurate parameters

  4. Fully Supervised Data

  5. “Fully” Supervised Data Specific foreground classes, generic background class PASCAL VOC Segmentation Datasets

  6. “Fully” Supervised Data Specific background classes, generic foreground class Stanford Background Datasets

  7. Supervised Learning • J. Gonfaus et al. Harmony Potentials for Joint Classification and Segmentation. CVPR, 2010 • S. Gould et al. Multi-Class Segmentation with Relative Location Prior. IJCV, 2008 • S. Gould et al. Decomposing a Scene into Geometric and Semantically Consistent Regions. ICCV, 2009 • X. He et al. Multiscale Conditional Random Fields for Image Labeling. CVPR, 2004 • S. Konishi et al. Statistical Cues for Domain Specific Image Segmentation with Performance Analysis. CVPR, 2000 • L. Ladicky et al. Associative Hierarchical CRFs for Object Class Image Segmentation. ICCV, 2009 • F. Li et al. Object Recognition as Ranking Holistic Figure-Ground Hypotheses. CVPR, 2010 • J. Shotton et al. TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation. ECCV, 2006 • J. Verbeek et al. Scene Segmentation with Conditional Random Fields Learned from Partially Labeled Images. NIPS, 2007 • Y. Yang et al. Layered Object Detection for Multi-Class Segmentation. CVPR, 2010 Generic classes, burdensome annotation

  8. Weakly Supervised Data Bounding Boxes for Objects PASCAL VOC Detection Datasets Thousands of images

  9. Weakly Supervised Data Image-Level Labels “Car” ImageNet, Caltech… Thousands of images

  10. Weakly Supervised Learning • B. Alexe et al. ClassCut for Unsupervised Class Segmentation. ECCV, 2010 • H. Arora et al. Unsupervised Segmentation of Objects Using Efficient Learning. CVPR, 2007 • L. Cao et al. Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes. ICCV, 2007 • J. Winn et al. LOCUS: Learning Object Classes with Unsupervised Segmentation. ICCV, 2005 Binary segmentation, limited data

  11. Diverse Data “Car”

  12. Diverse Data Learning • Avoid “generic” classes • Take advantage of • Cleanliness of supervised data • Vast availability of weakly supervised data

  13. Outline • Model • Energy Minimization • Parameter Learning • Results • Future Work

  14. Region-Based Model Regions Unary Potential θr(i) = wiTΨr(x) Pixels Features extracted from region r of image x Pairwise Potential θrr’(i,j) = wijTΨrr’(x) For example, Ψrr’(x) = constant > 0 For example, Ψr(x) = Average [R G B] wgrass = [0 -10 0] wwater= [0 0 -10] w”car above ground” << 0 w”ground above car” >> 0 Gould, Fulton and Koller, ICCV 2009

  15. Region-based Model y x E(x,y) α -log P(x,y) = Unaries + Pairwise E(x,y) = wTΨ(x,y) Best segmentation of an image? Accurate w?

  16. Outline • Model • Energy Minimization • Parameter Learning • Results • Future Work Kumar and Koller, CVPR 2010

  17. Move-Making Message-Passing • Besag. On the Statistical Analysis of Dirty • Pictures, JRSS, 1986 • Boykov et al. Fast Approximate Energy • Minimization via Graph Cuts, PAMI, 2001 • Komodakis et al. Fast, Approximately • Optimal Solutions for Single and Dynamic • MRFs, CVPR, 2007 • Lempitsky et al. Fusion Moves for Markov • Random Field Optimization, PAMI, 2010 • T. Minka. Expectation Propagation for • Approximate Bayesian Inference, UAI, 2001 • Murphy. Loopy Belief Propagation: An • Empirical Study, UAI, 1999 • J. Winn et al. Variational Message Passing, • JMLR, 2005 • J. Yedidia et al. Generalized Belief • Propagation, NIPS, 2001 Convex Relaxations Hybrid Algorithms • Chekuri et al. Approximation Algorithms • for Metric Labeling, SODA, 2001 • M. Goemans et al. Improved Approximate • Algorithms for Maximum-Cut, JACM, 1995 • M. Muramatsu et al. A New SOCP • Relaxation for Max-Cut, JORJ, 2003 • Ravikumar et al. QP Relaxations for Metric • Labeling, ICML, 2006 • K. Alahari et al. Dynamic Hybrid Algorithms • for MAP Inference, PAMI 2010 • P. Kohli et al. On Partial Optimality in • Multilabel MRFs, ICML, 2008 • C. Rother et al. Optimizing Binary MRFs • via Extended Roof Duality, CVPR, 2007 Which one is the best relaxation?

  18. Convex Relaxations LP provably better than QP, SOCP. Use LP!! Tightness SOCP QP LP Time 1976 2003 2006 We expect …. Kumar, Kolmogorov and Torr, NIPS, 2007

  19. Energy Minimization Fixed Regions Find Regions Find Labels LP Relaxation

  20. Energy Minimization Bad region – inhomogenous appearance, texture Good region – homogenous appearance, texture Low-level segmentation for candidate regions Super-exponential in Number of Pixels Find Regions Can we prune regions? Find Labels ……………… ……………… ………………

  21. Energy Minimization Mean-Shift Segmentation Spatial Bandwidth = 10

  22. Energy Minimization Mean-Shift Segmentation Spatial Bandwidth = 20

  23. Energy Minimization Mean-Shift Segmentation Spatial Bandwidth = 30

  24. Energy Minimization Car “Combine” Multiple Segmentations

  25. min Σθr(i)yr(i) + Σθrr’(i,j)yr(i)yr’(j) 23 3 Selected regions cover entire image Regions ✗ Efficient DD. Komodakis and Paragios, CVPR, 2009 No two selected regions overlap Kumar and Koller,CVPR 2010 Pixel Not Selected yr(i)  {0,1}, for i = 0, 1, 2, … , C Dictionary of Regions Select Regions, Assign Classes

  26. Comparison Parameters learned using Gould, Fulton and Koller, ICCV 2009 I M A G E G O U L D O U R Statistically significant improvement (paired t-test) Accuracy Energy

  27. Outline • Model • Energy Minimization • Parameter Learning • Results • Future Work Kumar, Turki, Preston and Koller, In Submission

  28. Supervised Learning P(x,y) αexp(-E(x,y)) Well-studied problem, efficient solutions • = exp(wTΨ(x,y)) P(y|x1) y1 x1 y1 y P(y|x2) y2 x2 y2 y

  29. Diverse Data Learning Generic Class Annotation x a h

  30. Diverse Data Learning Bounding Box Annotation x a h

  31. Diverse Data Learning Image Level Annotation x a = “Cow” h

  32. Learning with Missing Information Expectation Maximization Computationally Inefficient • A. Dempster et al. Maximum Likelihood from Incomplete Data via the EM Algorithm. JRSS, 1977. • M. Jamshadian et al. Acceleration of the EM Algorithm by Using Quasi-Newton Methods. JRSS, 1997. • R. Neal et al. A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants. LGM, 1999. • R. Sundberg. Maximum Likelihood Theory for Incomplete Data from an Exponential Family. SJS 1974. Hard EM Latent Support Vector Machine • P. Felzenszwalb et al. A Discriminatively Trained, Multiscale, Deformable Part Model. CVPR, 2008. • C.-N. Yu et al. Learning Structural SVMs with Latent Variables. ICML, 2009. Only requires an energy minimization algorithm

  33. Latent SVM Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008 Energy of Ground-truth Energy of Other Labelings ≤ min Σiξi – ξi minhi ≤ wTΨ(xi,a,h) wTΨ(xi,ai,hi) • + Δ(ai,a,h) User-defined loss Number of disagreements Difference of Convex CCCP || || + λ w 2

  34. CCCP Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008 Start with an initial estimate w0 Energy Minimization hi = minhwtT(xi,ai,h) Update Update wt+1 by solving a convex problem min ∑i i wT(xi,ai,hi) - wT(xi,a,h) ≤ (ai,a,h) - i || || + λ w 2

  35. Generic Class Annotation Generic background with specific background Generic foreground with specific foreground

  36. Bounding Box Annotation Every row “contains” the object Every column “contains” the object

  37. Image Level Annotation “Cow” The image “contains” the object

  38. CCCP Felzenszwalb et al., NIPS 2007, Yu et al., ICML 2008 Start with an initial estimate w0 Energy Minimization hi = minhwtT(xi,ai,h) Update Update wt+1 by solving a convex problem min ∑i i Bad Local Minimum!! wT(xi,ai,hi) - wT(xi,a,h) ≤ (ai,a,h) - i || || + λ w 2

  39. EASY Grey road White sky Green grass

  40. EASY Blue water White sky Green grass

  41. HARD Cat? Cow? Horse?

  42. HARD Black Mountain? Red Sky? All images are not equal

  43. Math is for losers !! Real Numbers Imaginary Numbers eiπ+1 = 0

  44. Self-Paced Learning Euler was a genius!! Real Numbers Imaginary Numbers eiπ+1 = 0

  45. Simultaneously estimate easiness and parameters Easy vs. Hard Easy for human  Easy for machine

  46. Self-Paced Learning Kumar, Packer and Koller, NIPS 2010 vi  {0,1} vi  [0,1] Start with an initial estimate w0 vi -∑ivi/K hi = minhwtT(xi,ai,h) Update Update wt+1 by solving a convex problem min ∑I i wT(xi,ai,hi) - wT(xi,a,h) ≤ (ai,a,h) - i vi = 1 for easy examples vi = 0 for hard examples || || + λ w 2 Biconvex Optimization Alternate Convex Search

  47. Self-Paced Learning Kumar, Packer and Koller, NIPS 2010 Start with an initial estimate w0 As Simple As CCCP!! hi = minhwtT(xi,ai,h) Update Update wt+1 by solving a biconvex problem min ∑I ivi -∑ivi/K wT(xi,ai,hi) - wT(xi,a,h) ≤ (ai,a,h) - i || || + λ w 2 Decrease K  K/

  48. Self-Paced Learning Kumar, Packer and Koller, NIPS 2010 x Test Error h Image Classification a = “Deer” Test Error x Motif Finding a = -1 or +1 h = Motif Position

  49. Learning to Segment CCCP SPL

  50. Learning to Segment Iteration 1 CCCP SPL

More Related