1 / 58

CVPR/ICCV 09 Paper Reading

CVPR/ICCV 09 Paper Reading. Dan Wang Nov. 6, 2009. Papers. CVPR: Learning Color and Locality Cues for Moving Object Detection and Segmentation  ICCV: Texel-based Texture Segmentation. Learning Color and Locality Cues for Moving Object Detection and Segmentation.

kalinda
Télécharger la présentation

CVPR/ICCV 09 Paper Reading

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CVPR/ICCV 09 Paper Reading Dan Wang Nov. 6, 2009

  2. Papers • CVPR: • Learning Color and Locality Cues for Moving Object Detection and Segmentation •  ICCV: • Texel-based Texture Segmentation

  3. Learning Color and Locality Cues for Moving Object Detection and Segmentation Feng Liu and Michael Gleicher

  4. Authors • What does the paper do? • Problem of previous methods? • How to do?

  5. Authors • What does the paper do? • Problem of previous methods? • How to do?

  6. Author 1 Feng Liu • Computer Sciences Department at the University of Wisconsin, Madison • Graduate student • Publications • Feng Liu, Michael Gleicher, Hailin Jin and Aseem Agarwala. Content-Preserving Warps for 3D Video Stabilization. ACM SIGGRAPH 2009 • Feng Liu, Yu-hen Hu and Michael Gleicher. Discovering Panoramas in Web Videos. ACM Multimedia 2008 • Feng Liu and Michael Gleicher. Texture-Consistent Shadow Removal. ECCV 2008 • Feng Liu and Michael Gleicher. Video Retargeting: Automating Pan and Scan.  ACM Multimedia 2006

  7. Author 2 Michael Gleicher • Computer Sciences Department at the University of Wisconsin, Madison • Professor • Positions • 2009 – present – Professor • 2004 – 2009 – Associate Professor • 1998 - 2004 – Assistant Professor • http://pages.cs.wisc.edu/~gleicher/CV.pdf

  8. Authors • What does the paper do? • Problem of previous methods? • How to do?

  9. Problem of Previous Methods • Most previous automatic methods rely on object or camera motion to detect the moving object. • Small motion of object or camera do not provide sufficient information for these methods.

  10. Abstract • The paper presents an algorithm for automatically detecting and segmenting a moving object from a monocular video. • Existing method: Rely on motion to detect moving object. When motion is sparse and insufficient, it fails… • What does this paper do ? • Unsupervised algorithm to learn object color and locality cues from the sparse motion information. • How to do? • Detect key frames, sub-objects • Learning from the sub-objects: color and locality cues • Combining cues in MRF framework

  11. What is “Moving Object”? • Some compact regions with different apparent motion from background. • How to detect moving object • Estimate the global motion in a video • Calculate the discrepancy at each between the object motion and global motion

  12. Detect Moving Object projective transformationDetails in [19] • Model: • Use Homographyto model the global motion between two consecutive frames. • Feature: • Use a SIFT feature-based method to estimate the homography

  13. Detect Moving Object • With the homography, we calculate the motion cue(mc) at pixel (x,y) as :

  14. Key Frame Extraction • Definition • The frame where a moving object or its part can be reliably inferred from motion cues. • Motion cues are likely reliable when they are strong and compact.

  15. Segment Moving Sub-objects from Key Frames • Problem • Not all pixels of the moving object have significant motion cues. Motion cues are Sparse!

  16. Segment Moving Sub-objects from Key Frames • Solution • Neighboring pixels are likely to have the same label. • Neighboring pixels with similar colors are more likely to have the same label. • Solved by Graph cut algorithm

  17. Segment Moving Sub-objects from Key Frames Number of pixels Neighbor of pixel i • MRF priors on labels to model the interaction Pixel label

  18. Segment Moving Sub-objects from Key Frames • The likelihood of image I given a labeling can be modeled as follows:

  19. Learning Color and Locality Cues • Assumption • The moving sub-objects from all the key frames form a complete sampling of the moving objects. • Procedure • Lab color space • Build GMM

  20. Learning Color and Locality Cues • The spatial affinity of pixel I to the moving object: • Location likelihood: position parameter F: sub-object

  21. Experiments • Insignificant camera and object motion

  22. Significant camera and uneven object

  23. Significant camera and object motion

  24. Discussion • Extracting moving object from a video with less object and camera motion is not easier than more objects and camera motion • Contribution • Unsupervised? • Currently • Off-line • Motion estimation is time consuming • Future • Parameters • Background modeling

  25. Texel-based Texture Segmentation Sinisa Todorovic, Narendra Ahuja Reporter: Wang Dan

  26. Authors • Sinisa Todorovic • Assistant Professor • School of EECS, Oregon State University • Publication: CVPR 2009, ICCV 2009, TPAMI 2008... • Narendra Ahuja • Donald Biggar Willet Professor • Beckman Institute, UIUC • Publication: ICCV 2009, IJCV 2008, TPAMI…

  27. Problem • Given an arbitrary image, segment all texture subimages • Texture = Spatial repetition of texture elements, i.e.,texels • Texels are not identical, but only statistically similar to one another. • Texel placement along the texture surface is not periodic, but only statistically uniform. • Texels in the image are not homogeneous, but regions that may contain subregions.

  28. Rational • Texels occupy image regions • If the image contains texture, many regions will have similar properties • color, shape, layout of subregions • orientation, relative displacements • The pdf of region properties will have modes Texture detection and segmentation Detection of modes of pdf of region properties

  29. Method Overview

  30. Contributions • No assumptions about the pdf of texel properties • Both appearance and placement of the texels are allowed to be stochastic and correlated • New hierarchical, adaptive-bandwidth kernel to capture texel structural properties

  31. Method Description • Define a feature space of region properties • Descriptor of each region = Data point in the feature space • Partition the feature space into bins by Voronoi tessellation • Run the meanshift with the new, hierarchical kernel • Regions under a pdf mode comprise the texture subimage

  32. The feature space of region properties • Image hierarchical structure • Use multiscale segmentation algorithm[1], [4]

  33. The feature space of region properties • A descriptor vector of prperties xi of image region i • Average contrast across i’s boundary • Area • Standard deviation of i’s children area • Displacement vector between the centroids of i and its parent region • Perimeter • Aspect ratio of intercepts of i’s principal axes with the i’s boundary • Orientation: the angle between principal axes and x-axis • Centroid of I • PCA 95% • Not scale and rotation invariance

  34. Voronoi Diagram A C B

  35. Voronoi Diagram • Definition: • Let P = {p1,p2, ... , pn} be a set of points in the plane (or in any dimensional space), which we call sites. • Define V(pi), the Voronoi cell for pi, to be the set of points q in the plane that are closer to pi than to any other site. That is, the Voronoi cell for pi is defined to be: • V(pi ) = {q | dist(pi,q) < dist(pj, q), for j != i}: • Anyway, it can partition the feature space… • http://www.dma.fi.upm.es/mabellanas/tfcs/fvd/voronoi.html

  36. Voronoi-based Binned Meanshift • New variable-bandwidth matrix

  37. Motivation • Texels, in general, are not homogenous-intensity regions, but may contain hierarchically embedded subregions. • Since region descriptors represent image regions, we can define hierarchical relationships between the descriptors based on the embedding of corresponding smaller regions within larger regions in the image. Gaussian kernel Hierarchical kernel

  38. Voronoi partitioning of the feature space • Suppose x belongs to Bi, and x’ belongs to Bj x x’ • Arbitrary points xi xj • Region descriptors

  39. x belongs to Bi. Compute by finding the maximum subtree isomorphism between two trees rooted at at xi and xj as:

  40. Experimental Evaluation • Qualitative Evaluation • Quantitative evaluation • G: the area of true texture • D: the area of a subimage that our approach segments • Segmentation error per texture:

  41. (1) 100 collages of randomly mosaicked, 111 distinct Brodatz textures, where each texture occupies at least 1/6 of the collage

More Related