1 / 31

Mid-level Representations for Images and Videos

Mid-level Representations for Images and Videos. Sylvain Paris, Adobe with Jiawen Chen , Michael Cohen, Frédo Durand, Wojciech Matusik , Jue Wang . Making good videos is hard, much harder than taking good photos. Better tools will help.

dudley
Télécharger la présentation

Mid-level Representations for Images and Videos

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mid-level Representationsfor Images and Videos Sylvain Paris, Adobe withJiawenChen,Michael Cohen, Frédo Durand, WojciechMatusik, Jue Wang

  2. Making good videos is hard, much harder than taking good photos.Better tools will help.

  3. Today's focus:How to represent video datato build better algorithms.

  4. Outline • Mid-level representations • Bilateral grid • Video mesh

  5. Low-level Representation R=255 G=50 B=35

  6. High-level Representation building car street

  7. Mid-level Information texture smooth edge

  8. Mid-level Representations • Data are indexed by (x,y,t,r,g,b) • (x,y,t): location. For video, it's a grid. • (r,g,b): color. That is where the information is. • Many representations exist: HSV, Lab, [Kim 09]... • We are going to look at (x,y,t,r,g,b).

  9. Outline • Mid-level representations • Bilateral grid • Video mesh

  10. Example: the Edge-aware Brush • Classical paint brush • Ignores edges • Our edge-aware brush • Respects edges Stroke with classical brush Input image Stroke with bilateral brush

  11. Bilateral Grid – Definition y Standard 2D Image • Bilateral grid = 3D array • x and y correspond to pixel position • z corresponds to pixel intensity • Euclidean distance accountsfor edges • space distance (x,y) andintensity distance (z) • Grid can be coarsely sampled • E.g., 70 x 70 x 10 for an8 megapixel image x Bilateral Grid

  12. Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y • Nearest neighbor • arbitrarily bright or dark • Bicubic • intermediate value not in original x Downsampling in 2D

  13. Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y x Downsampling in 2D Bilateral Grid

  14. Bilateral Grid Painting • When mouse is held down, paint only at intensity level of initial mouse click Input image Bilateral Grid

  15. Bilateral Grid Painting • Edge-aware brush used to change hue

  16. Bilateral Filter [Tomasi 98] • Smooth image except across strong edges • Ubiquitous in computational photography [Oh 01, Durand 02, Eisemann 04, Petschnigg 04, Bennett 05, Bae 06, Fattal 07, Kopf 07, …] FilteredOutput Input

  17. intensity space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid

  18. intensity intensity space space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid Gaussian blur grid values Slice: query gridwith input image Filtered scanline

  19. More than 100 Hzat HD video resolutionusing GPU.

  20. Video

  21. Model Input Output Many Operations and Applications • Local histogram equalization • Interactive tone mapping • Video abstraction [Winnemoller 06, DeCarlo 02] • Segmentation [Paris 07] • Photographic style transfer [Bae 06]

  22. Works great frame by frame if no noise. What if there is noise?Not everything can be done frame by frame, e.g. segmentation.

  23. Frame by Frame • Consider only the current frame. • Do not exploit all the available information. unused unused

  24. Volumetric, Off-line • Good if we know the entire sequence in advance. • Time is the same as x and y. • Use a time Gaussian.

  25. Causal, On-line • Good for real time • recursive algorithm • Formal equivalence between exp. decay and Gaussian[ECCV 08] fixed unused

  26. Video

  27. Summary on Bilateral Grid • Image data as a coarse volumetric grid. • Edges are naturally represented. • Standard operations become edge-aware for free. • Fast: HD video in real time • Grid good up to 5D/6D, e.g. (x,y,t,r,g,b) • See work by Adams et al. at Stanfordfor higher dimensions.

  28. Outline • Mid-level representations • Bilateral grid • Video mesh

  29. Video Meshes • Coarse data: • motion, depth • represented by a triangle mesh • Finely detailed data: • colors, boundaries • represented by RGBA textures

  30. Video

  31. Conclusion • Mid-level cues carry a lot of information • Can be "embedded" in image representation • Fast, easy-to-adaptalgorithms • Many existing structures • Unwrap mosaics, locally rigid triangles, edge-aware wavelets àtrous, video front…

More Related