Mid-level Representations for Images and Videos

Mid-level Representationsfor Images and Videos Sylvain Paris, Adobe withJiawenChen,Michael Cohen, Frédo Durand, WojciechMatusik, Jue Wang

Making good videos is hard, much harder than taking good photos.Better tools will help.

Today's focus:How to represent video datato build better algorithms.

Outline • Mid-level representations • Bilateral grid • Video mesh

Low-level Representation R=255 G=50 B=35

High-level Representation building car street

Mid-level Information texture smooth edge

Mid-level Representations • Data are indexed by (x,y,t,r,g,b) • (x,y,t): location. For video, it's a grid. • (r,g,b): color. That is where the information is. • Many representations exist: HSV, Lab, [Kim 09]... • We are going to look at (x,y,t,r,g,b).

Example: the Edge-aware Brush • Classical paint brush • Ignores edges • Our edge-aware brush • Respects edges Stroke with classical brush Input image Stroke with bilateral brush

Bilateral Grid – Definition y Standard 2D Image • Bilateral grid = 3D array • x and y correspond to pixel position • z corresponds to pixel intensity • Euclidean distance accountsfor edges • space distance (x,y) andintensity distance (z) • Grid can be coarsely sampled • E.g., 70 x 70 x 10 for an8 megapixel image x Bilateral Grid

Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y • Nearest neighbor • arbitrarily bright or dark • Bicubic • intermediate value not in original x Downsampling in 2D

Bilateral grid enables aggressive downsampling Extra dimension preserves edges 2D vs. Bilateral Grid Downsampling y x Downsampling in 2D Bilateral Grid

Bilateral Grid Painting • When mouse is held down, paint only at intensity level of initial mouse click Input image Bilateral Grid

Bilateral Grid Painting • Edge-aware brush used to change hue

Bilateral Filter [Tomasi 98] • Smooth image except across strong edges • Ubiquitous in computational photography [Oh 01, Durand 02, Eisemann 04, Petschnigg 04, Bennett 05, Bae 06, Fattal 07, Kopf 07, …] FilteredOutput Input

intensity space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid

intensity intensity space space Bilateral Filter on the Bilateral Grid Image scanline BilateralGrid Gaussian blur grid values Slice: query gridwith input image Filtered scanline

More than 100 Hzat HD video resolutionusing GPU.

Video

Model Input Output Many Operations and Applications • Local histogram equalization • Interactive tone mapping • Video abstraction [Winnemoller 06, DeCarlo 02] • Segmentation [Paris 07] • Photographic style transfer [Bae 06]

Works great frame by frame if no noise. What if there is noise?Not everything can be done frame by frame, e.g. segmentation.

Frame by Frame • Consider only the current frame. • Do not exploit all the available information. unused unused

Volumetric, Off-line • Good if we know the entire sequence in advance. • Time is the same as x and y. • Use a time Gaussian.

Causal, On-line • Good for real time • recursive algorithm • Formal equivalence between exp. decay and Gaussian[ECCV 08] fixed unused

Video

Summary on Bilateral Grid • Image data as a coarse volumetric grid. • Edges are naturally represented. • Standard operations become edge-aware for free. • Fast: HD video in real time • Grid good up to 5D/6D, e.g. (x,y,t,r,g,b) • See work by Adams et al. at Stanfordfor higher dimensions.

Video Meshes • Coarse data: • motion, depth • represented by a triangle mesh • Finely detailed data: • colors, boundaries • represented by RGBA textures

Video

Conclusion • Mid-level cues carry a lot of information • Can be "embedded" in image representation • Fast, easy-to-adaptalgorithms • Many existing structures • Unwrap mosaics, locally rigid triangles, edge-aware wavelets àtrous, video front…

Mid-level Representations for Images and Videos

Mid-level Representations for Images and Videos

Presentation Transcript

Mid-Level Design Patterns: Iteration and Iterators

Utopia/Dystopia: European Images and Representations of the “New World”

Learning Mid-Level Features For Recognition

Search Images Videos Maps

Over-complete Representations for Signals/Images

Harvesting Mid-level Visual Concepts from Large-scale Internet Images

Representing Videos using Mid-level Discriminative Patches

Mid- term Images

MOON IMAGES AND SOCIAL REPRESENTATIONS

Mid-Level Manager’s Conference

Eigen representations; Detecting faces in images

TEXT EXTRACTION FROM IMAGES AND VIDEOS

MID- LEVEL

Clinical Procedure Videos and Images From N.E.J.M.

Arkansas Mid-Level FBLA

Eigen Representations: Detecting faces in images

Learning sparse representations to restore, classify, and sense images and videos

Free Images and Videos

Greg Rolen Attorney | Videos & Images

you can easily shoot videos and click images

Inventory Images, Images & Royalty Free Videos

Stock Photos, Images & Royalty Free Videos