Video Compression

Video Compression

Schedule • 3/6 - 1st Paper Presentation • 3/11 - Project Proposal • 3/13 - 2nd Paper Presentation

Video (Motion Pictures) • A video consists of a time-ordered sequence of frames, i.e., images. • Video as a 3-D signal • 2 spatial dimensions & time dimension • continuous I( x, y, t ) => discrete I( m, n, tk) • Frame by frame => image sequence • Encode digital video • Simplest way ~ compress each frame image individually • e.g., “motion-JPEG” • only spatial redundancy is explored and reduced • How about temporal redundancy? Is differential coding good? • Pixel-by-pixel difference could still be large due to motion • Need better prediction

Characteristics of Typical Videos • Adjacent frames are similar and changes are due to object or camera motion

Derive the Difference • Simplistic way? • Subtract two video frames mov = aviread(‘car.avi'); j=0;numstr=[1:1];for j=1:1 V= mov(1,j); frame=frame2im(V);filename=['Example', numstr(j), '.jpg'];imwrite(‘frame_1.jpg’,'jpeg');end • Motion generators (caused by camera and or object motion) can be compensated by detecting the displacement of corresponding pixels or regions in these frames and measuring their differences • Video compression algorithms that adopt this approach are said to be based on Motion compensation (MC). There are three main steps: • Motion estimation (Motion Vector search) • Motion-Compensation-based prediction • Derivation of the prediction error--the difference

Motion Representations • Pixel-based representation • Specify MV for each pixel • Widely applicable at an expense of high computation complexity • Global motion representation • Good if camera motion is the dominating motion • A few parameters for the entire frame • Region-based representation • One set of motion parameters for each region • Need to find and specify region segmentation • Usually don’t know what pixels have similar motion • Need iterative segmentation and estimation

Motion Representation (cont’d) • Block-based representation • Fixed partitioning into blocks and characterize each with simple model (e.g., translation model) • Good compromise between accuracy and complexity, and shown success in video coding • Mesh-based representation • Partition image into polygons and specify MV for nodes • Provide continuous motion everywhere;Good for facial and other non-rigid motions From R.Liu Seminar Course @ UMCP

Motion Vector • Motion compensation is not performed at the pixel level, nor at the level of video object. It is at the marcoblock level • Target frame: The current image frame; • Reference frame: Previous or future frames; • The target marcoblock is predicted from the reference marcoblocks • The displacement of the reference marcoblock to the target macroblock is called motion vector MV • Forward prediction in which the reference frame is taken to be a previous frame • Backward prediction: The reference frame is future frame

Motion Vector

Video Compression based on motion compensation • After first frame, only motion vectors and differencemacroblocks need to be coded • Work on each marcrobook (MB) (16*16 pixels) independently for reduced complexity • Motion compensation done at the MB level • DCT coding of error at the block level (8*8 pixel) • Predict a new frame from a previous frame and only code the prediction error---Inter prediction • Prediction error will be coded using the DCT method • Prediction errors have smaller energy than the original pixel values and can be coded with fewer bits

Search for Motion Vector • MV search is usually limited to a small immediate neighborhood — both horizontal and vertical displacements in the range [−p, p]. This makes a search window of size (2p+ 1) x (2p+ 1). Macroblocksand Motion Vector in Video Compression.

Search for Motion Vectors • The difference between two macroblocks can then be measured by their Mean Absolute Difference (MAD): N—size of the macroblock, kandl — indices for pixels in the macroblock, iandj— horizontal and vertical displacements, C ( x + k, y + l ) — pixels in macroblock in Target frame, R ( x + i+ k, y + j + l ) — pixels in macroblock in Reference frame. • The goal of the search is to find a vector (i, j) as the motion vector MV = (u, v), such that MAD(i, j) is minimum: (u,v) = [(i, j)| MAD(i, j) is minimum]

Sequential Search • Sequential search: sequentially search the whole (2p+ 1) x (2p+ 1) window in the Reference frame (also referred to as Full search). • A macroblock centered at each of the positions within the window is compared to the macroblock in the Target frame pixel by pixel and their respective MAD. • The vector (i, j) that offers the least MADis designated as the MV (u, v) for the macroblock in the Target frame. • sequential search method is very costly — assuming each pixel comparison requires three operations (subtraction, absolute value, addition.

Sequential-search begin min_MAD= LARGE NUMBER; /* Initialization */ for i= −p to p for j = −p to p { cur_MAD= MAD(i, j); if cur_MAD < min_MAD { min_MAD= cur_MAD; u = i; /* Get the coordinates for MV. */ v = j; } } end

2D Logarithmic Search • Logarithmic search: a cheaper version, that is suboptimal but still usually effective. • The procedure for 2D Logarithmic Search of motion vectors takes several iterations and is akin to a binary search: • Initially only nine locations in the search window are used as seeds for a MAD-based search; they are marked as ‘1’. • After the one that yields the minimum MADis located, the center of the new search region is moved to it and the step-size (“offset”) is reduced to half. • In the next iteration, the nine new locations are marked as ‘2’ and so on.

2D Logarithmic Search 2D Logarithmic Search for Motion Vectors.

Vector:2D-logarithmic-search begin offset = ; Specify nine macroblocks within the search window in the Reference frame, they are centered at (x0,y0) and separated by offset horizontally and/or vertically; while last ≠ TRUE { Find one of the nine specified macroblocks that yields minimum MAD; if offset = 1 then last = TRUE; offset = Form a search region with the new offset and new center found; } end

Hierarchical Search • The search can benefit from a hierarchical (multiresolution) approach in which initial estimation of the motion vector can be obtained from images with a significantly reduced resolution. • A three-level hierarchical search in which the original image is at Level 0, images at Levels 1 and 2 are obtained by down-sampling from the previous levels by a factor of 2, and the initial search is conducted at Level 2. • Since the size of the macroblock is smaller and pcan also be proportionally reduced, the number of operations required is greatly reduced.

Lowest resolution medium resolution Original resolution Hierarchical Search • Problem with fast search at full resolution • Small mis-alignment may give high displacement error (EDFD) • esp. for texture and edge blocks • Hierarchical (multi-resolution) block matching • Match with coarse resolution to narrow down search range • Match with high resolution to refine motion estimation (From Wang’s Preprint Fig.6.19)

A Three-level Hierarchical Search for Motion Vectors.

Hierarchical Search (Cont'd) • Given the estimated motion vector (uk, vk) at Level k, a 3 x 3 neighborhood centered at (2 ·uk, 2 ·vk) at Level k − 1 is searched for the refined motion vector. • The refinement is such that at Level k − 1 the motion vector (uk−1 , vk−1) satisfies: (2uk − 1 ≤ uk−1 ≤ 2uk +1, 2vk − 1 ≤ vk−1 ≤ 2vk +1) • Let (xk0, yk0) denote the center of the macroblock at Level kin the Target frame. The procedure for hierarchical motion vector search for the macroblock centered at (x00, y00) in the Target frame can be outlined as follows:

Hierarchical Search begin // Get macroblock center position at the lowest resolution Level k xk0 = x00 /2k ;yk0 = y00 /2k; Use Sequential (or 2D Logarithmic) search method to get initial estimated MV(uk, vk) at Level k; while last ≠ TRUE { Find one of the nine macroblocks that yields minimum MADat Level k − 1 centered at ( 2(xk0+uk) − 1 ≤ x ≤2(xk0+uk) + 1; 2(yk0 +vk) − 1 ≤ y ≤2(yk0+vk) + 1 ); if k = 1 then last = TRUE; k = k − 1; Assign (xk0; yk0 ) and (uk, vk) with the new center location and MV; } end

Motion Compensation • Help to reduce temporal redundancy of video PREVIOUS FRAME CURRENT FRAME PREDICTED FRAME PREDICTION ERROR FRAME

Homework 2 (Due on April 3) • 1. Download the picture from the following link: • http://www.faculty.umassd.edu/honggang.wang/ece591_web/ECE595/redblock.bmp • 2. Write Matlab code to find red block in the picture. • Output the line and row numbers (no need to be very accurate)

Any Questions?

Video Compression