Introduction on MPEG Video Coding Standards

Introduction on MPEG Video Coding Standards Yung-Ching Chang (張永清) Visual Communication Laboratory,CS, NTHU

Lossy Coding of Still Image - JPEG

Uncompressed Bitrate for Video

Motion Compensated Predictive Coding

Video Compression Standards • CCITT H.261 • ITU-T Study Group 15 • Videophone and video conferencing • 1988-1990: p x 64 kbps (p = 1… 30) • ITU-T H.263 • PSTN and mobil network: 10 to 24 kbps • 1994: H.263, H.263+…

Video Compression Standards (cont’d) • MPEG-1 Video (ISO/IEC 11172-2) • 1.2 ~ 1.5Mbps • Video for digital storage media, CD-ROM • Sep 1990 • MPEG-2 Video (ISO/IEC 13818-2) • 2 ~ 30 Mbps • Digital broadcast TV, HDTV, Video services on network • Nov 1993 • MPEG-4 (ISO/IEC 14496) • An emerging coding standard • Universal access

MPEG-1 v.s. H.261 (Conceptually) • H.261 • Short algorithm delay • Lower compression complexity • Lower memory requirement • Limited flexibility on bit rate control • MPEG-1 • Longer algorithm delay • Higher compression complexity • Higher memory requirement • More coding mode support higher bit rate flexibility

Algorithm Delay • H.261 • MPEG-1 B-picture can’t be coded until next P- or I-frame

Compression Complexity • H.261 • MPEG-1

Memory Requirement • H.261 • MPEG-1

Bit Rate Flexibility • H.261 • MPEG-1 • GOP structure and B-frame can offer more flexibility on coding bit rate

MPEG-1 v.s. H.261 (Technically) • MPEG-1 • Bi-directional motion compensation (B-picture) • Group of pictures (GOP) • Half-pel motion compensation • Visually weighted quantization • No picture size or bit rate constraints • Flexible slice structure instead of GOB

MPEG-1 Coding Hierarchy . . . . . Video sequence Group of pictures (GOP) Divided into GOPs . . . . . I B B P B B P B B P B B I B B P … Motion estimation

MPEG-1 Coding Hierarchy (cont’d) Slice1 Slice2 Slice3 16  16 Slice4 Slice5 Slice6 Picture  Slices Slice7 Slice8 Slice9 Slice10 Slice11 Slice12 Slice13 8  8 Slice  Macroblocks Y Cb Cr

Some Coding Schemes • GOP • Random access • Prevent error propagation • B-picture • Pros: Best prediction and compression, object occlusion and entrance into scene, noise averaging. • Cons: Encoder delay, high complexity, large encoder buffer required • Slice • Synchronous unit • Suit for localized image property

Group of Pictures • Group of pictures (GOP) • A GOP contains at least one I-picture • Must start by I-picture in bitstream order • Can have any number of P-picture and B-picture Display order: 1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P 17B 18B 19I … Bitstream order: 1I 4P 2B 3B 7P 5B 6B 10I 8B 9B 13P 11B 12B 16P 14B 15B 19I 17B 18B … Display order: 1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P 17B 18B 19I …

Group of Pictures (cont’d) • Closed GOP • Don’t reference to the pictures in the previous GOP • Can be easily removed while editing • Open GOP: Reference to previous GOP Display order Closed GOP: I B B P B B P B B P Open GOP: B B I B B P B B P B B P Reference to the previous P or I Closed GOP: B B I B B P B B P B B P Only reference to the next I

System Stream Layer • An MPEG stream is segmented into packs • Contain info about system clock, bit rate, number of video streams and audio streams. • Multiplexing of video streams and audio streams • Can contain multiple packets, ex. three packets for video stream 1 and video stream 2 and audio stream 1. • Packet • Each packet contain a segment of data from a video stream or audio stream • Has presentation time and/or decoding time • Combine the payload of contiguous packets to form a elementary stream

Coding MPEG Video • Rate control within a sequence • Allocate bit rate for each picture • A reasonable ratio, I:P:B = 8:5:1 • Give the I and P the same visual quality, and reduce the bit rate for B to save bits, because B is not referenced, lower quality will not propagate • If there is little motion or change, the I should get more bits; if there is a lot of motion or change, reduce the bits of I and give them to P • Video stream of VCD: 1394.4 kbps, contain 30 pictures, typical GOP is IBBPBBPBBPBBPBB or IBBBPBBBPBBBPBBBP

Rate Control within a Picture • Allocate the target bits for each macroblock • If the generated bits over the target bits • Increase the quantizer scale • Discard the high frequency of DCT coefficients • If the generated bits is lower than the target bits • Decrease the quantizer scale • Insert the macroblock stuffing bits • How to allocate bits? • Smaller quantizer scale for smooth area to avoid blocking effect • Higher quantizer scale for rough area to save bits

Slice selection • Each slice header require 40 bits • For a video (30 picture/s) with vertical resolution is 240, there are 15 slices if each row of macroblocks is a slice. • If a picture contains only one slice  1200bps for the slices • If a picture contains 15 slices  18000bps for the slices • A slice is the minimum independently decodable unit • For an error free environment, one slice per picture may be appropriate • If the environment is noisy, the one slice per row of macroblocks may be more desirable • A slice have a quantizer scale, ranged from 1 to 31

Motion Estimation • The estimation distance is more longer than the H.261 • 1024 for full pixel or 512 for half pixel • Full search is not suitable and require a faster search algorithm I B B P

Coding I-Pictures • Macroblock types in I-picture • intra-d: encode in intra-mode with default quantization • intra-q: encode in intra-mode with updated quantization • Each intra-q require extra 5 bits for quantizer scale, ranged from 1 to 31 • A macroblock divided into for luminance blocks and two chrominance blocks, all six blocks have to be DCT coded

Index Coef. Coding blocks in I-Pictures • Applying DCT to each blocks as defined in H.261 • Quantize coefficients by the uniform quantizer for I-pictures • The final quantizer scale for DC is always 8 • The final quantizer scale for each AC is the the corresponding value in the quantization matrix multiple the quantizer scale of this macroblock

Coding blocks in I-Pictures (cont’d) • The quantized DC is DPCM + entropy coded • The quantized ACs are zig-zag scanned and then entropy coded • Example:

Coding P-Pictures • Seven macroblock types in P-pictures • -m: motion compensation, require motion vector • -c: coding pattern to indicate which blocks to be DCT coded • -q: change quantizer scale • skipped: use motion vector of previous macroblock

Coding P-Pictures (cont’d) • Coded block pattern (CBP) • Indicate which blocks to be DCT coded • If all quantized coefficients in one block are zero, this block is not coded; if all blocks are not coded, skip this macroblock • Selection of macroblock type CBP = 32 * BY0 + 16 * BY1 + 8 * BY2 + 4 * BY3 + 2 * BCb + BCr Quant Pred-mcqPred-mcPred-mPred-cqPred-cSkippedIntra-qIntra-d Coded Not quant MC Not coded Quant Begin Coded Not quant Non-Intra Not coded No MC Quant Intra Not quant

Index Coef. Coding blocks in P-Pictures • Intra blocks are coded as I-picture • Inter blocks • The residual is applying DCT • Quantize coefficients by the dead zone quantizer • The final quantizer scale for each AC is the the corresponding value in the quantization matrix multiple the quantizer scale of this macroblock

Coding B-Pictures • Eleven macroblock types in B-pictures • -I: interpolation, -c: coding pattern, -f: forward, -b: backward, -q: quantization

Coding B-Pictures (cont’d) • Selection of macroblock type • Because B-pictures have lowest bit rate, try to select the skipped type at first • Do the forward motion estimation and backward estimation, and then do interpolation  find the best one AAA Begin Quant Pred-*cqPred-*cPred-* or skippedIntra-qIntra-d Coded Not quant Non-Intra Not coded A Quant Intra Not quant

Decoding a Sequence for VCR Command • Decoding for fast forward • Discard the B-pictures and decode only the I- and P- • Discard the P- and B-pictures and decode only the I- • Decoding for reverse play • Require a large buffer to store whole bitstream of a GOP, and then decode and display at a reverse order B B I B B P B B P B B P pictures in display order0 1 2 3 4 5 6 7 8 9 10 11I B B P B B P B B P B B pictures in decoding order2 0 1 5 3 4 8 6 7 11 9 10I P P P B B B B B B B B pictures in new order2 5 8 11 10 9 7 6 4 3 1 0

Pre- and Post-Processing • Pre-processing • Apply medium filter to remove noise • Apply low-pass filter to smoothing the image edge, remove the high frequency to prevent the ringing effect • Post-processing • Blocking artifacts are more visiblein the low frequency blocks • Low-pass filter at block boundaries • Wide low-pass filter at adjacent smooth blocks

Pre- and Post-Processing (cont’d) • Ringing artifact appears along thesharp edges, in other words, in thehigh frequency blocks • Detect the edges in ringing block bythe Sobel masks, mark as edge if overa threshold • Apply a simple low-pass filter on thenon-edge area

MPEG-2 Compared to MPEG-1 • Frame/Field adaptive motion compensation and DCT • Dual prime motion compensation (for P-pictures when no B-pictures) • Nonlinear quantization table with increased accuracy for small values • Alternate scan for DCT coefficients • New VLC tables for DCT coefficients coding • In addition to 4:2:0, also supports 4:2:2 and 4:4:4 • Support maximum motion vector range of -2048 to +2047.5 (always half-pixel motion vectors)

Frame/field DCT • Frame DCT • Field DCT

Nonlinear Quantization Table

Additional Chrominance Format • 4:2:0 • 4:2:2 • 4:4:4 Y Cb Cr

Alternate Ccan for DCT Coefficients

Major Components of an MPEG-4 Terminal

MPEG-4 Components • Face • 66 Facial animation parameters • Primary facial expressions • 14 Visemes • VO (Video Object) • Shape • Motion vectors • Texture • Texture • From VOP • Still texture (Discrete Wavelet Transform) • AO (Audio Object) • MPEG Layer 1-3 • AAC(Advanced Audio Coder) • TTS (Text-To-Speech) • 2D Mesh • Triangular patches • Motion vector

Content-based Audio-Visual Representation • Audio-Visual Object (AVO) • Video object component (video object plane, VOP) • natural or synthetic • 2D or 3D • Audio object component • mono, stereo or multichannel

Video Object Planes (VOP) • Characteristics of VOP • may have different spatial temporal resolutions • may be associated with different degrees of accessibility  sub-VOPs • may be separated or overlapping • VOP type • Traditional I, P, B type • S-VOP (Sprite) for background

Video Object Plane Type S-VOP Time S-VOP B-VOP B-VOP B-VOP B-VOP B-VOP B-VOP I-VOP P-VOP P-VOP

Content-based Object Manipulation • Object manipulation • change of the spatial position of a VOP • application of a spatial scaling factor to a VOP • change of the speed with which an VOP moves • insertion of new VOPs • deletion of an object in the scene • change of the scene area

Example of Bit stream Manipulation

Segmentation Process • Depending on applications, segmentation can be perform • Online (real-time) or offline (non-real-time) • Automatic or semi-automatic • Examples • Video conferencing • real-time, automatic • separate foreground (communication partner) from background • Object Tracking in Video • May allow off-line and semi-automatic • separate moving object from others

Compression • Improved coding efficiency • 5-64 kbps for mobile applications • up to 20Mbps for TV/film applications • subjectively better quality compared to existing standard • Coding of multiple concurrent data streams • can code multiple views of a scene efficiently,e.g. stereo video

Coding VO in MPEG-4 • Reduce temporal redundancy • Motion estimation for arbitrary shaped VOPs • padding and modified block (polygon) matching motion estimation P-VOP B-VOP time I-VOP

Coding Procedure of VOP • BAB (Binary Alpha Block) • Motion Vector • CAE (Context-Based Arithmetic Encoding) • Rate Control by Sub-sampling • Texture • Motion Vector • DCT • Rate Control by Quantization Step

New Coding Features • For each macroblock, the motion vectors can be computed on a 16  16 or 8  8 block basis • Unrestricted motion estimation: prediction can extend over image boundary • Overlapped block motion compensation • Each component of texture can range from 1 to 12 bits • More robust coding

Introduction on MPEG Video Coding Standards

Introduction on MPEG Video Coding Standards

Presentation Transcript

Introduction to video coding

Fundamentals of Multimedia Chapter 12 MPEG Video Coding II MPEG-4, 7

Video Coding

MPEG Video Coding I: MPEG-1

MPEG Video Coding — MPEG-2

MPEG Video Coding II — MPEG-4, 7 and Beyond

MPEG Video Coding II — MPEG-4, 7 and Beyond

Existing Video Coding Standards

Fundamentals of Multimedia Chapter 12 MPEG Video Coding II MPEG-4, H.264

Video coding standards (Cntd.)

Introduction of MPEG-2 AAC Audio Coding

Fundamentals of Multimedia Chapter 12 MPEG Video Coding II MPEG-4, 7, 21

Fundamentals of Multimedia Chapter 11 MPEG Video Coding I MPEG-1 and 2

MPEG VIDEO COMPRESSION

International Standards for Image/Video Coding

Briefly introduction to image/ video coding standard and FGS for MPEG-4

Video coding

Video Coding Standards

Video Compression - MPEG

MPEG Standards

Introduction to Coding Standards

MPEG-1 Video Coding Standard