340 likes | 451 Vues
ECEC 453 Image Processing Architecture. Lecture 10, 2/17/2004 MPEG-2, Industrial Strength Video Compression and Friends Oleh Tretiak Drexel University. Lecture Outline. Basic Video Coding Features of MPEG-1 Features of H261 MPEG-2 Introduction to MPEG-4. Picture of Layers.
E N D
ECEC 453Image Processing Architecture Lecture 10, 2/17/2004 MPEG-2, Industrial Strength Video Compression and Friends Oleh Tretiak Drexel University
Lecture Outline • Basic Video Coding • Features of MPEG-1 • Features of H261 • MPEG-2 • Introduction to MPEG-4
Video Compression: Picture Types • Group of Pictures: Three types • I — intraframe coding only • P — predictive coding • B — bi-directional coding
Typical MPEG coding parameters • Typical sequence • IPBBPBBPBBPBBPBB (16 frames)
I frame P frame B frame Block Diagram of MPEG Decoder
Macroblock Coding: I & P • I pictures (almost like JPEG) • Divided into slices and macroblocks • No motion compensation • Each macroblock can have different quantization • DC and AC coded differently, as in JPEG • Different coding tables from JPEG • P pictures • Divided into slices and macroblocks • Option: no motion compensation • Option: can code block as inter or intra (like I picture) • Can skip macroblock (replace with previous). Great compression
Coding Image Blocks • B pictures • Inter or intra? • Forward, backward, interpolational? • Code block or skip? • Quantization step? Statistics for an image sequence
MPEG-1: ‘1.5’ Mbps • Sample rate reduction in spatial and temporal domains • Spatial • Block-based DCT • Huffman coding (no arithmetic coding) of motion vectors and quantized DCT coefficients • 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second —> 30.4 Mbps • Coded bit stream 1.15 Mbps (must leave bandwidth for audio) • Compression 26:1 • Quality better than VHS! • Temporal • Block-based motion compensation • Interframe coding (two kinds)
Video Teleconferencing • Comprehensive Standard: H.320 • Components of H.320 • H.261: Video coding, 64 to 1920 kbits/sec • G.722, G.726, G.728: Audio coding from 16 kbits/sec to 64 kbits/sec • H.221: Multiplexing of audio and video (frame based rather than packet based) • H.230 and H.242: Handshaking and control • H.233: encryption
H.261 Features • Common Interchange Format • Interoperability between 25 fps and 30 fps countries • 252 pix/line, 288 line, 30 fps noninterlace • Terminal equipment converts frame and line numbers • Y Cb Cr components, color sub-sampled by a factor of 2 in both directions • Coding • DCT, 8x8, 4 Y and 2 chrominance per masterblock • I and P frames only, P blocks can be skipped • Motion compensation optional, only integer compensation • (Optional) forward error correction coding
H.261 vs MPEG-1 • Similarities • CIF, SIF, non-interlaced • DCT technology • Differences • H.261 uses mostly P frames, no B frames • H.261 typical bit rates much lower (down to 64 kbits/sec) • Low bit rates achieved by reducing frame rate • Simpler motion compensations • End-to-end coding delay must be low • Conclusion: Same technology, different design to meet different needs
MPEG 2i, i = 0, 1 • History & Goals • Expanding universe of video coding • What are MPEG-2 profiles? • Features of MPEG-2
MPEG Home • Official web site • (http://www.cselt.it/mpeg/ still works) • http://mpeg.telecomitalialab.com/ • Information site • http://www.mpeg.org/MPEG/ (unchanged) • History • MPEG-1, the standard for storage and retrieval of moving pictures and audio on storage media (approved Nov. 92) • MPEG-2, the standard for digital television (approved Nov. 94) • MPEG-4 version 1, the standard for multimedia applications (approved Oct. 98), version 2, (approved Dec. 99) • Under development: • MPEG-4versions 3&4 • MPEG-7the content representation standard for multimedia information search, filtering, management and processing. • Started MPEG-21, the multimedia framework.
MPEG Example • Film on DVD: 8 Gbytes • Playing time: 2 hours • Bit rate 8e9 bytes x 8 bits/byte / 7200 seconds ~ 9 Mbits/sec • Information? on the web • http://www.microsoft.com/windowsxp/moviemaker/expert/digitalvideo.asp • ‘Bit Rate Explained Bit rate describes how much information there is per second in a stream of data. You might have seen audio files described as “128–Kbps MP3” or “64–Kbps WMA.” Kbps stands for “kilobytes per second,” ....’ • Site claims that 64 Kbps WMA is as good as 128 Kbps MP3 • Ignorance about bits and bytes does not encourage credibility
MPEG-2 Goals • Compatibility with MPEG-1 • Good picture quality • Flexibility in input format • Random access capability (I pictures) • Capability for fast forward, fast reverse play, stop frame • Bit stream scalability • Low delay for 2-way communications (videoconferencing) • Resilience to bit errors
MPEG-2 Implications • No reason to restrict to CCIR 601 • High resolution can be included (HDTV) • No single standard can satisfy all requirements • Family of standards • Most applications use a small set of the features • Toolkit approach
MPEG-2 profiles • A profile is a subset of the entire MPEG-2 bit-stream syntax • Simple • Main • 4:2:2 • SNR • Spatial • High • Multiview • Each profile has several levels (resolution quality) • Low — MPEG1 • Main — CCIR 601 • High-1440 (Video Editing) • High (HDTV)
Features of MPEG-2 • Support of both non-interlaced and interlaced pictures • Color handling • Y Cb Cr color space • Several subsampling schemes are used • 4:2:0, 4:2:2, 4:4:4 • MPEG-2 sequence can be either frames or fields • Both frame prediction and field prediction are supported • There can be motion between two fields in a frame, so that frame prediction is more tricky • In frame prediction, both fields constitute one picture • In field prediction, either field in the previous frame or the previous field in this frame can be used as reference • Robustified coding of motion vectors to protect against bit errors • Special prediction modes: 16x8, dual-prime
MPEG-2: DCT and Quantization • Two quantizers: one for intra blocks and one for non-intra blocks • Support different quantization blocks for luminance and chrominance • Scalable bit streams • data partitioning, SNR scalability, temporal scalability, spatial scalability • Data partitioning: headers and motion vectors in two bit streams • SNR scalability: lower layer provided basic video, other layers provide enhancements. Basic layer sent with robust modulation • Spatial scalability: lower layer provides basic resolution (e. g., MPEG-1), upper layer provides detail • Temporal scalability: lower layer provides basic (low) frame rate
MPEG-2: Profiles • 4:2:2 profile at Main level • Two Y blocks for each pair of Cb, Cr blocks • Distribution format for video production • Robust for several compressions and decompressions • 720x608, 30 fps • 50 Mbit/sec • Luminance full raster, chrominance are at full line rate • DC precision of intra blocks can be up to 11 bits • Main (4:2:0) profile at Main level • Four Y blocks for each pair of Cb, Cr blocks • Intended for broadcast quality (actually, is better) • 15 Mbit/sec • Main profile at low level • Like MPEG-1
MPEG2 features • Schemes for ‘frame’ and field coding. • There are two fields in a frame, T (top) B (bottom) • Either can be first • Frame prediction for frame pictures • What’s there to say? • Field prediction for field pictures • Target macroblock is in one field • Prediction pixels come from one field • Can be the same of different parity as target field • Field prediction for frame pictures • Dual prime for P-pictures • 16x8 macroblock for field pictures • Motion vectors coded at half-pel resolution
MPEG2 - Alternate Scan Zig-zag scan Alternate scan
MPEG2 — Subsampling • Suppose picture is 720x480 • 4:4:4 • Luminance and chrominance @ 720x480 • 4:2:2 • Luminance @ 720x480, chrominance 360x480 • 4:2:0 • Luminance 420x480, chrominance 360x240 • Weird terminology
Low • Y ~ 352x240 • Cb, Cr ~ 176x120 • 30 pictures per second • +/- 64 pixel displacement, half pixel resolution
Main (4:2:0) • Y ~ 720x480 • Cb,Cr ~ 360x240 • 30 frames per second • 4:3, 16:9 aspect ratio • Bitrate 15 Mbps (some applications as low as 5 Mbps) • Digital television
High • Y 1920x1152 • Cb, Cr 960x576 • 60 frames per second • 80 Mbps • HDTV
Low rate • Where is it needed? • How is it done?
MPEG-2: DCT and Quantization • Two quantizers: one for intra blocks and one for non-intra blocks • Support different quantization blocks for luminance and chrominance • Scalable bit streams • data partitioning, SNR scalability, temporal scalability, spatial scalability • Data partitioning: headers and motion vectors in two bit streams • SNR scalability: lower layer provided basic video, other layers provide enhancements. Basic layer sent with robust modulation • Spatial scalability: lower layer provides basic resolution (e. g., MPEG-1), upper layer provides detail • Temporal scalability: lower layer provides basic (low) frame rate
MPEG-4Multimedia Standard Thumbnail Description
What Is Left for MPEG-4? • Initial goals • Coding standards for lower-than-MPEG-1 rates • Hidden agenda: Incorporate new coding methods • Wavelet, fractal • Revised agenda: Object-based coding • MPEG-4 Architecture • Input to coder consist of audio, video, and stored objects • Decoder combines encoded objects with local objects • Example: send text by sending character codes, receiver uses character generator.
MPEG-4 Ideas • Video Object Plane (VOP) • A VOP can be a natural image from video camera or from a graphics database • A VOP can consist of several visual object. Visual objects do not have to have rectangular outline (arbitrary shape) • A scene consists of several VO’s and VOP’s with appropriate compositing • Different VOP’s can have their own motion • In principle, a visual scene can be decomposed into video objects by segmentation. • Color and texture can be attributes of visual objects • A viewer can manipulate VO’s.