1 / 72

Introduction on MPEG Video Coding Standards

Introduction on MPEG Video Coding Standards. Yung-Ching Chang ( 張永清) Visual Communication Laboratory, CS, NTHU. Lossy Coding of Still Image - JPEG. Uncompressed Bitrate for Video. Motion Compensated Predictive Coding. Video Compression Standards. CCITT H.261 ITU-T Study Group 15

jasper
Télécharger la présentation

Introduction on MPEG Video Coding Standards

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction on MPEG Video Coding Standards Yung-Ching Chang (張永清) Visual Communication Laboratory,CS, NTHU

  2. Lossy Coding of Still Image - JPEG

  3. Uncompressed Bitrate for Video

  4. Motion Compensated Predictive Coding

  5. Video Compression Standards • CCITT H.261 • ITU-T Study Group 15 • Videophone and video conferencing • 1988-1990: p x 64 kbps (p = 1… 30) • ITU-T H.263 • PSTN and mobil network: 10 to 24 kbps • 1994: H.263, H.263+…

  6. Video Compression Standards (cont’d) • MPEG-1 Video (ISO/IEC 11172-2) • 1.2 ~ 1.5Mbps • Video for digital storage media, CD-ROM • Sep 1990 • MPEG-2 Video (ISO/IEC 13818-2) • 2 ~ 30 Mbps • Digital broadcast TV, HDTV, Video services on network • Nov 1993 • MPEG-4 (ISO/IEC 14496) • An emerging coding standard • Universal access

  7. MPEG-1 v.s. H.261 (Conceptually) • H.261 • Short algorithm delay • Lower compression complexity • Lower memory requirement • Limited flexibility on bit rate control • MPEG-1 • Longer algorithm delay • Higher compression complexity • Higher memory requirement • More coding mode support higher bit rate flexibility

  8. Algorithm Delay • H.261 • MPEG-1 B-picture can’t be coded until next P- or I-frame

  9. Compression Complexity • H.261 • MPEG-1

  10. Memory Requirement • H.261 • MPEG-1

  11. Bit Rate Flexibility • H.261 • MPEG-1 • GOP structure and B-frame can offer more flexibility on coding bit rate

  12. MPEG-1 v.s. H.261 (Technically) • MPEG-1 • Bi-directional motion compensation (B-picture) • Group of pictures (GOP) • Half-pel motion compensation • Visually weighted quantization • No picture size or bit rate constraints • Flexible slice structure instead of GOB

  13. MPEG-1 Coding Hierarchy . . . . . Video sequence Group of pictures (GOP) Divided into GOPs . . . . . I B B P B B P B B P B B I B B P … Motion estimation

  14. MPEG-1 Coding Hierarchy (cont’d) Slice1 Slice2 Slice3 16  16 Slice4 Slice5 Slice6 Picture  Slices Slice7 Slice8 Slice9 Slice10 Slice11 Slice12 Slice13 8  8 Slice  Macroblocks Y Cb Cr

  15. Some Coding Schemes • GOP • Random access • Prevent error propagation • B-picture • Pros: Best prediction and compression, object occlusion and entrance into scene, noise averaging. • Cons: Encoder delay, high complexity, large encoder buffer required • Slice • Synchronous unit • Suit for localized image property

  16. Group of Pictures • Group of pictures (GOP) • A GOP contains at least one I-picture • Must start by I-picture in bitstream order • Can have any number of P-picture and B-picture Display order: 1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P 17B 18B 19I … Bitstream order: 1I 4P 2B 3B 7P 5B 6B 10I 8B 9B 13P 11B 12B 16P 14B 15B 19I 17B 18B … Display order: 1I 2B 3B 4P 5B 6B 7P 8B 9B 10I 11B 12B 13P 14B 15B 16P 17B 18B 19I …

  17. Group of Pictures (cont’d) • Closed GOP • Don’t reference to the pictures in the previous GOP • Can be easily removed while editing • Open GOP: Reference to previous GOP Display order Closed GOP: I B B P B B P B B P Open GOP: B B I B B P B B P B B P Reference to the previous P or I Closed GOP: B B I B B P B B P B B P Only reference to the next I

  18. System Stream Layer • An MPEG stream is segmented into packs • Contain info about system clock, bit rate, number of video streams and audio streams. • Multiplexing of video streams and audio streams • Can contain multiple packets, ex. three packets for video stream 1 and video stream 2 and audio stream 1. • Packet • Each packet contain a segment of data from a video stream or audio stream • Has presentation time and/or decoding time • Combine the payload of contiguous packets to form a elementary stream

  19. Coding MPEG Video • Rate control within a sequence • Allocate bit rate for each picture • A reasonable ratio, I:P:B = 8:5:1 • Give the I and P the same visual quality, and reduce the bit rate for B to save bits, because B is not referenced, lower quality will not propagate • If there is little motion or change, the I should get more bits; if there is a lot of motion or change, reduce the bits of I and give them to P • Video stream of VCD: 1394.4 kbps, contain 30 pictures, typical GOP is IBBPBBPBBPBBPBB or IBBBPBBBPBBBPBBBP

  20. Rate Control within a Picture • Allocate the target bits for each macroblock • If the generated bits over the target bits • Increase the quantizer scale • Discard the high frequency of DCT coefficients • If the generated bits is lower than the target bits • Decrease the quantizer scale • Insert the macroblock stuffing bits • How to allocate bits? • Smaller quantizer scale for smooth area to avoid blocking effect • Higher quantizer scale for rough area to save bits

  21. Slice selection • Each slice header require 40 bits • For a video (30 picture/s) with vertical resolution is 240, there are 15 slices if each row of macroblocks is a slice. • If a picture contains only one slice  1200bps for the slices • If a picture contains 15 slices  18000bps for the slices • A slice is the minimum independently decodable unit • For an error free environment, one slice per picture may be appropriate • If the environment is noisy, the one slice per row of macroblocks may be more desirable • A slice have a quantizer scale, ranged from 1 to 31

  22. Motion Estimation • The estimation distance is more longer than the H.261 • 1024 for full pixel or 512 for half pixel • Full search is not suitable and require a faster search algorithm I B B P

  23. Coding I-Pictures • Macroblock types in I-picture • intra-d: encode in intra-mode with default quantization • intra-q: encode in intra-mode with updated quantization • Each intra-q require extra 5 bits for quantizer scale, ranged from 1 to 31 • A macroblock divided into for luminance blocks and two chrominance blocks, all six blocks have to be DCT coded

  24. Index Coef. Coding blocks in I-Pictures • Applying DCT to each blocks as defined in H.261 • Quantize coefficients by the uniform quantizer for I-pictures • The final quantizer scale for DC is always 8 • The final quantizer scale for each AC is the the corresponding value in the quantization matrix multiple the quantizer scale of this macroblock

  25. Coding blocks in I-Pictures (cont’d) • The quantized DC is DPCM + entropy coded • The quantized ACs are zig-zag scanned and then entropy coded • Example:

  26. Coding P-Pictures • Seven macroblock types in P-pictures • -m: motion compensation, require motion vector • -c: coding pattern to indicate which blocks to be DCT coded • -q: change quantizer scale • skipped: use motion vector of previous macroblock

  27. Coding P-Pictures (cont’d) • Coded block pattern (CBP) • Indicate which blocks to be DCT coded • If all quantized coefficients in one block are zero, this block is not coded; if all blocks are not coded, skip this macroblock • Selection of macroblock type CBP = 32 * BY0 + 16 * BY1 + 8 * BY2 + 4 * BY3 + 2 * BCb + BCr Quant Pred-mcqPred-mcPred-mPred-cqPred-cSkippedIntra-qIntra-d Coded Not quant MC Not coded Quant Begin Coded Not quant Non-Intra Not coded No MC Quant Intra Not quant

  28. Index Coef. Coding blocks in P-Pictures • Intra blocks are coded as I-picture • Inter blocks • The residual is applying DCT • Quantize coefficients by the dead zone quantizer • The final quantizer scale for each AC is the the corresponding value in the quantization matrix multiple the quantizer scale of this macroblock

  29. Coding B-Pictures • Eleven macroblock types in B-pictures • -I: interpolation, -c: coding pattern, -f: forward, -b: backward, -q: quantization

  30. Coding B-Pictures (cont’d) • Selection of macroblock type • Because B-pictures have lowest bit rate, try to select the skipped type at first • Do the forward motion estimation and backward estimation, and then do interpolation  find the best one AAA Begin Quant Pred-*cqPred-*cPred-* or skippedIntra-qIntra-d Coded Not quant Non-Intra Not coded A Quant Intra Not quant

  31. Decoding a Sequence for VCR Command • Decoding for fast forward • Discard the B-pictures and decode only the I- and P- • Discard the P- and B-pictures and decode only the I- • Decoding for reverse play • Require a large buffer to store whole bitstream of a GOP, and then decode and display at a reverse order B B I B B P B B P B B P pictures in display order0 1 2 3 4 5 6 7 8 9 10 11I B B P B B P B B P B B pictures in decoding order2 0 1 5 3 4 8 6 7 11 9 10I P P P B B B B B B B B pictures in new order2 5 8 11 10 9 7 6 4 3 1 0

  32. Pre- and Post-Processing • Pre-processing • Apply medium filter to remove noise • Apply low-pass filter to smoothing the image edge, remove the high frequency to prevent the ringing effect • Post-processing • Blocking artifacts are more visiblein the low frequency blocks • Low-pass filter at block boundaries • Wide low-pass filter at adjacent smooth blocks

  33. Pre- and Post-Processing (cont’d) • Ringing artifact appears along thesharp edges, in other words, in thehigh frequency blocks • Detect the edges in ringing block bythe Sobel masks, mark as edge if overa threshold • Apply a simple low-pass filter on thenon-edge area

  34. MPEG-2 Compared to MPEG-1 • Frame/Field adaptive motion compensation and DCT • Dual prime motion compensation (for P-pictures when no B-pictures) • Nonlinear quantization table with increased accuracy for small values • Alternate scan for DCT coefficients • New VLC tables for DCT coefficients coding • In addition to 4:2:0, also supports 4:2:2 and 4:4:4 • Support maximum motion vector range of -2048 to +2047.5 (always half-pixel motion vectors)

  35. Frame/field DCT • Frame DCT • Field DCT

  36. Nonlinear Quantization Table

  37. Additional Chrominance Format • 4:2:0 • 4:2:2 • 4:4:4 Y Cb Cr

  38. Alternate Ccan for DCT Coefficients

  39. Major Components of an MPEG-4 Terminal

  40. MPEG-4 Components • Face • 66 Facial animation parameters • Primary facial expressions • 14 Visemes • VO (Video Object) • Shape • Motion vectors • Texture • Texture • From VOP • Still texture (Discrete Wavelet Transform) • AO (Audio Object) • MPEG Layer 1-3 • AAC(Advanced Audio Coder) • TTS (Text-To-Speech) • 2D Mesh • Triangular patches • Motion vector

  41. Content-based Audio-Visual Representation • Audio-Visual Object (AVO) • Video object component (video object plane, VOP) • natural or synthetic • 2D or 3D • Audio object component • mono, stereo or multichannel

  42. Video Object Planes (VOP) • Characteristics of VOP • may have different spatial temporal resolutions • may be associated with different degrees of accessibility  sub-VOPs • may be separated or overlapping • VOP type • Traditional I, P, B type • S-VOP (Sprite) for background

  43. Video Object Plane Type S-VOP Time S-VOP B-VOP B-VOP B-VOP B-VOP B-VOP B-VOP I-VOP P-VOP P-VOP

  44. Content-based Object Manipulation • Object manipulation • change of the spatial position of a VOP • application of a spatial scaling factor to a VOP • change of the speed with which an VOP moves • insertion of new VOPs • deletion of an object in the scene • change of the scene area

  45. Example of Bit stream Manipulation

  46. Segmentation Process • Depending on applications, segmentation can be perform • Online (real-time) or offline (non-real-time) • Automatic or semi-automatic • Examples • Video conferencing • real-time, automatic • separate foreground (communication partner) from background • Object Tracking in Video • May allow off-line and semi-automatic • separate moving object from others

  47. Compression • Improved coding efficiency • 5-64 kbps for mobile applications • up to 20Mbps for TV/film applications • subjectively better quality compared to existing standard • Coding of multiple concurrent data streams • can code multiple views of a scene efficiently,e.g. stereo video

  48. Coding VO in MPEG-4 • Reduce temporal redundancy • Motion estimation for arbitrary shaped VOPs • padding and modified block (polygon) matching motion estimation P-VOP B-VOP time I-VOP

  49. Coding Procedure of VOP • BAB (Binary Alpha Block) • Motion Vector • CAE (Context-Based Arithmetic Encoding) • Rate Control by Sub-sampling • Texture • Motion Vector • DCT • Rate Control by Quantization Step

  50. New Coding Features • For each macroblock, the motion vectors can be computed on a 16  16 or 8  8 block basis • Unrestricted motion estimation: prediction can extend over image boundary • Overlapped block motion compensation • Each component of texture can range from 1 to 12 bits • More robust coding

More Related