1 / 60

Existing Video Coding Standards

Existing Video Coding Standards. ITU. H.120 (1984). ISO. H.261 (1990) p ×64Kbps. MPEG-1 (1992) 1.5Mbps, VCD. H.263 8-512Kbps. MPEG-2/H.262 (1996) 2-10Mbps, DVD. MPEG-4 (2000) 8-1024Kbps. H.263+(1998). windows media player or real player. H.264/AVC coding standard.

elkan
Télécharger la présentation

Existing Video Coding Standards

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Existing Video Coding Standards ITU H.120 (1984) ISO H.261 (1990) p×64Kbps MPEG-1 (1992) 1.5Mbps, VCD H.263 8-512Kbps MPEG-2/H.262 (1996) 2-10Mbps, DVD MPEG-4 (2000) 8-1024Kbps H.263+(1998) windows media player or real player H.264/AVC coding standard EE569 Digital Video Processing

  2. H.261 Coding Standard • Background: • Facilitate video conferencing and videophone service over ISDN • p×64 kbps (p=1:videophone; p>5: videoconference; p=30: VHS-quality) • Basis of MPEG-1 and MPEG-2 • Features • Maximum coding delay of 150ms • Amenable to low-cost VLSA implementation EE569 Digital Video Processing

  3. Input Image Formats EE569 Digital Video Processing

  4. Video Multiplex • It defines a data structure so that a decoder can interpret the received bit stream without any ambiguity • Hierarchical data structure • Picture layer • Group of blocks (GOB) layer • Macroblock (MB) layer • Block layer • Each layer has a distinct header EE569 Digital Video Processing

  5. Picture and GOB Layers • Picture layer consists of picture header followed by the data for GOBs • Picture header contains data such as picture format (CIF or QCIF) • GOB layer is always composed of 33 macroblocks • GOB header contains a MB address and compression mode followed by the data for the blocks EE569 Digital Video Processing

  6. Macroblock and Block Layers Macroblock: the smallest unit to select the compression mode Y1 Y2 Cr Cb Y3 Y4 A MB always consists of 6 blocks (Y1 – Y4, Cr, Cb) EE569 Digital Video Processing

  7. Compression Modes • Intra Mode • Similar to JPEG coding • Support two compression modes • Inter Mode • ME is not specified (MC is optional) • Usually, 16-by-16 BMA, integer-pel accuracy, search range [-15,15] • Support various compression modes EE569 Digital Video Processing

  8. Selecting a Compression Mode • Should a MV be transmitted? • Should we use intra or inter compression mode? • Should the quantizer stepsize be changed? We can choose the optimal compression mode based on the variance of the original MB, the MB difference (bd), the displaced MB difference (dbd) and the best MV estimate EE569 Digital Video Processing

  9. Selection Method • If the variance of dbd is smaller than bd, then we select Inter mode and MC is needed • Need to transmit MVD • The transmission of DCT coefficients is optional • Otherwise, no MV will be transmitted • If the original MB has a smaller variance, select Intra mode; otherwise select Inter mode (but with a zero MV) • For MC blocks, prediction errors can be modified by a 2D spatial filter (the prototype of deblocking filter) EE569 Digital Video Processing

  10. H.261 Compression Modes EE569 Digital Video Processing

  11. Interpretation • MQUANT: when it is on, a new value of quantizer stepsize will be transmitted; • MVD: when it is on, the motion vector difference will be transmitted; • CBP: when it is on, it means at least one transform coefficient in MB will be transmitted; • TCOEFF: when it is on, transform coeffients will be transmitted EE569 Digital Video Processing

  12. Variable Thresholding T=g, Tmax=g+g/2 Y N Coeff<T? Y N T=g T<Tmax? T=Tmax T=T+1 Q[Coeff]=g+g/2 Q[Coeff]=0 Motivation: to increase the number of zero coefficients EE569 Digital Video Processing

  13. Example Coef<T Coef>T Coef>T EE569 Digital Video Processing

  14. Run-Length Coding (run,level) (0,3) (1,2) (7,1) EOB Zigzag Scan EE569 Digital Video Processing

  15. H.261 Rate/Buffer Control • The coded video data rate is controlled by • Pre-processing • Quantization step-size • Block significance criterion (CBP flag) • Temporal sampling ratio • The fullness of buffer is controlled by • Quantization step-size • Maximum allowable coding delay (150ms) EE569 Digital Video Processing

  16. MPEG-I Standard • Features • Syntax based • no specific algorithm is standardized, the parameters • defining the encoded bit stream and decoder are • contained in the bit stream itself. • Random access • Allow independent access points (I-frame) to the bit • stream. • Fast forward and reverse search • Reasonable coding/decoding delay EE569 Digital Video Processing

  17. Input Video Format • Progressive video (interlaced video is handled by MPEG2) • Input video is first converted into the MPEG standard input format (SIF). SIF format: Y - 352 ×240, Cr/Cb - 176 ×120, 30 frames/sec Cb Cr Y EE569 Digital Video Processing

  18. MPEG-I Constrained Parameter Set -maximum number pixels/line: 720 -maximum number of lines/picture: 576 -maximum number of pictures/sec: 30 -maximum number of macro-blocks/picture: 396 -maximum number of macro-blocks/sec: 9900 -maximum bit rate: 1.86 Mbps -maximum decoder buffer size: 376,832 bits EE569 Digital Video Processing

  19. Perspective Video Formats EE569 Digital Video Processing

  20. Hierarchical Data Structure (I) • Sequences are formed by Group Of Pictures (GOP) • GOP are made up of pictures • Pictures consist of slices • Slices are made up of macro-blocks • Macro-blocks (MB) consist of blocks • Blocks are 8×8 pixels arrays EE569 Digital Video Processing

  21. Hierarchical Data Structure (II) GOP GOP GOP GOP GOP GOP frame frame frame frame frame frame slice slice slice slice slice slice MB MB MB MB MB MB block block block block EE569 Digital Video Processing

  22. Four Compression Modes • I frame : Intra-frame JPEG-like coding • P frame : forward Prediction from previous frames • B frame : forward, backward or bi-directional Prediction • D frame : contain only the DC component of each block I B B B P B B B P 0 1 2 3 4 5 6 7 8 GOP EE569 Digital Video Processing

  23. GOP Reordering I B B B P B B B P 0 1 2 3 4 5 6 7 8 GOP Processing order: 0,4,1,2,3,8,5,6,7 EE569 Digital Video Processing

  24. MB Types in MPEG-I EE569 Digital Video Processing

  25. Intra-frame Compression Mode 8×8 DCT Quantization Run-length coding JPEG-like coder • MB types Q0 - Intra - Intra-A Q spatially adaptive quantization MQUANT parameter Default quantization matrix Q0 EE569 Digital Video Processing

  26. Inter-frame Compression Mode (P) • MB types A new MQUANT value and DCT of prediction error will be coded - Intra - Intra-A - Inter-D We need to transmit MV and DCT of prediction error - Inter-DA - Inter-F We need to transmit MV, DCT of prediction error and a new MQUANT - Inter-FD - Inter-FDA - skipped Directly copy from the block at the same position in the previous frame EE569 Digital Video Processing

  27. Interframe Compression Mode (B) • Advantages • allow efficient handling of problems associated with • covered/uncovered background • MC averaging over two frames suppresses noise better • than prediction from just one frame • Since B-frames are not used in predicting future frames, • they can be coded with fewer bits without causing error • propagation • Disadvantages • Two frame buffers are needed • Longer coding delay EE569 Digital Video Processing

  28. Theoretical Framework behind B-frame Coding • Why does it improve coding efficiency? • Multi-hypothesis motion compensation (MHMC) • B frame is one of the simplest MHMC (two hypotheses: forward and backward) • Why does it facilitate scalable coding? • Temporal scalability • We can skip B-frames without affecting the decoding of other frames EE569 Digital Video Processing

  29. MPEG-I Encoder and Decoder • Encoder modules motion estimation, selection of compression mode (MTYPE) per MB, setting MQUANT value, MCP, quantizer and dequantizer, DCT and IDCT, VLC, multiplexer, buffer and buffer regulator • relative number of I,P,B pictures in a GOP is application • dependent. The use of B-pictures is optional. There is at • least one I picture every 132 pictures. • half-pixel accuracy in motion estimation • m.v. that refer to pixels outside of picture is not allowed • Decoder modules Demultiplexer, VLC decoder, MCP, dequantizer and IDCT EE569 Digital Video Processing

  30. Software Implementations • Bellcore version • ivy.ee.princeton.edu (not publically accessible) • Berkeley version • toe.cs.berkeley.edu (128.32.149.117) • /pub/multimedia/mpeg/mpeg-2.0.tar.Z • Stanford version • ftp://havefun.stanford.edu/ (36.2.0.35) • /pub/mpeg/MPEGv1.2.tar.Z EE569 Digital Video Processing

  31. MPEG-I vs. H.261 EE569 Digital Video Processing

  32. MPEG-2 Standard • Features • it allows for interlaced input, higher-definition inputs • and alternative subsampling of chrominance channels • it offers scalable bit stream • it provides improved quantization and coding options • Profiles • simple profile, main profile, SNR scalable profile, • spatially scalable profile and high profile EE569 Digital Video Processing

  33. Chrominance Subsampling • 4:2:0 (same as MPEG-I) luminance chrominance • 4:2:2 (chroma subsampled in the horizontal direction only) luminance chrominance • 4:4:4 (no chroma subsampling) EE569 Digital Video Processing

  34. Interlaced Video Coding 8 8 • Frame pictures 8 Interleave lines of even and odd fields to form composite frames odd field 8 even field • Field pictures Even and odd fields are treated as separate pictures Q: For video containing significant motion, which format is preferred? EE569 Digital Video Processing

  35. Frame and Field Pictures • GOP can be composed of mixture of frame and field pictures • Field pictures always appear in pair (top field and bottom field) • If the top field is a P-/B- picture, then the bottom field must also be a P-/B- picture • If the top field is an I-picture, then the bottom field can be an I- or P- picture • A pair of field pictures are encoded in the order in which they should appear at the output EE569 Digital Video Processing

  36. Frame and Field DCT Field DCT Frame DCT EE569 Digital Video Processing

  37. Frame and Field Prediction • MC Prediction Modes • Simple field prediction • Simple frame prediction • Within a field picture, only simple field prediction is used • Within a frame picture, either simple field prediction or simple frame prediction can be employed on a MB-by-MB basis EE569 Digital Video Processing

  38. Frame and Field Prediction (cont’d) • In the presence of motion, frame prediction suffers from strong motion artifacts; in the absence of motion, field prediction does not utilize all the available information • 16×8 MC mode: only used in the field pictures, two MVs are used for top and bottom fields respectively • Dual-prime mode: used only for P-pictures, one MV and a small differential MV are encoded EE569 Digital Video Processing

  39. Spatial, Temporal and SNR Scalability in MPEG-2 • Spatial (resolution) scalability -base layer is a low spatial resolution of the video -enhancement layers successively enhances the spatial resolution • SNR (rate, quality) scalability -base layer uses a coarse quantizer for DCT coefficients -enhancement layer uses a fine quantizer for DCT coeffcients • Temporal scalability -allow the decodability at different frame rates Note: the scalability feature provides by MPEG-2 is ad-hoc in the sense of significantly sacrificing coding efficiency EE569 Digital Video Processing

  40. Other Improvements (I) optional alternate scan (said to fit interlaced video better) EE569 Digital Video Processing

  41. Other Improvements (II) Finer Quantization of the DCT Coefficients EE569 Digital Video Processing

  42. Other Improvements (III) 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0 30.0 31.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 14.0 16.0 18.0 20.0 22.0 24.0 26.0 28.0 32.0 36.0 40.0 44.0 48.0 52.0 56.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 MQUANT in MPEG-I MQUANT in MPEG-II Finer Adjustment of MQUANT EE569 Digital Video Processing

  43. Implementation Issues (I) Four levels defined by MPEG-II EE569 Digital Video Processing

  44. Implementation Issues (II) Five profiles defined by MPEG-II EE569 Digital Video Processing

  45. Hardware Implementations • C-Cube • CL450: single-chip, MPEG-I, SIF rates • CL950: MPEG-II • CL4000: single-chip, MPEG-I/JPEG/H.261 • SGS-Thomson • STi3400: single-chip, MPEG-I, SIF rates • STi3500: the first MPEG-II chip on the market • Motorola • MCD2500: single-chip, MPEG-I, SIF rates EE569 Digital Video Processing

  46. H.26x Standards • H.261 (1983-1990) Video conferencing, video email, video telephony over Public Switching Telephone Networks (PSTN) and wireless networks • H.263/H.263+/H.263++ (1993-1999) -Based on H.261 but offers significant improvement on coding efficiency -Adopted by several videophone terminal standards: H.324 (PSTN), H.320 (ISDN), H.310 (B-ISDN) • H.264/AVC (1999-2003) EE569 Digital Video Processing

  47. H.263 Input Image Formats • sub-QCIF: 88×72 Color format • QCIF: 176×144 4:2:0 YUV • CIF: 352×288 Temporal rate • 4-CIF: 704×576 30,15,10,7.5Hz • 16-CIF: 1408×1152 EE569 Digital Video Processing

  48. H.263 Picture Structure 176 pels GOB1 GOB2 GOB3 GOB4 GOB5 Picture Frame GOB6 144 lines GOB7 GOB8 GOB9 Group of Blocks (GOB) MB2 MB3 MB4 MB5 MB6 MB7 MB8 MB9 MB10 MB11 MB1 8 pels Y1 Y2 Macroblock Y3 Y4 Cb Cr 8 lines Block An example at QCIF resolution EE569 Digital Video Processing

  49. H.263 Baseline Coding Algorithm • Video Frame Structure - support sub-QCIF, QCIF,CIF,4CIF and 16CIF • Video Coding Tools - Motion estimation and compensation range : [-16,15.5] accuracy : half-pel - Transform 8×8 DCT - Quantization Q factor 3D VLC (LAST,RUN,LEVEL) - Entropy Coding • Coding Control - Intra/Inter switch EE569 Digital Video Processing

  50. Advanced Coding Modes in H.263 Unrestricted motion vector mode • Annex D range : [-31.5,31.5] Allow MV to point outside the picture boundaries Syntax-based arithmetic coding mode • Annex E About 5% savings over VLC • Annex F Advanced prediction mode Overlapped Block Motion Compensation (OBMC) • Annex G PB-frame mode … I B P B P EE569 Digital Video Processing

More Related