1 / 40

Video Coding Standards

Video Coding Standards. Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2011 Last updated 2011. 5. 13. Agenda . History and Concepts JPEG and JPEG-2000 MPEG-1 and MPEG-2 MPEG-4 H.261 and H.263 H.264 Beyond H.264.

Télécharger la présentation

Video Coding Standards

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Video Coding Standards Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2011 Last updated 2011. 5. 13

  2. Agenda • History and Concepts • JPEG and JPEG-2000 • MPEG-1 and MPEG-2 • MPEG-4 • H.261 and H.263 • H.264 • Beyond H.264

  3. 1. Standards and Standards Bodies • VCEG (video coding expert group) in ITU (formerly CCITT) • Focus on real-time, two-way video communication • MPEG/JPEG (moving picture expert group) in ISO • Focus on multimedia storage and distribution for entertainment • Some are overlapped ITU VCEG ISO MPEG/JPEG H.261 MPEG-1 MPEG-2 => H.262 JPEG H.263 MPEG-4 JPEG-2000 H.264 MPEG-4/AVC <= MPEG-7 H.264 High Profile H.264 SVC H.264 MVC HEVC(H.265) MPEG-21

  4. History of Video Coding Standards HP HEVC SVC MVC 2011

  5. ISO-MPEG/JPEG • JPEG (1992) : compression of still image (DCT) • MPEG-1 (1993) : real time play back of VHS quality on Video CD (1.4Mbps) • MPEG-2 (1995) : broadcasting quality video service (3~5Mbps) • MPEG-4 (1998) : wide bandwidth (20bps to high) and object oriented coding • JPEG-2000 (2000) : better quality still image • ITU-VCEG • H.261 (1990) : video telephony over ISDN (px64kbps) • H.263 (1995) : video telephony over circuit and packet network, at 20 kbps to high bandwidth • H.264 (2003) : multipurpose better quality video coding • Others • MPEG-7 (Multimedia content description interface) for search and retrieval in multimedia DB • MPEG-21(Multimedia Framework) for multimedia delivery for interoperability

  6. Standards process and usage • Standards process • Understanding standards • Only Syntax and Decoder system are defined in Standards. • Encoder, application, and Implementation are open to users • Standards provides “profile and level” and recommended usage for helping users to choose from many technical options. Int’l St’ds Draft St’ds Test Model (Docs & ref. SW) Scope & Aim of St’ds Performance & complexity evaluation Proposals From Companies, Universities Improvement Proposals

  7. 2. JPEG • ISO IS-10918 • By ISO/IEC JTC1/SC29/WG10, (1984~1992) • Widely used in WWW and digital photography • Motion-JPEG is just a successive stream of JPEG images

  8. Baseline JPEG Codec SSSS-value DC Huffman tables • RGB or YCbCr coded in either separately or in interleaved order dc quantization indices bits Differential Coding VLC input image Uniformscalarquantization Level offset 8x8 DCT [0,255] => [-128,127] Zig-zag scan Run-level coding VLC bits ac quantization indices Quantization tables AC Huffman tables RRRRSSSS-value 8x8 blocks

  9. Lossless JPEG • DPCM used, prediction from 3 neighbors pixels • Optional mode • Progressive encoding • Store image data in order of DC only, low-frequency AC, high frequency AC • Hierarchical encoding • Store image data in low resolution to high resolution • Motion-JPEG • Just a sequence of JPEG still images • Low complexity, Error tolerance, Market awareness • Used for video conferencing and surveillance before widely available cheap MPEG-1/2/4 solution in a market

  10. JPEG-2000 • Features • Good compression performance than JPEG • at high compression ratio, no blocking effects • Good compression for continuous tone, bi level (text) • Both lossless and lossy compression in one framework • ROI (region of interest) support • Error resilient support (data partitioning) • Rather slow in current embedded system due to complexity • Encoding process bits Arithmetic Encoder Quantizer (Tiling) Wavelet Transform image

  11. Comparison between JPEG vs. JPEG-2000 Lenna, 256x256 RGB Baseline JPEG: 4572 bytes Lenna, 256x256 RGB JPEG-2000: 4572 bytes

  12. Coder Control Control Data DCT Coefficients Intra-frame DCT Coder Quant - Intra-frame Decoder Decoder DeQ Entropy coder 0 Motion- Compensated Predictor Intra/Inter Motion Data Motion Estimator MPEG-1/2 • MC-DCT Hybrid Coding

  13. MPEG-1 • MPEG-1 • Targeted VHS quality(352x288, 30fps, YCbCr420) on VCD (600MB) • 1.4 Mbps (1.2 Mbps video + 0.2 Mbps audio) VCD, 70 minutes • Three parts: Part 1 System, Part 2 Video, Part 3 Audio • Technology • MC-DCT Hybrid • Macro-block (16x16 pixels): Motion estimation unit • Block (8x8 pixels): DCT and Quant unit • GOP structure • I, P, B picture • Trade-off between random access and coding efficiency • Asymmetric complexity • Larger memory and high computation required at Encoder

  14. MPEG-1 Structure • Syntax Hierarchy • Sequence layer • GOP layer • Picture Layer • Slice Layer • MB Layer • Block Layer

  15. Picture Coding • I Picture: no interframe prediction • P Picture: interframe prediction from one casual reference picture • B Picture: interframe prediction from one previous and one future picture • GOP and picture order • display order (input at encoder) • Transmission order (Encoding/decoding order) I1 B1 B2 P1 B5 I2 B4 P2 B6 B7 B1 I1 B2 B5 P1 I2 P2 B4 B6 B7

  16. MPEG-2 • Major target application • Digital television quality (720x576/480, 25/30 fps) at 3 ~ 4Mbps • Interlaced video support • Frame picture vs field picture : motion compensation unit • Frame DCT vs field DCT in frame picture field picture field picture frame picture Frame DCT Field DCT

  17. Scalability Support • Spatial scalability • Low resolution at Base layer and high resolution at Enhancement layer • BL is used for prediction of EL • E.g. SD resolution at BL, HD resolution at EL • Temporal scalability • 30 fps at BL, 60 fps at EL • SNR scalability • Same resolution but different quality • Data partitioning • Coding Data is packed into different stream BL bit stream BL Dec Lower Quality BL Enc down EL Enc EL Enc Input video Higher Quality EL bit stream

  18. Profile & Level • MPEG-2 has many options; all implementation do not needs all of them • Profiles • Simple : 4:2:0 input, I and P picture only, low complexity & low perf. • Main : 4:2:0 input, I,P,B Picture, interlaced • 4:2:2 : 4:2:2 input (same vertical resolution of color) • SNR : SNR scalable • Spatial : Spatial scalable • High : Spatial and 4:2:2 • Level • Low (352x288), Main(720x576), High 1440 (1440x1152), High (1920x1152) • E.g. • MPEG-1 : Main profile & Low Level • SD DTV, DVD : Main profile & Main Level • HDTV : Main profile & High Level (Historically MPEG-3’s target application)

  19. MPEG-4 • Features • Support for low bit rate (from 20 Kbps) • Support for object based coding • Reuse of components, composition, and interactivity support. • In practice, object based is not well used • Object-based Coding • Video Object • Shape Coding : transparent/opaque region, binary or grey scale • Texture coding with arbitrary shape • DCT after zero filling in interblock and exrapolation in Intrablock VO3 VO1 VO2

  20. Visual data structure

  21. H.261 • ITU Mostly focus on real-time communication • H.261 • First video coding std(1990) • N-ISDN (1990’s) • px64Kbps (p=1,..30), typically 64 ~ 384kbps • Circuit network based: low delay, reliable • H.261 key features • YCbCr420 CIF, QCIF input • MC-DCT • Integer-pel motion • Optional loop filter (for deblocking) • Filtering at 8x8 block boundary • FEC used

  22. H.261 syntax structure • H.261 Bit structure

  23. H.263 Versions Version 1 (1995) Improvement to H.261 4 optional modes Version 2 (2000, H.263+) 12 optional modes Version 3 (2002, H.263++) 19 optional modes Key Features Targets to 20 kbps and for packet based network also Half-pel prediction Redesigned 3-D VLC code H.263

  24. H.263 Optional Modes • Annex D: Unrestricted motion vectors • Annex E: Syntax-based arithmetic coding • Annex F: Advanced Prediction • Annex G: PB Frames • Annex I : Advanced Intra Coding • Annex J: Deblocking Filter • Annex K: Slice Structured Mode • Annex L: Supplemental enhancement information • Annex M: Improved PB frames • Annex N: Reference Picture Selection • Annex O: Scalability • Annex P: reference picture resampling

  25. (continued) • Annex Q: Reduced resolution update • Annex R: Indepenedent Segment Decoding • Annex S: Alternative inter VLC • Annex T: Modified Quantization • Annex U: Enhanced reference picture selection • Annex V: Data partition slice • Annex W: Additional supplemental enhancement information

  26. Performance

  27. H.264 • Name • ITU H.264 = ISO MPEG-4 Part 10/AVC • H.26L : Long term enhancement, not compatible H.263 • Now accepted in DMB-T/S, IPTV, replacing many MPEG-2 solutions • For 50% gain to H.263+

  28. Key features • Smaller processing units (upto 4x4 pixel block) • Intra prediction • Inter prediction • Macroblock based Interframe prediction selection • ¼ pixel motion vector support • Motion vector options for subblocks • 4x4 Integer DCT • Deblocking filter • Universal VLC • CAVAC (content-based adaptive binary arithmetic coding)

  29. A B M C D I J K L M A B C D I M A B C D J I K Mean (A-D, I-M) J M A B C D E F G H L K I L J K L H H H H H H V V V V V V H H …….. …….. Mean (H, V) Mean (H, V) V V …….. …….. Intra-frame Prediction • luma - 4x4: 9 modes - 16x16: 4 modes • chroma - 8x8: 4modes - The same prediction mode is always applied to both chroma blocks …

  30. I P B Inter-frame Prediction

  31. Transform and Quantization • Integer DCT • No encoder decoder mismatch • Three types of transformfollowed by quantization - Type 1: for the 4x4 array of luma DC coefficients in intra MBs predicted in 16x16 mode # -1 - Type 2: for the 2x2 array of chroma DC coefficients #16-17 - Type 3: for all other 4x4 blocks # 0-15, 18-25 ( 16x16 Intra Mode only) 16 17 -1 4 pixels 4 pixels 4 pixels 4 pixels 4 pixels 4 pixels 0 1 4 5 18 19 22 23 2 3 6 7 20 21 24 25 12 13 8 9 10 11 14 15 *Data is transmitted in the numbered order

  32. 4×4 DCT ( X – Input, Y – output) 4×4 integer transform - forward - backward Transform and Quantization W Post-scaling factor (PF)

  33. Entropy Coding

  34. A boundary-strength (BS) parameter is assigned to every 4×4 block BS = 0 No filtering BS = 1-3 Slight filtering BS = 4 Strong filtering Filters only when |P0-Q0|< α |P1-P0|< β |Q1-Q0|< β Thresholds α and β depend on the average quantization parameter (QP) The deblocking filtering accounts for 1/3 of the computational complexity of a decoder. Deblocking Filters

  35. Network Adaptation • VCL & NAL • VCL (video coding layer) • NAL (network adaptation layer) • Error Resilient Tools • Flexible macroblock ordering (FMO) • Allows to assign MBs to slices In an order other than scan order • Arbitrary slice ordering (ASO) • Improved end-to-end delay in real-time applications • Redundant slices (RS) • Redundant representations are coded using different coding parameters Slice Group #0 Slice Group #1

  36. Profile & Level • Main application • Baseline : Video telephony • Main : DTV and Storage • Extended :Streaming • Profile & tools

  37. Performance comparison

  38. Contributions of the VCL Tools

  39. Conclusion • Many video coding standards • St’ds reflect Coding Technology and Implementation Technology • Coding performance has improved over 4 times since H.261 (1990) • What’s next • SVC (Scalable Video Coding) in H.264 (done) • H.264ext (further improvement of H.264) • 3-D and MVC (Multi-View Coding) is on going. • UDTV (ultra Definition TV: 3840x2160) • And what’s next?

More Related