1 / 20

Design of a 125  W, Fully-Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications

Design of a 125  W, Fully-Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications. Tsu-Ming Liu 1 , Ching-Che Chung 1 , Chen-Yi Lee 1 , Ting-An Lin 2 , and Sheng-Zen Wang 2 1 National Chiao-Tung University, Hsin-Chu, Taiwan 2 MediaTek Inc. Hsin-Chu, Taiwan 2006/7/26. Outline.

elysia
Télécharger la présentation

Design of a 125  W, Fully-Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of a 125W, Fully-Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu1, Ching-Che Chung1, Chen-Yi Lee1, Ting-An Lin2, and Sheng-Zen Wang2 1National Chiao-Tung University, Hsin-Chu, Taiwan2MediaTek Inc. Hsin-Chu, Taiwan 2006/7/26

  2. Outline • Introduction • System Specification • Improved Memory Hierarchy • Low-Power Architectures • Design Flow • Measured Results • Conclusion

  3. Motivation • Low power demands • The power consumption of existing solutions is still not applicable for portable devices. • A memory system becomes a critical factor in power budgets. • High speed requirements • H.264/AVC requires high-speed modules to accomplish the extensive accesses between the memory and logic. Misc. 30% H.264/AVCCore Power Profiling SRAM 70%

  4. Design Contributions • To reduce power consumption • We exploit the memory hierarchy to reduce memory power consumption. • We develop low-power architectures to lower the working frequency with only a few additional buffers and an additional logic unit. • In addition to the power reduction through architectural levels, an efficient design flow can further reduce the power dissipation.

  5. Target Specification • Dual Standard • H.264/AVC Baseline Profile, Level 4 • MPEG-2 Simple Profile, Main Level • High Quality Decoding (30fps,4:2:0)

  6. System Block Diagram System BUS Syntax Parser SDRAM I/F 8MB SDRAM Intra, Inter Prediction + Display Engine Display I/F In/Post- Loop Filter 4x4/8x8 IDCT Entropy Decoder Slice Pixel SRAM Line-Pixel-Lookahead

  7. Improved Memory Hierarchy • Proposed three-level memory hierarchy SDRAM 24 3rdLevel I/O Interface 16 request i SliceSRAM LPL Unit 2nd Level 32-b bypass Slice SRAM stores rows of pixels Pipeline Register IntraPred. MotionComp. 1st Level …..

  8. Improved Memory Hierarchy • Line-Pixel-Lookahead (LPL) Unit • We exploit an LPL unit to eliminate redundant data and thereby reduce memory space. SliceSRAM (153.6kb) SliceSRAM (19.2kb) LPL Unit w/o LPL unit w/tLPL unit Horizontal Horizontal-Up

  9. Improved Memory Hierarchy SRAM Power • Memory Power Consumption DRAM Power mW 60 44% 51% 40 11% Memory Power Consumption 20 w/o MemoryHierarchy 3-level Memory Hierarchy 3-level Memory Hierarchy+ LPL Scheme

  10. Low-Power Architectures • Motion Compensation (MC) • We utilize the data reuse of interpolation window by allocating content buffers. 4x4sub-block 0 1 4 5 SDRAM 2 3 6 7 1% cost of MC 6x9 content buffers 0 1 4 5 2 3 6 7 0 1 4 5 2 3 6 7

  11. Low-Power Architectures • Deblocking Filter (DF) • We reduce the access overhead of different filtering directions by developing novel filtering orders. SRAM 17 18 19 20 1 5 1 5 9 13 17 21 22 23 24 2 6 10 14 21 50% accessreduction!! SRAM 13 15 5 9 5 1 3 7 11 4x4sub-block 1 3 6 10 14 16 6 2 4 12 8

  12. Low-Power Architectures • A lower working frequency is sufficient to meet our design specification. Improved MC 920cycles/MB Improved DF 580cycles/MB 380cycles/MB Pipelined Stage 242MHz 152MHz 100MHz Preliminary This Work

  13. Design Flow Phase 2 • A design flow for this video decoder Phase 1 Design Loop Timing/SI Closure Loop ArchitecturalDesign Synthesis P&R System SPEC C/C++ Model RTL Description RTL Compiler SoC Encounter Further 8.2% power reduction 1. Physical wire-load model (timing closure) 2. Low-power synthesis 3. Timing-aware and SI-prevention routing 73% power reduction 1. Improved Memory Hierarchy (memory size: C ) 2. Motion Compensation (working frequency: f ) 3. Deblocking Filter (working frequency: f )

  14. Measured Results 3.9 mm 3.9 mm

  15. Measured Results • Chip Summary

  16. Measured Results • Power Measurement

  17. Measured Results • Power Measurement • Measured accuracy: • Voltage scaling Max. working freq. (MHz) H.264 Core Power (W) 225W 112MHz QCIF@15fps 1.15MHz 31MHz 125W 1.8 1.6 1.4 1.2 1.0 (V) 1.8 1.6 1.4 1.2 1.0 (V)

  18. Conclusion • A MPEG-2 SP@ML and H.264/AVC BL@L4 video decoder is developed for dual standard requirements. • The tremendous saving in power consumption is attained through both improved memory hierarchy and low-power architectures, and this power can be further reduced through EDA tools. • Sub-mW power consumption can be achieved when real-time decoding MPEG-2 or H.264/AVC video sequences for mobile applications at 1V operating voltage.

  19. Thanks for your attention!

More Related