1 / 35

05/06/2010

FAST MODE DECISION IN H264/AVC VIDEO CODEC NIranjan Mulay (0393251) Chen Gao (0401840) (El6123: Project Presentation). 05/06/2010. Outline:. Introduction to H.264/AVC coding standard Mode decisions in H.264/AVC - Intra Block - Inter Block RDO algorithm and the need for FMD

Télécharger la présentation

05/06/2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAST MODE DECISION IN H264/AVC VIDEO CODECNIranjanMulay (0393251)Chen Gao(0401840)(El6123: Project Presentation) 05/06/2010

  2. Outline: • Introduction to H.264/AVC coding standard Mode decisions in H.264/AVC - Intra Block - Inter Block • RDO algorithm and the need for FMD • FMD (for Intra and Inter) Literature survey: edge-map based FMD • Study of x264 code and encoding options • Implementation: -Generation of MB mode statisticsfile from X264 -Visualize the modes in Matlab -Intra FMD; Inter FMD • Summary and future work

  3. Introduction to H.264/AVC Coding Standard The key features of H.264: • Improved Intra prediction: Directional spatial prediction • Enhanced Temporal Prediction: -Motion compensation with variable block sizesfrom 4x4 to 16x16: reduces ‘prediction error’ -Quarter-pelaccurate motion estimation -Multiple reference for motion estimation -Weighted prediction (for B and P frames) • DCT-like integer transform: No mismatch between encoder and decoder

  4. Introductionto H.264/AVC Coding Standard(Cntd) • Efficient entropy coding: -Uses arithmetic entropy coding, has option for VLC coding -Context adaptive entropy coding: 2 options – CAVLC and CABAC • Variable size (primarily 4x4 along with 8x8,16x16) transform: - Smaller size helps to represent a signal in locally adaptive manner which reduces ringing artifacts. - Generally high frequency=> 4x4 and low frequency=> 16x16 • In-loop deblocking filter: Reduces blocking artifacts, improves quality. • Special Error Resilient Tools

  5. H.264 Intra Modes: • Intra 4x4 : useful for a MB with significant detail • Intra 16x16 : good for coding very smooth areas (Intra 8x8 chroma: similar to intra 16x16) • I_PCM: no prediction or transform

  6. ‘Intra 16x16’: • Mode 0 (vertical): extrapolation from upper samples. • Mode 1 (horizontal): extrapolation from left samples. • Mode 2 (DC): mean of upper and left-hand samples. • Mode 3 (Plane): plane prediction based on a linear spatial interpolation by using the upper and left-hand samples of the MB.

  7. ‘Intra 4x4’: Figure:4x4 luma prediction mode

  8. Intra 4x4(Cntd): • Mode 0: Vertical • Mode 1: Horizontal • Mode 2: DC prediction • Mode 3: Diagonal down-left • Mode 4: Diagonal down-right • Mode 5: Vertical-right • Mode 6: Horizontal-down • Mode 7: Vertical-left • Mode 8: Horizontal-up

  9. H.264 Inter Modes: • Hierarchical Decision • Level-1 (Partition): Compute RD-cost for: 16x16, 16x8, 8x16, 8x8. • Level-2 (Sub-Partition): If level-1 => 8x8, Then, compute RD cost of 8x4,4x8 and 4x4 Select the most optimal block! • P_Skip Mode

  10. RDO Algorithm • Formula:RD_cost(s,c,MODE|Qp) = D +  . R • ------------------------------------------------------------------------------ • Computational Complexity of brute-force RDO: • INTRA block: Total Modes = 4(16x16) + 9(4x4) + 1 (I_PCM) + 4(chroma_8x8) = 18 Total # of RDO calculations = M8 * ( M4*16 + M16) Theoretical Bound for a MB: 4 x (9x16+4)=592! • INTERblock: Total Modes = [ 7+1(P_SKIP) ] + Intra counterparts HUGE Computations!! Problem for real time application => So, Need of FMD!

  11. FMD-Intra : Edge-Histogram approach • Main Idea: Use Prediction in Edge Direction Generate edge map using Sobel operator Build edge direction histogram Fastintra mode decision

  12. Generate Edge Map • Sobel Operator (Compute Gradients):

  13. Edge Direction Histogram for Intra_4x4

  14. FMD for Intra_4x4 Contd… As per observations in Reference[5]: - The ideal 4x4 mode is either the primary mode or one of the two neighboring modes - DC mode (Mode 2) is always evaluated - Total Modes = 1(Prime) + 2 (neighbors) + DC = 4

  15. Edge Direction Histogram for Intra_16x16 Total Modes = 1(Prime) + DC = 2

  16. Fast Mode Decision-Inter • Main idea: If we can reasonably decide that MB is temporally stationary orspatially homogeneous, we can encode MB using larger block-size and safely skip all other modes!

  17. Stationary Region Determination • Refers to the stillness between consecutive frames in the temporal dimension • Evaluate Zero-MV Diff : • If (Diff < Threshold Ts) => “Stationary” So, choose16x16 mode and skip other sizes ! • Threshold Ts = 200 (Reference[6])

  18. Homogeneous Region Determination • Refers to texture similarities inside a single video frame • Edge amplitude computation is already done in fast intra mode decision • Threshold values (Reference[6]): for 16x16 block : 20000 for 8x8 block : 5000

  19. Flow Chart of FMD_Inter

  20. Wait...Changing the mode:Theory to Practice! Implementation & Demo

  21. H.264/AVC Profiles • H264/AVC Profiles 

  22. Q. What is X264 ? • ‘x.264’ : • Open source H264/AVC encoder by VideoLAN • ‘C’ code library, Platform : Linux • Optimized as compared to reference JSVM software • Bunch of encoding options! • We finalized the options for “benchmarking” performance of Non-FMD vs FMD case E.g.: Command to encode ‘foreman_qcif.yuv’ sequence… ./x264 -o foreman_qcif.264 foreman_qcif.yuv 176x144 --profile baseline --frame 30 --verbose --keyint 15 --min-keyint 15 --no-scenecut --bframes 0 --ref 1 --slices 1 --fps 15 --qp 25 --partitions all --weightp 0 --me esa --subme 7 --no-chroma-me --no-8x8dct --trellis 0 --no-fast-pskip --visualize

  23. X264 Coding Options: • --keyint 15/--min-keyint 15: Sets GOP size to 15 • --bframes 0: Disables B-frame • --slices 1: Sets 1 slices per frame • --ref 1: Only 1 frame can be used as reference • --me esa: Select exhaustive motion estimation • --no-chroma-me: Ignore chroma in motion estimation • --qp 25: Fixed quantization step-size • --partitions all: Do all possible partitions • --no-scenecut: Disables adaptive I-frame decision

  24. Implementation I:‘Generation of Mode Statistics’ • Intra MB:3 Types :: I_4x4=0 (11 Modes), I_16x16=2 (4 Modes), I_PCM=3, • Inter MB: 3 Types :: P_L0=4, P_8x8=5, P_SKIP=6 • P_LO (Level-1): can have 3 Partitions: D_16x8=14, D_8x16=15, D_16x16=16 • P_8x8 (Level-2): has D_8x8 partition and can have 4 Sub-partitions: D_L0_8x8=3, D_L0_4x4=0, D_L0_8x4=1, D_L0_4x8=2

  25. Implementation II: ‘Visualization Utility’ I-Frame RED : Intra_4x4 CYAN: Intra_16x16 P-Frame GREEN: P_SKIP BLUE:P_8X8 (and below) MAGENTA: P_16x16,P_16x8, P_8x16 Motive:“Seeing is Believing  !” Let’s see a Demo…

  26. Key observations: • I- Frame: • 16x16 size chosen for spatially homogeneous region • 4x4 size chosen for a MB with manyspatial details/local edges ------------------------------------------------------------------------------------- • P-Frame:

  27. Contd… Though H.264 allows variable size MC up-to 4x4 size… • Real world video sequences: Certain percentage of ‘Skipped’ blocks • Spatially Homogeneous regions gets best compensated with 16x16 (such blocks have similar motion; very seldom split to smaller blocks) • Temporally Stationary blocks ( e.g. stationary background even with strong edges) gets best compensated with 16x16 or P_SKIP • Nonetheless, Blocks containing motion boundaries or motion in smaller objects benefit from 8x8 or 4x4 MC

  28. Implementation III: FMD Intra in x264 • ~1000 lines of C code: Edge Map computation, Prime mode computation based on histogram, Modification of mode decision logic in .x264 • Number of candidate modes in Intra-FMD:

  29. Results: Intra FMD (All I frames, Qp=25) Avg. Time Saving: 36.70% Avg. PSNR drop: 0.11 dB

  30. Results:Intra FMD (PSNR vs R) Sequence: Mobile, Coding: All I, Qp= 37,33,29,25 Avg PSNR drop: 0.044 dB, Avg. Increase in R: ~6%, Avg Time Saving: 37.51%

  31. Summary and future work: To Conclude: • Learnt x264 code-flow, different encoding options • Matlab ‘modevisualization script’ is ready • Intra-FMD ready, Inter-FMD (in progress) • Important:FMD framework is ready! Different FMD algorithms can be plugged in to evaluate prime mode selection… Future Work: • Inter FMD • FMD enhancement: Analysis of different modes with conditional probabilistic model

  32. Reference • [1] URL: http://www.videolan.org/developers/x264.html • [2] Thomas Wiegand, Gary J Sullivan, “Overview of the H264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7,July 2003 • [3]URL: http://www.vcodex.com/files/H.264_overview.pdfWhite Paper: An Overview of H.264 Advanced Video Coding • [4] Iain E G Richardson, “H.264 and MPEG4 Video Compression”, WILEY Publications, 2003 • [5] Feng Pan et al, “Fast Mode Decision Algorithm for Intra-prediction in H264/AVC Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 7,July 2005 • [6] D. Wu et al, “Fast Intermode Decision in H264/AVC Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 6,July 2005 • [7] Rui Su, Guizhong Liu, TongyuZhang,”Fast Mode Decision Algorithm for Intra Prediction In H264/AVC”, ICASSP-2006

  33. Thank you!

  34. Questions? Questions? Questions?

More Related