1 / 40

Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering

Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu- Huan Sung Jia-Ching Wang, Senior Member, IEEE. Outline. Introduction Related Works Feature Selection Proposed Fast Mode Decision Experiment Results

Télécharger la présentation

Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu-Huan Sung Jia-ChingWang, Senior Member, IEEE

  2. Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion

  3. Introduction • The up-to-date video coding standard H.264/AVC • twice the compression ratio of other video coding standards. • maintaining nearly the same visual equality. • However, an extremely high computational complexity is a tradeoff of the performance gains. • Video conferencing • Live TV broadcasting • Mobile computing

  4. Introduction • H.264/AVC adopts many features that can enhance coding performance. • Variable block-size MC • Sub-pixel ME • Multiple reference pictures selection • Directional intra prediction • In-the-loop de-blocking filtering, etc. • The features incur a heavy burden during the encoding process.

  5. Introduction • Reducing the computational time has received considerable attention recently. • Reducing the encoding time involves two main parts : • Inter-mode decision • Intra-mode decision according to a RD cost optimization scheme.

  6. Introduction • The proposed method presents a Multi-Phase Classification (MPC) scheme • use a nearest mean criterion. • determine inter-modes and intra-modes. • MPC is a hierarchical classification scheme that allows an MB to be classified into a category phase by phase.

  7. Introduction • The MPC presents a three-phase classification scheme. • a phase consists of several categories. • partition from current phase into next phase. • categories are the sub-sets of the upper phase. • Each category within a phase is represented as a feature point in the feature space. • assign an MB to a category with the minimum distance.

  8. Outline • Introduction • Related Works • Feature Selection • Proposed Method • Experiment Results • Conclusion

  9. Related Works • Four ways to develop the fast mode decision algorithm in previous works. • The first approach is SIKP-mode detection • early identified if anMB can be skipped. • Kannangara et al. [3] and Zhao et al. [4]. [3] C. Kannangara et al., “Low-complexity skip prediction for H.264 through Lagrangian cost estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 202–208, Feb. 2006. [4] Y. Zhao, M. Bystrom, and I. E. G. Richardson, “A MAP frame work for efficient skip/code mode decision in H.264,” in Proc. ICIP2006, Atlanta, GA, Oct. 8–11, 2006.

  10. Related Works • The second approach is mode prediction • directly or indirectly predict the best mode for the current MB. • Wu et al. [5], Ri et al. [6] and Paul et al. [17]. • The third approach is mode classification • classifies the current MB into a specific category. • the corresponding candidate modes will be checked to find the best. • Kim et al. [7], Yu et al. [8], Liu et al. [9], Zeng et al. [10] and Zhao et al. [11]. [5] D.Wu, F. Pan, K. P. Lim, and S.Wu et al., “Fast intermode decision in H.264/AVC video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 953–958, Jul. 2005. [6] S. H. Ri, Y. Vatis, and J. Ostermann, “Fast inter-mode decision in an H.264/AVC encoder using mode and Lagrangian cost correlation,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 2, pp. 302–306, Feb. 2009. [17] M. Paul,W. Lin, C. T. Lau, and B. S. Lee, “Direct inter-mode selection for H.264 video coding using phase correlation,” IEEE Trans. Image Process., vol. 20, no. 2, pp. 461–473, Feb. 2011.

  11. Related Works • The last approach redefines the optimization cost function • number of operations needed for mode selection can be reduced. [7] C. Kim and C. C. Jay Kuo, “Feature-based intra-/inter coding mode selection for H.264/AVC,” IEEE Trans. Circuits and Syst. Video Technol., vol. 17, no. 4, pp. 441–453, Apr. 2007. [8] A. C. W. Yu, G. R. Martin, and H. Park, “Fast inter-mode selection in the H.264/AVC standard using a hierarchical decision process,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 2, pp. 186–195, Feb. 2008. [9] Z. Liu, L. Shen, and Z. Zhang, “An efficient intermode decision algorithm based on motion homogeneity for H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 128–132, Jan. 2009. [10] H. Zeng, C. Cai, and K.-K. Ma, “Fast mode decision for H.264/AVC based on macro block motion activity,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 4, pp. 491–499, Apr. 2009. [11] T. Zhao, H.Wang, S. Kwong, and C.-C. Jay Kuo, “Fast mode decision based on mode adaptation,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 5, pp. 697–705, May 2010.

  12. Outline • Introduction • Related Works • Feature Selection • Feature Vector • Feature Space and Classifier • Proposed Fast Mode Decision • Experiment Results • Conclusion

  13. Feature Vector • There is a strong correlation of RD cost between the best mode and the temporal-spatial modes. • A three-dimensional feature vector that comprises RD costs of neighboring MBs is used to discriminate between the different modes for mode decision.

  14. Feature Vector • RD costs range to various extents under different coding modes and motion contents and should not be directly used as a universal criterion. • Using a three-dimensional feature vector • ensure that an MB can be assigned to the most probable category accurately. • adapt to the variable motion contents of various video sequences properly.

  15. Feature Vector • The three components of a feature vector, fskip, fspat, and ftemp, are expressed as :

  16. Flowchart of Initialization

  17. Feature Vector • RD cost is expressed as :

  18. Feature Space and Classifier • The 3D feature space

  19. Feature Space and Classifier • Feature Space and Voronoi Diagram ftemp fskip

  20. Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion

  21. Fast Mode Decision • Nearest Mean Criterion • assign MBs into a specific category. • classify MBs by using Euclidean distance. • predict the best mode of an MB by finding a mean Mi (cluster center).

  22. Fast Mode Decision • Category Organization • directly assigning the mode with minimum distance to the given MB. • unsatisfactory prediction accuracy. • grouping modes with similar characteristics into a category. • reducing the probability of a false prediction.

  23. Fast Mode Decision • Multi-Phase Classification • pass through multiple phases. • avoid assigning an MB to a category too cursorily. • Phase-Iidentifies • Large-Middle category (SKIP/DIRECT, 1616, 168, 816, I1616) • Middle-Small category (168, 816, P88, I44) • Phase-II and Phase-III then divide each motion category into much smaller categories.

  24. Fast Mode Decision

  25. Flowchart of Phases

  26. Fast Mode Decision • Mode decision process can be further accelerated by Early Termination. • activated => if the fskipis below a specific threshold. • SKIP mode is the best mode. • Initial threshold is set to be the average RD costs of SKIP-MBs in the training sequences, and will be dynamically updated according to : fskip Tskip

  27. Flowchart of Early Termination . . .

  28. Error Propagation and Performance Degradation Control • A performance control process is incorporated into the proposed method. • Avoid serious performance degradation caused by repeated use of wrongly predicted resultsor accidental false predictions. • The idea is providing an inspection for the coding result of each MB produced from the fast mode decision algorithm.

  29. Error Propagation and Performance Degradation Control • An adaptive RD cost inspection is proposed and all it needs have been gained already. • temporal RD costs • spatial RD costs • A fast mode decision is made and the corresponding RD cost is obtained, an inspection is performed by :

  30. Flowchart of Inspection

  31. Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion

  32. Training and Test Conditions • The means of each category and the related statistics are generated by JM17.0 [15]. • Ten video sequences are Silent, Ice, Hall, Highway, Miss-America, Carphone, Tempete, Soccer, Bus, and Table Tennis. Video format is QCIF-format. • QP values are 20, 24, 28, 32, and 36. • Two GOP structures (IPPP and IBBP) are used for the training purpose.

  33. Training and Test Conditions • The number of frames to be encoded is set to 100. • The search range of motion estimation is 16, and the search strategy is full search. • The number of reference frames is 1, and the intra-period is set to 4.

  34. Performance of Used Mode

  35. Performance Comparisons (1/2)

  36. Performance Comparisons (2/2)

  37. Performance Comparisons with GOP size

  38. RD Curves

  39. Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion

  40. Conclusion • Experimental results indicate that the quality loss and bitrate increasing are only 0.02 dB and 1.65%, respectively. • Reducing 67.5% encoding time on average among the 12 video sequences of different GOP structures. • Encompass a wide variety of motion contents and different resolutions.

More Related