1 / 17

Entropy Slices for Parallel Entropy Coding K. Misra, J. Zhao and A. Segall

Entropy Slices for Parallel Entropy Coding K. Misra, J. Zhao and A. Segall. Entropy Slices. Introduction: Entropy Slice Introduce partitioning of slices into smaller “entropy” slices Entropy slice Reset context models Restrict definition for neighborhood

stan
Télécharger la présentation

Entropy Slices for Parallel Entropy Coding K. Misra, J. Zhao and A. Segall

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Entropy Slices for Parallel Entropy CodingK. Misra, J. Zhao and A. Segall

  2. Entropy Slices • Introduction: Entropy Slice • Introduce partitioning of slices into smaller “entropy” slices • Entropy slice • Reset context models • Restrict definition for neighborhood • Process identical to current slice by entropy decoder • Key difference: reconstruction uses information from neighboring entropy slices

  3. Entropy Slices • We now introduce major advantages for the entropy slice concept • Advantage #1 - Parallelization: • Entropy slices do not depend on information outside of the entropy slice and can be decoded independently • Allows for parallelization of entire entropy decoding loop – including context adaptation and bin coding • Advantage #2 - Generalization • Entropy slices can be used for all entropy coding engines currently under study in the TMuC and TMuC software • Moreover, we have software available for PIPE and CABAC • V2V • CABAC • PIPE • UVLC

  4. Entropy Slices • Advantages #3 – No impact on single thread/core: • Parallelization capability does not come at the expense of single thread/core applications • A single thread/core process may • Decode all entropy slices prior to reconstruction OR • Decode entropy slice and then reconstruct without neighbourhood reset • This is friendly to any architecture

  5. Entropy Slices • Advantage #4 –Easy Adaptation to Decoder Design • Bit-stream can be partitioned into a large number of entropy slices with little overhead • For example, we will show performance of 32 entropy slices for 1080p on next slide – this would translate to ~128 slices for 4k. • Decoder can schedule N entropy decoders easily, where N is arbitrary • One example: for 32 slices, architecture with parallelization of 4 (N=4) would assign 8 slices per decoder. • Another example: for 32 slices, architecture with N=8 would assign 4 slices per decoder • Additionally, for large resolutions (4k,8k) possible to scale to 100s of decoders for GPU implementations

  6. Advantage #5 –Coding Efficiency Insertion of Entropy Slices results in negligible impact on coding efficiency. For example, if configure the encoder for a parallelization factor of 32, we get: Entropy Slices

  7. Entropy Slices • Advantage #6 –Specification • Entropy slices allow simple and direct specification of parallelization at the Profile and Level stage • This is accomplished by: • Specifying the maximum number of bins in an Entropy Slice • Specifying the maximum number of Entropy Slices per picture • Allows addition specification of PIPE/V2V configurations • Maximum number of bins per bin coder in an Entropy Slice • Additional advantage: straightforward to determine conformance at encoder

  8. Entropy Slices • Syntax • Slice header • Indicate slice is “entropy slice” • Send only information necessary for entropy decoding

  9. Conclusions • We have presented the concept of an “entropy slice” for the HEVC system • Advantages include: • Parallel entropy decoding (both context adaptation and/or bin coding) • Generalization to any entropy coding system under study • No impact on serial implementations • Easy adaptation to different parallelization factors at the decoder • Negligible impact on coding efficiency (<0.2%) • Direct path for specifying parallelization at the profile/level stage • Software is available

  10. Entropy Slices • In the last meeting, two topics were discussed • Size of entropy slice headers • Extension to potential architectures that do not decouple parsing and reconstruction • We address these in the next slides…

  11. Entropy Slices • Header Size • Very small (as asserted previously) • Quantitative • 2 bytes + NALU (1 byte) for 1080p • Scales for resolutions due to first_lctb_in_slice

  12. Entropy Slices • Extension to additional architectures • Previous meeting there was interest in extending the method to architectures that do no buffer symbols between parsing and reconstruction • This anticipates “joint-wave-front” processing of both parsing and reconstruction loops • We investigated this issue and concluded the following: • In the current TMuC design, we observe that it is not possible to do wavefront processing of the parsing stage. • If we configure the TMuC to support wavefront parsing, the extension of entropy slices is straightforward

  13. Entropy Slices Our approach: provide additional entry-points without neighbor restriction “Entropy slice” entry-points EC Init EC Init EC Init EC Init EC Init : Use cabac_init_idc to initialize entropy coder Confidential 13

  14. Entropy Slices Entropy + Reconstruction steps : 16 Confidential 14

  15. Syntax Signal that the bin coding engine will be reset at start of each LCU row Allow signaling cabac_init_idc for the reset Entropy Slices

  16. Entropy Slices 4x parallelism: Maintain initial 32x parallelism Additionally: Four entry points in the ES (aligned with LCU rows; result 4x speedup) RD performance - .3% Max parallelism: Maintain initial 32x parallelization Additionally: one entry point for every LCU row 17x for 1080p RD performance - .5-1% • Performance

  17. Entropy Slices • Conclusion • Entropy slices well tested and flexible • Demonstrated in multiple environments (JM, JMKTA, TMuC) • Demonstrated with CABAC and CAV2V • Friendly to serial and parallel architectures (including both decoupled and coupled parsing/reconstruction architectures) • From the last meeting: “The basic concept of desiring enhanced high-level parallelism of the entropy coding stage to be in the HEVC design is agreed.” • We propose • Adoption of the entropy slice technology into the TM • Evaluation of the “joint-wavefront” extension in a CE

More Related