1 / 25

Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs

Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs. Samarjit Chakraborty Joint work with Alexander Maxiaguine (ETH Zurich) Yongxin Zhu and Weng-Fai Wong. Background and Motivation. SoC platforms for multimedia processing

idalia
Télécharger la présentation

Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs Samarjit ChakrabortyJoint work withAlexander Maxiaguine (ETH Zurich) Yongxin Zhu and Weng-Fai Wong

  2. Background and Motivation • SoC platforms for multimedia processing • Example: Eclipse template and Viper SoC architecture • Advantages and Disadvantages • Flexibility, low design costs, time-to-market advantages … • Large disparity in performance compared to customized ASIC-based solutions • Using configurable platforms • Tuning, platform configuration and management techniques to improve performance involves several tradeoffs

  3. Video decoding Network Interface PE1 PE2 Vout Bv B1 B2 VLD IQ IDCT MC MP3 B3 Ba Aout Audio decoding Network Example Platform Architecture buffer sizes scheduling policy A set-top box device • Most design space exploration techniques rely on simulation • Artemis project: trace-driven co-simulation and symbolic execution of applications

  4. simulate system & estimate parameters choose configuration parameters choose configuration parameters generate design point generate design point simulate system analytically evaluate system Exploring the Platform Configuration Design Space purely simulation oriented approach resorting to simulation only once at the beginning

  5. Exploring the Platform Configuration Design Space Analytical framework to identify tradeoffs between platform configurations and management techniques Main tool: A new technique to capture the variability in multimedia workloads • Outline of the talk: • Main difficulty • Description of the technique • A case study

  6. Modeling a Platform Architecture infinite sequence of “stream objects” processed “stream objects” network playout buffer buffer output device processor • Task structure • A set of concurrently executing tasks that exchange information through unidirectional data streams

  7. Tuning SoC Platforms: Main Challenge network complex and bursty on-chip traffic • Reasons: • high data-dependent variability in execution time requirements • variability in input-output rates associated with tasks (e.g. variable length decoding in MPEG-2)

  8. How do we Capture this Variability? • Reasons: • high data-dependent variability in execution time requirements • variability in input-output rates associated with tasks (e.g. variable length decoding in MPEG-2) Model? • Statistical methods (e.g. Varatkar & Marculescu TVLSI’04) • Deterministic best/worst case characterization 

  9. worst-case execution time best-case execution time Best/Worst Case Characterization? • Classical real-time systems • e is the best/worst case execution requirement of one stream object, then for k objects, it is ke • Can we provide a better bound better? • Use “variability characterization curves” (VCCs)

  10. code Variability Characterization Curves (VCCs) sequence of input stream objects sequence of input stream objects • Each execution of the code • consumes variable number of stream objects • produces a variable number of stream objects • requires a variable number of processor cycles

  11. code Variability Characterization Curves (VCCs) sequence of input stream objects sequence of input stream objects • Arrival of stream objects at the input is bursty • The processor also might not be always available (because of some • other tasks or multiple streams being processed on it)

  12. Variability Characterization Curves (VCCs) • Each execution of the code • consumes variable number of stream objects • produces a variable number of stream objects • requires a variable number of processor cycles consumption curve production curve workload curve • Arrival of stream objects at the input is bursty • The processor also might not be always available (because of some • other tasks or multiple streams being processed on it) arrival curve service curve

  13. Variability Characterization Curves (VCCs) • Best/worst-case characterization of sequences • Consumption/production/workload curves • sequences of consecutive executions of the code • Arrival/service curves • sequences of consecutive time units Record the max and min in this window

  14. max execution time for any sequence of 50 events = 200 sum upper workload curve event # sequence length Example: Workload Curve execution time event # max difference

  15. Example: Workload Curve Same long-term behavior, but different burstiness on smaller time scales

  16. Example: Service Curve A VCC represents a family of instances maximum/minimum computing power in any interval of length 2 computing power in time interval [0,2]

  17. Determining System Properties Computing with Curves service input stream output stream PE remaining service

  18. arrival max. delay max. memory Determining System Properties • Max/Min buffer fill level • Max/Min delay • Utilization • ….

  19. e1 e3 e4 e2 Proportional Share Rate Monotonic CPU2 CPU1 Determining System Properties ? P1=7 ? P2=11 Periodic stream Execution requirement (can be represented as a workload curve)

  20. remains periodic becomes bursty e3 e1 e4 e2 burstiness increases Rate Monotonic Proportional Share becomes bursty input streams Determining System Properties

  21. Video decoding Network Interface PE1 PE2 Vout Bv B1 B2 VLD IQ IDCT MC MP3 B3 Ba Aout Audio decoding Network Case Study buffer sizes scheduling policy A set-top box device • Tradeoffs between TDMA period and buffer sizes • Large period  low overhead but larger buffers • Small period  high overhead but smaller buffers

  22. Case Study buffer space [#bits x 107]

  23. SDRAM RISC Arbiter DSP The Bottomline Communication Templates Computation Templates imagecoprocessor FPGA Characterize using VCCs (instruction set simulation / cycle accurate simulation / databook) DSP RISC SDRAM CANinterface mC Architecture Tune/Configure TDMA EDF EDF proportionalshare TDMA FCFS WFQ Discrete event simulation is NOT required Use the proposed method! Priority dynamicfixed priority static WFQ

  24. simulate system & estimate parameters choose configuration parameters generate design point analytically evaluate system The Bottomline

  25. ? ? ? ? ? Questions!

More Related