1 / 11

Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores

Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores. Joerg Henkel NEC C&C Research Princeton, New Jersey. Tony Givargis, Frank Vahid* Dept. of Computer Science & Engineering University of California, Riverside

venessah
Télécharger la présentation

Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores Joerg Henkel NEC C&C Research Princeton, New Jersey Tony Givargis, Frank Vahid* Dept. of Computer Science & Engineering University of California, Riverside *also with the Center for Embedded Computer Systems, UC Irvine This work was supported by the National Science Foundation under grant # CCR-9876006 , and by a Design Automation Conference graduate scholarship.

  2. Core database Application1 Peripheral1 Peripheral2 Peripheral1 Peripheral2_a Peripheral2_b …. System-on-a-chip (SOC) • Want to explore alternative cores, parameter settings, and applications • Gate/RT level simulation too slow SOC Application2 Micro- processor Cache Memory Bridge

  3. SOC: System-level model Application Cache Memory Micro- processor Cache Memory Bridge Bridge Peripheral Peripheral Peripheral Peripheral Peripheral Peripheral SOC: Gate-level model Application • Still need system-level method for peripherals • 3-step method Micro- processor Cache Memory Bridge Peripheral Peripheral Peripheral SOC System-level Power Estimation • Microprocessor • Tiwari/Malik/Wolfe 94 • Instruction set simulator • Marculescu/Pedram 96 • Instruction trace reduction Micro- processor • Plus cache, memory & bus • Simunic/Benini/DeMicheli 99 • Extended instruct. simulator • Givargis/Vahid/Henkel 99 • Trace reductions

  4. Reset() … Enable_tx() … Enable_rx() … Send() … Rcceive() … UART UART Core Provider’s Step 1: Instruction-based System-Level Model Creation • System simulation model already commonly used, and required in VSIA standard • Executes ~1000x faster than gate-level model Core database UART JPEG decode ….

  5. Energy 2 bytes 4 bytes 8 bytes 16 bytes Reset Reset 13 J 13 J 13 J 14 J 14 J Enable_tx Enable_tx 23 J 23 J 25 J 24 J 24 J Enable_rx Enable_rx 18 J 18 J 19 J 19 J 19 J Send Send 76 J 76 J 77 J 89 J 115 J Receive Receive 44 J 44 J 49 J 55 J 64 J Buffer size UART instruction UART instruction Instruction Core Provider’s Step 2: Low-level Per-instruction Power Evaluation • Measure power of gate/layout model, per instruction • Use unique testbench per instruction, may take hours/days • Low-level model differentiates cores from other SOC modules enabling accurate power estimation • Must account for core parameters

  6. Energy Reset 13 J Enable_tx 23 J Enable_rx 18 J Send 76 J Receive 44 J Core Provider’s Step 3: Back Annotation of System Model Core database Reset() … uJtot += 13 Enable_tx() … uJtot += 23 Enable_rx() … uJtot += 18 Send() … uJtot += 76 Rcceive() … uJtot += 44 UART UART UART JPEG decode ….

  7. 2 bytes 4 bytes 8 bytes 16 bytes Mode 1: Idle Reset 11 J 13 J 14 J 14 J Enable_tx 27 J 32 J 31 J 31 J Enable_rx 17 J 18 J 19 J 18 J Send 17 J 19 J 19 J 20 J Receive 14 J 15 J 17 J 18 J Enable_tx or Enable_rx Mode 2 : Enabled Mode1: Idle Mode2: Enabled Reset 13 J 13 J 14 J 14 J Enable_tx 23 J 25 J 24 J 24 J Reset Enable_rx 18 J 19 J 19 J 19 J Send 76 J 77 J 89 J 115J Receive 44 J 49 J 55 J 64 J Core “Power Modes” Requires Extra Effort by Core Provider • Unlike microprocessor, certain peripheral core instructions can greatly modify power consumption of other instructions • Must create power mode transition function, and measure power per instruction per mode.

  8. + Total energy User Performs System Simulation, Which Yields Power Data • Simulation takes only seconds or minutes SOC Application Micro- processor Cache Memory Core database Bridge Peripheral Peripheral UART UART UART JPEG decode ….

  9. 14% 1793 1% Gate-level: 40,980 sec 1573 1550 “Databook” RT-level: 2,700 sec Instr.-based system-level: 14 sec 38% 717 5% 519 493 37% 2% 155 113 115 Results: Image-decode Accelerator • Examined 3 peripheral cores: UART, DMA, JPEG • Compared our instruction-based system-level method with: • Gate-level simulation: slow but accurate • “Databook” RT-level: cycle-accurate simulation, used databook average-power values 2000 1800 1600 1400 1200 1000 Energy (mJ) 800 600 400 200 0 UART DMA JPEG

  10. Gate-level energy (mJ) System-level energy (mJ) Single-mode 113 86 23.0% Two-modes 104 8.6% Four-modes 115 1.7% Error Results: Importance of Power Modes • Proper power-mode selection is critical for peripheral cores • Too few modes or wrong modes can lead to much error UART example

  11. Conclusions • Introduced instruction-based method is • Accurate (less than 5% error) • Fast (1000x speedup over gate-level) • Fits with current core-based methodology • Concept of power modes is necessary for accuracy • Future work includes: • Trace-simulator-based approach (10x speedup) • Trace-analysis-based approach (100x speedup)

More Related