1 / 11

Roman Lysecky, Frank Vahid* Department of Computer Science and Engineering

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning. Roman Lysecky, Frank Vahid* Department of Computer Science and Engineering University of California, Riverside {rlysecky, vahid}@cs.ucr.edu

oliana
Télécharger la présentation

Roman Lysecky, Frank Vahid* Department of Computer Science and Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning Roman Lysecky, Frank Vahid* Department of Computer Science and Engineering University of California, Riverside {rlysecky, vahid}@cs.ucr.edu *Also with the Center for Embedded Computer Systems at UC Irvine This work was supported in part by the National Science Foundation and the Semiconductor Research Corporation

  2. 2 Profile application to determine critical regions 1 Initially execute application in software only Profiler Profiler µP 3 I$ Partition critical regions to hardware µP I$ D$ 5 Partitioned application executes faster with lower energy consumption D$ FPGA Dynamic Part. Module (DPM) FPGA Dynamic Part. Module (DPM) 4 Program configurable logic & update software binary IntroductionWarp Processors – Dynamic HW/SW Partitioning • Study the benefits of warp processing for FPGA soft processor cores R. Lysecky

  3. SW Binary Binary Binary Standard Compiler Profiling Proc. FPGA CAD Tools CAD Tools CAD Tools Profiling Profiling Profiling Proc. FPGA Proc. FPGA Proc. FPGA IntroductionWarp Processors – Dynamic HW/SW Partitioning • Dynamic HW/SW Partitioning • Enabler – Synthesis from Binaries [Stitt & Vahid, 2005][Stitt & Vahid, 2002] • Advantages • Does not require any special compilers • Completely transparent • Provides separation of function and architecture • Avoid complexities of supporting different FPGAs • Opens additional market segments (i.e., all software developers) that otherwise would not use FPGAs and CAD Traditional partitioning done here Dynamic Hardware/Software Partitioning: A First Approach, DAC’03 A Configurable Logic Fabric for Dynamic Hardware/Software Partitioning, DATE’04 Dynamic FPGA Routing for Just-in-Time FPGA Compilation, DAC’04 R. Lysecky

  4. Proc. * Proc. I$ D$ IntroductionSoft Processor Cores • FPGA vendor currently providing soft processor cores • Xilinx – PicoBlaze and MicroBlaze • Altera – NIOS and NIOS II • Advantages • Configurability • Add custom instructions/coprocessors • Configurable instruction/data caches • Quickly integrate processor within any FPGA • Easy to build multi-processor systems • Disadvantages • Higher power consumption and decreased performance • Compared to hard-core embedded processor Proc. << * FPGA • How can warp processing benefit soft processor cores? R. Lysecky

  5. Micro Blaze Instr. BRAM i_lmb lmb_cntrl d_lmb lmb_cntrl Data BRAM opb Periph1 Periph2 FPGA IntroductionMicroBlaze Soft Processor Core • MicroBlaze Soft Processor Core • 32-bit configurable processor core with three-state pipeline • Execution frequency as high as 150 MHz • 85 MHz using Spartan3 FPGA • Configurable instruction and data caches • Configurable HW datapath components • Multiplier to support mul instruction • Divider to support idiv instruction • Barrel shifter to support bs and bsi instructions R. Lysecky

  6. Updated Binary HW Bitstream Binary Std. HW Binary Binary Binary Binary Binary JIT FPGA Compilation RT Synthesis Binary Updater Partitioning Decompilation Profiler BRAM Interface Dynamic Part. Module (DPM) WCLA JIT FPGA Compilation MicroBlaze Warp Processor Single Processor System Micro Blaze Instr. BRAM i_lmb lmb_cntrl d_lmb lmb_cntrl Data BRAM opb Periph1 Periph2 FPGA A Configurable Logic Fabric for Dynamic Hardware/Software Partitioning, DATE’04 Dynamic FPGA Routing for Just-in-Time FPGA Compilation, DAC’04 R. Lysecky

  7. Micro Blaze Micro Blaze Micro Blaze Instr. BRAM Instr. BRAM Instr. BRAM Data BRAM Data BRAM Data BRAM Dynamic Part. Module (DPM) MicroBlaze Warp Processor Multi-Processor System FPGA A Configurable Logic Fabric for Dynamic Hardware/Software Partitioning, DATE’04 Dynamic FPGA Routing for Just-in-Time FPGA Compilation, DAC’04 R. Lysecky

  8. DADG & LCH Reg2 Reg0 Reg1 32-bit MAC Existing FPGA (250 MHz) MicroBlaze Warp ProcessorWarp Configurable Logic Architecture (WCLA) • Warp Configurable Logic Architecture (WCLA) • Data address generators (DADG) and Loop control hardware (LCH) • Provides fast, efficient coprocessor interface • Fast, single-cycle 32-bit multiplier-accumulator (MAC) • Ideally, WCLA would use existing FPGA for configurable logic • JIT FPGA compilation tools currently only support our custom CAD-oriented FPGA Profiler Micro Blaze Instr. BRAM BRAM Intrf. Data BRAM DPM Custom FPGA (250 MHz) opb WCLA P1 P2 FPGA A Configurable Logic Fabric for Dynamic Hardware/Software Partitioning, DATE’04 Dynamic FPGA Routing for Just-in-Time FPGA Compilation, DAC’04 R. Lysecky

  9. Average speedup of 5.8X using warp processing MicroBlaze warp processor is on average 1.3X faster than 325MHz ARM10 MicroBlaze Warp ProcessorPerformance Speedup (Single Critical Kernel) R. Lysecky

  10. Average energy reduction of 57% using warp processing MicroBlaze warp processor requires on average 26% less energy than 325MHz ARM10 MicroBlaze Warp Processor Energy Consumption (Single Critical Kernel) R. Lysecky

  11. MicroBlaze Warp Processor Conclusions & Future Work • Conclusions • Studied the benefits of warp processing for FPGA soft processor cores (MicroBlaze) • Average speedups of 5.8X (>10X possible for some applications) • Average energy reduction of 58% • Demonstrated MicroBlaze warp processor is competitive with hard-core embedded processors • Speedup of 1.3X compared to 325MHz ARM10 • Energy reduction of 26% compared to 325MHz ARM10 • Future Work • Prototyping our custom FPGA and warp processors • Supporting a wider range of applications (PDA/desktop/server) • Incorporating advances on-chip configurable structures R. Lysecky

More Related