1 / 8

FPGA-Based Prototyping of Multi-Level Computing Architecture for Enhanced Processor Performance

This work presents an FPGA-based prototype of a Multi-Level Computing Architecture (MLCA) aimed at enhancing processor performance through advanced techniques like superscalar execution and dynamic voltage scaling. The control processor, identified as a potential bottleneck, undergoes custom microarchitecture design to address high requirements for memory and facilitate speculative execution. The prototype utilizes the Nios II Development Board and focuses on realistic cycle-accurate performance evaluations. Key contributions involve analyzing task formation and memory management, steering towards scalable processing capabilities.

ferris-ross
Télécharger la présentation

FPGA-Based Prototyping of Multi-Level Computing Architecture for Enhanced Processor Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPGA-based Prototyping of the Multi-Level Computing Architecture presented by Davor Capalija Supervisor: Prof. Tarek S. Abdelrahman Connections 2006

  2. A modern processor • Superscalar, out-of-order and speculative execution Control Unit Instruction Queue Memory Register File XU XU XU Execution units

  3. Multi-level Computing Architecture Control Program while(…) { Allocate(out frame) Preprocess(…) Analyze(…) Output(…) } Control Processor Task Scheduler Shared Memory Universal Register File PU PU PU Task instruction Analyze() Preprocess() Allocate() Tasks

  4. Previous work in the MLCA group • Automatic task formation • Kirk Stewart • Compile-time optimizations to extract parallelism • Utku Aydonat • Task memory management • Ahmed Abdelkhalek • Power optimization using dynamic voltage scaling • Ivan Matosevic Work done using a high-level functional simulator

  5. Motivation and goal • Realistic cycle-accurate evaluation using an FPGA-based prototype • Feasibility of hardware implementation • Deliver scalable performance • The control processor is expected to be a bottleneck • Custom hardware design of the control processor • Contribution: microarchitecture of the control processor

  6. Challenges • Mapping the architecture to FPGA device resources • High requirements for on-chip memory: blocks, capacity & ports • System: shared memory, URF • PUs: caches, private and instruction memories • CP: renaming tables, task queues • Control processor microarchitecture design space • Performance vs. area trade-offs • Support for speculative execution of tasks

  7. Status • Initial FGPA-based prototype • Nios II Development Board, Stratix Pro Edition (1S40) • Based on initial implementation by David Han • PUs - Altera Nios II/f processors • Interconnect - Altera Avalon interconnect • Memory - both on-chip & off-chip • Software-based control processor • Emulated on one Nios II/f processor • Determining and removing bottlenecks • Next step: microarchitecture of the Control Processor

  8. Bonus FPGA device Shared memory Universal Register File CP’s mem Ins4 M Priv4 M Ins1 M Priv1 M Ins3 M Priv3 M Ins2 M Priv2 M I$ D$ I$ D$ I$ D$ I$ D$ I$ D$ PU4 PU3 PU1 PU2 CP Comm4 Comm1 Comm2 Comm3 CP RT TQ

More Related