1 / 22

Instruction Generation for Hybrid Reconfigurable Systems

Instruction Generation for Hybrid Reconfigurable Systems. Ryan Kastner, Seda Ogrenci-Memik, Elaheh Bozorgzadeh and Majid Sarrafzadeh {kastner,seda,elib,majid}@cs.ucla.edu. Embedded and Reconfigurable Systems Group Computer Science Department UCLA Los Angeles, CA 90095. Outline.

rangle
Télécharger la présentation

Instruction Generation for Hybrid Reconfigurable Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction Generation for Hybrid Reconfigurable Systems Ryan Kastner, Seda Ogrenci-Memik, Elaheh Bozorgzadeh and Majid Sarrafzadeh {kastner,seda,elib,majid}@cs.ucla.edu Embedded and Reconfigurable Systems Group Computer Science Department UCLA Los Angeles, CA 90095

  2. Outline • Introduction • Programmability • Hybrid Reconfigurable Systems • Strategically Programmable System • Instruction Generation • Uses in Hybrid Reconfigurable Systems • Relation to Template Generation and Matching • Algorithm for Template Generation and Matching • Experiments • Conclusion

  3. Programmability • Future systems need programmability multiple levels of computation hierarchy Computational Hierarchy: Control Control ADD Register FU FU Memory Register Bank MUL Register -Architecture Level Architecture Level Gate Level Hybrid Reconfigurable Systems have programmability at one or more levels

  4. Tradeoffs Configuration Time Flexibility Thousands of cycles Hundreds of cycles Tensilica, Improv Chameleon Systems Xilinx, Altera Control Control FU FU Memory ADD Register Register Bank MUL Register Gate level Micro-architecture level Architecture level Types of Programmable Units CLBs, LUTs Datapath unit, Control unit, RAM Custom instructions, Register banks Example Platform Hybrid Reconfigurable Systems should find a happy medium

  5. SPS - Strategically Programmable System Memory VPB VPB Memory VPB • Embed (hard or soft) computational units – Versatile Programmable Blocks (VPB) - into FPGA-like fabric • Combine programmable units from gate, microarchitecture and architecture levels • Balance flexibility and configuration time • Need automated method of determining the functionality of VPBs

  6. Overview of SPS SPS Compiler Set of applications specified in high level code (c/c++, fortran, MOC) • Compile to low • level specification • Determine VPB • functionality SPS Architecture Generation SPS Architecture SPS Module Placement VPB Synthesis Routing Arch.

  7. VPB Instruction Generation Set of applications • Given a set of applications, what computation should be implemented on VPBs? RAM VPB VPBs? RAM VPB • Want complex, commonly occurring computation patterns • Look for computational patterns at the instruction level • Basic operation is add, multiply, shift, etc.

  8. Problem Definition • Determining VPB functionality requires regularity extraction • Regularity Extraction - find common sub-structures (templates) in one or a collection of graphs • Each application can be specified by collection of graphs (CDFGs) • Templates are implemented as VPBs • Two related sub-problems: • Template Matching • Template Generation

  9. Template Matching – Formal Def’n Directed Labeled Graph G % + + * * + * + & % * + + * + + + + * * || * & || & + + * * + * * * & + + + • Problem 1: Given a directed, labeled graph G(N, A), a library of templates, each of which is a directed labeled graph Ti(V,E), find every subgraph of G that is isomorphic to any Ti Templates T T1 T2 T3 T6 T5 T4

  10. Template Matching – Formal Def’n • Problem 2: Given an infinite number of each set of templates  = T1, … , Tk and an overlapping set of subgraphs of the given graph G(N,E) which are isomorphic to some member of ; minimize k as well as  xi where xiis the number of templates of type Tiusedsuch that the number of nodes left uncovered is the minimum. % + + * & % * + + + + * * & || * * + + +

  11. Template Generation • Templates may not always be given as input • An automatic regularity extraction algorithm must develop it’s own templates • Generate a set of templates such that: • Number of templates is minimized • Covering of the graph is maximized

  12. Related Work • Useful in a wide variety of CAD applications • Data path regularity • [Chowdhary98], [Callahan99] • Scheduling [Ly95] • System partitioning [Rao93] • Low power design [Mehra96] • Soft macros – CPR [Cadambi99] for PipeRench architecture

  13. An Algorithm for Simultaneous Template Generation and Matching Formal Definition Informal Definition • Given a labeled digraph G(V, E) • # C is a set of edge types • C  • while (stop_conditions_not_met(G)) • C  profile_graph(G) • cluster_common_edges(G, C) • Find the most common edge type • Contract common edges • Repeat until stopping condition met

  14. Explanation of Algorithm • Profile Edges: Find most common edge types * + * Most Common Edge Type * * * * • Edge contraction: Merge adjacent nodes and maintain connectivity + * + Contract Edge * * * * * * * • Stopping Conditions • Reach certain number of templates • Graph sufficiently covered • No frequently occurring edge type

  15. Algorithm in Action >> % & Conflict Graph >> % + & + Create Conflict Graph Determine MIS Contract edges 2 and 4 MIS Edge 3 * * * Edge 4 * * * * Edge 2 * Edge 1 Edge 4 Edge 1 Edge 3 Edge 2 * * * * Templates >> % >> % & & + + * * * * * * * * Contract edges Iteration 2 * * * * Templates

  16. Algorithm Summary • Algorithm can be generalized and used in a variety of applications • Easily extended to hypergraphs • Input/output pin restrictions can easily be added • Performs template generation and matching simultaneously We target algorithm towards VPB generation in SPS

  17. Experimental Setup Control Dataflow Graph + * + + * Control Flow Graph Set of applications specified in C SUIF & Machine-SUIF Dataflow Graph Generation Pass

  18. Experimental Setup MediaBench Files Control Dataflow Graph + * + + * Compile to CDFGs Perform Template Generation and Matching Gather Statistics: Graph Coverage, Num. Templates

  19. Benchmark C File Description Experimental Setup - Benchmarks mpeg2 motion.c Motion vector decoding mpeg2 getblk.c DCT block decoding adpcm adpcm.c ADPCM to/from 16-bit PCM epic convolve.c 2D general image convolution jpeg jctrans.c Transcoding compression jpeg jdmerge.c Color conversion rasta fft.c Fast Fourier Transform rasta noise_est.c Noise estimation functions gsm gsm_decode.c GSM decoding gsm gsm_encode.c GSM encoding • Selected files from MediaBench

  20. Oper-ation MediaBench file name Similarity Across Applications motion jdmerge getblk gsm_dec jctrans ADD 50.3% 84.6% 44.5% 29.6% 84.6% MUL 36.3% 13.8% 24.0% 22.4% 13. 8% Template Coverage MUL- MUL 0.0% 0.0% 1.3% 0.0% 0.0% ADD-ADD 14.5% 9.1% 3.2% 3.6% 9.1% ADD-MUL 0.0% 0.4% 0.6% 0.0% 0.4% MUL-ADD 36.3% 13.0% 21.5% 22.4% 13.0%

  21. Experimental Results • Techniques • Simple – restrict templates to two operations • No restrictions – unlimited amount of operations • Stopping condition: most common edge occurs < x% (x5-25)

  22. Summary • Systems need programmability at multiple levels of the computational hierarchy • Introduced SPS as a Hybrid Reconfigurable System • Developed an instruction generation algorithm to determine VPB functionality • Showed that common templates can be found across a similar set of applications • An efficient covering possible using simple templates • Future work: Create methods to uncover more complex templates

More Related