1 / 16

HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH

Politecnico di Milano. HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH. Tesi di Laurea di: Christian Pilato Matr.n. 674373 Relatore: Prof. Fabrizio FERRANDI Correlatore: Ing. Antonino TUMEO. Outlines. 2. Summary. High-Level Synthesis

belva
Télécharger la présentation

HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Politecnico di Milano HIGH LEVEL SYNTHESIS WITH AREACONSTRAINTS FOR FPGA DESIGNS:AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian PilatoMatr.n. 674373 Relatore: Prof. Fabrizio FERRANDICorrelatore: Ing. Antonino TUMEO

  2. Outlines 2 Summary • High-Level Synthesis • Proposed methodology • Experimental results • Some further extensions… • Conclusion and future works

  3. High-Level Synthesis 3 High-Level Synthesis – Problem description • Three main sub-tasks: • operation scheduling: when operations start their execution • resource allocation and binding: where operations are executed, where values are stored and how elements are interconnected. • controller synthesis: which operations are issued • Inputs: • behavioral description (in C language) • library of different types of resources • set of constraints “High-Level Synthesis means going from an algorithmic level specification of a behaviour of a digital system to a register level structure that implements that behavior”. McFarland, et al., Proc. IEEE, February 1990. Output: register-transfer level (RTL) design in a hardware description language (e.g. SystemC, VHDL and Verilog) Goal: minimize objectives (area, latency, etc.) Resource Library Behavioral specification Design constraints High-Level Synthesis tool Objectives Scheduling Datapath& Controller Allocation Binding Controller Synthesis

  4. What are the problems? 4 High-Level Synthesis – Problem description • All the sub-tasks are NP-complete: no efficient algorithms • Interconnections have to be considered: up to 80% of final area • All the tasks are closely interdependent • Most of information are available only at the end of the synthesis Try non-deterministic approaches with feedback information Genetic algorithms Multi-objective optimization: reducing to single-objective (weighted average) is not efficient Non-dominated Sorting Genetic Algorithm (NSGA-II) K. Deb, S. Agrawal, A. Pratab, and T. Meyarivan, “A Fast and Elitist Multi-Objective Genetic Algorithm: NSGA-II,” Proceedings of the Parallel Problem Solving from Nature VI Conference, pp. 849–858, 2000.

  5. The proposed methodology 5 High-Level Synthesis and Design Space Exploration

  6. Experimental results 6 Experimental results • Development framework • Integrated in the PandA framework • an open-source C++ framework covering different aspects of the hardware-software design of embedded systems • Evolutionary computation with Open BEAGLE framework • Functional validation • Comparison between Verilog and C simulations Estimation model validation • Comparison between estimations and logic synthesis values • average error equal 4.02 %standard deviance equal 2.82 %maximum error less than 10 % These values can be effectively used as fitness values

  7. Experimental results 7 Experimental results • Design Space Exploration validation • Population size of 1.000 individuals, evolving up to a maximum of 200 generations • the best trade-off between overall execution time and solution quality. Considerations: • It takes into account all elements in the design solution • It can cover a good number of trade-offs between the fastest solution and the minimal area solution • Better approach than existing tools to deal with area constraints Paper accepted for publication at International Symposium on Systems, Architectures, MOdeling and Simulation (SAMOS), Samos, Greece, July 2007Title: “An Evolutionary Approach to Area-Time Optimization of FPGA designs”

  8. Some features just provided… 8 Some extensions… • Weighted clique covering: in register allocation to reduce interconnections • An higher weight is assigned to compatibility edge when the two values involve the same functional units • Clique covering on a weighted graphs; results show a further reduction of overall area up to 10%. • Fitness inheritance: to reduce overall execution time • A fraction of expensive real evaluations is substituted with an estimation based on similar evaluated individuals • It is able to reduce overall execution time over by 25% • No substantial difference in the final Pareto-optimal solution Paper submitted to IEEE Congress of Evolutionary Computation (CEC) 2007, Singapore, September 2007. Title: “Fitness Inheritance In Evolutionary and Multi-Objective High Level Synthesis”

  9. Conclusion and future works 9 Conclusion and future works • The main contributions from this thesis are: • An high-level synthesis flow from C specifications to HDL descriptions • Integration of a model for fast estimation of synthesis results • Design space exploration with a genetic algorithm: • It takes into account all elements composing the design solution • High fitting with real values • Multi-objective concurrent optimization Future works: • Optimize the results coming from the synthesis flow • Further reduce the overall execution time of the proposed methodology • Refine the estimation model and specialize it for different targets

  10. Thank you! Christian PILATOMatr. n. 674373

  11. High-Level Synthesis Flow 5 The proposed High-Level Synthesis flow The proposed flow is organized as follows: • From C to intermediate representation • from GIMPLE to produce graph representation • High-Level Synthesis Flow • Partial binding and Scheduling • Finite State Machine creation • Register allocation • Interconnection allocation • Performance and area estimations • From data structures to intermediate representation in form of graph • From intermediate representation to Hardware Description Language (e.g. Verilog) ready for logic synthesis

  12. Partial Binding and Scheduling 6 1. Partial binding and Scheduling Partial binding: force an operation to be executed on a selected functional unit instance β (+1) = < plus; 0 > • A technique introduced to partially control the final area occupation • It can affect scheduling, register allocation and interconnection allocation Scheduling: assign a starting control step to each operation to be executed • Many scheduling algorithms are able to support partial binding (Integer Linear Programming formulation, list based algorithm, etc.) • Different solutions based on the selected algorithm

  13. FSM and Register allocation 7 2-3. Finite State Machine creation and Register allocation • Scheduling gives information about concurrent operations. • This information is useful for controller synthesis and register allocation • State Transition Graph (STG), based on Moore-FSM model, is created on scheduled specification • It represents control flow and concorrent operations • Conditional operations create bifurcation based on evaluated conditional values • Register allocation: allocate elements to store values across cycle step boundaries. A compiler approach has been implemented on STG: • Liveness analysis based on dataflow equations • Interference graph based on liveness information • Different heuristics to minimize number of registers

  14. The final steps… 8 4-5. Interconnection allocation and result estimations • Interconnection allocation: allocate elements to interconnect the hardware components • Mux-based architecture: port swapping for commutative operations • Glue logic: represent logic netlist to decode commands and select inputs • Truth tables based on signals from controller • The RTL structural description is now available and it considers all elements. Objective values could be retrieved from logic synthesis • too slow! • Estimation model: perform fast estimations of objective values. • Area is difficult to be estimated • Updated and used an existing area model* *: C. Brandolese, W. Fornaciari, and F. Salice. “An Area Estimation Methodology for FPGA Based Designs at SystemC-Level ”, DAC '04: Proceedings of the 41st annual conference on Design automation, pp. 129– 132, 2004.

  15. Problem dependent elements 10 Design Space Exploration by Genetic Algorithm • Chromosome encoding • Each operation in the specification has a gene to represent a feasible partial binding • Genes are added to represent algorithms used to perform high-level synthesis steps: scheduling, register allocation and interconnection optimization • Fitness Evaluation • Information from chromosome about partial binding and algorithms are used to perform a synthesis flow. • Objective values are estimated using the proposed model

  16. Problem independent elements 11 Design Space Exploration by Genetic Algorithm • Generic operators • common operators (crossover and mutation) used without modifications: no unfeasible chromosomes can be created. • If the gene changed by operators is related to: • operation: a new binding constraint for that operation. • algorithm: a different algorithm to solve the related synthesis step • Initial population • created by random or starting from some interesting points to explore around them. • Solution ranking • ranking into different levels according to their fitness values. • accelerated using the fast-non-dominated-sort algorithm available in the NSGA-II

More Related