Computation Flow • For both run-time and compile-time • For some applications, must iterate
If many reconfigurations have to be done, then some of the steps should be reiterated according to the application's need. A synchronization mechanism is usually used between the processor and the RD. Blocking access should also be used for the memory access between the two devices. Computation flow
Devices like the Xilinx Virtex II/II-Pro up and the Altera Excalibur feature one or more soft or hard-macro processors. The complete system can be integrated in only one device. The reconfiguration process can be: Full: The complete device have to be reconfigured. Partial: Only part of the device is configured while the rest keeps running. Computation flow
Full reconfiguration devices Function to be downloaded at run-time are developed and stored in a database. No geometrical constraints restriction are required for the function. Partial reconfiguration capabilities Modules represented as rectangular boxes, are pre-computed and stored in a data base. With relocation, the modules are assigned to a position on the device at run-time. Computation flow M4 M1 M2 M3 Placer task 2 task 1 task N Task Request Services Scheduler Module Database O.S. T2 TN T1 Reconfigurable Device
M4 M1 M2 M3 Placer RTR Challenges task 2 • Management of Reconf. Device: • Usually as a part of the OS running on a processor • Scheduler • Decides when a task must be executed • Tasks in a database • Characterized by (bbox, run time) • Placer • Temporal placement: management of tasks at run time • Allocates a set of resources for the task. • If cannot find a site, task is rejected • Challenges: • Fragmentation • Communication between new/old tasks task 1 task N Task Request Services Scheduler Module Database O.S. T2 TN T1 Reconfigurable Device
Implementation of a reconfigurable system: a Hardware/software co-design process: Software part: (code-segment to be executed on the processor) Development in a software language with common tools Hardware part: (to be executed on the RD) Development in HDL Interface: HDL or system-level languages Hardware/Software Partitioning Interface Software C, C++, Java etc ... Hardware VHDL, Verilog HandelC, etc..
FPGA Architecture • FPGA architecture from CAD tools’ point of view: • N BLE’s (Basic Logic Element) • K-LUT: k-input LUT • I inputs, N outputs • Inputs and outputs fully connected to the inputs of each LUT through MUXes
Design Flow for H/w Part • Almost the same for all digital circuit design • Synthesis • Different particularly in Technology mapping • LUT-technology mapping • Specific to target technology (device)
Design Flow for H/w Part • Design Entry • Schematic Netlist • HDL • Waveform • State Diagram
Textual or Schematic • Most people today use textual languages rather than schematic • Poor use of screen space. • Not appropriate for large designs. • Hard tooling (parsing).
What is Synthesis? • Transformation of an abstract description into a more detailed description • "+" operator is transformed into a gate netlist • "if (VEC_A = VEC_B) then" a comparator which controls a multiplexer • Transformation depends on several factors: • Algorithm, constraints, library • عملگرهاي ساده (مثل AND، OR، مقايسه) به گيتهاي مشخصي تبديل مي شوند اما عملگرهاي پيچيده تر مثل ضرب ابتدا به ماکروسلهاي خاص آن tool تبديل مي شوند.
Synthesizability • Only a subset of VHDL is synthesizable • Different tools support different subsets • records? • arrays of integers? • clock edge detection? • sensitivity list? • ...
Compilation and optimization: All non-synthesizable data types and operations synthesizable code Translated into a set of Boolean equations Then minimized (Technology-independent optimization) Technology mapping: Assign functional modules to library elements. On FPGAs: Mapping control logic and datapath to LUTs and BLEs Mapping optimized datapath to on-chip dedicated circuit structures (e.g. on-chip multipliers, adders with dedicated carry-chains, embedded memory blocks) Technology-dependent optimization Synthesis
Result: Netlist: a list of components and their interconnections. Netlist Formats: EDIF (Electronic Design Interchange Format). Vendor specific formats. Example: XNF (Xilinx Netlist Format) Synthesis
Place: Assign locations to the components In hierarchical architectures: May need a separate clustering step: to group BLEs into logic blocks Clustering: prior to placement or during placement Route: Provide communication paths to the interconnections. Optimization problems: some cost must be minimized Important factors: Clock frequency Power Consumption Routing congestion ... Physical Design: Place and Route
Bitstream: LUT contents, Multiplexer control lines, Interconnections, …. Configuration Bitstream
برنامه نويسي کامپايل اجرا ويرايش ورود طرح کامپايل شبيه سازي سنتز شبيه سازي ويرايش ويرايش Design Flow • Debug طرح مانند سيکل برنامه نويسي:
Design: Modulo 10-counter Target device: FPGA with 2x2 Logic Blocks (LB) LBs: Two 2-inputs LUTs Two edge-triggered T-Flipflops Objectives: Area Latency FPGA Design Flow – Example
Truth table: State transitions TFF inputs FPGA Design Flow – Example • Synthesis and Optimization: • Karnaugh maps
References • [Bobda07] C. Bobda, “Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications,” Springer, 2007.