Introduction to High-Level Synthesis: Optimizing System Architecture through HLS Flow

ECE 565High-Level Synthesis--Introduction Shantanu Dutt ECE Dept., UIC

HLS Flow • Code/Algorithm  Architecture (interconnected functional units (FUs), memory units (MUs) via muxes, demuxes, tristate buffers, buses, dedicated interconnects) Classically, these 3 stages were performed sequentially but currently performed together (which leads to better optimization)

HLS Flow (contd)

HLS Flow (contd) (Binding) Allocation: Simple counting of FUs after the above 2 stages

Simple HLS Examples +

ldd ldc ldx c d ldy x y I1 I0 I0 I1 ldb lda mux a b mux mux2 mux1 + X 1 2 3 4 5 6 demux demux cc 3(i+1) ldz z reg. “a” loaded lda = 1 Simple HLS Examples (contd) 2) Mapping to h/w w/ constraints: use only 1 (X) and 1 (+) a) Non-overlapped scheduling X c1(1) c1(2) + c2(1) c3(2) c3(1) c2(2) cc’s mux1=0, mux2=0 demux=0, ldy=1 [y  c+d] (c2) Controller FSM: cc 3i Reset Note: A register is loaded at the +ve/-ve edge (in a +ve/-ve edge triggered system) of the cc after the one in which its load signal is asseted. lda=1, ldb=1, ldc=1, ldd=1, mux1=1, mux2=1 demux=1, ldz=1 Note: Unspecified control signals have either an inactive value, or if such a concept doesn’t exists for the cs, then the don’t-care value ldx=1 cc 3(i+2) [x  a x b] (c1) [z  x+y] (c3)

ldd ldc ldx c d ldy x y I1 I0 I0 I1 ldb lda mux a b mux mux2 mux1 + X demux demux 1 2 3 4 5 6 ldz z Simple HLS Examples (contd) 2) Mapping to h/w w/ constraints: use only 1 (X) and 1 (+) b) Overlapped scheduling X c1(1) c1(2) + c2(1) c3(1) c2(2) c3(2) cc’s cc 3(i+1) ldc=1, ldd=1, mux1=0, mux2=0, demux=0, ldx=1, ldy=1 [y  c+d, x  a x b] (c1, c2) Controller FSM: cc 3i Reset • For 4 iterations, the overlapped schedule takes 9 cc’s versus 12 cc’s by the non-overlapped sched. • Overlap. sched: Time for n iterations = 2n+1 • Throughput = n/(2n+1) ~ 0.5 outputs/cc • Nonoverlap. sched: Time for n iterations = 3n • Throughput = n/3n ~ 0.33 outputs/cc •  ~ 34% throughput improvement using an overlapped schedule lda=1, ldb=1, mux1=1, mux2=1 demux=1, ldz=1 [z  x+y] (c3)

in1 in in2 T F Distributor • Some DFG control operation nodes: Selectot T F Condition (T/F) Condition (T/F) out out2 out1 Simple HLS Examples (contd) • Conditional code: If (a > b) then c  a-b; Else c  b-a; • Possible DFGs corresponding to the above conditional code:

Iterative code: while (a > b) a  a-b; b a a r1 b ldb lda ldr1 1 T F 0 sel Mux b’ mux > - b’+1 = 2’s compl. of -b To fsm + cin 1 s xor ovfl = 1  -ve = 0  +ve Initialized to F dist T F demux Demux 0 1 ldfina a final a + c1 c2 c1 c2 Scheduling & binding: cc’s Simple HLS Examples (contd) c2 c1

Delay Nodes in DFGs A delay node is generally implemented as a register; a delay node thus becomes a state variable.

Delay Nodes in DFGs (contd) register Mapping to the architecture Transformation in the DFG

Detailed HLS Example

Detailed HLS Example (contd) Note: Not clear how register allocation has been done. It is sub-optimal. The synthesized architecture

Detailed HLS Example (contd)

Detailed HLS Example—Register Allocation

Detailed HLS Example—Register Allocation (contd) • In the conflict graph (one per FU), there is an edge between 2 variable nodes if their lifetimes overlap (indicating that different registers need to be allocated to them) • Graph coloring in general is NP-hard • The above type of conflict graph is called an interval graph (derived from a 1-dimensional interval) • Min. graph coloring can be solved optimally in linear time (using the left-edge algorithm that we will see later for channel routing)

Detailed HLS Example—Register Allocation (contd)

Introduction to High-Level Synthesis: Optimizing System Architecture through HLS Flow

Introduction to High-Level Synthesis: Optimizing System Architecture through HLS Flow

Presentation Transcript

High-Level Synthesis an introduction

Multi-Level Logic Synthesis Introduction

High Level Synthesis

IL2200 - High Level Synthesis

High-Level Synthesis

High-level Synthesis Scheduling, Allocation, Assignment,

High-Level Synthesis: Creating Custom Circuits from High-Level Code

ENGG3190 Logic Synthesis High Level Synthesis

Validating High-Level Synthesis

L12 : Lower Power High Level Synthesis(3)

Lower Power High Level Synthesis

High-Level Synthesis-II

L13 :Lower Power High Level Synthesis(3)

High-Level Synthesis for Reconfigurable Systems

ECE 565 High-Level Synthesis—An Introduction

High-Level Synthesis Algorithms

High-level synthesis

Multi-Level Logic Synthesis Introduction

High-Level Synthesis

High-level Synthesis Transformations

Sea Ice

Sea Ice