1 / 68

High-level Synthesis Scheduling, Allocation, Assignment,

High-level Synthesis Scheduling, Allocation, Assignment,. Note: Several slides in this Lecture are from Prof. Miodrag Potkonjak, UCLA CS. Overview. High Level Synthesis Scheduling, Allocation and Assignment Estimations Transformations. Allocation, Assignment, and Scheduling.

shania
Télécharger la présentation

High-level Synthesis Scheduling, Allocation, Assignment,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High-level SynthesisScheduling,Allocation, Assignment, Note: Several slides in this Lecture are fromProf. Miodrag Potkonjak, UCLA CS

  2. Overview • High Level Synthesis • Scheduling, Allocation and Assignment • Estimations • Transformations

  3. Allocation, Assignment, and Scheduling Techniques Well Understood and Mature

  4. Control Step Scheduling and Assignment Control Step

  5. ASAP Scheduling Algorithm

  6. ASAP Scheduling Example

  7. ASAP: Another Example ASAP Schedule Sequence Graph

  8. ALAP Scheduling Algorithm

  9. ALAP Scheduling Example

  10. ALAP: Another Example ALAP Schedule(latency constraint = 4) Sequence Graph

  11. Observation about ALAP & ASAP • No priority is given to nodes on critical path • As a result, less critical nodes may be scheduled ahead of critical nodes • No problem if unlimited hardware • However of the resources are limited, the less critical nodes may block the critical nodes and thus produce inferior schedules • List scheduling techniques overcome this problem by utilizing a more global node selection criterion

  12. List Scheduling and Assignment

  13. List Scheduling Algorithm using Decreasing Criticalness Criterion

  14. Scheduling • NP-complete Problem • Optimal • Heuristics - Iterative Improvements • Heuristics – Constructive • Various versions of problem • Unconstrained minimum latency • Resource-constrained minimum latency • Timing constrained • If all resources identical, reduced to multiprocessor scheduling • Minimum latency multiprocessor problem is intractable

  15. Scheduling - Optimal Techniques • Integer Linear Programming • Branch and Bound

  16. Integer Linear Programming • Given: integer-valued matrix Amxn, vectors B = ( b1, b2, … , bm ), C = ( c1, c2, … , cn ) • Minimize: CTX • Subject to: AX  B X = ( x1, x2, … , xn ) is an integer-valued vector

  17. Integer Linear Programming • Problem: For a set of (dependent) computations {t1,t2,...,tn}, find the minimum number of units needed to complete the execution by k control steps. • Integer linear programming: Let y0 be an integer variable. For each control step i ( 1  i  k ): define variable xij as xij = 1, if computation tj is executed in the ith control step. xij = 0, otherwise. define variable yi = xi1 + xI2 + ... + xin .

  18. Integer Linear Programming • Integer linear programming: For each computation dependency: ti has to be done before tj, introduce a constraint: k x1i+ (k-1) x2i+ ... + xki kx1j+ (k-1) x2j+ ... + xkj+ 1(*) Minimize: y0 Subject to: x1i+ x2i+ ... + xki = 1 for all 1  i  n yj y0 for all 1  i  k all computation dependency of type (*)

  19. c1 c2 c3 c4 c5 c6 An Example 6 computations 3 control steps

  20. An Example • Introduce variables: • xij for 1  i  3, 1  j  6 • yi = xi1+xi2+xi3+xi4+xi5+xi6 for 1  i  3 • y0 • Dependency constraints: e.g. execute c1 before c4 3x11+2x21+x31 3x14 +2x24+x34+1 • Execution constraints: x1i+x2i+x3i = 1 for 1  i  6

  21. An Example • Minimize: y0 • Subject to: yi y0 for all 1  i  3 dependency constraints execution constraints • One solution: y0 = 2 x11 = 1, x12 = 1, x23 = 1, x24 = 1, x35 = 1, x36 = 1. All other xij = 0

  22. ILP Model of Scheduling • Binary decision variables xil • i = 0, 1, …, n • l = 1, 2, … +1 • Start time is unique

  23. ILP Model of Scheduling (contd.) • Sequencing relationships must be satisfied • Resource bounds must be met • let upper bound on # of resources of type k be ak

  24. Minimum-latency Scheduling Under Resource-constraints • Let t be the vector whose entries are start times • Formal ILP model

  25. Example • Two types of resources • Multiplier • ALU • Adder • Subtraction • Comparison • Both take 1 cycle execution time

  26. Example (contd.) • Heuristic (list scheduling) gives latency = 4 steps • Use ALAP and ASAP (with no resource constraints) to get bounds on start times • ASAP matches latency of heuristic • so heuristic is optimum, but let us ignore it! • Constraints?

  27. Example (contd.) • Start time is unique

  28. Example (contd.) • Sequencing constraints • note: only non-trivial ones listed • those with more than one possible start time for at least one operation

  29. Example (contd.) • Resource constraints

  30. Example (contd.) • Consider c = [0, 0, …, 1]T • Minimum latency schedule • since sink has no mobility (xn,5 = 1), any feasible schedule is optimum • Consider c = [1, 1, …, 1] T • finds earliest start times for all operations • equivalently,

  31. Example Solution: Optimum Schedule Under Resource Constraint

  32. Example (contd.) • Assume multiplier costs 5 units of area, and ALU costs 1 unit of area • Same uniqueness and sequencing constraints as before • Resource constraints are in terms of unknown variables a1 and a2 • a1 = # of multipliers • a2 = # of ALUs

  33. Example (contd.) • Resource constraints

  34. Example Solution • MinimizecTa = 5.a1 + 1.a2 • Solution with cost 12

  35. Precedence-constrained Multiprocessor Scheduling • All operations done by the same type of resource • intractable problem • intractable even if all operations have unit delay

  36. Scheduling - Iterative Improvement • Kernighan - Lin (deterministic) • Simulated Annealing • Lottery Iterative Improvement • Neural Networks • Genetic Algorithms • Taboo Search

  37. Scheduling - Constructive Techniques • Most Constrained • Least Constraining

  38. Force Directed Scheduling • Goal is to reduce hardware by balancing concurrency • Iterative algorithm, one operation scheduled per iteration • Information (i.e. speed & area) fed back into scheduler

  39. The Force Directed Scheduling Algorithm

  40. - - - - * * * * * * * * * * * * + + + + < < ASAP Step 1 • Determine ASAP and ALAP schedules ALAP

  41. * * * - - + + < * * * Step 2 • Determine Time Frame of each op • Length of box ~ Possible execution cycles • Width of box ~ Probability of assignment • Uniform distribution, Area assigned = 1 C-step 1 C-step 2 C-step 3 1/2 C-step 4 1/3 Time Frames

  42. 0 0 1 1 2 2 3 3 4 4 Step 3 • Create Distribution Graphs • Sum of probabilities of each Op type • Indicates concurrency of similar OpsDG(i) =  Prob(Op, i) 1 1 2 2 3 3 4 4 DG for Multiply DG for Add, Sub, Comp

  43. Diff Eq Example: Precedence Graph Recalled

  44. Diff Eq Example: Time Frame & Probability Calculation

  45. Diff Eq Example: DG Calculation

  46. - - Fork + + + Join - - 0 1 2 + + 1 + 2 DG for Add Conditional Statements • Operations in different branches are mutually exclusive • Operations of same type can be overlapped onto DG • Probability of most likely operation is added to DG

  47. Self Forces • Scheduling an operation will effect overall concurrency • Every operation has 'self force' for every C-step of its time frame • Analogous to the effect of a spring: f = Kx • Desirable scheduling will have negative self force • Will achieve better concurrency (lowerpotential energy) Force(i) = DG(i) * x(i) DG(i) ~ Current Distribution Graph value x(i) ~ Change in operation’s probability Self Force(j) = [Force(i)]

  48. C-step 1 C-step 2 * * * C-step 3 - - + + < 1/2 * C-step 4 * * 1/3 1 0 1 2 3 4 2 3 4 DG for Multiply Example • Attempt to schedule multiply in C-step 1 • Self Force(1) = Force(1) + Force(2) • = ( DG(1) * X(1) ) + ( DG(2) * X(2) ) • = [2.833*(0.5) + 2.333 * (-0.5)] = +0.25 • This is positive, scheduling the multiply in the first C-step would be bad

  49. Diff Eq Example: Self Force for Node 4

  50. - - * * * * + + < * * Predecessor & Successor Forces • Scheduling an operation may affect the time frames of other linked operations • This may negate the benefits of the desired assignment • Predecessor/Successor Forces = Sum of Self Forces of any implicitly scheduled operations

More Related