1 / 68

ECE260B – CSE241A Winter 2005 Partitioning & Floorplanning

ECE260B – CSE241A Winter 2005 Partitioning & Floorplanning. Website: http://vlsicad.ucsd.edu/courses/ece260b-w05. Key Design Stages. Synthesis Partitioning Floorplanning Power/ground Generation Clock Generation Placement Routing. Floorplanning. Floorplanning Input.

milt
Télécharger la présentation

ECE260B – CSE241A Winter 2005 Partitioning & Floorplanning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE260B – CSE241AWinter 2005Partitioning & Floorplanning Website: http://vlsicad.ucsd.edu/courses/ece260b-w05

  2. Key Design Stages • Synthesis • Partitioning • Floorplanning • Power/ground Generation • Clock Generation • Placement • Routing

  3. Floorplanning

  4. Floorplanning Input • Design netlist (required) • Area requirements (required) • Power requirements (required) • Timing constraints (required) • Physical partitioning information (required) • Die size vs. performance vs. schedule trade-off (required) • I/O placement (optional) • Macro placement information (optional)

  5. Floorplanning Output • Die/block area • I/Os placed • Macros placed • Power grid designed • Power pre-routing • Standard cell placement areas  Design ready for standard cell placement

  6. Floorplanning Output

  7. Floorplan • Blocks inside a pad frame • Routing inside, between blocks • Different-sized blocks more difficult than standard cells to place and route • Blocks • Hard, soft, semi-soft • Rectangular, L-shaped, T-shaped, rectilinear • Can rotate, mirror, … blocks RAM std cell I/O pads Routing channels data path Courtesy K. Yang, UCLA

  8. Design Styles • Full Customized • Analog / RF • CPU design • ASIC (Application Specific IC) • Gate array / sea of gate / standard cells • Via programmable • Structured ASICs • Programmable Logics • PLA • FPGA • Software implementation • Micro-code Courtesy K. Yang, UCLA

  9. Physical Design Schedule Perf Die size Die size Size Estimation • Why we care: • If area is too small: P&R will not finish or meet timing, will run too long • Schedule and die size inversely related • Performance and die size have complex relationship • Rule of thumb (must correct for power, clock, etc.): • 3LM: Cell utilization 65 percent // what is utilization? • 4LM: Cell utilization 70 percent • 5LM: Cell utilization 75 percent • 6LM: Cell utilization 80 percent • Floorplan metrics • Low interconnect density  Cell util (standard cell area/standard cell row area) • High interconnect density  “Net util” (number of nets/standard cell area)

  10. A A B C B C channel 1 ch 1 ch 3 ch 2 ch 2 Channels • Channels end at block boundaries • Alternate channel definitions possible, depending on position of blocks A B C Courtesy K. Yang, UCLA

  11. C A B D E Channel Intersection Graph • Nodes are channels, edges correspond to pairs of channels that touch • Channel graph shows paths between channels • Channel graph can be used to guide global routing Courtesy K. Yang, UCLA

  12. channel B B A D channel A C constraint Channel Ordering • Wire out end of one channel creates pin on side of next channel • “Wheel” = Circular constraints that create an unroutable configuration of channels Courtesy K. Yang, UCLA

  13. 1 A C 3 2 D B 4 E Slicing Floorplan Represented by Binary Tree • A slicing floorplan can be recursively cut in two without cutting any blocks • A slicing floorplan is guaranteed to have no “wheels”, therefore guaranteed to have a feasible order of routing for the channels • A slicing floorplan can be represented as a binary tree, with internal nodes representing slices in the floorplan and leaves representing blocks. 1 2 3 4 C A B D E Courtesy K. Yang, UCLA

  14. O-Tree • Partial ordering based on projection overlapping (with given physical locations) • Transforming into binary trees by pivoting, etc. • Coded in a node sequence given a tree traversal algorithm • E.g., OACBDEF for DFS • Condensed solution space C A O D E B F Courtesy K. Yang, UCLA

  15. Sequence Pair • Based on layout partitions by non-overlapping ascending/descending staircases • Coded in two node sequences • E.g., CEDFAB for descending staircases and • ABCDEF for ascending staircases • Larger solution space, finer representation C A D E B F Courtesy K. Yang, UCLA

  16. Partitioning

  17. Outline • Introduction • Kernighan-Lin Algorithm • Fiduccia-Mattheyses Algorithm • Partitioning by Network Flow • Clustering • End-case Partitioning (and Placement)

  18. Partitioning • Decomposition of a complex system into smaller subsystems • Done hierarchically • Partitioning done until each subsystem has manageable size • Each subsystem can be designed independently • Interconnections between partitions minimized • Less hassle interfacing the subsystems • Communication between subsystems usually costly • Time-budgeting

  19. Example: Partitioning of a Circuit Input size: 48 Cut 1=4 Size 1=15 Cut 2=4 Size 2=16 Size 3=17

  20. Hierarchical Partitioning • Levels of partitioning: • System-level partitioning:Each sub-system can be designed as a single PCB • Board-level partitioning:Circuit assigned to a PCB is partitioned into sub-circuitseach fabricated as a VLSI chip • Chip-level partitioning:Circuit assigned to the chip is divided into manageable sub-circuitsNOTE: physically not necessary

  21. Delay at Different Levels of Partitions x A B D C 10x PCB1 PCB2 20x

  22. Partitioning: Formal Definition • Input: • Graph or hypergraph • Usually with vertex weights • Usually weighted edges • Constraints • Number of partitions (K-way partitioning) • Maximum capacity of each partitionORmaximum allowable difference between partitions • Objective • Assign nodes to partitions subject to constraintss.t. the cutsize is minimized • Tractability • Is NP-complete 

  23. Hypergraphs in VLSI CAD • Circuit netlist represented by hypergraph Slides Courtesy Kia Bazargan, U. Minn

  24. Hypergraph Partitioning in VLSI • Variants • directed/undirected hypergraphs • weighted/unweighted vertices, edges • constraints, objectives, … • Human-designed instances • Benchmarks • up to 4,000,000 vertices • sparse (vertex degree » 4, hyperedge size » 4) • small number of very large hyperedges • Efficiency, flexibility: KL-FM style preferred

  25. Some Notations • A net n is cut by a cluster C if at least one, but not all, pins of n is in C. • We use E(C) to denote the set of nets cut by a cluster C. • We use E(P) to denote the set of nets cut by at least one cluster of a partition P. • We use w(C) to denote the no. of cells assigned to a cluster C.

  26. Some Bipartitioning Formulations • Min-Cut Bipartitioning: • Objective : Minimize F(P2) = |E(C1)| = |E(C2)| • Min-Cut Bisection: • Objective : Minimize F(P2) = |E(C1)| = |E(C2)| • Constraint : |w(C1) - w(C2)|   • Size-Constrained Min-Cut Bipartitioning: • Objective : Minimize F(P2) = |E(C1)| = |E(C2)| • Constraint: L  w(C1), w(C2)  U • Minimum Ratio Cut Bipartitioning: • Objective : Minimize F(P2) = |E(C1)|/(w(C1)w(C2))

  27. Some Multi-Way Partitioning Formulations • Size-Constrained Min-Cut k-Way Partitioning: • Objective : Minimize F(Pk) • Constraint: L  w(Ci)  U  Ci  Pk • Many other complicated formulations • k-way partitioning: Formulation • Given a netlist of n cells V = {v1, v2, …, vn}, assign the cells to k clusters Pk = {C1, C2, …, Ck} satisfying some given constraints such that an objective function F(Pk) is optimized. • Partitioning: k is small O(1) • Clustering: k is large O(n) • Technology Mapping: Constraints on the clusters

  28. Outline • Introduction • Kernighan-Lin Algorithm • Fiduccia-Mattheyses Algorithm • Partitioning by Network Flow • Clustering • End-case Partitioning (and Placement)

  29. Kernighan-Lin (KL) Algorithm • On non-weighted graphs • An iterative improvement technique • A two-way (bisection) partitioning algorithm • The partitions must be balanced (of equal size) • Iterate as long as the cutsize improves: • Find a pair of vertices that result in the largest decrease in cutsize if exchanged • Exchange the two vertices (potential move) • “Lock” the vertices • If no improvement possible, andstill some vertices unlocked, thenexchange vertices that result in smallest increase in cutsize W. Kernighan and S. Lin, Bell System Technical Journal, 1970.

  30. Kernighan-Lin (KL) Algorithm • Initialize • Bipartition G into V1 and V2, s.t., |V1| = |V2|  1 • n = |V| • Repeat • for i=1 to n/2 • Find a pair of unlocked vertices vai V1 and vbi V2 whoseexchange makes the largest decrease or smallest increasein cut-cost • Mark vai and vbi as locked • Store the gain gi. • Find k, s.t.  i=1..k gi=Gain k is maximized • If Gain k > 0 then move va1,...,vak from V1 to V2 and vb1,...,vbk from V2 to V1. • Until Gain k  0

  31. An Example a b c d a 0 1 2 3 2 a c 3 b 1 0 1 4 1 3 1 b d c 2 1 0 3 4 d 3 4 3 0 Slides courtesy F. Y. Young, U. Hong Kong

  32. An Example - Pass One 2 3 4 a c d c d b 3 3 3 1 3 4 2 3 1 1 1 1 b d b a c a 4 1 2 g(a,c) = -1+3-3+1 = 0 g(a,d) = -1+2-3+4 = 2 g(b,c) = -1+4-3+2 = 2 g(b,d) = -1+1-3+3 = 0 g1 = 2 g(b,c) = -4+1-2+3 = -2 g2 = -2  G = g1 = 2 (k = 1)

  33. An Example - Pass Two 3 3 3 d c d c d c 1 3 3 1 3 2 4 4 2 4 2 1 a a b b b a 1 1 1 g(a,b) = -2+3-4+1 = -2 g(a,d) = -2+1-4+3 = -2 g(c,b) = -2+3-4+1 = -2 g(c,d) = -2+1-4+3 = -2 g1 = -2 g(a,b) = -3+2-1+4 = 2 g2 = 2 G = g1 + g2 = 0 (k = 2) STOP!

  34. Cut During One Pass (Bipartitioning) Cut Moves

  35. Kernighan-Lin (KL) : Analysis • Time complexity? • Inner (for) loop • Iterates n/2 times • Iteration 1: (n/2) x (n/2) • Iteration i: (n/2 – i + 1) (n/2 – i + 1). • Passes? Usually independent of n • O(n3) • Drawbacks? • Local optimum • Balanced partitions only • No weight for the vertices • High time complexity • Only on edges, not hyper-edges

  36. Outline • Introduction • Kernighan-Lin Algorithm • Fiduccia-Mattheyses Algorithm • Partitioning by Network Flow • Clustering • End-case Partitioning (and Placement)

  37. Fiduccia-Mattheyses Algorithm: Basic Ideas • Differences from KL: • Move only one cell each time. • Cells can have different sizes. • Nets can be multi-terminal. • Maintain a balanced partition after every move.

  38. FM Algorithm • Start with a balanced partition P = {X,Y}. • Repeat • For i = 1 to n: • Choose a free cell b  XY s.t. moving b to the other side gives the highest gain, gain(b), and moving b preserves balance in P. • Move and lock b. • Let gi = gain(b). • Find k s.t. G = g1 + g2 + ….. + gk is maximized and shuffle the cells up to this kth step. • Until G = 0.

  39. An Example g2 d e f a c a b c d e f g1 locked b a c g4 a c d f g3 f d b e b e

  40. An Example c a a g5 a d d c g6 f d f c f e b e e b b If G = g1 + g2 + g3 + g4 is the largest partial sum, the partition after this pass is: c d e a f b

  41. Balanced Partition • A partition P = (X,Y) is balanced iff: for some constant r  1 where w(X) is the total size of the cells in X. To preserve balance, a cell b is moved in a pass only if: after moving b where W = w(XY) and Smax is the maximum cell size

  42. KL and FM Extensions: Tie-Breaking Strategy • When picking the highest gain move, break ties by looking ahead a certain number of steps. • If ties still occur, some researchers observe that LIFO order improves solution quality.

  43. Ratio of #edges to #vertices • Solution quality of KL and FM depends on the ratio of #edges to #vertices: good if ratio > 5 and bad if ratio < 3. VLSI circuits have ratio 1.8-2.5 typically. • Goldberg and Burstein suggested contracting edges to increase the ratio: AB A B

  44. Outline • Introduction • Kernighan-Lin Algorithm • Fiduccia-Mattheyses Algorithm • Partitioning by Network Flow • Clustering • End-case Partitioning (and Placement)

  45. 12/12 12 b b a a 20 19/20 11/16 16 s t s t 10 4 9 7 10 1/4 9 7/7 13 4 12/13 4/4 c d c d 11 11/11 min-cut = max-flow Network Flow Technique • The network flow technique can find the min-cut bipartition optimally, but not necessarily balanced. • Apply the algorithm repeatedly to obtain a balanced bipartition.

  46. Network Flow Technique • The network flow technique is very useful in many different research areas. • Many sophisticated improvements have been made to the original algorithm. • Ford & Fulkerson: O(|E||f|) where |f| is the size of the total flow. Note that for unit capacity, |f|  |E|, so O(|E|2) time.

  47. Circuit Partitioning • We can apply the network flow algorithm in partitioning circuits. • The biggest problem is that the two partitions may not be balanced. • The problem of obtaining two balanced partitions with minimum cut is NP-complete. • However we can apply some heuristics to balance the two partitions.

  48. Flow-Balanced-Bipartition (FBB) • Find a min-cut C = (X,Y) in the network N • If (1-)W/2  w(X)  (1+)W/2, stop and return C • If w(X) < (1-)W/2 then • Collapse all nodes in X to s • Collapse to s a node vY incident on a net in C • Go to to step 1 • If w(X) > (1+)W/2 … (similarly) ... Why do we need this step?

  49. A B C D How to represent this netlist by a simple graph? Circuit Representation • Another problem in applying the network flow technique in circuit partitioning is how to represent a circuit correctly by a graph.

  50. Hypergraph In hypergraph, an edge is a set of vertices. H(V,E) where V = {A, B, C, D} E = {n1, n2, n3} n1 = {A, B, C, D} n2 = {A, B} n3 = {C, D} A B C D Circuits can be represented by hypergraphs, but the net-work flow method can only be used in simple graphs.

More Related