1 / 40

An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types

An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types. Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical Engineering Texas A&M University College Station, Texas 77845, USA. Outline. Introduction O(b 2 n 2 ) Algorithm

darrin
Télécharger la présentation

An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An O(bn2) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical Engineering Texas A&M University College Station, Texas 77845, USA

  2. Outline • Introduction • O(b2n2) Algorithm • New O(bn2) Algorithm • Experimental Results • Extension • Conclusion

  3. Introduction • Buffer insertion and sizing is one of the most effective method for reducing interconnect delay. Saxena, et al. [TCAD 2004]

  4. Introduction(cont.) • Modern libraries contain hundreds of different buffers with different characteristics. • Polarity, input capacitance, driving resistance, intrinsic delay, noise margin, power, area, etc. • Buffer library size has quadratic effect on running time in traditional algorithms. • With such large number of buffers and buffer types, fast algorithms for buffer insertion are crucial for timing closure.

  5. Problem Formulation • Given: A routing tree, n possible buffer positions, sink capacitances and required arrival times (RAT), a buffer library, wire resistance and capacitance. • Delay model: Elmore delay for interconnect and linear delay model for buffers. buffer library s2 sinks s1 s0 s3 s4 source possible buffer positions

  6. Maximum Slack Problem • Find: Where to insert buffers so that the slack at the source Q(s0) is maximized. = - Q ( s ) min { RAT ( s ) delay ( s , s )} 0 0 i i > 0 i s2 s1 s0 s3 s4 without buffer, Q(S0)= – 50 ps

  7. Maximum Slack Problem • Find: Where to insert buffers so that the slack at the source Q(s0) is maximized. = - Q ( s ) min { RAT ( s ) delay ( s , s )} 0 0 i i > 0 i s2 s1 s0 s3 s4 with 2 buffers, Q(S0)= 100 ps

  8. Previous Research • Maximum Slack • van Ginneken [ISCAS 90]: O(n2) time and space, where n is the number of buffer positions. • Lillis, Cheng and Lin [TCAS 96]: O(b2n2) time and space for b buffer types. • Shi and Li [DAC 03]: O(nlogn) time for 2-pin nets, O(nlog2n) time for multi-pin nets. O(nlogn) space. • Minimum Buffer Cost (Area, Power, etc.) • Lillis, Cheng and Lin [TCAS 96]: pseudo-polynomial time algorithm. • Shi, Li and Alpert [ASPDAC 04]: buffer cost minimization is NP-hard if b is a variable.

  9. Outline • Introduction • O(b2n2) Algorithm • New O(bn2) Algorithm • Experimental Results • Extension • Conclusion

  10. Dynamic Programming • Each candidate solution of a sub-tree is represented by a (Q, C) pair, where Q is slack and C is downstream capacitance. • For any two candidates A1 and A2 of the same sub-tree, if Q(A1)Q(A2) and C(A1)C(A2), then A1 is redundant. • O(b2n2) time dynamic programming algorithm (Lillis-Cheng-Lin) • For b buffer types, the number of candidates is at most bn+1 • For a wire, update (Q, C) value for every candidate in O(bn) time • For a buffer position, add b new candidates in O(b2n) time • For a branch point, merge two sets of candidates in O(bn1+bn2) time

  11. Dynamic Programming • Each candidate solution of a sub-tree is represented by a (Q, C) pair, where Q is slack and C is downstream capacitance. • For any two candidates A1 and A2 of the same sub-tree, if Q(A1)Q(A2) and C(A1)C(A2), then A1 is redundant. • O(bn2) time dynamic programming algorithm (This paper) • For b buffer types, the number of candidates is at most bn+1 • For a wire, update (Q, C) value for every candidate in O(bn) time • For a buffer position, add b new candidates in O(bn) time • For a branch point, merge two sets of candidates in O(bn1+bn2) time

  12. Data Structure: Linked List • Use linked list to store non-redundant candidates • Sorted in decreasing Q and decreasing C order • Each entry also contains the list of buffer positions Better Slack Less Capacitance (Q1,C1) (Q2,C2) (Q3,C3)

  13. Best Candidates • For each buffer Bi, R(Bi) is buffer driver resistance, C(Bi) is buffer input capacitance, and t(Bi) is buffer intrinsic delay. Label buffers according to non-decreasing order of resistance R(B1)R(B2)  … R(Bb). • For each buffer type Bi • Define the best candidate ias the candidate that maximizes slack among all candidates after Bi is inserted. • The new slack is Q(i)–R(Bi)C(i)–t(Bi). • Define the new candidate ias the candidate formed by i with buffer type Bi. • How to find all best candidates quickly is the key addressed in this paper.

  14. 1 1 2 2 3 3 Example • Three buffer types • R(B1)=1, C(B1), t(B1) • R(B2)=3, C(B2), t(B2) • R(B3)=5, C(B3), t(B3) Insert B2: (6t(B2), C(B2)) (7t(B2), C(B2)) (6t(B2), C(B2)) (1t(B2), C(B2)) (3t(B2), C(B2)) Insert B1: (16t(B1), C(B1)) (15t(B1), C(B1)) (12t(B1), C(B1)) (5t(B1), C(B1)) (5t(B1), C(B1)) Candidates (Q, C): (21, 5) (19, 4) (15, 3) (7, 2) (6, 1) Insert B3: (4t(B3), C(B3)) (1t(B3), C(B3)) (0t(B3), C(B3)) (3t(B3), C(B3)) (1t(B3), C(B3)) Best candidate for B1 is 1, and the new candidate is 1 Best candidate for B2 is 2, and the new candidate is 2 Best candidate for B3 is 3, and the new candidate is 3

  15. Outline • Introduction • O(b2n2) Algorithm • New O(bn2) Algorithm • Experimental Results • Extension • Conclusion

  16. (Q, C) Plane A1 (21, 5) A2 (19, 4) • Non-redundant (Q, C) list is a monotonically decreasing sequence • As resistance is added, Q values change A3 (15, 3) A4 (7, 2) A5 (6, 1)

  17. R(B1) = 1, Q=Q–R(B1)*C A1 (21-5, 5)

  18. R(B2) = 3, Q=Q–R(B2)*C A1 (21-15, 5)

  19. R(B3) = 5, Q=Q–R(B3)*C A1 (21-25, 5)

  20. As R Increases, Q Decreases

  21. Best Q for each R Best Q Values Move to Left

  22. Best Candidatesare in Decreasing Order of C • Lemma 1: C(1) C(2) …C(b) • Not enough for an O(bn) algorithm to find all best candidates. • Need global search 1 2 3

  23. Convex Pruning • Convex pruning prune candidates like A4 A1 A2 A3 A4 Pruned A5

  24. Before Convex Pruning Non-Convex

  25. After Convex Pruning 1 2 3

  26. Convex Hull • After convex pruning, remaining list is a convex hull • Lemma 3: Best candidates must be on the convex hull • A candidate is on the convex hull if and only if there exists an resistance R such thatwhen R is added, this candidate gives maximum Q • Lemma 4: On convex hull, if Ai gives maximum Q among neighboring candidates, Aigives maximum Q among all candidates • The slope (Qi Qj)/(CiCj) between candidates Ai and Aj (i>j) is the extra resistance value that makes Aj to have better slack than Ai • On convex hull, slopes are in sorted order • Local Optimal  Global Optimal

  27. Local Optimal  Global Optimal A1 A2 • For any R(Bi), if A2 gives better slack than A1 and A3, then A2is the best candidate for Bi. A3 A5

  28. Find Convex Hull: Graham’s Scan • Since the points are sorted, Graham’s scan can perform convex pruning in linear time Q C

  29. Find Convex Hull: Graham’s Scan • Since the points are sorted, Graham’s scan can perform convex pruning in linear time Q C

  30. Find Convex Hull: Graham’s Scan • Since the points are sorted, Graham’s scan can perform convex pruning in linear time Q C

  31. Find Convex Hull: Graham’s Scan • Since the points are sorted, Graham’s scan can perform convex pruning in linear time Q C

  32. O(bn) O(bn) O(bn) O(blogb) New Subroutine for Adding Buffer • At each buffer position, given the (Q, C) list N in decreasing C order and the buffer library, where R(B1)  R(B2) … R(Bb). • Generate new (Q, C) list A1, A2, …, with Convex Pruning • Generate new candidates 1 , 2 … with the following loop • Initialize j = 1, then for i = 1 to b do If Aj gives better slack than Aj+1 then Generate new candidates i for buffer Bi Q(i ) = Q(Aj)–R(Bi)C(Aj)–t(Bi) C(i) = C(Bi) else j = j + 1 • Sort i s in non-increasing C order. • Insert i s into original list N

  33. O(bn2) Algorithm • Dynamic programming • For b buffer types, the number of candidates is at most bn+1 • For a wire, update (Q, C) value for every candidate in O(bn) time • For a buffer position, add b new candidates in O(bn) time • For a branch point, merge two sets of candidates in O(bn1+bn2) time • Total complexity is O(bn2).

  34. Outline • Introduction • O(b2n2) Algorithm • New O(bn2) Algorithm • Experimental Results • Extension • Conclusion

  35. Speedup over O(b2n2) Algorithm net1: 337 sinks net2: 1944 sinks net3: 2676 sinks

  36. Speedup vs. Buffer Positions Buffer Library Size: 64

  37. Outline • Introduction • O(b2n2) Algorithm • New O(bn2) Algorithm • Experimental Results • Extension and Conclusion

  38. Extension to Min Buffer Cost • Buffer cost is associated with area and power • Find a solution satisfying the slack requirement and at the same time, has minimum buffer cost • Each candidate solution is represented by a (Q, C, W) triple, where Q is slack, C is capacitance, and W is buffer cost • Worst-case NP-hard • Our algorithm can reduce the operation of adding a buffer from O(bN) to O(N), where N is the number of non-redundant candidates

  39. Conclusion • New O(bn2) algorithm for optimal buffer insertion with b buffer types • Best candidates must be in decreasing order of C • Best candidates must be on the convex hull • Local optimal  global optimal • Applicable to cost minimization and inverting buffer types

  40. Thank You!

More Related