1 / 68

Arkadiy Morgenshtein, Eby G. Friedman, Ran Ginosar, Avinoam Kolodny

Timing Optimization in Logic with Interconnect. Arkadiy Morgenshtein, Eby G. Friedman, Ran Ginosar, Avinoam Kolodny. Technion – Israel Institute of Technology. SLIP (System Level Interconnect Prediction) 2008. Timing Optimization. A. B. Intro. Timing Optimization. function. A. B.

june
Télécharger la présentation

Arkadiy Morgenshtein, Eby G. Friedman, Ran Ginosar, Avinoam Kolodny

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Timing Optimization in Logic with Interconnect Arkadiy Morgenshtein, Eby G. Friedman, Ran Ginosar, Avinoam Kolodny Technion – Israel Institute of Technology SLIP (System Level Interconnect Prediction) 2008

  2. Timing Optimization A B Intro Timing Optimization function A B Special cases A Typically, a mixture of both B only gates only wires

  3. Logic with Wires Intro Common Example 1 1 2 2 4 3 3 4 5 UART design 5

  4. Intro The Interconnect Wall Logic w/o wires Long wires Interconnect Optimization Logic Gate Sizing Logical Effort Repeater Insertion

  5. A B Intro Timing Optimization in Logic with Interconnect Logic w/o wires Long wires

  6. Existing TechniquesA (very) Short Tutorial

  7. Delay = Delay = Delay = Delay Optimal sizing Delayi = Delayi+1 gihi=gi+1hi+1 Intro Logical Effort (only logic) Delay model - delay of minimal inverter R0·C0 , technology constant - logical effort, gate type factor: e.g. ginv=1 - electrical effort, load driving capability - parasitic effort, due to output capacitance I. Sutherland, B. Sproull, and D. Harris, “Logical Effort - Designing Fast CMOS Circuits,” Morgan Kaufmann, 1999.

  8. Delay = Delay = Delay = Delay Intro Limitations of Logical Effort Delay = Delay = Delay = Delay • No wires • No fixed side branches Logic with wires and branches LE breaks down ? ? ?

  9. Intro Repeater Insertion (only wires) Delay ~ Length2 D = RC = 25 Delay ~ Length D = Σrc = 5 Optimal sizing Optimal number of repeaters - wire resistance - effective resistance of minimal inverter - wire capacitance - gate capacitance of minimal inverter H.B. Bakoglu, “Circuits, Interconnections and Packaging for VLSI,” Adison-Wesley, pp. 194‑219, 1990

  10. = x Intro Properties of Repeater Insertion Assumptions of basic repeater insertion (RI) Equal size Equal spacing Terminal gates are similar to repeaters equal fixed Characteristics of RI Number and size of repeaters are independent Single optimal size for a given process and metal layer

  11. So, What Are We Going To Do?

  12. Intro We Are Breaking The Wall Logic w/o wires Long wires Logical Effort Repeaters Insertion WANTED – solution for the mixed case Challenges: Gate placements Gate sizes Number of gates, repeaters

  13. Our Approach to Timing Optimization Logic Gates as Repeaters (LGR) Gate placement (along the wire) Unified Logical Effort (ULE) Gate sizes Gate-terminated Sized Repeater Insertion (GSRI) Number of repeaters

  14. Logic Gates as Repeaters - LGR “Where should the gates be located (along the wire)?”

  15. Problem – delay reduction in logic with wire LGR The Idea • A solution – wire segmenting by repeaters • Drawback – power, area w/o logical functionality = waste • Proposed – logic gates as repeaters LGR - distribution of logic gates over interconnect - driving the partitioned wire without adding repeaters K. Venkat, “Generalized Delay Optimization of Resistive Interconnections through an Extension of Logical Effort,” ISCAS 1993

  16. LGR LGR Delay Modeling Total Delay M. Moreinis, A. Morgenshtein, I. Wagner, and A. Kolodny, “Logic Gates as Repeaters (LGR) for Area-Efficient Timing Optimization,” IEEE TVLSI, 2006

  17. LGR Optimal Wire Segmenting • Output resistance of driving gate i below average  wire length i is increased • Input capacitance of successor gatei+1 above average wire length i is decreased • All gates are equal  equal partitioning • In the case of a negative segment length, neighbor gates are merged

  18. LGR LGR Results Critical path of 8-256 decoder circuit • Delay reduction of up-to 27% - by “moving” the gates • Further delay reduction – by scaling and LGR+RI M. Moreinis, et al., “Repeater Insertion combined with LGR Methodology for on-Chip Interconnect Timing Optimization,” ICECS, 2004.

  19. LGR Optimal Gate Scaling • Enlargement of all gates by a uniform factor S to minimize timing • can be performed iteratively with Segmenting equal segments inverters

  20. LGR LGR Segmenting and Scaling Uniform scaling performed for all gates • For intermediate wires LGR outperforms RI by up-to 55% • For long wires RI is faster • BUT: it requires 44 repeaters • Best for long wires – combined LGR and RI M. Moreinis, et al., “Repeater Insertion combined with LGR Methodology for on-Chip Interconnect Timing Optimization,” ICECS, 2004.

  21. LGR Summary LGR • Logic gates serve as repeaters • No need for logically redundant repeaters • Delay reduction + lower area/power • Can be combined with RI

  22. Unified Logical Effort - ULE “What is the optimal size of the gates?”

  23. Capacitive interconnect effort Resistive interconnect effort ULE Unified Delay Model(including wires)

  24. ULE Minimal Delay Condition Minimal Delay Equal Stage Delays

  25. ULE Minimal Delay for Capacitive Wires General RC interconnect Capacitive interconnect (short wires and branches)

  26. ULE ULE Convergence to LE and RI logic without wires repeater insertion special cases • repeater scaling • Logical Effort

  27. ULE Some Algebra…

  28. ULE Intuition of ULE Optimum = optimal size Delay caused by gate capacitance should be equal to delay caused by gate resistance

  29. ULE ULE Optimality Size too small high resistance Size too big high capacitance

  30. ULE Optimal Gate Capacitance • Expression for size of a single gate • Gate sizes along a logic path are iteratively determined

  31. ) 0 C × ( x opt e c n a t i m c µ a m L = 1 mm 100 m p m μ a 5 . 10 0 C m L = 0 µ 50 LE ULE Examples (1): ULE Sizing 100 • Equal wires • Total electrical effort H = 10 • L = 0  Size converges to LE • Longer wires  ULE is faster • Long wires  Fixed sizing xopt 90 80 70 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 Gate #

  32. x opt L = 1 mm 0 . 5 mm 100 µm ) 0 C × 50 µm ( e c n a t i c a p a 10 µm C LE L = 0 Gate # ULE Examples (2): ULE Sizing 60 • Total electrical effort H = 1 • L = 0  Converges to LE (no scaling) • All wire lengths  ULE is faster • Long wires  Fixed sizing xopt 55 50 45 40 35 30 25 20 15 10 1 2 3 4 5 6 7 8 9

  33. ULE So, What is Xopt? For long wires

  34. ULE Optimum Condition for Long Wires For long wires

  35. ULE Xopt and Repeaters Optimal sizing condition for repeater equal wires INV (g=1) H.B. Bakoglu, “Circuits, Interconnections and Packaging for VLSI,” Adison-Wesley, pp. 194‑219, 1990

  36. ULE Solving Design Problems with Xopt • Layout constraint -optimal size of the repeater located between two wires

  37. ULE Solving Design Problems with Xopt • Cell size constraint -optimal wire length with a repeater of size xrep

  38. ULE Typical Design Example • Optimal ULE sizing • similar gates, similar wires • different gates, similar wires • similar gates, different wires • Gates with higher logical effort get bigger size • No fixed xopt in circuits with various gates and wires

  39. ULE ULE Results Simulation Setup Critical path in a logic circuit (e.g. Adder) • Compared to Cadence Virtuoso® Analog Optimizer (using numerical algorithms) ‎ • 65 nm CMOS

  40. Delay Optimization Logical Effort: higher delay ULE: minimal delay Analog Optimizer: minimal delay (but sloooooow) ULE • LE becomes inaccurate as the wire lengths grows • ULE is close to Analog Optimizer tool • within 9%

  41. Run Time Comparison ULE Run time [min] • ULE run time is orders of magnitude shorter than the run time of Analog Optimizer • ULE run time is shorter than 1 second

  42. Power-Delay Optimization in ULE ULE Power is function of gate and wire capacitances Optimal gate size Ci

  43. Sizing for minimal P×D ULE Random logic path assumed with 10 stages x6 x8 x1 x3 x4 X5 x7 x9 X10 x2 L6 L8 L1 L3 L4 L5 L7 L9 L2 Four wire length scenarios S1: all wires L = 100µm S2: all wires L = 80µm S3: all wires L = 400µm S4: L = {900,600,150,300,800,200,400,150,250} (S4) Gate size (×C0) minimal Delay • Power-Delay optimization reduces gate sizes as compared to Delay optimization minimal Power×Delay

  44. Reduced Energy, Low Delay Penalty ULE Delay Energy 4000 10 9 3500 minimal Power-Delay minimal Power-Delay 8 minimal Delay 3000 minimal Delay 7 2500 6 delay [ps] energy [pJ] 5 delay (ps) 2000 energy (pJ) 4 1500 3 1000 2 500 1 0 0 S1 S2 S3 S4 S1 S2 S3 S4 scenario scenario

  45. ULE for Branches and Fanout ULE General ULE condition for gate sizing

  46. Sizing in Path with Branches Gate Sizing with Branches 140 S1 120 S2 S3 100 S4 no branches 80 size 60 40 20 0 1 2 3 4 5 6 7 8 9 10 gate # ULE Four branch scenarios S1: Lb = 400µm, Cb = 1 for all branches S2: Lb = 400µm, Cb = 30 for all branches S3: Lb = {400, 100, 400, 400}µm, Cb = {30,1,30,1} S4: Lb = {100, 100, 100, 400}µm, Cb = {1,1,1,30} Lw = 100µm for all wires at critical path • Branches cause a change in sizing as compared to ULE without branches

  47. Delay Optimization with Branches ULE • Additional delay reductionis obtained using extended ULE condition with branches

  48. Unified Logical Effort Summary ULE = • Useful over entire range of problems • logic only – logic & wires – wires only • Computes optimal gate sizes • Low computational complexity

  49. ULE One More Question: “When can I reduce delay by adding an inverter?”

  50. ULE Adding an Inverter to Reduce Delay condition for inverter insertion

More Related