1 / 32

Timing Issues for DSM

Timing Issues for DSM. R. Brayton U.C. Berkeley. Caveats. This talk is about a work in progress Much of the work is roughly described with the idea of just communicating the general thrust.

cianna
Télécharger la présentation

Timing Issues for DSM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Timing Issues for DSM R. Brayton U.C. Berkeley

  2. Caveats • This talk is about a work in progress • Much of the work is roughly described with the idea of just communicating the general thrust. • Many details remain to be decided and currently several algorithms are being programmed for experimental purposes. • We are just in the middle of many studies and depending on their results, the direction of the project may change. Tau97

  3. Outline • Introduction - DSM project at Berkeley • Our timing abstraction and motivation • Timing driven placement (wireplanning) • slicing approach • programming approach • matching approach • Iterated logic decomposition • Logic rip-up and re-route • Technology aspects Tau97

  4. Overview • Two levels of approach • electrical and technology level • logic level using timing abstraction • Electrical level used to insure reality • predict technology dimensions • place and wire transistors to create leaf cells using Cadence’s LAS tool or CADABRA • extract parasitics using SPACE or FASTCAP • simulate using SPICE with advanced BSIM model Tau97

  5. Overview • Logic level works with a timing abstraction (to be explained) • we need to be sure that abstraction is correct (thus electrical experiments) • Currently cross-talk noise effects on timing ignored • Immediate goal is to build combinational logic macros that meet timing constraints • sequential circuits can be handled similarly Tau97

  6. R f A a g R b A c d A A Macro Problem Statement • Given: • rectangular area, inputs and outputs on perimeter. • required times on outputs, arrival times on inputs. • set of logic functions to be synthesized (possibly pin locations can be somewhat flexible) • Find: Logic decomposition of the functions that can be: • placed and wired in the given area • meeting the timing constraints. Tau97

  7. Some Facts • As dimensions shrink, gate delays decrease and wire delays increase • in the limit all delays are in the wires. • On a net, by a combination of buffer insertion and wire sizing: • delay of net from root to any leaf can be made linear in the Manhatten distance from root to leaf. Tau97

  8. Linear Delay • By buffer insertion • spacing is determined by resistance and capacitance of the line and the buffers • optimum # of optimum sized buffers makes the delay linear Tau97

  9. Linear Wire Delay Model for a Net y x Delay is made linear by buffer insertion and wire and buffer sizing Tau97

  10. Since f depends on b, then is the minimum delay that can be on any path from b to f. Timing Abstraction: Linear Delay Model (LDM) • Delay is linear function of the Manhatten distance, independent of the logic it meets along a path. a f b c Tau97

  11. Caveat • So far we are not considering the effect of cross-talk noise on delay victim aggressor Victim can be slowed by aggressor if transitions are opposing Tau97

  12. f h g a b c Common Divisors May Cause Paths to Stray But in this example, the longest path is not increased Tau97

  13. Example Where Longest Path Must be Increased f b Any divisor h(a,b) common to both f and g cannot be placed without increasing longest path h g a Tau97

  14. Problem 1: Timing Driven Point Placement • Given: Area, Arrival and Required times, pin positions, and a decomposition (netlist) Find: Point placement that satisfies all timing constraints. • No consideration of areas required to implement logic gates • Areas of gates can be approximated by count of literals in factored form Tau97

  15. Pure Point Placement congested area f g a b c Tau97

  16. Problem 2: Placement with Area Constraints • Areas are flexible. Leaf cell “gates” remain to be built. Gates types remain to be determined (PLAs, domino, PTL, etc.) • Three experimental “wireplanning ” approaches • slicing • programming • matching Tau97

  17. Slicing Approach • Use simulated annealing to get point placement • cost function for SA is derived by doing a delay trace through the placed points • After SA, derive slicing structure from point placement • Use flexibility of areas for final placement Tau97

  18. Slicing Approach Hypothesis: Can make slicing so that distances are not perturbed too much from point placement Distances are estimated now as Manhatten distance center-to-center Once we get slicing structure, we need to build logic in blocks allocated LDM implies that we can build the logic so that delay < distance across logic sub-block Tau97

  19. Programming Approach • Get initial point placement with force directed type method (or SA) • force points apart to provide space for areas • this gives relative point positions • Distribute slacks using zero slack distribution • Formulate and solve LP Tau97

  20. LP Formulation • Distributed slacks give bound on wire lengths, dij • Assume aspect ratio given for each “gate” • Point placement gives relative positions All areas scaled by to guarantee feasibility Tau97

  21. Matching Approach • Divide area into minimum size squares • Label each square with functions that it can contain without violating timing f fg/abc gh/bc fh/ac a b g Tau97 h c

  22. Matching Approach • Each logic “gate” fans out to set of primary outputs (fg) and fans in from set of primary inputs (abc) • Thus a gate is labeled say fg/abc • Each gate is given an area (#lits in FF) • Want to match gates to squares so that square’s capacity is not violated. Tau97

  23. before Iterated Decomposition • Given: netlist and current placement • Select divisor that can be placed, still satisfying timing constraints smaller areas some paths longer after Tau97

  24. Iterated Decomposition • Choose divisor that maximally decreases • Algorithm: Get initial decomposition (say minimum area) Selectively duplicate nodes and adjust outputs Collapse local trees Global timing driven placement Do { select “best” divisor locally adjust placement (reset global placement after k divisors) Until area constraints are met} Tau97

  25. Fast Local Adjustment • With slicing method, can insert new divisor into slicing structure, get new placement and do delay trace efficiently. • So we can accurately reflect area change as it affects delay • With LP method, can also solve fast. • Just need inequalities where areas may overlap Tau97

  26. Comments • After k divisors selected and placed, re-do global placement to better reflect all divisors • i.e. do total timing driven placement on new netlist • Selective duplication and collapsing can be done to improve timing during the iteration. • experimenting with how to choose this selective collapsing Tau97

  27. Rewiring • To alleviate timing further, rewiring can be done • Can use SPFDs since exact logic in “gate” is somewhat irrelevant. • SPFDs allow one wire to replace another Gives more flexibility than redundancy addition and removal Uses that logic in blue box can be changed Tau97

  28. Technology Studies • Guess at process dimensions for DSM • “strawman ” .25m process • shrink to get .18m, ... , .05mprocesses • Design and layout different complex “gates” • Use Cadence’s LAS tool or Cadabra tool • Extract parasitics using SPACE or FASTCAP • Simulate with SPICE and Hu’s advanced BSIM model • Verify LDM Tau97

  29. Strawman 0.05 um Process Interconnect H/W = 2.5/2.0 • 9 metal layers • Copper wires and vias • Polyimide dielectric (k=2) • H/W = 2 for all layers except M9 • M9 kept same as .25 um process • Insulator thickness = .7m H/W = 2.4/1.2 Not to scale H/W = 1.6/0.8 H/W = 0.6/0.3 H/W = 0.14/0.07 Tau97

  30. First Six Layers of Metal Approximately to scale Tau97

  31. Design and Extract Flow manual wireplanning netlist decomposition technology file Hand design Standard Cell Domino Pass Transistor Logic test.blif format? LAS or Cadabra test.gds test.blifmv constraint file test.verilog SPACE(3D) test.gds SPICE 0.25m... 0.18m... 0.10m... 0.05m... ...0.25m ...0.18m ...0.10m ...0.05m interconnect technology parameters transistor models Tau97

  32. Richard Newton Alberto Sangiovanni Ralph Otten Wilsin Gosti Amit Narayan Philip Chong Mukul Prasad Amit Mehrotra Sunil Khatri Ravi Gunturi Subarna Sinha Hiroshi Murata IBM, Motorola, Intel, Fujitsu, Cadence SRC Acknowledgements Tau97

More Related