1 / 41

ECE260B – CSE241A Winter 2005 Interconnects and Delay Calculation

ECE260B – CSE241A Winter 2005 Interconnects and Delay Calculation. Website: http://vlsicad.ucsd.edu/courses/ece260b-w05. Interconnect-Centric Methodology. Conventional component-centric design methodology Interconnect impacts are negligent components characterized by cell libraries

tab
Télécharger la présentation

ECE260B – CSE241A Winter 2005 Interconnects and Delay Calculation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE260B – CSE241AWinter 2005Interconnects andDelay Calculation Website: http://vlsicad.ucsd.edu/courses/ece260b-w05

  2. Interconnect-Centric Methodology • Conventional component-centric design methodology • Interconnect impacts are negligent • components characterized by cell libraries • Modern interconnect-centric design methodology • Interconnects dominate VLSI system performance • Needs accurate interconnect prediction and analysis • Approaches • Hierarchical “time-budgeting” • Top-level “chip-integration” • Slide courtesy of Sylvester/Shepard

  3. Passivation Dielectric Wire Etch Stop Layer Via Global (up to 5) Dielectric Capping Layer Copper Conductor with Barrier/Nucleation Layer Intermediate (up to 4) Local (2) Pre Metal Dielectric Tungsten Contact Plug SEMATECH Prototype BEOL stack, 2000 • What are some implications of reverse-scaled global interconnects? • Slide courtesy of Chris Case, BOC Edwards

  4. Damascene and Dual-Damascene Process • Damascene process named after the ancient Middle Eastern technique for inlaying metal in ceramic or wood for decoration • Single Damascene • Dual Damascene ILD Deposition Oxide Trench / Via Etch Oxide Trench Etch Metal Fill Metal Fill Metal CMP Metal CMP

  5. Cu Dual-Damascene Process Bulk copper removal Cu Damascene Process Barrier removal Oxide over-polish • Polishing pad touches both up and down area after step height • Different polish rates on different materials • Dishing and erosion arise from different polish rates for copper and oxide Oxide erosion Copper dishing

  6. Area Fill & Metal Slot for Copper CMP Copper • Dishing can thin the wire or pad, causing higher-resistance wires or lower-reliability bond pads • Erosion can also result in a sub-planar dip on the wafer surface, causing short-circuits between adjacent wires on next layer • Oxide erosion and copper dishing can be controlled by area filling and metal slotting Oxide Metal Slot Area Fill

  7. Resistance & Sheet Resistance L r R = T W Sheet Resistance L R T R R 1 2 W • Resistance seen by current going from left to right is same in each block

  8. Bulk Resistivity • Aluminum dominant until ~2000 • Copper has taken over in past 4-5 years • Copper as good as it gets

  9. Capacitance: Parallel Plate Model ILD = interlevel dielectric L W T Bottom plate of cap can be another metal layer H SiO ILD 2 Substrate Cint = eox * (W*L / tox)

  10. Line Dimensions and Fringing Capacitance Lateral cap w S • Line dimensions: W, S, T, H • Sometimes H is called T in the literature, which can be confusing

  11. Inductance • V = L d I/d t V2 = M12 d I1/d t • Faraday’s law V = N d (B A) / d t B = m (N / l) I L = m N2 A / l V = voltage N = number of turns of the coil B = magnetic flux A = area of magnetic field circled by the coil l = height of the coil t = time • At high frequencies, can be significant portion of total impedance Z = R + jwL (w = 2pf = angular freq) Slide courtesy of Ken Yang, UCLA

  12. Inductance is Important… e.g. • Faster clock speeds • Frequency of interest is determined by signal rise time, not clock frequency • Copper interconnects  R is reduced • Thick, low-resistance (reverse-scaled) global lines • Chips are getting larger long lines  large current loops Massoud/Sylvester/Kawa, Synopsys • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys

  13. On-Chip Inductance • Inductance is a loop quantity • Knowledge of return path is required, but hard to determine • For example, the return path depends on the frequency Signal Line Return Path Massoud/Sylvester/Kawa, Synopsys • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys

  14. Signal Gnd Gnd Gnd Gnd Gnd Gnd Signal Gnd Gnd Gnd Gnd Gnd Gnd Frequency-Dependent Return Path • At low frequency, and current tries to • minimize impedance • minimize resistance • use as many returns as possible (parallel resistances) • At high frequency, and current tries to • minimize impedance • minimize inductance • use smallest possible loop (closest return path)  L dominates, current returns “collapse” • Power and ground lines always available as low-impedance current returns • Slide courtesy of Massoud/Sylvester/Kawa, Synopsys

  15. Inductance vs. Capacitance • Capacitance • Locality problem is easy: electric field lines “suck up” to nearest neighbor conductors • Local calculation is hard: all the effort is in “accuracy” • Inductance • Locality problem is hard: magnetic field lines are not local; current returns can be complex • Local calculation is easy: no strong geometry dependence; analytic formulae work very well • Intuitions for design • Seesaw effect between inductance and capacitance • Minimize variations in L and C rather than absolutes • E.g., would techniques used to minimize variation in capacitive coupling also benefit inductive coupling? • Slide courtesy of Sylvester/Shepard

  16. Vin Vout Distributed using multiple lumps of P model of a single wire Interconnect Modeling • Lumped load capacitance • Distributed R(L)C(K) network • P Model for each uniform wire segment • Transmission line • Microwave domain

  17. Transition 80% 50% 20% Vin Vout Delay Distributed using multiple lumps of P model of a single wire Characterization • Signal • Propagation delay • Transition time (slew rate) • Interconnect transfer function H(s) in Laplace domain

  18. Transition Degradation • Transition degradation leads to increased downstream (gate and interconnect) delays Step response of a distributed RC wire as function of location along wire and time Courtesy Prof. A. B. Kahng

  19. Elmore Delay = First Moment of Transfer Function • H(t) = step input response • h(t) = impulse response = dH(t)/dt = transfer function in time domain • T50% = median of h(t) • TED = mean of h(t) • TED = first moment of h(t)

  20. R C h(t) telm t Elmore Delay = Simple Delay Metric • Upper bound 50% delay for RC trees • TED = T50% if symmetric h(t) • TED > T50% for monotonic waveforms • TED T50% with increased transition time • TED = T50% / ln2 for an RC load driven by a step input • +/- 15% error for RC interconnects with a ratio • Simple (linear time) computation • Incremental • facilitate ECO (Engineering Chang Order)

  21. Elmore Delay Computation in an RC Tree Courtesy Prof. A. B. Kahng

  22. Vin Vout Distributed using multiple lumps of P model of a single wire Asymptotic Waveform Evaluation (AWE), etc. • Moment matching  poles and residues  time domain

  23. uN Iout g1 g2 v1 v2 v3 c1 c2 Interconnect Model Order Reduction or • Direct matrix solver (AWE): numerical instability • Pade via Lancoz (PVL) • Block Arnoldi (PRIMA)

  24. Capacitive Coupling (Crosstalk) • Two coupled lines • Cross-section view • Interwire capacitance allows neighboring wires to interact • Charge injected across Cc results in temporary (in static logic) glitch in voltage from the supply rail at the victim

  25. Aggressor Victim Crosstalk Noise • Glitches caused by capacitive coupling between wires • An “aggressor” wire switches • A “victim” wire is charged or discharged by the coupling capacitance (cf. charge-sharing analysis) • An otherwise quiet victim may look like it has temporarily switched • This is bad if: • The victim is a clock or asynchronous reset • The victim is a signal whose value is being latched at that moment • What are some fixes? • Slide courtesy of Paul Rodman, ReShape

  26. Aggressor Victim Crosstalk Delay Variation: Timing Pull-In • A switching victim is aided (sped up) by coupled charge • This is bad if your path now violates hold time • Fixes include adding delay elements to your path • Slide courtesy of Paul Rodman, ReShape

  27. Aggressor Victim Crosstalk Delay Variation: Timing Push-Out • A switching victim is hindered (slowed down) by coupled charge • This is bad if your path now violates setup time • Fixes include spacing the wires, using strong drivers, … • Slide courtesy of Paul Rodman, ReShape

  28. Delay Uncertainty Delay 85 80 75 70 (%) 65 60 d Noise / T 55 d 50 T D 45 Delay Uncertainty 40 Nominal Delay 35 30 25 0.35 0.30 0.25 0.20 0.15 0.10 Technology Generation (μm) Aggressor Victim Delay Uncertainty • Relatively greater coupling noise due to line dimension scaling • Tighter timing budgets to achieve fast circuit speed (“all paths critical”) • Slide courtesy of Kevin Cao, Berkeley

  29. Input 1 Output 1 Input 2 Output 2 Crosstalk Delay Calculation: Levels of Accuracy • Discard coupling capacitances • De-coupling by replacing coupling caps by double ground caps • De-coupling by Miller factors • Simulating multi-input multi-output (MIMO) networks

  30. Miller Factor • Q = CcvDVv = Cc (DVv – DVA) • Ccv = (DVv – DVA) / DVv * Cc • Miller factorroughly between 0 and 2 • Or between –1 and 3 (for 50% delay calculation)? Courtesy Prof. A. B. Kahng

  31. Input 1 Output 1 Input 2 Output 2 Multi-Input Multi-Output Model • RLC interconnect is linear • Superposition • Each of the drivers is simulated in turn • Other Thevenin voltage sources are shorted • AWE/PRIMA model order reduction techniques

  32. Worst Case Aggressor Scenario • Stimuli vector • For RC interconnects • Aggressors take opposite transition  max delay • Aggressors take identical transition  min delay • For RLC interconnects • ? • Aggressor alignment • For (linear) interconnects • Aggressors are aligned with each other to make max crosstalk noise peak • Align the noise peak to make max delay variation • For worst case gate delay • ? Aggressor 1 Aggressor 2 alignment Noise D delay

  33. Calculation Flow • Timing window overlaps enable crosstalk delay variation • Chicken-egg dilemma: delay vs. crosstalk • Iteration • Starting with the assumption that all timing windows are overlapped (pessimistic about the unknowns) • Refine calculation by reducing pessimism refinement Aggressor Victim overlap Timing window assumptions D delay Crosstalk delay calculation

  34. A D F B CL CL Gate Timing Characterization • “Extract” exact transistor characteristics from layout • Transistor width, length, junction area and perimeter • Local wire length and inter-wire distance • Device modeling and simulation by BSIM or SPICE (differential-equations solver) Courtesy Prof. A. B. Kahng

  35. Static Timing Analysis • Conservatism (Worst case scenario) • True gate delay depends on input arrival time patterns • STA will assume that only 1 input is switching • Will use worst slope among several inputs • For a number of different input slews and load capacitances simulate the circuit of the cell • Propagation time (e.g., 50% Vdd at input to 50% at output) • Output slew (e.g., 20% Vdd at output to 80% Vdd at output) tslew Vdd tpd Time Courtesy Prof. A. B. Kahng

  36. Look-Up Table • DG = f (CL, Sin) and Sout = f (CL, Sin) • Non-linear • Interpolate between table entries • Polynomial representation vs. lookup tables Load Capacitance Load Capacitance Input Slew Input Slew Output Slew Gate Delay Delay of the gate Resulting waveform

  37. Delay Calculation Cell Fall 0.147ns 0.1ns 0.178 Cell Rise 0.12ns 1.0pf 0.261 Fall delay = 0.178ns Rise delay = 0.261ns Fall transition = 0.147ns Rise transition = … Fall Transition 0.147 Courtesy Prof. A. B. Kahng

  38. Vin Vout Distributed using multiple lumps of P model of a single wire Effective Capacitance • Resistive shielding effect  effective capacitance < total load capacitance Iout Tr t • Ceff gate delay

  39. library(my_lib) { delay_model : table_lookup; library_features (report_delay_calculation); time_unit : "1ns"; voltage_unit : "1V"; current_unit : "1mA"; leakage_power_unit : 1uW; capacitive_load_unit(1,pf); pulling_resistance_unit : "1kohm"; nom_voltage : 1.08; nom_temperature : 125.0; nom_process : 1.0; slew_derate_from_library : 0.500000; default_operating_conditions : slow_125_1.08 ; lu_table_template("load") { variable_1 : input_net_transition; variable_2 : total_output_net_capacitance; index_1( "1, 2, 3, 4" ); index_2( "1, 2, 3, 4" ); } cell("INV") { pin(Z) { direction : output; function : "!A"; max_transition : 1.500000; max_capacitance : 5.1139; timing() { related_pin : "A"; cell_rise(load) { index_1( "0.0375, 0.2329, 0.6904, 1.5008" ); index_2( "0.0010, 0.9788, 2.2820, 5.1139" ); values ( \ "0.013211, 0.071051, 0.297500, 0.642340", \ "0.028657, 0.110849, 0.362620, 0.707070", \ "0.053289, 0.165930, 0.496550, 0.860400", \ "0.091041, 0.234440, 0.661840, 1.091700" ); } Timing Library Example (.lib)

  40. PVT (Process, Voltage, Temperature) Derating Actual cell delay = Original delay x KPVT Courtesy Prof. A. B. Kahng

  41. PVT Derating: Example + Min/Typ/Max Triples Proc_var (0.5:1.0:1.3) Voltage (5.5:5.0:4.5) Temperature (0:20:50) KP = 0.80 : 1.00 : 1.30 KV = 0.93 : 1.00 : 1.08 KT = 0.80 : 1.07 : 1.35 KPVT = 0.60 : 1.07 : 1.90 Cell delay = 0.261ns Derated delay = 0.157 : 0.279 : 0.496 {min : typical : max} Courtesy Prof. A. B. Kahng

More Related