400 likes | 591 Vues
Closing Gap Between ASIC and Custom. Chapter 12,13,14. Chapter 12. Semi-Custom Methods in a High-Performance Microprocessor design. Custom Processor Design. IBM eServer zSeries(S/390 mainframe) Physical design extensive use hierarchy Each Functional unit is partitioned as a macro
E N D
Closing Gap Between ASIC and Custom Chapter 12,13,14
Chapter 12 Semi-Custom Methods in a High-Performance Microprocessor design
Custom Processor Design • IBM eServer zSeries(S/390 mainframe) • Physical design extensive use hierarchy • Each Functional unit is partitioned as a macro • Each macro unit is fully floorplanned • Global wiring is done hierarchically • Macros are characterized for timing, noise… • Timing rules are generated using static transistor level simulation • Circuit and Physical design start as soon as sufficient logic is defined
Custom Processor Design • As design matures emphasis shifts from functional verification to logic modification and repartitioning for archive timing closure • Efficiency, turn-around-time and flexibility are as important as cycle-time. • Three types of macros • Arrays • Synthesized random logic macros(RLMs) • Full custom dataflow • Done predominantly in static logic, with dynamic circuitry reserved for extremely critical functions
Custom Processor Design • Custom design is very effective when elements are identical across the bit range of the data stack • Complex numerical functions usually are far less regular around the stack • Requires more effort to produce full custom • Often Timing critical • Circuit architecture might evolve Good candidate for Semi-Custom design
Semi-Custom design • Basic building block is a set of parameterized gates • Covers basis set capable covering most of the design space • No directly associated layout
Circuit Tuning • Tools can be divided to dynamic and static tuning • Dynamic tuning involves simulation with explicit waveforms and measures • Static tuning formulates optimization through static timing, optimizing slack in the presence of timing assertions • Large, non-bitslice circuit’s are impractical for dynamic tuning, but good for static • Tool used here is Einstuner • Build on top of static transistor-level timing tool(EinsTLT) • Combines a fast event driven simulator(SPECS) with timing tool(Einstimer)
Cell Generation • Create layout for corresponding to the parameterized gates • Writers use their own tool, C-cell which is script based system designed to produce optimal layout • Tool supports semi custom design • Generate set of layouts from cell specs. • Parse a schematic • Converts between parameterized and standard (RLM library) cells • Has integrated floorplanning aid • Layout post-processing (flattening, shape trimming)
Design example • 24-bit adder • Eight 24-bit adder modules • Architecture • CLA Ling adder
Conclusion • Faster method than Full custom • Feasible performance compared to full custom • Sometimes better performance if the architecture selection for full custom is done non optimally • Adapting easily to global timing convergence is advantage of semi-custom design
Chapter 13 Controlling Uncertainty in High Frequency Designs
Terminology • Actual frequency • Market frequency • Design frequency • Predicated frequency
Uncertainty Defined • Process uncertainty • for example in-die variation • Tool uncertainty • Inaccuracy in the simulation and extraction tools • For example: inductance is not extracted, the predicated frequency of design will be optimistic relative to actual frequency • Design uncertainty • Unpredictable variations in the design process between design iterations • Variations of execution of design methods across the chip
Uncertainty defined • Uncertainty in the manufacturing tools and design processes cause a gap between the predicated and actual frequencies, thus reducing the cycle time available for logic functionality
Uncertainty and Frequency • When process contains uncertainty • Time and energy is wasted on non-critical parts of design • Reduced frequency or delay in time-to market
Focused methodology developement • Reducing the uncertainty will minimize the number of paths in WNS(worst negative slack) bucket and thus the effort required to address them • It has been shown that the closer the correlation between the predicated frequency and the actual frequency for the paths in the WNS bucket, the higher the actual frequency will be
Methods for removing paths from the uncertainty window • Traditionally CAD algorithms is WNS • Result a large number of paths in WNS bucket • Better to use total negative slack (TNS) algorithm • TNS is defined as sum of negative slacks • Tries to improve all negative paths until it reaches zero slack • Goal is to remove all paths from the negative region, design goal is remove all paths from the WNS bucket • Design frequency should be setted that all paths in WNS bucket are in the negative region.
Conclusion • Gap between actual and predicated frequency implies uncertainty in the manufacturing, tool, and design processes • Design teams will work on wrong paths • If the uncertainty in analysis of the design is minimized, resources can be managed better and the gain of costly local optimizations is higher.
Conclusion • To control design and tool uncertainty take the following steps • List all of sources of uncertainty • Uncertainty plan development • Reduce guard band as much as possible • Use TNS-based cost model • Tune the Design frequency • Toward end of design threat all paths within sigma of the design equal, reduce uncertainty by reducing automation • Push CAD vendors into algorithm development • Finally remember any gap between predicated and actual frequency is lowering the actual frequency
Chapter 14 Increasing Circuit Performance through Statistical Design Techniques
Process variability • As CMOS technology keeps scaling the magnitude of variability of the process will increase • Systematic variable due the interaction between manufacturing process and the properties of the design • Optical proximity cause polysilicon feature to vary depending on the local layout surrounding • Inter layer dielectric thickness varies due to the dependence to CMP on the local wire dencity • Ability to improve manufacturing tolerances is limited • Mask fabrication • Overlay control
Process variability • Intra-chip variation should be taken account • Recent study shows 0.13um CMOS 35% of variation in MOS channel length is affected by intra-chip variation • For 0.07um CMOS the intra-chip effect would be 60% • Intra-chip variation is caused by emergence of a number of variation-generating mechanisms located on the interface between design and process
Identifying sources of variation • Needed to decide which of the multiple sources and patters of variation deserve the most attention • For example impact on path delay • The exact variability contribution of a process parameter is defined • Sensitivity of a circuit performance • Magnitude of the variation
Increasing performance through probabilistic timing modeling • How does intra-chip variation differ from inter-chip variation? • Usually in high performance chips, the delay is optimized by moving delay off from critical path to paths with slack • Resulting chip with large number of paths near to the maximum delay • The inter-chip variation affects in each path similarly • The intra-chip variation effect is dependent on surrounding and die position
Increasing performance through probabilistic timing modeling
Increasing performance through probabilistic timing modeling • Conservatism of the traditional timing tools is more disadvantageous for ASICs • No testing for full speed as in custom circuits • By implementing probabilistic timing analysis methodology the conservatism built into standard ASIC design can be reduced • By lowering yield the performance could be improved • Yield of 98% (instead of 99,99%) reduct conservatism by 17% • It has been noted that an ASIC chip prodused in foundry ca run up to 40% faster than predicated by standard timing analysis • Vendors would trade yield to performance if the revenue from faster chips will justify the additional expense in lost yield and testing overhead
Increasing performance through design for manufucturability techniques • As mentioned intra-chip variation is affected by layout. • Most techniques presented are already in use in full custom design • Optical proximity correction (OPC) • Cover wide range of reticle enhancement techniques • Geometrical structures are added to mask • Critical dimension (L) and resolution variable • Corner rounding and line pull-back • Phase shifting mask (PSM)
Increasing performance through design for manufucturability techniques
Increasing performance through design for manufucturability techniques • Currently(2002), a significant effort is under way to provide cell libraries which are OPC- and PSM-compliant • Allows ASIC designer benefit from those • Inserting a dummy features with regions of lesser density will increase uniformity. • Improves process uniformity of CMP • Downside is increased coupling capacitances and the delay and signal integrity dangers • Might be better to use better model than adding metal fill or use both • Systematic spatial correlated variation through lens aberrations, would need a mask level spatial correlation algorithm performed in conjunction with OPC
Increasing performance through design for manufucturability techniques • Is the parameter variation systematic or random • Systematic variation can be deterministic modeled • Random (or too complex to model deterministicly) variation is best be described by statistical means
Conclusion • Intra-chip variation of a process parameter are increasing • Makes timing estimates provided by standard design methodology overly conservative • Downgrade the speed • New methods needed for timing analysis • ASICs suffers more of these effects • No full speed tests, or trading yield for speed