The Life of SPICE as a Transient Circuit Simulator

The Life of SPICE as a Transient Circuit Simulator Chung-Kuan Cheng CSE Department UC San Diego

Outline: SPICE (Simulation Program with Integrated Circuit Emphasis) • Motivation and Contribution • Theory and Algorithms • Efforts to Improve the Simulator • What I have learned • Conclusion and Future Work

Motivation: The Life of Spice One of the most successful academic products. The software package and its derivatives have become the bread and buffer of circuit designers. After 46 years, the core of its transient simulation algorithms remains the dominant approaches to produce the gold standard for the characterization of the circuit behavior. “Every time we thought of a new technique, we hoped for a factor of 100 to 1 in speed improvement. We would accept 10 to 1, but we would only get 3 to 1, which was not enough to beat the improved speed of the new hardware coming out of Intel.”, D. Pederson, IEEE Spectrum 1998.

History of SPICE • Cancer (Computer Analysis of Nonlinear Circuits, Excluding Radiation) • 1971, Ron Rohrer (Ernest Kuh) • SPICE1(Simulation Program with Integrated Circuit Emphasis) • 1972, Larry Nagel (Don Pederson) • SPICE2 • 1975, Larry Nagel • SPICE3 • 1989, Tom Quarles (Richard Newton) • 1980’s and Beyond • Ngspice, HSPICE, Spectre, PowerSpice, et al. • FastSPICE

It took SPICE 10 years to attain commercial viability Research Ron Rohrer Development Larry Nagel Marketing Don Pederson All students used SPICE in his design classes He pushed very hard for “ease of use” Sales Students They fanned out into industry Source: IEEE Solid-State Circuits Magazine

D. Pederson, E.S. Kuh, R. Rohrer: NEC Award, 1995

with R. Rohrer:godfather of SPICE, San Diego, 2017

with Zhou FengSan Diego, 2017

Motivation: Devices + Interconnects Interconnects from wired.com 2016/05 google-tpu-custom-chips A standard cell with three metal layers (sand-colored) and vertical pillars (typically plugs of tungsten), polysilicon gates (red), and the solid silicon bulk. wikipedia Four layers copper interconnect, polysilicon (pink), wells (greyish), and substrate (green) wikipedia • Tightly coupling capacitance & inductance (post layout) • Nonlinear devices. • Low margin designs with severe noises

Motivation: Complexity • Billions of transistors, interconnect components, over 7-12 metal layers, and strong coupled/post-layout effects • Simulation is critical to systems and modules: e.g. memories, custom digital, mixed-signal designs. • Long simulation time (days, weeks) is one of significant bottlenecks. Impossible for full-chip simulation. • Severe parasitic and noise problem. Voltage drop analysis of a large network, Cadence Design Systems Inc. From Dick Sites, “Datacenter Computers modern challenges in CPU design” Google Inc. 2015 & Intel i7

Theory and Algorithm: Formulation Linearizedformat where • : linearized capacitance/inductance • : linearized conductance/resistance • : nonlinear device off-set • : node voltage/branch current • : incident matrix • : input sources

Theory and Algorithm: Formulation TypicalRLCcircuitanditsequationsfrom IEEE TCAD PRIMA 1998

SPICE Flow: Transient Analysis • DC analysis • Device Evaluation (easy to parallelize) • Numerical Integration • Solve the dynamical (differential equation) systems • The most time consuming part, when system is large • Convergence & Error Check • Step Control Circuit netlist Device Evaluation & DC Analysis Device Evaluation Numerical Integration Time stepping Convergence & Error Check Re-evaluate Step Control finish

Theory and Algorithm: Problems Huge, stiff, nonlinear, dynamic system • Circuits of millions to billions of elements with parasitics. • Stiffness: System eigenvalues variate by a range of 107 or higher. • CMOS models: parameters change drastically between on and off states. • Frequencies: 1K-1G hertz or wider. Simulation goes through the details of Gigahertz to cover the trend of Kilohertz. Issues: Accuracy, Stability, Complexity (Computation time and power consumption).

Theory and Algorithm: Problems Issues: Accuracy, Stability, Complexity (Computation time and power consumption). • Accuracy: algorithm, step size and numerical precision • Double-precision floating-point: 16 significant decimal digits, Exponent (-1024-1023) • Stability: An algorithm is called numerically stable if an error, whatever its cause, does not grow to be much larger during the calculation.

Theory and Algorithm: SPICE Approaches • Numerical Integration • Dahlquist barrier: No linear multi-step method with order larger than 2 can be A stable. • Low order linear multi-step integration to preserve the A stability • Nonlinearity • Newton Raphson iteration • Convergent when the initial guess is close to the solution (reducing the step size h) • Matrix Solver • Direct sparse matrix solver • Robust and reasonably efficient

Theory and Algorithm: Integration A homogenous dynamical system Solution: Numerical Integration (Conventional Method) • Forward Euler • Backward Euler where =A • Trapezoidal Method

Numerical Integration Accuracy • An example: , where • EXPM: Analytical solution • FE: Forward Euler ; BE: Backward Euler; TR: Trapezoidal Method

Efforts to Improve the Simulator • Waveform Relaxation Method • Alternating Direction Implicit (ADI) Method • Integration with High Order Derivatives • Matrix Solvers • Matrix Exponential Integrator

Waveform Relaxation Method , . + Partitioning and distributed computation + Each differential equation picks its own timesteps - Gauss-Seidel and Gauss-Jacobi techniques, Convergence, Stability E. Picard, “On the application of successive approximation methods to the study of some ordinary differential equations,” Journal of pure and applied mathematics, 1893. E. Lelarasmee, A.E. Ruehli, A.L. Sangiovanni-Vincentelli,”Waveform relaxation: theory and practice,” IEEE Trans. On CAD 1982.

Alternating Direction Implicit Method Split the G matrix: Alternate two half steps of trapezoidal integration • Partitioning is the key • The number of non-zero elements is decreased • Stable D.W. Peaceman, H.H. Rachford Jr., Journal of SIAM, 1955. T. Namiki, K. Ito, IEEE Tran. Microwave Theory and Technology, 1999. 2D EM wave. Z. Zhu, R. Shi, and C.K. Cheng, E.S. Kuh, ASPDAC 2006.

Integration with High Order Derivatives Use Taylor expansion with high order derivatives to approximate the integration solution. + High order accuracy + Stable • N-fold dimensions (N the order of the formula) • High order derivatives of nonlinear models • N. Obreshkov, On the mechanical quadrature, Akad. Nauk, 1942 • E. Gad, M. Nakhla, R. Achar, Y. Zhou, A-stable and L-stable high-order integration methods for solving stiff differential equations. IEEE TCAD, 2009.

Matrix Solvers Speed up the calculation of matrix solvers via supercomputer, GPU, node ordering and graph sparsification. • Supercomputer, Trilinos: Xyce, S. Hutchinson, E. Keiter, R. Hoekstra, H. Watts, A. Waters, R. Schells, S. Wix, Parallel Computing: Advances and Current Issues (pp. 165-172), 2002. • Node Ordering, Block Triangular Form: KLU, T.A. Davis, E. Palamadai Natarajan. ACM Transactions on Mathematical Software, 2010. • GPU: L. Ren, X. Chen, Y. Wang, C. Zhang, H. Yang, H., 2012, Design Automation Conference, 2012. • Graph Sparsification: Z. Feng, Design Automation Conference 2016.

Matrix Exponential Methods • Analytical Formulation • Standard Krylov Method • Rational Krylov Method • Invert Krylov Method C. Moler, C. Van Loan, 19 dubious ways to compute the exponential of a matrix, 1978; twenty-five years later, 2003 A. Nauts, R.E. Wyatt, 1983; T.J. Park, J.C. Light, 1986; Y. Saad, 1992

Matrix Exponential Method • Analytical solution perspective • Let (C can be regularized [TCAD 2012]) • Let input be piecewise linear

MatrixExponentialComputation Transform into For simplicity, we use to represent , from now on • Forward Euler fits the first two terms as (I+hA) • Backward Euler fits the first two terms as (I-hA)-1 • Trapezoidal fits the first three terms as (I-0.5hA)-1 (I+0.5hA)

Matrix Exponential Method • Accuracy: Analytical solution • ApproximateeAh as (I+Ah)  Forward Euler • ApproximateeAh as (I-Ah)-1  Backward Euler • Scalable:Sparse matrix-vector multiplication (SpMV) • Stability: Stable for passive circuits reference solution

Krylov Subspac: Approximatev • Krylov subspace • Standard Basis Generation • Orthogonalization (Arnoldi Process): • Matrix reduction: m=10~30 • Matrix exponential operator (Assume = or =1) • time stepping, h, via scaling • Posteriori error estimate [Saad92]

Property of Standard Krylov Subspace Method • Theorem [Saad ‘92]: Standard Krylov method fits the first m terms in Taylor’s expansion. • Standard Krylov subspace tends to capture theeigenvalues of large magnitude • For transient analysis, the eigenvalues of small real magnitude are wanted to describe the dynamic behavior.

Rational Krylov Subspace • Spectral Transformation: • Shift-and-invert matrix A • Rational Krylov subspace captures slow-decay components • Use rational Krylov subspace for matrix exponential Important eigenvalue: Component that decays slowly. Not so important eigenvalue: Component that decays fast.

Rational Krylov Subspace: eAhv Rational Krylov subspace • Arnoldi orthogonalization Vm=[v1 v2 … vm] • Matrix reduction (Assume that v1=v) • Matrix exponential operator

Invert Krylov Subspace eAhv Invert Krylov space • Arnoldi orthogonalization Vm=[v1 v2 … vm] • Matrix reduction (v1=v) • Matrix exponential operator Comment: The calculation of A-1vi=G-1Cvi is much simpler when G is sparse but C is complex for post layout simulation

Absolute Error: A Sample Test Case We derive absolute errors of various methods using a sample test case with known solution (Eigenvalue/eigenvector approach). • n: circuit node number= 1600 • G: conductance range [0.01, 100] • C: capacitance range [8.5e-18, 9.9e-16] • A=-C-1G: eigenvalue range [-3.98e17, -8.49e10] • v: L2 norm = 23.3; L norm = 0.999 from rand(1600,1) by MATLAB

Matrix exponential operators vs. low order scheme • FE: Forward Euler • Std. M: standard Mexp, m=2 • Inv. M: Invert Mexp, m=2 • Rat. M: Rational Mexp, m=2 • BE: Backward Euler • TR: Trapezoidal

Matrix exponential operators vs. low order scheme • Case h< e-17 (Taylor’s expansion is valid) • Trapezoidal method: error terms start from the third order • BE, FE, and Krylov methods: error terms start from the second order • Invert and rational Krylov methods: error terms start from the first order if is not well chosen • Case e-17<h<e-10 • FE method: the result diverges • Trapezoidal and Krylov methods: the results are flat if is not well chosen • BE method and invert/rational Krylov methods: the results look better • Case e-10<h • FE method: the result diverges • Trapezoidal and BE methods: the results are flat • All Krylov methods: the absolute errors shrink with the solution.

Matrix exponential operators vs. low order scheme • For three Krylov methods, m=2 • For rational Krylovthod, is set a large constant • * TR is worse than that BE

Rational Krylov Approach: = h/2

Rational Krylov Method vs Trapezoidal Method When , rational Krylov and trapezoidal methods use the same subspace: ()-1 . However, rational Krylov method is better in terms of accuracy or allows longer time step h for the same accuracy.

Rational Krylov Approach: = 5e-11

Rational Krylov Approach: = 5e-14

Rational Krylov Space: =5e-17

Rational Krylov Space Method • For a linear system and a constant , we need only one LU decomposition for different time step h. • If constant is large enough, the longer time step h we use, the better accuracy we obtain.

Standard and Invert Krylov Methods

Standard and Invert Krylov Methods As m increases, the curve of Krylov method shifts to the right and converges at the right end the curve of invert Krylov method shifts to the left and converges at the left end. The desired time step h is around 10-12second range which is near the right end.

Three Krylov Methods Standard Krylov Method Result is accurate when time step size h is tiny C cannot be singular (regularization) Rational Krylov Method Result is superior to trapezoidal method for all h For proper choice of , the result is the best among the three Krylov methods The calculation is the most expensive (C+G)-1 Invert Krylov Method Result is good when step size h is large enough The calculation is the simplest G-1C, in particular for post layout simulation

Post Layout Analysis • Inverted Krylov Subspace Method • Backward Euler • Trapezoidal Method

Post Layout Analysis LU=C G C LU=C+hG LU=G

NumericalStabilityofSingularSystems • A1 tankRLCmodelwith a singularCmatrix • Toavoid,exponentialintegratorsshouldbeevaluatedwithlow-orderfunctions where

NumericalStabilityofSingularSystems • StepinputI_swithrisingtime10ps • Element-wiseresidueisplottedforeachvariable • Theincreasingresiduecouldbeapproximatedwithasensitivitymatrix.SinceafterT=10ps,thesolutionis

ErrorEstimationwithSensitivityMatrix • FromArnoldiprocess,wehavetherelationwith • Assumealocalerrorwithintargetfunction theerrorwithisorhigherordercomparedto where

The Life of SPICE as a Transient Circuit Simulator