1 / 41

Chapter 12: Reliability and Failure Analysis

Chapter 12: Reliability and Failure Analysis. S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012. Outline. Introduction Failure Times and Acceleration Factors Distribution Functions Cumulative Distribution Function Probability Density Function

lois
Télécharger la présentation

Chapter 12: Reliability and Failure Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12: Reliability and Failure Analysis S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012

  2. Outline • Introduction • Failure Times and Acceleration Factors • Distribution Functions • Cumulative Distribution Function • Probability Density Function • Reliability Concerns • Electromigration – in depth • Hot carriers – in depth • Briefly • Gate Oxide Integrity • Negative Bias Temperatute Instability • Stress induced lecakge current • Electrostatic discarge • Failure Analysis and Characterization Techniques • Quiescent Drain Current (IDDQ) • Emission Microscopy • Overview of other Techniques Reliability and Failure Analysis

  3. Introduction • Reliability • Probability of operating a product for a given time under specified conditions without failure. [1] • Reliability is a probability; i.e. measure of the confidence that an event will occur that is based on empirical evidence • In principle, if ALL conditions are known, then probability doesn’t exist. Reliability and Failure Analysis

  4. Introduction • Reliability • Probability of operating a product for a given time under specified conditions without failure. [1] • Time is arbitrary • Starts at 0. • When time -> ∞ the probability that an event, ‘failure’ here, is 100% • Reliability can be specified for other units than timeand can be independent of time Reliability and Failure Analysis

  5. Introduction • Reliability • Probability of operating a product for a given time under specified conditions without failure. [1] • Conditions are a fixed and chosen particular environment the product is subjected to and plays a critical role in its failure. • Reliability of formula 1 tires in different weather conditions • Sensor in deep space vs normal terrestrial applications (see cosmic rays) Reliability and Failure Analysis

  6. Introduction • Reliability • Probability of operating a product for a given time under specified conditions without failure. [1] • Failure is defined as the product not meeting an intended function at a certain performance level that is defined based on the application and other considerations (cost etc.) • Failure can range from to the complete inoperability to fully functioning but with an unacceptable performance deviation. Reliability and Failure Analysis

  7. Failure Times and Acceleration Factors • Failure Times • mean time to failure (MTTF) • mean time between failures (MTBF) Where n is the number of products and tn is the operating time after a product fails. Reliability and Failure Analysis

  8. Failure Times and Acceleration Factors -------- MTBF -------- • Failure Times • MTFB is more useful for repairable products • MTTF is more used for products replaced after failure (i.e. sensor) operating operating failure failure ---- MTTF ---- ---- MTTF ---- Fig. 1 Illustration of MTTF and MTBF Reliability and Failure Analysis

  9. Failure Times and Acceleration Factors • Failure Times • Macro manufacturing defects (infant mortality) • Can be reduced or eliminated with rigorous testing and burn-in* • Constant (random) failures • Wear out failures Fig. 2 Illustration of reliability bathtub curve with components broken out. [2] *Burn-in process eliminates, or reduces, the infant mortality by filtering those defective products or components before placing in service. Effectively moves tstart further down the bathtub timeline. [3] Reliability and Failure Analysis

  10. Failure Times and Acceleration Factors It is not practical to test products over their estimated operational lifetimes which can span years. This gives rise to the need to ‘accelerate time’. This acceleration is achieved by ‘accelerating’ (increasing) a stress condition (temperature, voltage, current, etc.) which is called an acceleration factor (AF). It is assumed results can be extrapolated to normal operating conditions. • Not always the case as failure mechanisms may differ under accelerated condition • (AF ‘degree’) α uncertainty • Modest AF for lower uncertainty: longer reliability stress, product life tests • Higher AF for faster measurement times • Knowledge of degradation mechanism Reliability and Failure Analysis

  11. Failure Times and Acceleration Factors • Acceleration Factor (AF) • Many failure modes are activation energy limited • Arrhenius equation characterizes such Where A, a constant, and EA, activation energy, are assumed to be independent of temperature • Acceleration factor, AFT, is then simply the ratio of time at the base (normal operating temperature), To, to the elevated temperature, T1: Reliability and Failure Analysis

  12. Failure Times and Acceleration Factors • Acceleration Factor (AF) • For when voltage is increased above the operating value the time to failure is expressed as: • Acceleration factor is again simply the ratio t(Vo)/t(V1): Reliability and Failure Analysis

  13. Distribution Functions • Product failures are random in nature and will occur at different frequency according to the measured unit, such as time. • A frequency distribution over some unit • A failure or hazard rate λ: the probability of failure/unit time at time t given that the member of the original distribution has survived until time t.[4] Reliability and Failure Analysis

  14. Distribution Functions • Probability density function • The number of failures of at time t. • Cumulative Distribution Function • Known as failure probability: the probability that a device will fail at or before time t. Reliability and Failure Analysis

  15. Distribution Functions • Probability density function • Models the failure distribution • Choose the ‘right’ model • Cumulative Distribution Function • Is the area of the probability density function up to t. When t=∞ F(t)=1, or, in other words,100 % of products have failed. When t=0 F(t)=0. • Mean time to failure Reliability and Failure Analysis

  16. Distribution Functions • Exponential Distribution • Simplest distribution function. Characterized by a constant failure rate over the lifetime of the device. • Useful when early failures and wear out mechanisms have been eliminated • Frequently used in semiconductor analysis Fig. 3 Exponential probability and cumulative distribution curves[5] Reliability and Failure Analysis

  17. Distribution Functions • Weibull Distribution • Weibull distribution function [6] the failure rate varies as a power of the device age. • Also known as extreme value distribution τ and β are scale and shape parameters, respectively For β > 1 failures increase with time, β < 1 failures decrease with time, B=1 Weibull becomes the exponential distribution. Fig. 3 Weibull probability and cumulative distribution curves[7] Reliability and Failure Analysis

  18. Distribution Functions • Normal Distribution σ is the scale parameter and t15.87 is the time 15.87 % of the devices have failed. [8] Fig. 4 Normal probability and cumulative distribution curves[9] Reliability and Failure Analysis

  19. Distribution Functions • Log Normal Distribution • Frequently used to describe failure statistics of semiconductor devices over long times Fig. 4 Log normal probability and cumulative distribution curves[10] Reliability and Failure Analysis

  20. Reliability ConcernsElectronmigration • Electromigration – Gradual segregation of line components due to momentum transfer to ions from moving electrons, under an applied electric field, and diffusing metal. • Momentum transfer dominates in Al lines and Cu lines. Fig. 8 Illustration of electromigration [11] Fig. 9 Illustration of electromigration on polycrystalline line [1] Reliability and Failure Analysis

  21. Reliability ConcernsElectronmigration • Electromigration can lead to failures due to: • Increase in line resistance • A line becoming an open circuit (voids) • An adjacent line becoming a short circuit (whiskers) • Factors • Intrinsic • material, grain size, grain boundary orientation, triple point density, surface conditions, passivation etc. • Extrinsic • current and power density (related to size and dimensions i.e. implications on scaling!) • AC vs DC Reliability and Failure Analysis

  22. Reliability ConcernsElectronmigration • Electromigration mechanism • Diffusion through grain boundaries/material interface dominates • Triple points (void formation) • Mass depletion at large to small grains (void formation) • Surface Single crystalline lines do not experience electromigration degradation Fig. 10 Illustration of electromigration on polycrystalline line [1] Reliability and Failure Analysis

  23. Reliability ConcernsElectronmigration • What is affected • All metallic lines, interconnects and contacts face failure which can result in complete IC failure. Risk increases as current density and/or power density increases • high power devices such as low voltage MOSFETS • or scaled down VLSI and ULSI • Solder joints as is used in flip-chip bonding Reliability and Failure Analysis

  24. Reliability ConcernsElectronmigration • Electromigration mitigation • Triple point removal • Natural consequence as lines narrow (see bamboo structure) (activation energy: intergrainvs- intragrain) • Diffusion retardation using impurities • Addition of 4% Cu to Al lines. (70x lifetime increase in Al lines). Addition of 1% Pd or 1% Ag [11] to Cu. • Material selection • Cu is less susceptible to EM degradation than Al • Refractory metals (TiN, W, etc.) experience almost no EM Reliability and Failure Analysis

  25. Reliability ConcernsElectronmigration • Electromigration mitigation contd. • Creating a shunt with layered structure • TiN can act as a shunt for a Al line should a weak spot develop. • Blech length • A critical length for metal lines exist, below which EM is inhibited Reliability and Failure Analysis

  26. Reliability ConcernsHot Carriers • Hot Carriers • Energetic carriers (electrons,e, or holes, h). Electron that gains energy in an electric field causing ionization collisions and avalanche multiplication of e/h pairs. Energetic species can be injected in gate oxide as trapped charges or appear as gate drift current, form interface traps, and can generate photons or appear as substrate current. • The most dominant failure mechanism for nanoscale NMOS transistors. Fig. 10 Illustration of hot carrier damage on NMOS[1] Reliability and Failure Analysis

  27. Reliability ConcernsHot Carriers • Hot carriers manifest in NMOS as: • Threshold voltage change (i.e. minimum voltage to create an inversion layer below gate) • Transconductance degradation (drain current change divided by change in the gate/source voltage with constant drain/source voltage) • Forward bias of the source-substrate junction due to substrate current • Snapback breakdown (when output enters a negative resistance region, leads to more impact ionization and can result in device destruction) Reliability and Failure Analysis

  28. Reliability ConcernsHot Carriers • The chief degradation mechanism for n-channel MOSFETs is interface trap generation • substrate current is a good monitor of such damage. • p-channel devices chief degradation mechanisms is trapped electrons near the gate-drain interface • gate current is a good monitor of such. The weakness of measuring by means of substrate or gate currents is that it is an indirect measurement Reliability and Failure Analysis

  29. Reliability ConcernsHot Carriers • The interface trap density can be measured also • directly by charge pumping [12][13] • indirectly by threshold voltage, transconductance, or drain current changes Reliability and Failure Analysis

  30. Failure Analysis and Characterization TechniquesMicroprobes • Mechanical probes are used to contact parts of an IC during FA. • AFM probes for sub-micron dimensions • Scanning capacitance and spreading resistance microscopy for ion implant and dielectric characterization Reliability and Failure Analysis

  31. Failure Analysis and Characterization TechniquesMicroprobes • A number of metal lines are tested at various temperatures for a given current density. The resulting data are plotted as cumulative failures as a function of test time [1] Reliability and Failure Analysis

  32. Failure Analysis and Characterization TechniquesMicroprobes • The median times to failure are then plotted as log(t50) versus 1/T and the activation energy is extracted [1] Reliability and Failure Analysis

  33. Failure Analysis and Characterization TechniquesMicroprobes • Then measurements are made for various current densities at a given temperature. The exponent n can be found from the fitted slope. • With n and activation energy known, MTTF can be found for other temperature or current densities. (Eq. 12.20) Reliability and Failure Analysis

  34. Failure Analysis and Characterization Techniques • Emission Microscopy uses the emission of light in response to an electrical stimulus to locate potential defect site. [1] • A failed chip is placed in an emission microcope and illuminated. • An image is taken to later locate the defect sites. • Illumination is turned off and then a voltage is applied which causes the stimulated emission of light • Images are then superimposed • Top vs. bottom imaging [1] Reliability and Failure Analysis

  35. Failure Analysis and Characterization Techniques • What can be detected? • latch-up conditions when there is a high density of excess carriers in the device and a ‘short’ is formed between components. • hot carriers i.e. in the drain space charge region of a MOSFET • When carriers flow through an oxide and lose energy (reverse biased np-junction). • Weak spots on the gate oxide Reliability and Failure Analysis

  36. Failure Analysis and Characterization Techniques • What can be detected? continued • The spectral content of emitted light can be used to gain insight into the failure mode. The light emission correlates well with substrate current (see hot carrier and Isub) and device degradation.5 • picosecond imaging circuit analysis can be use to show time resolved switching of devices Reliability and Failure Analysis [1]

  37. Overview Oxide Integrity: Oxides are commonly characterized by their charge-to-breakdown behavior or time-to-breakdown. Measured by the constant or ramped gate voltage or gate current techniques. Pro The constant gate current has the advantage of simply yielding the charge-to-breakdown as a simple product QBD = JGtBD . Con the current may not be uniform and most devices are not operated under constant current, but under constant gate voltage conditions. Thin oxides frequently do not exhibit well defined breakdowns partly because the gate leakage current prior to breakdown is quite high. Negative bias temperature instability: most commonly characterized by threshold voltage, transconductance, interface trap density, and drain current measurements. Occurs in p-channel MOS devices stressed with negative gate voltages at elevated temperatures. Reliability and Failure Analysis

  38. Overview • IDDQ: Pro: Simple to implement as only the input current to an IC needs to be monitored; very good at detecting shorts. Con: Does not localized the fault; difficult to determine opens. • Emission Microscopy: Pro: Entire die may be viewed at one time; no deprocessingexcept for lid removal; functional failure does not need to propagate to output. In the form of PICA it can be used to follow the switching function of an IC and allows circuit FA. Con: IC must be biased and toggled; ohmic defects do not emit light; no light detection through opaque layers; emission site may not be defect site. For imaging from the back of the chip: sample preparation; substrate thinning may impact device characteristics; Si is an infrared filter and limits detection bandwidth of emission sites; doping atoms serve to scatter IR photons leading to reduced sensitivity; CCD based systems have a low quantum efficiency in the required IR spectrum. Reliability and Failure Analysis

  39. Overview Voltage Contrast: Pro: Contactless method to determine the spatial and temporal voltage within an IC. The electron beam is small and can contact most lines in an IC. Con: Difficult when the line of interest is buried below other metallization levels. Liquid Crystal: Pro: Low cost, easy to use, very good thermal and spatial resolution, useful for thermal and voltage contrast analysis, real time imaging. Con: Tends to “wick up” around probes and bonding wires making identification of hot spots difficult; poor thermal resolution for measurements from the back of the wafer; the number of layers between the source of the failure and the surface where the liquid crystal resides limits spatial resolution and sensitivity; liquid crystal has a set transition temperature. Multiple hot spots can be difficult to resolve if the warmer spot creates a significant temperature gradient. Fluorescent Microthermography: Pro: Offers high thermal and spatial resolution. Con: The film must be calibrated for quantitative temperature measurements. Infrared Thermography: Pro: Is a passive technique not requiring thermal excitation with good temperature resolution allowing imaging from front and rear surfaces. Con: Calibration is necessary for quantitative information but is not easy since the emissivity is generally not known. OBIRCH: Pro: A sensitive technique for a variety of FA investigation with high resolution. When OBIRCH does not work well, frequently EMMI does. The two are complementary. Con: Cannot be used for multiple metal layer chips; when used from the rear surface, wafer must be thinned to 150–200 μm. Reliability and Failure Analysis

  40. References • M. Ohring, Reliability and Failure of Electronic Materials and Devices, Academic Press, San Diego, 1998. • (public domain) http://en.wikipedia.org/w/index.php?title=File:Bathtub_curve.svg&page=1 • http://en.wikipedia.org/wiki/Burn_in • F.R. Nash, Estimating Device Reliability: Assessment of Credibility , Kluwer, Boston, 1993. • http://en.wikipedia.org/wiki/Exponential_decay • W. Weibull, &#8220;A Statistical Distribution Function of Wide Applicability,&#8221; J. Appl. Mech. 18 , 293 &#8211; 297, Sept. 1951. • http://en.wikipedia.org/wiki/Weibull_distribution • W.J. Bertram, &#8220;Yield and Reliability,&#8221; in VLSI Technology 2 nd ed. (S.M. Sze, ed.), McGraw- Hill, New York, 1988. • http://en.wikipedia.org/wiki/Normal_distribution • http://en.wikipedia.org/wiki/Log-normal_distribution • http://en.wikipedia.org/wiki/electromigration • A. Scorzoni, B. Neri, C. Caprile, and F. Fantini, “Electromigration in Thin-Film Interconnection Lines: Models, Methods and Results,” Mat. Sci. Rep. 7, 143–220, Dec. 1991. • J. S. Brugler and P. G. A. Jespers, ``Charge Pumping in MOS Devices,'' IEEE Trans.Electron Devices, vol. 16, no. 3, pp. 297-302, 1969. • G. Groeseneken, H. Maes, N. Beltran, and R. F. De Keersmaecker, ``Reliable Approach to Charge-Pumping Measurements in MOS Transistors,'' IEEE Trans.Electron Devices, vol. 31, no. 1, pp. 42-53, 1984. • T. Schweinb &#168; ock, S. Sch &#168; omann, D. Alvarez, M. Buzzo, W. Frammelsberger, P. Breitschopf, and G. Benstetter, &#8220;New Trends in the Application of Scanning Probe Techniques in Failure Analysis,&#8221; Microelectron. Reliab. 44 , 1541 &#8211; 1546, Sept./Nov. 2004; G. Benstetter, P. Breitschopf, W. Frammelsberger, H. Ranzinger, P. Reislhuber, and T. Schweinb &#168; ock, &#8220;AFM- based Scanning Capacitance Techniques for Deep-submicron Semiconductor Failure Analysis,&#8221; Microelectron. Reliab. 44 , 1615 &#8211; 1619, Sept./Nov. 2004. Reliability and Failure Analysis

  41. Exercises • Choose two reliability concernsa not covered in the lecture and describe them (~1 paragraph) . Explain how these two reliability concerns manifest in an example device of your choosing. Use figures and references. • Choose four failure analysis characterization techniquesb concerns not covered in the lecture and describe them (~1 paragraph) . Briefly list the pros and cons of the chosen techniques and be sure to mention if any techniques compliment each other (For example, quiescent drain current testing is sometimes combined with emission microscopy). Use figures and references where appropriate. a. (Gate Oxide Integrity, Negative Bias Temperature Instability, Stress Induced Leakage Current, Electrostatic Discharge) b. (Quiescent Drain Current, fluorescent Microthermography, Infrared Thermography, Voltage Contrast, Laser Voltage Probe, Liquid Crystals, Optical Beam Induced Resistance Change, Noise) Reliability and Failure Analysis

More Related