1 / 41

SMU CSE 8314 / NTU SE 762-N Software Measurement and Quality Engineering

SMU CSE 8314 / NTU SE 762-N Software Measurement and Quality Engineering. Module 12 Software Reliability Overview. Contents. Introduction Measuring Reliability Summary. Introduction. Reliability is the “Bottom Line” of Software Quality.

idola
Télécharger la présentation

SMU CSE 8314 / NTU SE 762-N Software Measurement and Quality Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SMU CSE 8314 / NTU SE 762-NSoftware Measurement and Quality Engineering Module 12 Software Reliability Overview

  2. Contents • Introduction • Measuring Reliability • Summary

  3. Introduction

  4. Reliability is the “Bottom Line” of Software Quality • Reliability is the most conspicuous attribute of quality • But what do we mean by reliability? • End-User’s Perspective • It does what I want • It never fails • etc. • What can be Measured • It does what was specified • Failure rates are low • etc. Not a Perfect Match

  5. Hardware Reliability Focuses on Materials and Production • Assumption: failure results from physical changes • breakage, wearout, fatigue • Or incorrect manufacturing processes • Theory of reliability is founded on the statistical behavior of random molecules

  6. But Product Development can also be a Factor in Quality & Reliability • Bulletproof Vest • 2” thick steel vs. • lightweight synthetic material • Quality? • Reliability? • Suitability for Purpose?

  7. Poor Development Practices can Lead to Failure • What if the design puts undue strain on a part? • Was the failure due to the part or the design? • What if the product wasn't properly tested? • Car overheats in the desert (never tested that severely)

  8. Poor Software Development Practices can lead to Failure • Software Failures are almost always attributable to product development practices • Design - Testing • Coding - CM

  9. Software Reliability • Failure: when it does not do what it is supposed to do • Defect: the reason for the failure • Bad code/data/design/requirements • Bad configuration control • etc. “The extent to which software correctly performs the functions assigned to it”

  10. Improving Software ReliabilityOption 1 Design software to be fault tolerant • Redundancy • Multiple algorithms • This approach has been shown to have very little effect on overall reliability • It is a better fit to the hardware paradigm that involves fatigue of parts

  11. Improving Software ReliabilityOption 2 Develop software to befree of defects • Prevention activities • Detection activities • This is where we will concentrate • Traditionally this has been viewed as a lost cause • But using modern techniques of quality improvement, there is hope for success

  12. Measuring Reliability

  13. Definitions Failure • When the product does not do what it is expected to do for a given set of input or operating conditions. Fault (depends on author) • A condition that causes failures. Defect (depends on author): • A fault found before / after product release • Any cause of failure • Any error, regardless of whether it is caught before release • Other terms: bug, mistake, malfunction, etc.

  14. What is Reliability? • Reliability is the probabilitythat software will continue to function for a given period under given conditions • The period can be measured in natural units or time units • Natural unit – something that measures the amount of processing performed by the software, such as “runs”, pages of output, screens displayed, jobs completed, etc. • Time units are hours, minutes, days, weeks, etc. • Reliability can also be measured as failure intensity – the number of failures expected per natural unit or time unit

  15. What is Reliability? • If reliability is measured in terms of a time unit, denoted t, then • t is a random, failure free time interval. • The question we would like to answer is: how long is t? In other words, how long will it function without failure? • But since we cannot know this, we can only estimate the probability of failure for a given value of t.

  16. Reliability is not Correctness • Reliability means that it does what you want it to do often enough to be satisfactory whereas • Correctness is a binary, “yes or no” condition • Software is almost never perfectly correct • But it can be highly reliable

  17. This definition is usually expressed as a function: R(t) = probability of operation without failuresin time t (i.e., in the interval 0-t) • For example, with an exponential distribution, the classic reliability equation is: R(t) = e -t/

  18. Graph of a Reliability Function R(t) = e -t/

  19.  Measures Reliability as a Constant • For large values of , the probability of operation without failure remains high for a longer period of time • For small values of , the probability of operation without failure deteriorates quickly This works with an exponential distribution function. Otherwise, there may not be a simple constant to measure reliability of the total product.

  20. Additional Notes about Reliability • The desired value of t depends a lot on the application and the priorities • Commercial application • t is large & the priority is to have few defects over the life of the application in order to keep maintenance cost low • Real time application • e.g. an aircraft application, t is relatively short & failures in operation are critical

  21. Failure Function • Another popular approach is to look at the probability of a failure: F(t) = 1 - R(t) = probability of failure in time t • The latter is called a failure function. • It is the cumulative distribution function of the time interval 0-t. • For the exponential distribution, the failure function is: F(t) = 1 - e -t/

  22. Graph of a Failure Function F(t) = 1 - e -t/

  23. Probability Density Function • Attempts to put it in another form that means something to a user: “(approximately) how many defects are left in the software?” For the exponential distribution, the formula is: f(t) = dF(t)/dt f(t) = a-1e -t/

  24. Graph of Density Function f(t) = a-1e -t/

  25. Post Release Defects

  26. All three functions( = 2)

  27. Failure Rate ()(Hazard Function; Failure Intensity) • This is an attempt to estimate the probability of a failure at a given time, assuming no prior failures. (t) = f(t)/R(t) = -dR(t)/dt

  28. How to Determine  • For an exponential distribution,  is a constant • I.e., the higher the reliability, the lower the failure rate = 1/a This works with an exponential distribution. Otherwise, l may be a more complex function.

  29. Failure Ratevs Number of Defects • Most models assume the failure rate is directly related to the number of defects remaining in the software. Is this a reasonable assumption? (Discussion)

  30. Problems with the Assumptions for Classic Definitions • Defects are random with respect to the structure of the software • But some parts are harder to write than others • Testing is uniform with respect to software • But some parts are better tested than others

  31. More Problems with Assumptions • All defects are equally likely to occur • But it actually depends on the paths taken most often • All defects produce equally serious failures • Clearly not the case for software • Testing correctly simulates normal, stressful and unusual conditions • Generally this is very hard to do

  32. Error ProbabilityHardware vs. Software

  33. Another View Infant Mortality Wearout

  34. Software May Not FitClassical Models of Reliability • There is very little solid data for any of the models in use for software • Conditions that differ may produce very different failure rates, even though defect count remains the same

  35. How to Measure Time • The measure of time is a matter of considerable dispute • This may dramatically affect how you determine reliability

  36. Three Ways to Measure Timefor a Software Product • Real Time (Calendar Time) • Number of weeks or months since some event • Use Time • Number of hrs the software is in actual use • CPU Time • Number of hours using the CPU • Each of these produces very different results and fits very different models

  37. The Nature of The Application • Different applications can have very different notions of reliability

  38. Different Applications - Different Reliability Implications Application: Financial Transactions Problem: Floating Point Round off Errors Not a Problem: Excessive Time for Calculations Application: Space Craft Local Navigation Problem: Excessive Time for Calculations Not a Problem: Floating Point Round off Errors Application: Space Craft Flight Path Calculations (ground based) See Financial Transactions

  39. Summary • Reliability is important, but hard to measure in a way that relates to customer expectations • Software reliability is mainly determined by development practices rather than by manufacturing or materials • Reliability, failure rate, and failure density are related measures

  40. References • Lyu, Michael R., Handbook of Software Reliability Engineering, IEEE, 1996, Catalog # RS00030. ISBN 0-07-039400-8. • Musa, John, Software Reliability Engineering: More Reliable Software, Faster Development and Testing, McGraw Hill.ISBN: 0-07-913271-5. • Xie, M. Software Reliability Modeling, World Scientific, London, 1991. ISBN 981-02-0640-2.

  41. END OF MODULE 12

More Related