Software Reliability Research by Professor Pankaj Jalote

Software Reliability Research Pankaj Jalote Professor, CSE, IIT Kanpur, India

System Reliability • System – an entity that provides defined behavior at interfaces • System is a hierarchy of subsystems, each subsystem being a system • Reliability of a system - its ability to provide failure-free operation • Failure – the system behavior is incorrect or not as expected; is a random phenomenon

Reliability Quantification • Reliability of a system defined as failure probability in a time period R(t) = Prob that system has not failed by time t • For rel work, often distribution of R(t) is specified

Reliability Quantification.. • Reliability can also be quantified by Mean Time to Failure (MTTF) • Also by failure rate (no of failures per unit time.) • From R(t), MTTF or failure rate can be determined • Under some assumptions, failure rate and MTTF are inversely related

Software Reliability • Software (un)reliability not caused due to aging but due to bugs • The more the bugs, the lesser the reliability of the software • Still failures seem random, hence rel theory can be applied

Software Reliability Research • Two main threads • Software reliability modeling – how to model and predict sw rel • Improving sw reliability – by removing defects through program checking, verification, testing,… • Will discuss some work being done here in these two

Software Reliability Modeling

Software Reliability • Software systems often are one-off • Measuring reliability in lab not practical as too much failure data is needed; requires time • Failures often result in fault removal, leading to reliability improvement • Predicting future reliability from measured reliability is harder • Hence different models needed

Software Reliability Growth Models • Assume that reliability is a function of the defect level and as defects are removed, reliability improves • Model the failure-fix process of software evolution • Many models have been proposed in the last 3 decades • Model parameters determined from past data on failures and fixes

Reliability of Software Products • For software products, a large population exists in field and faults are not removed as failures occur • According to SRGMs, the reliability should remain the same • I.e. the failure rate should be constant

Average Failure Rate of a MS Product

Reasons for this Phenomenon • Users learn with time and avoid failure causing situation • Users start with exploring more, then limit to some part of the product • Most users use a few product features • Configuration related failures are much more in the start • These failures reduce with time

A New Model for Product Rel. • For a user, there is a transient failure rate, which decays with a factor • With time the transient goes, and failure rate reaches a steady state • Steady state failure rate – represents the reliability of the product

Failure Rate of a Unit • Failure rate for one unit isλ (i) = λ0 *αi + λf • λ0 is the initial transient rate • λf is the final steady state rate • α is the decay factor

Applying it to a Product • Considered the failure and sale data of a real product for MS • Applying the model to the data and determining parameters, we get λ0 = 0.04 failures/month λf = 0.008 failures/month α = 0.4 (i.e. 40% decay each month)

Example… • Steady state failure rate is 1/6th of average rate in month 2, 1/3rd of average rate in month 4 • I.e. initial MTTF could be 1/6th the steady state MTTF • Steady state is reached quite soon – in two to three months

Software Architecture Based Rel Estimation

Sw Architecture • Architecture is the components in the system and how they are connected • Is decided very early in sw project • If reliability and performance can be modeled from architecture, can improve the architecture • Some work going on in arch. based perf. and rel modeling

Program Verification

Program Verification • Basic goal – to ensure that program is free of defects (bugs) as much as possible • Good program verification leads to higher reliability

Program Verification Techniques • Testing – program is executed with test data to find bugs • Static analysis – program source code is analyzed • Dynamic analysis – program run on some data and assertions made • Model checking • Formal verification

Techniques • Most techniques work in isolation • Sometimes they are complimentary in their defect detection capability • Combining techniques meaningfully can improve reliability • We are working on techniques for combining testing and static analysis

State-based Testing Automation

Testing • Testing remains main verification activity – most reliance on it • Consumes as much as half of the total effort in a sw product • Testing: test case design, execution, checking the results, then debugging, fixing, retesting • Each step is expensive

Test Automation • Test automation can help reduce cost and make testing more effective • Most test automation approaches focus on data collection, re-testing • Little effort in complete end-to-end automation • We are working on automating OO testing using state based models

Summary • Software reliability is a rich and wide area • Exciting work going on across the world in modeling, analysis, program checking, testing, etc • Lots of open issues

Software Reliability Research by Professor Pankaj Jalote

Software Reliability Research by Professor Pankaj Jalote

Presentation Transcript

Software Reliability

Software Reliability Modelling

Software faults reliability

Software Reliability

Software Reliability

Research Reliability

Software Reliability

Evaluating Web Software Reliability

Software Reliability Model

Software Testing and Reliability

Software Reliability Methods

Software Reliability

Software Reliability

Software Metrics and Reliability

Software Reliability

Software Reliability Corroboration

Software Reliability Models

Software Metrics and Reliability

Software Reliability Model

Sea Ice

Sea Ice