Analyzing Software Contribution to System Failures PROBLEM SOLUTION • Software causes many failures - significant mission risk • Hard to quantify effects on system risk of: • software defects • software development practices • software verification and validation Link factors affecting software quality to system failures modes by: Predicting number of defects in software subsystems given software development and V&V decisions Using a fault tree to link software defects to system failures, affecting their probabilities TECHNOLOGY Prototype implementation in the Eclipse software development environment of tool linking: ODC COQUALMO - USC/Ames tool which predicts number of software defects and effectiveness of V&V tools DDP - JPL tool calculates risks, costs and effect of mitigation strategies from user specification of links between system objectives, risks and mitigations In this application, DDP represents system fault tree where some leaf nodes correspond to software defects. DDP derives information from ODC COQUALMO on software defect numbers, type and V&V effectiveness. DDP calculates system failure probability before and after selected V&V mitigations applied.
Explanation of Accomplishment • POC: Julian Richardson (RIACS/USRA, RSE Group, Code TI, firstname.lastname@example.org) • Work funded by: Reliable Software Engineering (ESAS Project 6G), Software Risk Management Task. • Background: Software plays in indispensable role in all of NASA’s modern space vehicles. This means, however, that incorrect software (software “bugs” in common parlance) can contribute to system failures. There continues to be considerable work devoted to find better ways to prevent software bugs in the first place (e.g., by improving coding standards), and to detect their presence ahead of mission use (e.g., by improving tools and techniques for testing software). The “Software Risk Management” element within which this accomplishment occurs is focused on assessing the system risk that software bugs pose, taking into account the application of preventions and detections planned, or already applied, to the software. • Accomplishment: We have integrated two capabilities that are crucial to analyzing software’s contribution to system failures: • (1) ODC COQUALMO, a University of Southern California/NASA Ames developed tool for predicting how many software bugs are present in a software system. Importantly, this tool includes estimates of the effectiveness of practices at preventing/detecting bugs. Furthermore, to categorize bugs (different kinds of bugs have different prevalences, and have different effects) it uses an IBM-developed technique, Orthogonal Defect Classification, adapted for NASA use. • (2) DDP, a JPL developed tool for representing bugs and fault trees that relate those bugs to their potential contribution to system failures. Importantly, this tool includes the capability to also represent the available practices to prevent/detect bugs, and (provided it is populated with the appropriate data), calculate the failure probabilities of the various choices among those practices. • Our integration involved both determining how to integrate these tools, and also developing a prototype implementation that realizes that integration. This implementation is hosted in the Eclipse software development environment, which the Reliable Software Engineering project has adopted as the environment in which to host its developments in a unified fashion. • Benefits: This accomplishment is a significant step forward in the quantification of the impact of software – and the practices followed in the development and testing of that software – on system risk. The utility of this is in providing quantitative guidance to inform decisions among design alternatives and tradeoffs where software is involved, and in planning and managing the considerable efforts that will be expended on analysis, testing and V&V of mission-critical software. • Future Work: We will be performing extensive experimentation to calibrate the efficacy of a wide gamut of practices available for preventing/detecting software bugs, including in the scope of the new and improved practices that other elements of this project are developing. We will be extending our capability to continually track and manage risk during the course of software development projects. Credits: DARP spacecraft image slide 1: “An artist conception of the autonomous DART spacecraft as it approaches the MUBLCOM satellite. … Credit: NASAexplores “ from http://www.nasa.gov/missions/science/dart_into_space.html