Fundamental Techniques for Fault Tolerance in Embedded Software Design
170 likes | 301 Vues
This document outlines fundamental techniques and methods for the model-based analysis and design of fault-tolerant embedded software, focusing on critical systems. It discusses the significance of fault tolerance in isolating component faults to prevent system failures and enhance dependability. Key attributes of dependability like availability, reliability, safety, confidentiality, integrity, and maintainability are covered alongside methods for fault and failure classification. The document emphasizes the importance of redundancy strategies, error recovery mechanisms, and includes practical exercises on implementing watchdog timers in redundancy schemes.
Fundamental Techniques for Fault Tolerance in Embedded Software Design
E N D
Presentation Transcript
Fault Tolerance Fundamentals ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August 2011
Fault Tolerance Means to isolate componentfaults Prevents systemfailures May increase systemdependability
Dependability - attributes • Availability • Reliability • Safety • Confidentiality • Integrity • Maintainability
Dependability - impairments • Faults • Errors • Failures
Error Classification (Fault Error) • Effect • Extent • latent • effective • local • distributed
Failure Classification (Fault Error Failure) • Consequence • benign • malign (a mishap)
Fault Tolerance Means to isolate componentfaults ... And mask them Prevents systemfailures May increase systemdependability
Full tolerance • Graceful Degradation • Fail safe FT - levels BW p. 107
Retry ... ... Try Try Try FT basis: Redundancy • Time • Space Try Retry BW p. 109
Failure exception Interface exception Request/response Interface exception Failure exception Request/response The ideal FT-component Normal mode Exception Handler
Model Design Procedure • Model the correct component and check that it has the desired properties. • Model relevant faults and introduce them as internal transitions to error states. Check that this fault-affected. • Introduce into the model the mechanisms for fault detection, error recovery and masking and check that the desired properties are valid for this design.
Exercise • What is the purpose of a watchdog-timer? • How could it be used in a space based redundancy scheme? • - in a time based redundancy scheme?