1 / 25

The Complexity of Adding Failsafe Fault-tolerance

The Complexity of Adding Failsafe Fault-tolerance. Sandeep S. Kulkarni Ali Ebnenasir. Motivations. Why automatic addition of fault-tolerance? Why begin with a fault-intolerant program? Reuse of the fault-intolerant program Separation of concerns (functionality vs. fault-tolerance)

velvet
Télécharger la présentation

The Complexity of Adding Failsafe Fault-tolerance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Complexity of Adding Failsafe Fault-tolerance Sandeep S. Kulkarni Ali Ebnenasir

  2. Motivations • Why automatic addition of fault-tolerance? • Why begin with a fault-intolerant program? • Reuse of the fault-intolerant program • Separation of concerns (functionality vs. fault-tolerance) • Potential to preserve properties such as efficiency • One obstacle • Adding masking fault-tolerance to distributed programs is NP-hard [ FTRTFT, 2000]

  3. Motivation (Continued) • Approach for dealing with complexity • Heuristics [SRDS 2001] • Weaker form of tolerance • Failsafe • Safety only in the presence of faults • Nonmasking • Safety may be temporarily violated • Restricting input • Programs • Specifications

  4. Masking fault-tolerant Failsafe fault-tolerant Nonmasking fault-tolerant Motivation (Continued) • Why failSafe Fault-Tolerance? • Simplify the design of masking • Partial automation of masking fault-tolerance (using TSE’98) Automate Automate Intolerant Program

  5. Outline of the Talk • Problem of adding fault-tolerance • Difficulties caused by distribution • Complexity of failsafe fault-tolerance • Class of programs and specifications for which polynomial synthesis is possible

  6. f p/f p Basic Concepts:Programs and Faults • State space Sp • Program transitions deltap, faults deltaf • Invariant S, fault-span T • Specification spec: Safety is specified by transitions, (sj, sk) that should not be executed T S

  7. Invariant of fault-intolerant program Invariant of fault-tolerant program No new transition here New transitions may be added here Problem Statement • Inputs: program p, Invariant S, Faults f, Specification spec • Outputs: program p’, Invariant S’ • Requirements: Only fault-tolerance is added; no new functional behavior is added

  8. a=1,b=0 a=0,b=0 • Only if we include the transition a=1,b=1 a=0,b=1 Difficulties with Distribution • Read/Write restrictions • Two Boolean variables a and b • Process cannot read b • Can we include the following transition? Groups of transitions (instead of individual transitions) must be chosen.

  9. Included iff x0 is false an = a0 a0 Included iff x0 is true _ cj = xj \/ xk \/ xl Included iff xk is true Included iff xl is false Included iff xj is false Reduction from 3-SAT

  10. Dealing with the Complexity of Adding Failsafe Fault-tolerance • For what class of problems, failsafe fault-tolerance can be added in polynomial time • Restrictions on • Fault-tolerant programs • Specifications • Faults • Our approach for restrictions: • In the absence of faults, preserve all computations of the fault-intolerant program

  11. Restrictions on Programs and Specifications • Monotonicity requirements • Capture the notion that safe assumptions can be made about variables that cannot be read • Focus on specifications and transitions of fault-intolerant programs

  12. Then If x = true x = true s’0 s’1 x = false x = false s0 s1 Does not violate safety Does not violate safety Monotonicity of Specifications • Definition: A specification spec is positive monotonic with respect to variable x iff: • For every s0, s1, s’0, s’1: • The value of all other variables in s0 and s’0 are the same • The value of all other variables in s1 and s’1 are the same

  13. x = false X = false s’0 s’1 x = true x = true s0 s1 Invariant S Monotonicity of Programs • Definition: Program p with invariant S is negative monotonic with respect to variable x iff: • For every s0, s1, s’0, s’1: • The value of all other variables in s0 and s’0 are the same • The value of all other variables in s1 and s’1 are the same

  14. Theorem • Adding failsafe fault-tolerance can be done in polynomial time if either: • Program is negative monotonic, and • Spec is positive monotonic • Or • Program is positive monotonic, and • Spec is negative monotonic • If only one of these conditions is satisfied then adding failsafe fault-tolerance is still NP-hard • For many problems, these requirements are easily met

  15. Example: Byzantine Agreement • Processes: General, g, and three non-generals j, k, and l • Variables • d.g : {0, 1} • d.j, d.k, d.l : {0, 1, ┴ } • b.g, b.j, b.k, b.l : {true, false} • f.g, f.j, f.k, f.l : {0, 1} • Fault-intolerant program transitions • d.j = ┴ /\ f.j = 0 d.j := d.g • d.j ≠ ┴ /\ f.j = 0 f.j := 1 • Fault transitions • ¬b.g /\ ¬b.j /\ ¬b.k /\ ¬b.l b.j := true • b.j d.j,f.j :=0|1,0|1

  16. Example: Byzantine Agreement (Continued) • Safety Specification: • Agreement: No two non-Byzantine non-generals can finalize with different decisions • Validity: If g is not Byzantine, no process can finalize with different decision with respect to g • Read/Write restrictions • Readable variables for process j: • b.j, d.j, f.j • d.g, d.k, d.l • Process j can write • d.j, f.j

  17. Example: Byzantine Agreement (Continued) • Observation 1: • Positive monotonicity of specification with respect to b.j • Observation 2: • Negative monotonicity of program, consisting of the transitions of j, with respect to b.k • Observation 3: • Negative monotonicity of specification with respect to f.j • Observation 4: • Positive monotonicity of program, consisting of the transitions of j, with respect to f.k

  18. Summary • Complexity analysis for failsafe fault-tolerance • Reduction from 3-SAT • Restrictions on specifications and programs for which polynomial synthesis is possible • Several problems fall in this category • Byzantine agreement, consensus, commit, … • Necessity of these restrictions

  19. Future Work • Simplifying the design of masking fault-tolerance using the two-step approach • Refining boundary between classes for which polynomial synthesis is possible and for which exponential complexity is inevitable • Using monotonicity requirements for simplifying masking fault-tolerance

  20. Thank You • Questions?

  21. Future Work • Conclusion • Specifying the boundary • Fault-tolerance addition can be done in polynomial time • Exponential complexity is inevitable • Goal: what problems can benefit from automation? • Necessity and sufficiency of monotonicity requirements • Future Work • How can we Change a non-monotonic program to a monotonic one by modifying its invariant? • How can we Strengthen a non-monotonic specification to a monotonic one? • How a nonmasking program can be designed manually to satisfy monotonicity requirements?

  22. Basic Concepts: Fault-tolerant Program Fault-tolerance in the presence of faults: Failsafe: Satisfies its safety specification Nonmasking: Satisfies its liveness specification (safety may be violated temporarily) Masking: Satisfies safety and liveness specification

  23. The complexity of Adding Failsafe fault-tolerance • Adding (failsafe/nonmasking/masking) fault-tolerance in high atomicity model is in P • Adding masking fault-tolerance to distributed programs is in NP • How about failsafe? • Adding Failsafe to distributed programs is NP-hard!! (proof in the paper) • Reduction of 3-SAT to the problem of failsafe fault-tolerance addition

  24. Our Approach • Stepwise towards masking fault-tolerance: • Automating the addition of failsafe fault-tolerance • How hard is adding failsafe fault-tolerance? • Polynomial time boundaries for failsafe tolerance addition?

  25. Sp’ • Sp,

More Related