1 / 55

The Formal Verification of SPIDER

The Formal Verification of SPIDER. Lee Pike Department of Computer Science Indiana University, Bloomington lepike@indiana.edu. Thanks to. Steven Johnson, Indiana University, Bloomington The National Institute of Aerospace

michi
Télécharger la présentation

The Formal Verification of SPIDER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Formal Verification of SPIDER Lee Pike Department of Computer Science Indiana University, Bloomington lepike@indiana.edu

  2. Thanks to • Steven Johnson, Indiana University, Bloomington • The National Institute of Aerospace • The NASA LaRC Formal Methods Team, especially Paul Miner

  3. Overview • SPIDER Overview • Reasoning about Faults • The Old vs. New Interactive Consistency (IC) Protocol • SPIDER Formal Verification Goals & Future Work • References

  4. SPIDER OverviewWhy? • Develop a fault-tolerant architecture based on an ultra-reliable bus • Scalable • Handle a large number of possibly-simultaneous faults, specifically transient faults from electromagnetic effects. • Provide reintegration services • Case study for the FAA • Developed in accordance with RTCADO-254: Design Assurance Guidance for Airborne Electronic Hardware. • Provide a test-bed for techniques in the specification and verification of safety-critical electronic systems. These sort of architectures are the foundation of tomorrow's X-by wire safety-critical systems.

  5. SPIDER OverviewWhat? • Scalable Processor-Independent Design for Electromagnetic Resilience

  6. SPIDER OverviewWhat? • Scalable Processor-Independent Design for Electromagnetic Resilience • Processor Elements (PEs) PE PE PE

  7. SPIDER OverviewWhat? • Scalable Processor-Independent Design for Electromagnetic Resilience • Processor Elements (PEs) • Reliable Optical BUS (ROBUS) • Time Division Multiple Access (TDMA) bus • Maintains Synchrony between PEs. • Prevents Babbling Idiots & PE-to-PE interference • The services of the ROBUS are the focus of the verification effort. PE ROBUS PE PE

  8. ROBUS OverviewTopology • nBus Interface Units (BIUs) • mRedundancy Management Units (RMUs) • The BIUs and RMUs are called nodes. • Every BIU and RMU is directly connected. • No two BIUs are directly connected. Similarly for the RMUs. RMU1 to PE BIU1 to PE BIU2 RMU2 BIU3 RMU3 to PE ROBUS

  9. ROBUS OverviewServices (Protocols) • Interactive Consistency Purpose: Reliably broadcast messages between PEs. • Clock Synchronization Purpose: Maintain synchrony between all nodes and PEs. • Distributed Diagnosis Purpose: Convict faulty nodes in the ROBUS. The focus of this talk is Interactive Consistency.

  10. Global Fault Classifications • Good Not faulty d node d d

  11. Global Fault Classifications • Good Not faulty • Benign Broadcasts only detectably faulty messages garbage node garbage garbage

  12. Global Fault Classifications • Good Not faulty • Benign Broadcasts only detectably faulty messages • Symmetric Broadcasts the same arbitrary message to all d' node d' d'

  13. Global Fault Classifications • Good Not faulty • Benign Broadcasts only detectably bad messages • Symmetric Broadcasts the same arbitrary message to all • Asymmetric (Byzantine) Arbitrarily sends arbitrary messages d node d' d''

  14. Local Fault InformationEach Node Maintains • Accusations A node accuses other nodes based on the messages it receives as well as indirect information.

  15. Local Fault InformationEach Node Maintains • Accusations A node accuses other nodes based on the messages it receives as well as indirect information. • Convictions Periodically, the distributed diagnosis protocol is executed; nodes exchange accusations to produce convictions. • NOTE: While a good node knows that all good nodes have the same convictions, it does not know that all good nodes have the same accusations.

  16. Local Fault InformationEach Node Maintains • Accusations A node accuses other nodes based on the messages it receives as well as indirect information. • Convictions Periodically, the distributed diagnosis protocol is executed; nodes exchange accusations to produce convictions. • NOTE: While a good node knows that all good nodes have the same convictions, it does not know that all good nodes have the same accusations. • Eligible Voters For each BIU, the set of RMUs that it neither accuses nor convicts. Similarly for each RMU.

  17. Interactive Consistency ProtocolExternal View • Purpose: Reliably communicate data between processing elements (PEs) over the ROBUS. PE PE ROBUS PE

  18. Interactive Consistency ProtocolExternal View • A PE sends its data to the ROBUS. PE data in PE sender ROBUS PE

  19. Interactive Consistency ProtocolExternal View • The IC Protocol is executed in the ROBUS. PE PE ...IC Protocol... ROBUS PE

  20. Interactive Consistency ProtocolExternal View • The ROBUS broadcasts data back out to the PEs. data out PE data out PE sender ...IC Protocol... ROBUS data out PE

  21. Old Interactive Consistency ProtocolInternal View to PE RMU1 BIU1 to PE BIU2 RMU2 data in sender BIU3 RMU3 to PE ROBUS

  22. 1. A BIU broadcasts data to the RMUs. If the BIU is good, the same value is broadcast to all RMUs. to PE RMU1 BIU1 data to PE BIU2 data RMU2 data in sender data BIU3 RMU3 to PE ROBUS

  23. 2. For each good RMU, if it receives data that isn't detectably faulty, then it passes the data received back to each BIU. Otherwise, source_error is sent. to PE data or source_error RMU1 BIU1 RMU1 good data or source_error to PE BIU2 RMU2 similarly for RMUs 2 and 3 BIU3 RMU3 data or source_error to PE ROBUS

  24. 3. Each BIU eliminates from its EV those RMUs that sent detectably faulty messages. 2 1 to PE d RMU1 BIU1 RMU1 good 3 garbage to PE BIU2 RMU2 RMU2 benign faulty BIUs 2 and 3 do likewise d BIU3 RMU3 to PE ROBUS

  25. 4. For each BIU, it votes on the majority data sent from each RMU in its EV. 2 1 d to PE RMU1 BIU1 3 d vote = d to PE BIU2 RMU2 BIUs 2 and 3 do likewise BIU3 RMU3 to PE ROBUS

  26. 5. IF the majority of RMUs sent the same data, then it is sent to the BIU's PE. ELSE source_error is sent to the BIU's PE. d to PE RMU1 BIU1 vote = d to PE BIU2 RMU2 BIUs 2 and 3 similarly send data BIU3 RMU3 to PE ROBUS

  27. IC Protocol Guarantees • Validity If the broadcasting BIU is good, not convicted, and sends data d, then the result of the vote for a good BIU is be d. • Agreement Any two good BIUs vote the same result for the broadcasted value (even if the sender is asymmetric!).

  28. Old Assumptionsto ensure guarantees hold Environment Assumptions The Maximum Fault Assumption (MFA): • There are more good BIUs than symmetric + asymmetric BIUs. • Similarly for the RMUs. • There are either no asymmetric BIUs or no asymmetric RMUs.

  29. Old Assumptionsto ensure guarantees hold Environment Assumptions The Maximum Fault Assumption (MFA): • There are more good BIUs than symmetric + asymmetric BIUs. • Similarly for the RMUs. • There are either no asymmetric BIUs or no asymmetric RMUs. System Assumptions • Symmetric Agreement If a node is not asymmetric, then all good nodes assign it the same accusation. • Good Trusting Good nodes aren't accused by good nodes. • Conviction Agreement All good nodes have the same convictions.

  30. ValidityProof Sketch Assume the broadcasting BIU is good and sends data d. RMU1 BIU1 d BIU2 d RMU2 sender good d BIU3 RMU3 ROBUS

  31. ValidityProof Sketch Thus, all good RMUs send d back to the BIUs. d RMU1 BIU1 RMU1 good d BIU2 RMU2 similarly for RMUs 2 and 3 d BIU3 RMU3 ROBUS

  32. ValidityProof Sketch Each good BIU filters out the bad messages received. By the MFA, most of its EV then contains good RMUs. 2 RMU1 1 d BIU1 garbage 3 d BIU2 RMU2 similarly for BIUs 2 and 3 BIU3 RMU3 ROBUS

  33. ValidityProof Sketch Since all good RMUs sent d, the result of the vote yields d. q.e.d. 2 RMU1 1 d BIU1 vote = d 3 d BIU2 RMU2 BIU3 RMU3 ROBUS

  34. AgreementProof Sketch Either the broadcasting BIU is asymmetric or not. Suppose it is. RMU1 BIU1 d BIU2 d' RMU2 sender asym d'' BIU3 RMU3 ROBUS

  35. AgreementProof Sketch Then no RMU is asymmetric, by the MFA. So every RMU sends the same data to every BIU. 2 1 x RMU1 BIU1 y 3 z BIU2 RMU2 BIUs 2 and 3 receive the same values BIU3 RMU3 ROBUS

  36. AgreementProof Sketch Since no RMU is asymmetric, by symmetric trusting, the EV of each BIU is the same. Thus, the result of the vote for each BIU is the same. 2 1 x RMU1 BIU1 y 3 z BIU2 RMU2 BIUs 2 and 3 receive the same values BIU3 RMU3 ROBUS

  37. AgreementProof Sketch For the other case, suppose the sending BIU is not asymmetric. RMU1 BIU1 d BIU2 d RMU2 sender not asym d BIU3 RMU3 ROBUS

  38. AgreementProof Sketch Most of the RMUs are good, by the MFA. Since all good RMUs received the same values, they send the same values. RMU1 BIU1 x RMU1 good BIU1 good BIU2 RMU2 BIU3 RMU3 x RMU3 good BIU3 good ROBUS

  39. AgreementProof Sketch By good trusting, no good BIU accuses a good RMU. Since most RMUs are good, there are a majority of good RMUs in the EV of each good BIU, after filtering benign RMUs. 2 x 1 RMU1 BIU1 RMU1 good BIU1 good 3 x BIU2 RMU2 2 1 x BIU3 RMU3 RMU3 good 3 x BIU3 good ROBUS

  40. AgreementProof Sketch Thus, the result of the votes will be the same for all good BIUs. q.e.d. 2 x 1 RMU1 BIU1 RMU1 good BIU1 good 3 x BIU2 RMU2 2 1 x BIU3 RMU3 RMU3 good 3 x BIU3 good ROBUS

  41. New Assumptionsto reason about reintegration Environment Assumptions The Dynamic Maximum Fault Assumption (DMFA): • For each good BIU, its EV consists of more good RMUs than symmetric + asymmetric RMUs. • Similarly for good RMUs. • Either no asymmetric RMU is in the EV of a good BIU or no asymmetric BIU is in the EV of a good RMU.

  42. New Assumptionsto reason about reintegration Environment Assumptions The Dynamic Maximum Fault Assumption (DMFA): • For each good BIU, its EV consists of more good RMUs than symmetric + asymmetric RMUs. • Similarly for good RMUs. • Either no asymmetric RMU is in the EV of a good BIU or no asymmetric BIU is in the EV of a good RMU. System Assumptions • Symmetric Agreement If a node is not asymmetric, then all good nodes assign it the same accusation. • Good Trusting Good nodes aren't accused by good nodes. • Conviction Agreement All good nodes have the same convictions.

  43. Agreement Breaks!Under the New Assumptions (courtesy of Wilfredo) Suppose the sender is asymmetric, but is in no EV of all good RMUs. Suppose there is an asymmetric RMU in the EV of both good BIUs. This satisfies the DMFA. RMU1 BIU1 good & trusts all good & accuses BIU2 d BIU2 RMU2 d' good & accuses BIU2 sender asym d'' BIU3 RMU3 good & trusts all asym ROBUS

  44. Agreement Breaks!Under the New Assumptions The two good RMUs relay the values received, and since RMU3 can relay arbitrary data, it sends d to BIU1 and d' to the other. 2 1 d RMU1 BIU1 good & trusts all good & accuses BIU2 d' 3 d BIU2 RMU2 good & accuses BIU2 sender asym 2 1 d BIU3 RMU3 d' good & trusts all asym 3 d' ROBUS

  45. Agreement Breaks!Under the New Assumptions The result of the votes of BIU1 and BIU2 differ. Agreement is violated! 2 1 d RMU1 BIU1 good & trusts all good & accuses BIU2 d' 3 d vote = d BIU2 RMU2 good & accuses BIU2 vote = d' sender asym 2 1 d BIU3 RMU3 d' good & trusts all asym 3 d' ROBUS

  46. Revised IC Protocol In the new IC Protocol, the RMUs relay source_error when • They receive bad messages and • They accuse the sender.

  47. Revised IC Protocol In the new IC Protocol, the RMUs relay source_error when • They receive bad messages and • They accuse the sender. The revised IC protocol satisfies both validity and agreement (verified in PVS).

  48. Formal VerificationWhy Level 3 Verification? • A math proof is proof enough, right? • Level 3 verification can require significant time to complete. In other words...

  49. Using PVS

  50. Formal VerificationWhy Level 3 Verification? • A math proof is proof enough, right? • Level 3 verification can require orders of magnitude more time to complete than level 1 or level 2 verification. But... • Proofs for fault-tolerant protocols for distributed architectures are tedious and large (there are nearly 400 lemmas & theorems in our current unfinished set of proofs). • Proofs are not checked by a community of mathematicians like other mathematical results are. In other words...

More Related