1 / 35

Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis. István Majzik Budapest University of Technology and Economics Department of Measurement and Information Systems June 2000. Introduction. Basis: FT-CORBA specification UML-based automatic dependability modeling Topics:

chiara
Télécharger la présentation

Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault Tolerant CORBA(FT-CORBA) -Modeling and Analysis István MajzikBudapest University of Technology and EconomicsDepartment of Measurement and Information SystemsJune 2000

  2. Introduction • Basis: • FT-CORBA specification • UML-based automatic dependability modeling • Topics: • Support to construct optimal FT-CORBA schemes • Evaluate existing architectures • Part I: The FT-CORBA proposal • Part II: UML-based dependability analysis • Part III: Dep. modeling of FT-CORBA

  3. Part I The FT-CORBA Proposal

  4. CORBA • OMG CORBA: standard of open OO systems • Provides transparent access to services of remote objects (like local method calls) • ORB: Object Request Brokercommunication of requests/responses(location, activation, parameter passing etc.) • IOR: interoperable object reference • GIOP: general inter-ORB protocol • IIOP: Internet inter-ORB protocol • IDL: Interface definition languageconsistency between client and server interfaces

  5. FT-CORBA • Goal: Fault tolerance in CORBA environment • History: • April 1998: Request for Proposal issued • October 1998: Initial submissions • December 1999: Joint revised submission by Ericsson, Inprise, Iona, Lucent, Oracle, Sun,... • April 2000: Final adopted specification

  6. FT-CORBA Concepts • Avoiding SPOF of single (server) objects • Fault tolerance by entity redundancy, fault detection and recovery • creation of (server) object groups • infrastructure to maintain object replicas • Basic properties: • replication transparency(access independent of number/location) • failure transparency(access independent of faulty server objects)

  7. Fault Tolerance Domains • FT domain: • Object groups of server object replicas • Single Replication Manager • Object groups: • different hosts • single object per host • Replication Manager: • Creation and management of object groups • Support of application-controlled management

  8. Fault Tolerance Domain Domains, object groups, hosts and replicas

  9. Architecture Overview • Set of CORBA objects to support FT • Replication Manager • Fault Detector • Fault Notifier • Fault Analyzer • ORB extensions • logging mechanism • recovery mechanism • Commercial implementations?

  10. Fault Tolerance Infrastructure

  11. Replication Management • Infrastructure controlled case: • application: create_object() method of the RM • RM: invokes local factory objects on hosts • RM manages membership, consistency • Application controlled case: • application’s responsibility to manage replicas • Parameters: • ReplicationStyle: stateless, cold / warm passive, active • MembershipStyle • ConsistencyStyle • InitialNumberReplicas, MinimumNumberReplicas

  12. Fault Detection and Notification • Fault model: • object crash (incorrect results are not tolerated) • Fault detection by polling • application objects inherit the PullMonitorable interface: is_alive() method • Fault Detector invokes it periodically • hierarchy of fault detectors • Fault notification and fault analysis • Parameters: • FaultMonitoring (Style, Granularity, IntervalAndTimeout)

  13. Logging and Recovery • Application objects inherit: • Checkpointable interface: get_state(), set_state() • Updateable interface: get_update(), set_update() • Logging Mechanism: • storing GIOP messages • periodically storing state of the objects • Recovery Mechanism: • restore object state and retrieve stored messages • Parameters: • CheckpointInterval

  14. IIOP IIOP IIOP Client Failover • Identification of object groups: • IOGR: interoperable object group reference • multiple IIOP profiles addressing object group members or gateways • Basic mechanisms of the client ORB: • retry all alternative IIOP profiles • transparent reinvocation of requests(“at most once” execution semantics at the server) • heartbeating of the server

  15. Part II Dependability Modeling of Object-Oriented Systems Described in UML

  16. Dependability Analysis Approach by A. Bondavalli, I. Majzik, I. MuraHIDE - High-level Integrated Design Environment for Dependability ESPRIT Open LTR No. 27493 • From UML-based models (class, object, deployment diagrams) to Timed Petri Nets standard PN evaluation tools can be used • Supports • comparison of design choices • identification of bottlenecks • System-wide, structural model

  17. Modeling Approach 1. UML model: Diagrams with extensions stereotypes to identify roles (variant, tester, ...) tagged values to assign parameters 2. Intermediate model: Simplified structure elements: software, hardware, with/wo states dependencies: „uses the service of” „is composed of” class based redundancy  fault tree 3. Dependability model: Timed Petri net sub-nets for elements and dependencies

  18. O1 O2 O1 Failure/Propagation Sub-models UML model elements Petri net modules <<SF-SW>>

  19. O1 Repair Sub-model UML model Petri net module <<SF-HW>>

  20. RM V1 V2 Redundancy Sub-models UML model Fault tree Petri-net

  21. Part III Dependability Modeling of FT-CORBA Architectures

  22. Approach • UML models: • identification of elements/structures • additional parameters  support of automatic modeling • Tailoring to FT-CORBA • subnets to specific mechanisms • based on the parameters • Restrictions: • non-replicated client, static structure • infrastructure controlled replication management

  23. UML Modeling • Identification of elements/structures • Fault Tolerance Domain: package • independent of deployment • Object groups: sub-package • Roles: stereotypes • FT-CORBA properties as tagged values • ReplicationStyle • MembershipStyle • ConsistencyStyle • FaultMonitoring (Style, Granularity, Interval) • (Initial, Minimum) NumberReplicas

  24. OG3 OG2 Overall Structure << >> FT Domain Alpha Domain1 FTI C1 C2 RM FN FD OG4 << >> Domain2 OG1 << >> S11 S12 FD1

  25. Modularity • Available building blocks: • failure subnet • propagation subnet • repair subnet • fault tree • Sub-models in FT-CORBA: 1. Client failover 2. Server object failure 3. Fault management (detection and notification) 3. Recovery (replication management)

  26. 1. Client Failover • Semantics: • Primary is tried first • Failover conditions: „crash” • Communication failure • No response No failover: erroneous response • No failure exception until all profiles have been tried

  27. Dependability Sub-model Fault tree (passive replication): • Top event: Client failure • Basic events: • Server object crash • Server object erroneous response • Composite events (OR): number n of profiles • S1 (primary) erroneous • S1 crash AND S2 erroneous • S1 crash AND S2 crash AND S3 erroneous • ... • S1 crash AND S2 crash AND ... AND Sn crash

  28. 2. Server Object Failure • Distinction of failures: • Crash  Failover in client  Error detected in the object group • Erroneous response (commission fault)  Propagated to clients, application-specific error detection

  29. Dependability Sub-model • Failure process: • failure subnet • distinguished cases: crash/erroneous response • Propagation subnets • standard subnets (toward the client fault tree)

  30. 3. Fault Management • Fault detection+notification: Chain of events • Source: Fault Detector • latency = MonitoringInterval • coverage depends on MonitoringGranularity: • each member / single per host / single per host and type • Propagation: Fault Notifier(s) • communication failures • Destination: Replication Manager • Hierarchy of Fault Detectors • Infrastructure objects: Replication is possible

  31. Dependability Sub-model • Error detection delay • timed PN transition • Fault notification subsystem • fault tree (AND) • Replicated infrastructure objects • local fault trees (AND)

  32. 4. Recovery in the Object Group • Triggered by the Fault Notifier in the Replication Manager • Goal: Maintain the number of replicas • crashed object is removed • creation of new replica, restoring state • only a single replica on a given host! • Repair is possible if: • current host is fault-free • current host is faulty, but there are available hostsi.e. number of hosts >= NumberReplicas

  33. Dependability Sub-model • Repair subnet: Explicit repair • latency: CheckpointInterval, ReplicationStyle • Recovery of the replica: • Static deployment:Standard repair subnet • Pool of identical hosts: Logic condition for repairFree hosts (PN place) • marking increased by host repair and server object crash • marking decreased by host crash and server object repair Guard on the transition for explicit repair

  34. Overall Structure of Subnets Client Fault Tree NumberReplica Prop. Prop. Prop. Prop. FaultMonitoringGranularity FaultMonitoringInterval S1 crash S1 err Notification ReplicationStyle Recovery Repair Prop. Prop. CheckpointInterval

  35. System-wide Dependability Model • Analysis of the Petri-net: • standard tools (SPNP, PANDA, ...) • Sensitivity analysis • system-wide reliability, availability  Optimal selection of FT-CORBA parameters • replication (membership, consistency) styles • number of replicas • monitoring granularity, interval

More Related