Strategies for Developing Fault-Tolerant Replicated Distributed Programs

Replicated Distributed Programs Eric C. Cooper University of California Berkeley, 1985 4. 21. 2004 Presented by Roh, HyunGul hgroh@camars.kaist.ac.kr

Contents • Introduction • Overview • A Model of Replicated Distributed Programs • Replicated Procedure Calls • Performance analysis • Synchronization • Transaction • Binding • Summary 2

Introduction • Want highly available distributed programs • Despite failures of some of its component • Fault-tolerant • Nonstop • Replicated Distributed Programs • Replication ; von Neumann(1956) • How in this paper? • Replication on per-module basis; • Flexible & not burdening the programmer • Provide transparency to programmer • Fundamental mechanism • Troupes, or replicated modules • Replicated procedure call, many-to-many 3

Replicated DistributedPrograms Active agent as single thread State information Troupe module procedures Overview - A model of Replicated Distributed Program - How can replications be added? How can controls be transferred? Which protocols are used? • What are problems issued? • Synchronization problem • Transaction • Binding distributed & replicated Programs • Reconfiguration & Recovery from partial failure 4

Replicated DistributedPrograms Active agent as single thread State information Troupe module procedures Distributed Modules & Treads • Modules • Packaging state information & procedures • Separating the interface to that abstraction from its implementation • Threads • “A thread of control is an abstraction intended to capture the notion of an active agent in a computation” • Particular thread runs in exactly on module at a given time • Multiple threads in same module • Moving among modules • Implementation • Provide location transparency • Module; implemented by server • Thread; implemented by using RPC to transfer control from server to server 5

Adding Replication to Distributed Programs • Partial failures of the distributed program • Masking failures is replication • Replication transparency (RT) • Terminology • Troupes; replicated module • Troupe members; the replicas • Assumptions • Troupe members; execute on fail-stop processor • if not => Byzantine agreement • How is replication transparency in troupe model guaranteed? • Deterministic troupe: a set of replicas of a deterministic modules • (input → unique output) • Troupe consistent (TC) ; When all its members are in the same state • In the absence of application-specific knowledge, TC ⇔ RT • All troupes is deterministic ⇒ guarantee RT troupe Replicated Procedure Calls Troupe member Troupe member Troupe member 6

client server Call P P: proc Call P P: proc Call P P: proc Replicated Procedure Calls(1/3) • RPC (remote procedure call) : distributed programs can be written as local programs • Replicated Procedure Calls: When modules are replaced by troupes, natural generalization of RPC • Server & Client? • Server troupe: have a procedure module • Client troupe: caller • How? The Circus Paired Message Protocol • Previous works by author • Characteristic • Paired msgs (e.g. call and return) • Reliably delivered • Variable length • Call sequence numbers • Based on the RPC • Use UDP, the DARPA User Datagram Protocol • Connectionless but retransmission 7

server client P: proc Call P P: proc Call P Call P P: proc P: proc Call P [Many-to-one call] Replicated Procedure Calls(2/3) • RPPCs are implemented on the top of the paired message layer • One-to-Many calls • Each client troupe member to the entire server troupe • The client will normally wait for all return msgs from the server troupe • Many-to-One calls • Each server troupe member handles from the entire client troupe • Two problems: • Distinguish unrelated call • How many other call msgs to expect as part of the same replicated call • (Author’s previous work ) [One-to-many call] 8

client server Call P P: proc Call P P: proc Call P P: proc Replicated Procedure Calls(3/3) • Many-to-Many calls • Client: call msgs to entire server troupes • Server: return msgs to entire client troupes • Waiting for Message to arrive • Since troupes are assumed to be deterministic, all msgs will be identical • When should computation proceed? • only after the entire set has arrived • First come • only after the entire set has arrived • error detection, error correction • Expensive execution time • First come • Determined by fastest member of each troupe • Send return to un-received members as soon as fastest member call got • Crashes and Partitions • crash detection: Probing & timeout • Network partition: Which member receive Majority of the expected set of message? • Collators • A function that maps a set of msgs into a single result 9

Performance Analysis • Measuring the cost of replicated procedure calls as a function of the degree of replication • Six VAX-11/750 by a single 10 Mb/s Ethernet 10

The Synchronization Problem for troupes • Multiple threads of control • If they want same resource? • Serializability can be achieved by any of a number of concurrency control algorithm • When Server module is a troupes; • Serialized by each server troupe member • Serialized by the same order 11

Replicated Transactions • The Transaction mechanism • Guaranteeing serializability & atomicity • Conventional transaction • The permanence of committed updates • Crash recovery algorithm • Correctness condition for conventional transactions • serializability • Troupe consistency must also be preserved • Existing concurrency control algorithm for replicated DB • Require communication among replicas • Can’t be used in troupe model • One well-known multiple-copy concurrency control algorithm • two-phase locking with unanimous update 12

A Troupe Commit Protocol • Optimistic • “concurrent transaction are unlikely to conflict” • Detect un-serialized transaction • transform such un-serialized transaction into a deadlock • Essential property • Two troupe members succeed in committing two transaction iff both troupe members attempt to commit the transactions in the same order ready_to_commit Client troupe C Server troupe S Client troupe C’ S1 (T,T’) (T, T’) T’ T S2 (T,T’) (T,T’) (1) (2) Commit or not Commit or not 13

Binding Agents for Distributed Programs • “A binding agent is a mechanism the enables programs to import and export modules by interface name” • [lookups, registration, deletion] can be provided by a general purpose name server • Clients cache the result of lookups • The classic cache invalidation problem • Garbage collection: obsolete registration information 14

Binding Agents for Replicated Programs • Import and export troupes rather than single modules • Binding agent must manipulate sets of module addresses rather than single addresses • More complicated cache invalidation problem • Troup ID as a form of incarnation number 15

Reconfiguration & Recovery from Partial Failure • Detect crash by timeout • Replace crashed troupe • New troupe member (add_troupe_member) • State consistent with that of the other members • Be registered with the binding agent • “get_state” procedure 16

Summary • Details are invisible • Replication transparency • Transfer of control; replicated procedure calls • Circus paired protocol • Many-to-many • Serializability for concurrency control • Binding & reconfiguration 17

Strategies for Developing Fault-Tolerant Replicated Distributed Programs

Strategies for Developing Fault-Tolerant Replicated Distributed Programs

Presentation Transcript

Scaleable Replicated Databases

Multithreaded and Distributed Programming – How Distributed Programs Communicate

Multithreaded and Distributed Programming – How Distributed Programs Communicate

Mobile Replicated Data

Replicated State Machines

Replicated Data Protocols

Replicated Data Management

IOA: Distributed Algorithms  Distributed Programs

Replicated Binary Designs

Replicated Databases

Synthesis of Fault-Tolerant Distributed Programs

Fault-tolerant Stream Processing using a Distributed, Replicated File System

Distributed Programs

Replicated Distributed Systems

Fault Tolerant Stream Processing using Distributed Replicated File System

Replicated Stratified Sampling

Scaleable Replicated Databases

Replicated Databases

Hierarchical Pointer Analysis for Distributed Programs

Replicated Binary Designs

Distributed Replicated FIFO Queue

Synthesis of Fault-Tolerant Distributed Programs

Sea Ice

Sea Ice