1 / 78

Nominal Calculi for Transactions: ZSN in JOIN II

Models and Languages for Coordination and Orchestration IMT- Institutions Markets Technologies - Alti Studi Lucca. Nominal Calculi for Transactions: ZSN in JOIN II. Roberto Bruni Dipartimento di Informatica Università di Pisa. Distributed 2PC.

metta
Télécharger la présentation

Nominal Calculi for Transactions: ZSN in JOIN II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Models and Languages for Coordination and Orchestration IMT- Institutions Markets Technologies - Alti Studi Lucca Nominal Calculi for Transactions: ZSN in JOIN II Roberto Bruni Dipartimento di Informatica Università di Pisa

  2. Distributed 2PC • The distributed 2PC is a variant of the decentralized 2PC with a finite but unknown number of participants • When a participant P is ready to commit it has only a partial knowledge of the whole set of participants • Only those who directly cooperated with P • To commit P must contact all its neighbours and possibly learn the identity of other participants from them Models and Languages for Coordination and Orchestration

  3. Commit in Distributed DataBases • Data can be inherently distributed • e.g. customers accounts in different branches of the same bank • Data are distributed to achieve failure independence • e.g. replicated file systems • Partial failures can lead to inconsistent results • Commits have to be coordinated among participants to preserve data consistency Models and Languages for Coordination and Orchestration

  4. Distributed DataBases user user DB user user user Centralized Distributed Models and Languages for Coordination and Orchestration

  5. Atomic Commitment Problem • Reach a globally consistent state despite failures • Each participant has two possible decision values • commit • All participants will make the transaction’s updates permanent • abort • All will roll-back • Individual decisions are irreversible • A commit decision requires unanimity of YES votes Models and Languages for Coordination and Orchestration

  6. Atomic Commitment Properties • Consensus • All participants that decide reach the same decision • If any participant decides commit, then all participants must have voted YES • If all participants have voted YES and no failures occur, then commit is decided • Irreversibility • Each participant decides at most once Models and Languages for Coordination and Orchestration

  7. Commitment Protocols • Atomic commitment protocol • satisfies all atomic commitment properties • ensures that transactions terminate consistently at all participating sites of a distributed database, even in presence of failures • Non-blocking • if it permits transaction termination to proceed at correct participants despite failures of others • is the activity of ensuring that Sw and Hw failures do not corrupt persistent data • can limit time intervals of resource locking Models and Languages for Coordination and Orchestration

  8. Some Assumptions • One of the participants acts as unique coordinator (centralized version) • At most one (if no failures, then there is one coordinator) • A participant assumes the role of coordinator within a fixed time interval from the beginning of the transaction • The transaction begins at a single participant called the invoker (not necessarily the coordinator) • sends start messages to other participants • Only undeliverable messages are dropped • All participants can communicate (useful later) Models and Languages for Coordination and Orchestration

  9. Generic ACP: Coordinator • send VOTE-REQ[Tid] to all participants • set-timeout • wait-for vote[Tid] from all participants • if (all votes are YES) then • broadcast (commit[Tid], participants) • else// at least one vote is NO • broadcast (abort[Tid], participants) • on-timeout:// escape blocking wait-for • broadcast (abort[Tid], participants) Phase 1 Phase 2 Models and Languages for Coordination and Orchestration

  10. Generic ACP: Participants • set-timeout • wait-for VOTE-REQ[Tid] from coordinator // 1 • send vote[Tid] to coordinator • if (vote==NO) then // unilateral abort • decide abort • else • set-timeout • wait-for decision from coordinator // 2 • if (decision==abort) thendecide abort • else decide commit • on-timeout: termination-protocol // escape 2 • on-timeout: decide abort //escape 1 Models and Languages for Coordination and Orchestration

  11. Simple Broadcast • broadcast(m,S) • // Broadcaster • send m to all processes in S • deliver m • // other processes in S • upon-receipt m // non-blocking • deliver m • This corresponds to the 2PC Protocol Models and Languages for Coordination and Orchestration

  12. Timeout Actions • Participants must wait • VOTE_REQ from coordinator • If this takes too long can just decide abort • Coordinator collects votes • No global decision is yet made • Coordinator can decide abort • commit / abort from coordinator • The participants already took a decision (YES) • It is now uncertain • It must consult other participants according to the termination protocol Models and Languages for Coordination and Orchestration

  13. Termination Protocol (TP) • What if a participant that voted YES times out waiting for the response from coordinator? • It invokes a termination protocol to contact: • the coordinator • other participants (cooperative TP) • can have already voted or not yet voted • There are failure scenarios for which no termination protocol can lead to a decision • Blocking scenario: correct participants cannot decide • e.g. coordinator crashes during broadcast • all faulty participants deliver and crash • all correct participants do not deliver the decision • if faulty participants do not recover any decision could contradict the decision of a participant that crashed Models and Languages for Coordination and Orchestration

  14. skipped for lack of time Non-Blocking ACP I • set-timeout • wait-for VOTE-REQ[Tid] from coordinator // 1 • send vote[Tid] to coordinator • if (vote==NO) then // unilateral abort • decide abort • else • set-timeout • wait-for decision from coordinator // 2 • if (decision==abort) thendecide abort • else decide commit • on-timeout:decide abort// escape 2 • on-timeout: decide abort //escape 1 Models and Languages for Coordination and Orchestration

  15. skipped for lack of time Non-Blocking ACP II • broadcast(m,S) • // Broadcaster as before • // other processes in S • upon-first-receipt m • send m to all processes in S // S can be sent along VOTE_REQ • deliver m • any process receiving m relays m to all others (if any correct process receives m, all correct process receive m, even if broadcaster crashes) • m is delivered only after relaying Models and Languages for Coordination and Orchestration

  16. skipped for lack of time Recovery • Participant p is recovering from a failure • Must reach a consistent decision • Suppose p remembers its state at the time it failed • Before voting • it can unilaterally abort • After deciding abort • it can unilaterally abort • After receiving commit / abort from coordinator • it had already decided and must behave accordingly • During the uncertainty period (voted YES) • Independent recovery is not possible! • Termination protocol is needed Models and Languages for Coordination and Orchestration

  17. skipped for lack of time Distributed Transaction Log • DTL is kept in stable storage at each site • Its content must survive failures • Coordinators and participants at that site can record information about transactions • Before/after sending VOTE_REQ, the coordinator C writes start2PC(S,Tid) • Before voting YES, a participant writes yes(C,S,Tid) • Before/after voting NO, a participant writes abort(Tid) • Before C sends commit, it writes commit(Tid) • Before/after C sends abort, it writes abort(Tid) • After receiving the decision, participant writes commit/abort Models and Languages for Coordination and Orchestration

  18. skipped for lack of time Recovery From DTL • If DTL contains start2PC (the site hosted the coordinator) • If it also contains commit/abort • The coordinator decided before failure • Otherwise • The coordinator can decide abort (and record it in DTL) • Otherwise • It contains commit/abort • The participant has reached decision before the failure • Does not contain yes • Either failed before voting or voted no • The participant can unilaterally abort • Otherwise (it contains yes but not commit/abort) • The participant failed in its uncertainty period • Must use the termination protocol Models and Languages for Coordination and Orchestration

  19. skipped for lack of time Cooperative TP: Initiator • send DECISION_REQ[Tid] to all processes in S • wait-for decision[Tid] from any process • if (decision==commit) then • write commit in DTL • else // decision==abort • write abort in DTL Models and Languages for Coordination and Orchestration

  20. skipped for lack of time Cooperative TP: Responder • wait-for decision[Tid] from any process p • if (abort(Tid) in DTL) then • send abort to p • else if (commit(Tid) in DTL)then • send commit to p Models and Languages for Coordination and Orchestration

  21. Evaluation of 2PC • Criteria: Reliability vs Efficiency • Resiliency • What failures can be tolerated? • Blocking • Can processes be blocked? • Under which conditions? • Time Complexity • How long does it take to reach a decision? • Message Complexity • How many messages are exchanged to reach a decision? • What are their dimensions? Models and Languages for Coordination and Orchestration

  22. Balancing • Reliability and Efficiency are conflicting goals • each can be achieved at the expenses of the other • The choice of protocol depends on which goal is more important for a specific application • Whatever protocol is chosen, we should optimize for the case with no failures • Hopefully the normal operating state of the system Models and Languages for Coordination and Orchestration

  23. Measuring Time Complexity • A round is the max time for a message to reach its destination • Timeouts are based on the assumption that such a delay is known • Note that many messages can be sent in a single round • Two messages must belong to different rounds iff one cannot be sent before the other is received • Rounds are taken as time units • We count the number of rounds needed for unblocked sites to reach a decision, in the worst case • This neglects the time needed to process messages • Reasonable: messages delays usually exceed processing delays • Other two factors can be relevant: • DTL management (on stable storage) • Broadcasting preparation (to a large number of processes) Models and Languages for Coordination and Orchestration

  24. Measuring Message Complexity • Total number of messages sent during the whole protocol • Reasonable measure if individual messages are not very large • Otherwise we should measure the length of messages, not merely their number • Here messages are short, so we abstract away from their lengths Models and Languages for Coordination and Orchestration

  25. Reliability of 2PC • Resiliency • 2PC is resilient to • site failures • communication failures • In fact, the cause of timeouts is not important • Blocking • 2PC is subject to blocking • Probabilistic analysis can be performed depending on the probabilistic distribution of failures Models and Languages for Coordination and Orchestration

  26. Time Complexity of 2PC • In absence of failure, 2PC requires 3 rounds • Broadcast VOTE-REQ • Collect votes • Broadcast global decision • If failures happen, The TP may need 2 additional rounds • Broadcast DECISION_REQ • Reply from a process outside its uncertainty period • Note that several TPs can be initiated separately in the same round • Up to 5 rounds, independently from the number of failures! • But processes may remain blocked for an unbounded period of time Models and Languages for Coordination and Orchestration

  27. Message Complexity of 2PC • Let N+1 be the number of participants, including the coordinator • In each round of 2PC, there are N messages sent • Hence, in absence of failures 2PC uses 3N messages • Cooperative TP is invoked by all participants that voted YES but did not receive commit / abort • Let there be M such participants • M initiators, each sending N DECISION_REQ (MN messages) • At most N-M+1 processes will respond to the first request • In the worst case only one process abandons its uncertainty and will respond to another initiator: (N-M+1)+(N-M+2)+…+N Models and Languages for Coordination and Orchestration

  28. Calculating the Message Complexity of 2PC • In the worst case the total number of TP messages will be: • NM + i=1 (N-M+i) = NM + NM – M2 + M(M+1)/2 • = 2NM – M2/2 + M/2 messages • This quantity is maximum when M=N • N(3N+1)/2 messages • The 2PC together with worst-case TP amount to • 3N + N(3N+1)/2 = N(3N+7)/2 messages M Models and Languages for Coordination and Orchestration

  29. Communication Topology • The communication topology of a protocol is the specification of who sends messages to whom • e.g. in 2PC without TP, the coordinator sends messages to participants and vice versa • Participants do not send messages directly to each other • The topology is described as a tree of height 1 Coordinator … Participant Participant Participant Participant Models and Languages for Coordination and Orchestration

  30. Alternative 2PCs • To reduce time and message complexity of centralized 2PC, two variations have been proposed, based on different communication topologies • Decentralized 2PC • Communication topology is a complete graph • Improve time complexity • Linear 2PC (aka Nested 2PC) • Linearly ordered processes • Reduce the number of messages Models and Languages for Coordination and Orchestration

  31. Decentralized 2PC • Depending on its own vote, the coordinator sends YES or NO to all participants • Informs that it is time to vote • Tells the coordinator’s vote • If the message is NO • Each participant decides abort and stops • Otherwise, each participant sends back its vote to ALL OTHER PARTICIPANTS • After receiving all votes each process can decide autonomously • If all are YES and its own vote is YES, decide commit • Otherwise it decides abort • Timeouts can be employed as in the centralized 2PC Models and Languages for Coordination and Orchestration

  32. Evaluation of Decentralized 2PC • In the absence of failures, only 2 rounds are necessary • Coordinator voting YES / NO • Each participant voting YES / NO • More messages are needed: N2+N messages • N messages in the first round • N2 messages in the second round • (and this is just in absence of failures) Models and Languages for Coordination and Orchestration

  33. Linear 2PC • Each participant can communicate only with its left / right neighbors • The coordinator is the leftmost process • It sends its vote YES / NO to its right neighbor • This message has a dual meaning as in decentralized 2PC • Each participant p waits for the vote from its left neighbor • If it is YES, and p votes YES, then p tells YES to its right neighbor • Otherwise, p tells NO to its right neighbor • When the rightmost participant receives the vote, it makes the final decision commit / abort • The decision is propagated from right to left • When the coordinator receives it, the protocol ends • Timeout periods are influenced by positions Models and Languages for Coordination and Orchestration

  34. Evaluation of Linear 2PC • Only 2N messages needed • N votes from left to right • N decisions from right to left • (and this is just in absence of failures) • Unfortunately the same amount of rounds is needed: 2N rounds • No two messages are sent concurrently Models and Languages for Coordination and Orchestration

  35. Comparison of 2PC Variants • Hybrid communication topologies are also possible • e.g. Linear for voting, complete for conveying decision • 2N messages, N+1 rounds • The choice of the protocol might be influenced by the available communication topology Models and Languages for Coordination and Orchestration

  36. skipped for lack of time From 2PC to 3PC • In 2PC, if all operational participants are uncertain, they are blocked • They cannot decide abort even if aware that processes they cannot communicate with have failed, because some of them could have decided commit before failure • The 3CP is an ACP designed to rule out this situation • It guarantees that if any operational process is uncertain, then no (operational / failed) process can have decided commit • Thus, if p realizes that any operational site is uncertain, then p can decide abort • Why does 2PC violate this property? • A participant p can receive commit while q is still uncertain Models and Languages for Coordination and Orchestration

  37. skipped for lack of time Sketch of 3PC: The Idea • After the coordinator has found that all votes were YES, it sends pre-commit messages to all participants • When a participant p receives pre-commit, it knows that all participants voted YES • p is no longer uncertain, but does not decide commit yet • p knows that it will decide commit unless it fails • p acknowledges the receipt of pre-commit • When the coordinator collects all acks it knows that no participant is uncertain • The coordinator sends commit to all participants • When a participant receives commit, it decides commit • If a participant voted NO, then 3PC behaves as 2PC Models and Languages for Coordination and Orchestration

  38. skipped for lack of time Sketch of 3PC: Some Notes • In absence of failures, 3PC involves 5 rounds and up to 5N messages • Participants have four possible states • Aborted, Uncertain, Committable, Committed • For p and q any two participants, only certain combinations of their states are possible • Timeouts can occur in five situations • 3 are trivially handled • 2 require a complex termination protocol • Election protocol (for a new coordinator) based on a linear ordering of participants • The new coordinator checks the states of all operational participants • Timeouts are again necessary Models and Languages for Coordination and Orchestration

  39. Some References • Concurrency control and recovery in database systems (Addison-Wesley 1987) • P. Bernstein, N. Goodman, V. Hadzilacos • Transaction processing: concepts and techniques (Morgan Kaufmann 1993) • J. Gray, A. Reuter • Sagas (Proc. SIGMod’87, ACM, pp. 249-259) • H. Garcia-Molina, K. Salem • Non-blocking atomic commitment (Chapter 6 of Distributed Systems, Addison-Wesley 1995) • O. Babaoglu, S. Toueg Models and Languages for Coordination and Orchestration

  40. D2PC • Every participant P acts as coordinator • During the transaction P builds its own synchronization set LPof cooperating agents • When P is ready to commit, P asks readiness to processes in LP (if empty P was isolated and can commit) • In doing so, P sends them the set LP • Other participants will send to P • either a successful reply with their own synchronization sets • or a failure message • (in this case, failure is then propagated) • Successful replies are added to LP • The protocol terminates when LP is transitively closed Models and Languages for Coordination and Orchestration

  41. Example: D2PC {P1,P3} P2 P1 {P2} P3 {P2} Models and Languages for Coordination and Orchestration

  42. parties who replied known parties contacted parties Example: P3 Enters ACP Phase {P1,P3} P2 P1 {P2} [] () P3 {P2} Models and Languages for Coordination and Orchestration

  43. Example: P3Contacts Known Parties {P1,P3} P2 Hi, I am P3. I am ready to commit. I know P2. <P3,{P2}> P1 {P2} [P2] () P3 {P2} Models and Languages for Coordination and Orchestration

  44. Example: P2 Enters ACP Phase {P1,P3} [] () P2 <P3,{P2}> P1 {P2} [P2] () P3 {P2} Models and Languages for Coordination and Orchestration

  45. Example: P2Contacts Known Parties {P1,P3} [P1,P3] () P2 <P3,{P2}> Hi, I am P2. I am ready to commit. I know P1 and P3. <P2,{P1,P3}> <P2,{P1,P3}> P1 {P2} [P2] () P3 {P2} Models and Languages for Coordination and Orchestration

  46. Example: Some Pending Messages Around {P1,P3} [P1,P3] () P2 <P3,{P2}> <P2,{P1,P3}> P1 {P2} [P2] () P3 <P2,{P1,P3}> {P2} Models and Languages for Coordination and Orchestration

  47. Example: P2 Reads a Pending Vote {P1,P3} [P1,P3] (P3) P2 <P2,{P1,P3}> P1 {P2} [P2] () P3 <P2,{P1,P3}> {P2} Models and Languages for Coordination and Orchestration

  48. Example: P3 Reads a Pending Vote {P1,P3} [P1,P3] (P3) P2 <P2,{P1,P3}> P1 {P1,P2} [P2] (P2) P3 {P2} Models and Languages for Coordination and Orchestration

  49. Example: P3 Contacts the Newly Known Party {P1,P3} [P1,P3] (P3) P2 <P2,{P1,P3}> P1 {P1,P2} [P1,P2] (P2) P3 <P3,{P1,P2}> {P2} Models and Languages for Coordination and Orchestration

  50. Example: P1 Enters ACP Phase {P1,P3} [P1,P3] (P3) P2 <P3,{P1,P2}> <P2,{P1,P3}> P1 {P1,P2} [P1,P2] (P2) P3 {P2} [] () Models and Languages for Coordination and Orchestration

More Related