html5-img
1 / 19

CS 294-8 Consensus cs.berkeley/~yelick/294

CS 294-8 Consensus http://www.cs.berkeley.edu/~yelick/294. Agenda. Overview and Administrivia Specifications and verification Consensus Practical Issues in Consensus Note: due in part to unreliable network and lack of reliability in ppt, most slides are stolen from Lamport. Administrivia.

Télécharger la présentation

CS 294-8 Consensus cs.berkeley/~yelick/294

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 294-8Consensushttp://www.cs.berkeley.edu/~yelick/294

  2. Agenda • Overview and Administrivia • Specifications and verification • Consensus • Practical Issues in Consensus • Note: due in part to unreliable network and lack of reliability in ppt, most slides are stolen from Lamport.

  3. Administrivia • So far: readings in distributed and fault tolerant systems • Next: specifying and reasoning about these systems • Readings for next few weeks will be set by Thursday • For Thursday: • SmartBridge (for talk) • Frangiapani (for discussion)

  4. Course Overview • So far: reading “systems” papers • Next few weeks: reading papers on algorithms and proofs • Why? • I know my algorithm works, but… • I found a missing case when I was implementing… • My advisor (or the PC) doesn’t believe me…

  5. Agenda • Overview and Administrivia • Specifications and verification • Consensus • Practical Issues in Consensus

  6. Highly Available Computing • High availability means either perfection or redundancy. • The system can work even when some parts are broken. • The simplest redundancy is replication: • Several copies of each part. • Each non-faulty copy does the same thing. • Every computing system works as a state machine. • So a replicated state machine can do highly available computing.

  7. Replicated State Machines • If a state machine is deterministic, then feeding two copies the same inputs will produce the same outputs and states. • We call each copy a process. • So all we need is to agree on the inputs. • Examples: • Replicated storage with Read(a) and Write(a, d) steps. • Airplane flight control system with ReadInstrument(i) and RaiseFlaps(d) steps.

  8. State Machine Approach • A distributed system is: • A finite set of processes • A process is: • A set of states, with one initial state • A set of events or actions • An execution is a possibly infinite sequence of alternating states/actions s0 s1 s2 a0 a1 a2

  9. Properties • A stuttering transition has the form s s • A property is a set of executions closed under stuttering [Abadi, Lamport 1990] • The clock still ticks after a program temrinates • Stuttering is also a useful in mapping between levels of abstraction a0

  10. Safety Properties • Informally: A safety properties is one that says something bad doesn’t happen • Formally: A property P is a safety property iff: • If s is in P then any finite prefix of s is in P • Additionally, • If s is not in P then there is some finite prefix of s that is not in P • There is a point at which an illegal transition occurred • Safety properties can be finitely refuted.

  11. Liveness Properties • Informally: A liveness property says something good eventually happens • Formally: A property P is a liveness property iff: • If every finite behavior is a prefix of some behavior in P • Additionally, • Can always “complete” a finite behavior into one that is in P • Safety properties cannot be finitely refuted.

  12. Safety and Liveness • Every property (I.e., every set of behaviors) is the conjunction of: • A safety property and • A liveness property • Due to Alpern and Schneider, based on basic results from Topology

  13. Visible Behavior • A specification identifies a subset of its actions (or its state variables) as externally visible. • A state machine defines a set of allowable executions: • state: a set of values, usually divided into named variables. • actions: named changes in the state; internal and external. • They may be nondeterministic • In fact, Lampson encourages this in specs to allow flexibility in implementations

  14. Implements • Y implements X if • every external behavior of Y is an external behavior of X, • This expresses the idea that Y implements X if you can’t tell Y apart from X by looking only at the external actions • Examples: abstract data types, databases, distributed systems • Note: Lampson implicitly deals with finite behaviors, and therefore states the liveness property separately. (Doesn’t treat liveness in the proofs.)

  15. Agenda • Overview and Administrivia • Specifications and verification • Consensus • Practical Issues in Consensus

  16. Use of Consensus • Agreeing on some value is called consensus. • A replicated state machine needs to agree on a sequence of values: • Input 1 Write(x, 3) • Input 2 Read(x) • . . .

  17. Paxos Assumptions • Each legislator has • A ledger (stable storage) • An hourglasses for time • Communication • Point-to-point, fully connected network • Unreliable: loss and delay allowed • Failures • Legislators may come and go (processor failure) • They are honest – no byzantine failures

  18. Agenda • Overview and Administrivia • Specifications and verification • Consensus • Practical Issues in Consensus

  19. Summary • How to build a highly available system using consensus. • Run a replicated deterministic state machine, and get consensus on each input. • Use leases to replace most of the consensus steps with actions by one process. • The most fault-tolerant algorithm for consensus without real-time guarantees. • Lamport’s “Paxos” algorithm, based on • How to design and understand a concurrent, fault-tolerant system. • Write a simple spec as a state machine. • Define abstract function and show simulation.

More Related