1 / 34

CS 603 Mid-Semester Review

CS 603 Mid-Semester Review. March 4, 2002. One or Two Day Review?. One day: Skim material and Test Overview What to do with Wednesday? More on replication Start on distributed processes Two day: Discuss material to date Wednesday: Finish Review Work out sample question. Basics.

Télécharger la présentation

CS 603 Mid-Semester Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 603Mid-Semester Review March 4, 2002

  2. One or Two Day Review? • One day: Skim material and Test Overview • What to do with Wednesday? • More on replication • Start on distributed processes • Two day: Discuss material to date • Wednesday: • Finish Review • Work out sample question

  3. Basics • Why do we want distributed systems? • Scaling • Heterogeneity • Geographic Distribution • What is a distributed system? • Transparency vs. Exposing Distribution • Hardware Basics • Communication Mechanisms

  4. Basic Software Concepts • Hiding vs. Exposing • Distribution – Distributed OS • Location, but not distribution – Middleware • None – Network OS • Concurrency Primitives • Semaphores • Monitors • Distributed System Models • Client-Server • Multi-Tier • Peer to Peer

  5. Communication Mechanisms • Shared Memory • Enforcement of single-system view • Delayed consistency: δ-Common Storage • Message Passing • Reliability and its limits • Stream-oriented Communications • Remote Procedure Call • Remote Method Invocation

  6. RPC Example: DCE • Language / Platform Independent • Implementation Issues: • Data Conversion • Underlying Mechanisms • Fault Tolerance Approaches

  7. Java RMI • Supports remote invocation of Java objects • Key: Java Object SerializationStream objects over the wire • Language specific • Advantages • True object-orientation: Objects as arguments and values • Mobile behavior: Returned objects can execute on caller • Integrated security • Built-in concurrency (through Java threads) • Disadvantage – Java only • Implementation / Use • Registry

  8. SOAP • Goal: RPC protocol that works over wide area networks • Interoperable • Language independent • Problem: Firewalls • Solution: HTTP/XML • Client side: Ability to generate http calls and listen for response • Server: • Listen for HTTP • Bind to procedure • Respond with HTTP • SOAP message format and use mechanisms

  9. Naming Requirements • Disambiguate only • Access resource given the name • Build a name to find a resource • Do humans need to use name? • Static/Dynamic Resource • Performance Requirements

  10. Naming Approaches • Scope • Global vs. Hierarchical • Unique ID vs. Non-Unique Description • Namespaces • URN, URI, URL • Registries

  11. Registry Example: X.500 • Goal: Global “white pages” • Lookup anyone, anywhere • Developed by Telecommunications Industry • ISO standard directory for OSI networks • Idea: Distributed Directory • Application uses Directory User Agent to access a Directory Access Point

  12. Directory Information Base(X.501) • Tree structure • Root is entire directory • Levels are “groups” • Country • Organization • Individual • Entry structure • Unique name • Build from tree • Attributes: Type/value pairs • Schema enforces type rules • Alias entries

  13. X.500 • Directory Entry: • Organization level – CN=Purdue University, L=West Lafayette • Person level – CN=Chris Clifton, SN=Clifton, TITLE=Associate Professor • Directory Operations • Query, Modify • Authorization / Access control • To directory • Directory as mechanism to implement for others

  14. X.500 – Distributed Directory • Directory System Agent • Referrals • Replication • Cache vs. Shadow copy • Access control • Modifications at Master only • Consistency • Each entry must be internally consistent • DSA giving copy must identify as copy

  15. X.500 Subsets • LDAP • X.500 without OSI • Intended for use over IP • Active Directory • Microsoft’s answer to LDAP • Extensible “default” naming schema • Limited replication facilities

  16. Clock Synchronization • Definition: All nodes agree on time • What do we mean by time? • What do we mean by agree? • Lamport Definition: Events • Events partially ordered • Clock “counts” the order

  17. Event-based definition(Lamport ’78) Define partial order of processes • A  B: A “happened before” B: Smallest relation such that: • If A and B in same process and A occurs first, A  B • If A is sending a message and B is receipt of a message, A  B • If A  B and B  C, then A  C • Clock: C(x) is time x occurs: • C(x) = Ci(x) where x running on node i. • Clocks correct if  a,b: ab  C(a) < C(b)

  18. Lamport Clock Implementation • Node i Increments Ci between any two successive events • If event a is sending of a message m from i to j, • m contains timestamp Tm = Ci(a) • Upon receiving m, set Cj≥ current Cj and > Tm • Can now define total ordering. a  b iff: • Ci(a) < Cj(b) • Ci(a) = Cj(b) and Pi < Pj

  19. What if we want “wall clock” time? • Ci must run at correct rate: • κ << 1 such that | dCi(t)/dt – 1 | < κ • Synchronized: •  small ε such that  i,j: | Ci(t) – Cj(t) | < ε • Assume transmission time between μ and μ+ξ • Algorithm: Upon receiving message m,set Cj(t) = max(Cj(t), Tm+μ) • Theorem: Assume every τ seconds a message with unpredictable delay ξ is sent over every arc. Then t ≥ t0 + τd, ε≈ d(2κτ + ξ)

  20. Clock Synchronization:Limits • Best Possible: Delay Uncertainty • Actually ε(1 – 1/n) • Synchronization with Faults • Faulty clock • Communication Failure • Malicious processor • Worst case: Can only synchronize if < 1/3 processors faulty • Better if clocks can be authenticated

  21. Real example: NTP I doubt you need to review this...

  22. Process Synchronization • Problem: Shared Resources • Model as sequential or parallel process • Assumes global state! • Alternative: Mutual Exclusion when Needed • Coordinator approach • Token Passing • Timestamp

  23. Mutual Exclusion • Requirements • Does it guarantee mutual exclusion? • Does it prevent starvation? • Is it fair? • Does it scale? • Does it handle failures?

  24. CS 603Mid-Semester Review March 6, 2002

  25. Mutual Exclusion:Colored Ticket Algorithm • Goals: • Decentralized • Fair • Fault tolerant • Space Efficient • Idea: Numbered Tickets • Next number gets resource • Problem: Unbounded Space • Solution: Reissue blocks

  26. Multi-ResourceMutual Exclusion • New Problem: Deadlock • Processes using all resources • Each needs additional resource to proceed • Dining Philosophers Problem • Coordinated vs. truly distributed solutions • Problems with deterministic solutions • Probabilistic solution – Lehman & Rabin • Starvation / fairness properties

  27. Distributed Transactions • ACID properties • Issues: • Commit Protocols • Fault Tolerance Why is this enough? • Failure Models and Limitations • Mechanisms: • Two-phase commit • Three-phase commit

  28. Two-Phase Commit(Lamport ’76, Gray ’79) • Central coordinator initiates protocol • Phase 1: • Coordinator asks if participants can commit • Participants respond yes/no • Phase 2: • If all votes yes, coordinator sends Commit • Participants respond when done • Blocks on failure • Participants must replace coordinator • If participant and coordinator fail, wait for recovery • While blocked, transaction must remain Isolated • Prevents other transactions from completing

  29. Transaction Model • Transaction Model • Global Transaction State • Reachable State Graph • Local states potentially concurrent if a reachable global state contains both local states • Concurrency set C(s) is all states potentially concurrent with s • Sender set S(s) = {local states t | t sends m and s can receive m} • Failure Model • Site failure assumed when expected message not received in time • Independent Recovery

  30. Problems with 2-PC • Blocking on failure • 3-PC as solution • Theorems on recovery limits • Independent recovery: No two-site failure • Non-independent recovery • Anything short of total failure okay • Recovery protocol for total failure

  31. c1 a1 c2 a2 3PC assuming timeout on receipt of message Coordinator Participant q1 q2 start xact/ no start xact/ yes xact request/ start xact abort/ - w1 w2 no/ abort yes/ pre-commit pre-commit/ ack p1 p2 ack/commit commit/ -

  32. Termination Protocol • If participant times out in w2 or p2: • Elect new Coordinator If coordinator alive, would have committed/aborted • New coordinator requests state of all processes. Termination rules: • If any aborted, broadcast abort • If any committed, broadcast commit • If all w2, broadcast abort • If any p2, send pre-commit and enter state p1 • Complete failure protocol

  33. Test Basics • Mechanics: Open book/notes • No electronic aids • Two questions • Each multi-part • Will include scoring suggestions • Underlying question: Do you understand the material? • No need to regurgitate “best in literature” answer • Reasonable self-designed solution fine • Key: Do you really understand your answer • Can you build CORRECT distributed systems?

  34. Develop synchronization protocol for a four processor system with fully-connected processors. Linear envelope of real time Bounded difference between clocks on correct processors. Time set to 0 when the protocol begins (but not synchronized). Assume: Clocks don't drift Messages take between time 0 and e At most one faulty processor No authentication Discuss the correctness of your algorithm, including the types of faults handled. Scoring: Protocol: Up to five points Argument for correctness: 2 points requires believable proof sketch for full 2 points Faults supported / not supported: 1-3 points 3 points requires proof sketch that it handles supported faults and examples showing failure with unsupported fault types. Sample Question:Clock Synchronization

More Related