1 / 22

Synchronization

Synchronization. A02 marks posted make sure you use proper attribution. Midterm - to discuss on Wednesday Project logistics P01 marks posted P02, P03 posted last week Timeline posted P02 due Thursday next week. . Summary so far …. A distributed system is:

donatella
Télécharger la présentation

Synchronization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synchronization

  2. A02 marks posted • make sure you use proper attribution. • Midterm - to discuss on Wednesday • Project logistics • P01 marks posted • P02, P03 posted last week • Timeline posted • P02 due Thursday next week.

  3. Summary so far … A distributed system is: • a collection of independent computers that appears to its users as a single coherent system Components need to: • Communicate • Point to point: sockets, RPC/RMI • Point to multipoint: multicast, epidemic • Cooperate • Naming to enable some resource sharing • Naming systems for flat (unstructured) namespaces: consistent hashing, DHTs • Naming systems for structured namespaces • Synchronization

  4. Synchronization to support coordination • Examples • Distributed make • Printer sharing • Monitoring of a real world system • Agreement on message ordering • Why is synchronization more complex than in a single-box system • No global views, multiple clocks, failures

  5. An example … Two shooters in a multiplayer online game. Both shoot (accurately) towards the same target. Need to decide who gets the point. Generally implemented using replicated state Object A is observed from two vantage points S1 and S2 at local times t1 and t2. Need to aggregate everything into one consistent view.

  6. Roadmap • Physical clocks • Provide actual / real time • ‘Logical clocks’ • Where only ordering of events matters • Leader election • How do I choose a coordinator? • Mutual exclusion • How does one implement critical regions

  7. Physical clocks (I) • Problem: How to achieve agreement on time in a distributed system? • A possible solution: useUniversal Coordinated Time (UTC): • Based on the number of transitions per second of the cesium 133 atom (pretty accurate). • At present, the real time is taken as the average of some 50 cesium-clocks around the world. • Introduces a leap second from time to time to compensate for days getting longer. • UTC is broadcastthrough short wave radio and satellite. • Accuracy ± 1ms (but if weather conditions considered ±10ms)

  8. Physical clocks - underlying model Suppose we have a distributed system with a UTC-receiver somewhere in it. Problem: we still have to distribute time to each machine. Internal mechanism at each node • Each machine has a timer • Timer causes an interruptH times a second • Interrupt handler adds 1 to a software clock • Software clock keeps track of the number of ticks since agreed-upon time in the past.

  9. Physical clocks – main problem: clock drift Notation: Value of clock on machine p at real time t is Cp(t) Ideally: Cp(t) == t and dCp(t) = dt Real world: clockdrift, i.e., |Cp(t) - t | > 0 Maximum drift rate (ρ)guaranteed by manufacturer 1 - ρ ≤ (dC/dt) ≤ 1 + ρ Goal:Never let clocks in any two nodes in the system differ by more than x time units Solution: synchronize at least every x/(2ρ) seconds.

  10. Building a complete system … • Option I: Every machine asks a time server for the accurate time at least once every x/(2ρ) seconds (Network Time Protocol). • Need to account for network delays, including interrupt handling and processing of messages. • Client updates time to? Tnew=CUTC+(T1-T0)/2 • Fundamental:You’ll have to take into account that setting the time back is never allowed  smooth adjustments. • Option II: Let the time server scan all machines periodically, calculate an average, and inform each machine how it should adjust its time relative to its present time. • Note: you don’t even need to propagate UTC time.

  11. Real world: Network Time Protocol (NTP) • Stratum 0 NTP servers – receive time from external sources (cesium clocks, GPS, radio broadcasts) • Stratum N+1 servers synchronize with stratum N servers and between themselves • Self-configuring network • User configured to contact local NTP server • Survey (N. Minar’99) • > 175K NTP servers • 90% of the NTP servers have <100ms offset fro synchronization peer • 99% are synchronized within 1s

  12. Uses of (synchronized) physical clocks in the real world • NTP • Global Positioning Systems • Using physical clocks to implement at-most-once semantics

  13. [Summary] Physical clocks • Clock drift: time difference between two clocks • Sources of errors (drift) • Variability in time to propagate radio signals. Variability. (±10ms) • Clocks are not perfect: Drift rates • Network latencies are not symmetric • Differences in speed to process messages • System design to limit drift • One node holds the ‘true’ time • Other nodes contact this node periodically and adjust their clocks • How often? • How exactly the adjustment is done?

  14. We’ve established that clocks can not be perfectly synchronized. • What can I do in these conditions? • Estimate out how large the drift is • Example: GPS systems • Design the system to take drift into account • Example: Server design to provide at-most-once semantics • Give up physical clocks! • Consider only event order - Logical clocks

  15. GPS – Global Positioning Systems (1) • Basic idea: Estimate signal propagation time between satellite and receiver to estimate distance to satelite • Strawman: Assume that the clocks of the satellites and receiver are accurate and synchronized: • Real world: The receiver’s clock is definitely out of synch with the satellite

  16. GPS – Global Positioning Systems (2) • Xr, Yr, Zr, are unknown coordinates of the receiver. • Ti is the timestamp on a message from satellite i • ∆Ii = (Tnow – Ti) is the measured propagation delay of the message sent by satellite i. • Distance to satellite i can be estimated in two ways • Propagation time: di = c x ∆Ii • Real distance: • 3 satellites 3 equations in 3 unknowns • So far I assumed receiver clock is synchronized! • What if it needs to be adjusted? Treal=Tnow+ ∆r • ∆I = (Tnow + ∆r ) – Ti • Collect one more measurement from one more satellite!

  17. Computing position in wired networks Observation: a node P needs at least k + 1 landmarksto compute its own position in a k-dimensional space. Consider two-dimensional case: Solution:P needs to solve three equations in two unknowns (xP,yP):

  18. We’ve established that clocks can not be perfectly synchronized. • What can I do in these conditions? • Estimate out how large the drift is • Example: GPS systems • Design the system to take drift into account • Example: Server design to provide at-most-once semantics • Give up physical clocks! • Consider only event order - Logical clocks

  19. Efficient at-most-once message delivery Server needs to maintain request identifiers to avoid replayed requests Issues • 1: How long to maintain transaction data? • 2: How to deal with server failures? (Minimize state that is persistently stored)

  20. Efficient at-most-once message delivery (II) Issue1: How long to maintain transaction data? Solution: • Client: • Sends transaction id and physical timestamp • Client (or network) may resend messages for up to MaxLifeTime • Server’s goal: Discard messages with duplicate id and messages that have been generates too far in the past Mechanism • Maintains: G = Tcurrent - MaxLifeTime - MaxClockSkew • Discards messages with timestamps older than G • Ignores (or delays) message that arrive in the future • Maintains transaction data only for the interval [G..Tnow]

  21. Efficient at-most-once message delivery (III) • Issue 2: What to persistently store across server failures? • Solution: • Current Time (CT) is written to disk every ΔT • At recovery Gfailure is recomputed after a crash from saved CT • After recovery – new messages arrive • Discard messages with timestamp older than Gfailure + ΔT • Process messages with timestamp newer than Gfailure + ΔT [Quiz-like question: the two bullets above ignore clock skew. Change the formulas to consider the clock skew]

  22. Roadmap • Clocks can not be perfectly synchronized. • What can I do in these conditions? • Figure out how large exactly the drift is • Example: GPS systems • Design the system to take drift into account • Example: Server design to provide at-most-once semantics • Next: Do not use physical clocks! • Consider only event order - Logical clocks

More Related