EEC-681/781 Distributed Computing Systems

EEC-681/781Distributed Computing Systems Lecture 10 Wenbing Zhao wenbing@ieee.org Cleveland State University

Outline • Clock Synchronization issues • Clock Synchronization Algorithms • Centralized • Distributed • Event ordering and logical clocks • Due date for project progress report • 11/20 Monday mid-night • No extension! EEC-681: Distributed Computing Systems

Motivation for Clock Synchronization • In everyday life, we are relying on clocks to coordinate our activities • For example, in EEC681, we meet every Monday and Wednesday between 6-7:50pm • In computer systems, it is also convenient to use clock as a way to coordinate different activities • For example, the “make” program relies on files’ timestamp to decide if a recompilation is necessary EEC-681: Distributed Computing Systems

Motivation for Clock Synchronization • Stock market buy and sell orders • Secure document timestamps (with cryptographic certification) • Aviation traffic control and position reporting • Radio and TV programming launch and monitoring • Intruder detection, location and reporting • Multimedia synchronization for real-time teleconferencing • Network monitoring, measurement and control • Differentiated services traffic engineering

Universal Coordinated Time • Universal Coordinated Time (UTC): • Based on the number of transitions per second of the cesium 133 atom (pretty accurate) • At present, the real time is taken as the average of some 50 cesium-clocks around the world • Introduces a leap second from time to time to compensate that days are getting longer • UTC is broadcastthrough short wave radio and satellite. • Satellites can give an accuracy of about 0.5 ms EEC-681: Distributed Computing Systems

Physical Clocks in Computer Systems • Every machine has a timer that generates an interrupt H times per second • There is a clock in machine p that ticks on each timer interrupt • Denote the value of that clock by Cp(t), where t is UTC time • Ideally, for each machine p, Cp(t)= t, or, in other words, dC/dt = 1 EEC-681: Distributed Computing Systems

Clock Time and UTC • The relation between clock time and UTC when clocks tick at different rates • In practice: 1 -  < dC/dt < 1 +  • Maximum drift rate:  Clock time, C EEC-681: Distributed Computing Systems

Terms • Clock drift rate (or clock accuracy) • The amount of deviation from UTC per unit of time (a day or a week, etc.) • Clock precision: • Resolution of the clock, e.g., 1ms, or 1ns • Clock skew • The difference in time values of two clocks is called clock skew • Maximum clock skew of a group of clocks is determined by the two clocks that have the largest clock difference EEC-681: Distributed Computing Systems

Clock Synchronization • Two clocks are said to be synchronized at a particular instance of time if the clock skew of the two clocks is less than some specified constant δ • A set of clocks are said to be synchronized if the clock skew of any two clocks in this set is less than δ EEC-681: Distributed Computing Systems

Clock Synchronization Issues • A distributed system requires: • External Synchronization • Synchronize with an external time source • Internal Synchronization • Clocks within the same network synchronize with each other • Clock Synchronization requires: • Each node can read the other nodes’ clock values • Must consider unpredicted communication delay • Time must never run backward • Smooth adjustments => must maintain the order of the events EEC-681: Distributed Computing Systems

Message Propagation Time • Estimate of message propagation time: (T1-T0-I)/2 EEC-681: Distributed Computing Systems

T2 Server T3 x q0 T1 Client T4 Message Delay Distribution EEC-681: Distributed Computing Systems

Clock Synchronization Algorithms • Centralized • Passive Time Server Centralized Algorithm • Cristian’s algorithm • Active Time Server Centralized Algorithm • Berkeley Algorithm • Distributed • Global Averaging Distributed Algorithms • Localized Averaging Distributed Algorithms EEC-681: Distributed Computing Systems

Passive Timer Server • Each node periodically sends a message (time=?) to the time server at the current local clock time, T0 • The server responds with a message (time = T), T is the current time of the server • The client receives the message at the local clock time T1, and adjusts its local clock time to T+(T1-T0-I)/2 • The time taken by the server to handle the request message is I, T+(T1-T0-I)/2 • Several measurements of T1-T0, discard the unreliable ones EEC-681: Distributed Computing Systems

Active Time Server • The time server periodically broadcasts its clock time (T) • Other nodes receive the message to correct their own clocks • Each node has the knowledge of the approximate time (Ta) required for the propagation of the message, T+Ta • Each nodes replies with the local clock time • The server • Knows the approximate propagation time from each node • Takes fault-tolerant average of clock values as current time • The server adjusts it own, and sends the amount by which each node clock requires adjustment to each node EEC-681: Distributed Computing Systems

Centralized Algorithms – Drawbacks • Single-point failure • Scalability EEC-681: Distributed Computing Systems

Global Averaging Distributed Algorithms • Each node broadcasts its local clock time periodically • Each node waits for time T • The node collects the messages broadcast by other nodes • For each message received, the node keeps the local time • At the end of T, the node estimates the skew of its clock with respect to each of the other nodes on the basis of the times at which it received • The node computes a fault-tolerant average of the estimated skews and uses it to adjust its local clock EEC-681: Distributed Computing Systems

Localized Averaging Distributed Algorithms • Each node exchanges its clock time with its neighbors • Then sets its clock time to the average of its own clock and the clock times of its neighbors EEC-681: Distributed Computing Systems

Exercise • Consider the behavior of two machines in a distributed system. Both have clocks that are supposed to tick 1000 times per millisecond. One of them actually does, but the other ticks only 990 times per millisecond. If UTC updates come in once a minute, what is the maximum clock skew that will occur? EEC-681: Distributed Computing Systems

Event Ordering • “Time, Clocks, and the Ordering of Events in a Distributed System”, by Leslie Lamport, Communications of the ACM, July 1978, Volume 21, Number 7, pp.558-565 • He showed that it is possible to synchronize all the clocks to produce a single, unambiguous time standard • He pointed out the clock synchronization need not to be absolute • What usually matters is not that all processes agree on exactly what time it is, but rather, that they agree on the order in which events occur EEC-681: Distributed Computing Systems

Happens-Before Relation • With perfectly accurate physical time • An event a happened before an event b if a happened at an earlier time than b • Without using the physical clocks • Assume that the system is composed of a collection of processes, each process consists of a sequence of events • The events of a process form a sequence, where a occurs before b in this sequence if a happens before b • Assume sending and receiving a message is an event in a process EEC-681: Distributed Computing Systems

Happens-before Relation • “Happens-before” relation, denoted by “→”, is defined as follows: • The relation “→” on the set of events of a system is the relation satisfying the following three conditions: • If a and b are events in the same process, and a comes before b, then a →b • If a is the sending of a message by one process and b is the receipt of the same message by another process, then a →b • If a →b and b →c, then a →c • Event acausally affects event b EEC-681: Distributed Computing Systems

Partial Ordering • Two distinct events a and b are said to be concurrent if a →b and b →a • Neither event can causally affect the other • This introduces a partial ordering of events in a system with concurrently operating processes EEC-681: Distributed Computing Systems

Logical Clocks • Logical clocks: Use the clock just as a way of assigning a number to an event, where the number is the time at which the event occurs • Define a clock Ci for each process Pi • Assigns a number Ci(a) to any event a in that process • The entire system of clocks is represented by the function C which assigns to any event b the number C(b), where C(b) =Cj(b) if b is an event in process Pj • The clocks Ci are logical clocks rather than physical clocks EEC-681: Distributed Computing Systems

Implementation of Logical Clocks • The logical clocks is correct if the events of the system that are related to each other by the happens-beforerelation can be properly ordered using these clocks • Clock condition: • For any event a, b, if ab then C(a) <C(b) EEC-681: Distributed Computing Systems

Implementation of Logical Clocks • According to our definition of the happens-beforerelation, the clock condition is satisfied if the following two conditions hold: • C1: if a and b are events in process Pi, and a comes before b, then Ci(a) < Ci(b) • C2: if a is the sending of a message by process Pi and b is the receipt of that message by process Pj, then Ci(a) < Cj(b) EEC-681: Distributed Computing Systems

Implementation of Logical Clock • To meet C1: • Each process Pi increments Ci between any two successive events • To meet C2: • (a) if event a is the sending of a message m by process Pi, then the message m contains a timestamp Tm = Ci(a). • (b) Upon receiving a message m, process Pj sets Cj greater than or equal to its present value and greater than Tm EEC-681: Distributed Computing Systems

Implementation of Logical Clocks by Counters • A Lamport logical clock is a monotonically increasing software counter • Each process Pi keeps its own logical clock Ci which is used to apply Lamport timestamps to events • To capture the happens-before relation→, processes update their logical clocks and transmit the values of their logical clocks in messages as follows: • Before each event at Pi: Ci := Ci+1 • When Pi sends a message m, it piggybacks t = Ci • When Pj receives (m,t): Cj := max(Cj,t) + 1 • e→e’ => C(e) < C(e’) EEC-681: Distributed Computing Systems

Implementation of Logical Clocks by Counters Question: Which two events are concurrent? EEC-681: Distributed Computing Systems

Total Ordering of Events • We can use the logical clocks satisfying the Clock Condition to place a total ordering on the set of all system events • Simply order the events by the times at which occur • To break the ties, Lamport proposed the use of any arbitrary total ordering of the processes, i.e. process id EEC-681: Distributed Computing Systems

Total Ordering of Events • Using this method, we can assign a unique timestamp to each event in a distributed system to provide a total ordering of all events • Very useful in distributed system • Solving the mutual exclusion problem • Totally ordered reliable multicast => needed to build fault tolerant systems EEC-681: Distributed Computing Systems

EEC-681/781 Distributed Computing Systems

EEC-681/781 Distributed Computing Systems

Presentation Transcript

Distributed Computing Systems

Distributed Systems

Distributed computing

ACM SIGACT News Distributed Computing Column 9

EEC-681/781 Distributed Computing Systems

Chapter 18 – Distributed software engineering

COS 497 - Cloud Computing 2. Distributed Computing

Distributed Systems Meet Economics: Pricing in Cloud Computing

Distributed Systems and Architectures

Distributed Systems and Architectures

Distributed systems

Distributed Computing and Analysis

Distributed Computing

Distributed systems II AGREEMENT (2-3 phase CoM. )

Architecture of Cloud Computing and Distributed Database Systems

DISTRIBUTED COMPUTING

Chapter 18 – Distributed software engineering

Mobile Computing – A Distributed Systems Perspective

Java Distributed Computing

Distributed systems II Closing

1. Introduction II

COMP 734 -- Distributed File Systems