Block 3: Concurrency 2 Unit 1: IPC in Non-shared Memory (I)

Block 3: Concurrency 2 Unit 1: IPC in Non-shared Memory (I) This unit introduces inter-process communication (IPC) in systems with non-shared memory and, in particular, explores some of the mechanisms used to achieve distributed IPC.

IPC with no shared memory Two approaches to sharing information in such systems are: • Data is passed from process to process. • The common information is managed by a process. In this case the managing process will carry out operations on the data it encapsulates on request from other processes. This is called the client-server model.

Synchronized byte stream (or pipe) where one process sends a stream of bytes which the receiver reads. If an attempt is made to read more bytes than have yet been written to the stream, the receiver is blocked until more bytes are available. The main problem is that a byte stream is unstructured: there is no information transmitted about, for example, the type(s) of data being transferred. Message passing which transfer information in the form of typed arguments on a procedure call or method invocation. A message is a collection of data constructed with a header, indicating the destination of the message, and a body, containing a collection of typed arguments.

asynchronous message passing a process sends a message with no regard for whether the recipient is in a position to accept it. The message must therefore be stored somewhere, i.e. buffered, until the recipient can read it. synchronous message passing the two processes synchronize before the message is sent (i.e. by only sending when the other is ready to receive). There is, therefore, no need to buffer the message; it is transferred directly from the sender to the receiver.

Shared-memory IPC differs from non-shared-memory IPC in that the latter relies on the passing of data (or messages) between processes whereas the former is based on the communicating processes accessing shared data; • Shared-memory IPC is suitable for use in simple, unprotected and multithreaded systems whereas non-shared-memory IPC favors protected as well as distributed systems;

Section 2: Distributed IPC • The general object and client-server models • 1. The general object model of a distributed system • is one in which objects are distributed across a network of computers. • originating from any part of the system are invoked directly on objects, wherever in the system these objects are located, and are not invoked via a server. • For this model to work, objects must be named (so that it is possible to say that an operation should be carried out on a particular object) and objects need to be located. • This means that the underlying operating system must support the naming and location of objects.

the client–server model • an object is associated with a specific server. The server • offers a set of operations on the objects it manages, rather than offering the objects • themselves, and the server carries out the operations on the objects. • In this sense, the client–server model can be seen as a special case of the general object model in which data objects are managed by servers.

Java's sockets and streams • A server is a program that runs on one computer and provides useful information to another program (the client) running (usually) on another computer. • A client–server system is normally used in connection with distributed or networked systems in which the architectural model assumes that there is a server program on one machine that can communicate with client programs on other machines. • A port is an abstraction for a memory address and can be thought of as the place in a computer where the client and server will rendezvous.

A socket is an abstraction for each of the two sides of the connection between the client and server. It is identified by the combination of port and host name, and provides the facilities (e.g. input and output streams) to enable data to be transferred between client and server. • In Java, a ServerSocket object listens for requests from a client at a particular port. On the other hand, a Socket object forms part of the connection by means of which client and server communicate. Both the client and server have a socket that together form the connection. Each socket provides input and output streams that enable client and server to receive data from and send data to the connection.

Unit 2: IPC in Nom-shared Memory (II) RPC and Java's RMI mechanism

1. A remote procedure call (RPC) • RPC is a call to a procedure located on a machine that is different (remote) from the machine from which the call was made. • Message passing scheme vs. RPC A message passing scheme may be asynchronous. • An RPC is always synchronous, in the sense that the calling procedure, having invoked a procedure on a remote machine, waits for the result before continuing with its processing.

A request-reply-acknowledge (RRA) protocol • is assumed (RRA is where one process (the caller process) sends a message (request) to another process (the receiver process), the called system (server) sends a reply (which also acts as an acknowledgement that the original message was received), and the calling system (client), on receipt of the reply, sends an acknowledgement to confirm receipt of the reply). • An alternative is request-acknowledge-reply-acknowledge (RARA) (RARA is similar to RRA but an additional explicit acknowledgement of the receipt of the original message is sent by the called system).

The RPC protocol with network or server congestion • The timer is used by the RPC service of the client to detect the possibility of network congestion or of a network or receiver failure. If a reply is not received within a set time, the client assumes that a problem has arisen: either that the original request failed to get through or that the reply was lost. In such circumstances, the client may re-send the request. • If the original request did get through to the server, which did respond, the RPC identifier can be used by the RPC service of the server to detect that it has already responded. • In principle, in the face of failure or congestion, the client and server RPC services can repeat the sending of messages, replies and acknowledgements to ensure that the client does receive a reply to its request. This is known as exactly once RPC semantics.

Client failure • An orphan is a remote procedure call whose client (i.e. the node making the call) crashes after the request has been sent. That is, there has been client failure with the result that the client is not able to deal with any reply generated by the orphan.

Server failure The server may fail before the call is received or at some point during the call; in all cases the client timeout will expire): • After the RPC service receives the call but before the call to the remote procedure is made • During the remote procedure invocation, C; • After the remote procedure invocation but before the result is sent, D. In all cases the client might repeat the call when the server restarts. In cases C and D this could cause problems since the server could have made permanent state changes before crashing resulting, in inconsistent state.

RMI system

Block 3: Concurrency 2 • Unit 3: Composite actions • Earlier units discussed concurrency issues associated with single operations on a single data abstraction. The first aim of this section is to study how such operations can be made atomic, particularly in order to deal with crashes. • The second aim of the unit is to study the nature of composite operations (i.e. operations that are composed of a number of single operations) and the problems raised by their concurrent action. A particular problem is that of deadlock

A model of a crash A fail–stop model of a crash assumes that a crash happens at some instant of time rather than over a period of time. It results in the loss of volatile state (process registers, cache and main memory). Any changes that have been made to persistent state, such as disk, are assumed to be correct, but may be incomplete.

Idempotent (repeatable) operations • An idempotent operation is an operation that can be repeated without causing any errors or inconsistencies. • That is, there is no discernible difference if the operation is carried out once or many times. • An example of idempotent operation is x = 3 • No matter how many times the operation is performed, x will always be 3.

Atomic operations on persistent objects • An atomic operation invocation is one for which: • • if it terminates normally, all its effects are made permanent, otherwise it has no effect at all; • • if it accesses a shared data object, it does not interfere with other operation invocations on the same data object. • A transaction is an atomic operation invocation in a Transaction Processing (TP) system (e.g. banking and airline booking systems). • An atomic operation that completes successfully is said to commit and its effects are guaranteed to be permanent. If it does not complete it aborts and all of its effects must be undone.

Implementation of atomic operations • Logging • Logging is the process of recording the old data value, the new data value and an • identifier of the transaction so that the persistent store can be rolled back to its previous state. • The information in the log is used to roll back the persistent store to its state at the start of the transaction in which the crash occurred by using the old values recorded in the log.

Shadowing • Shadowing is where the results of a transaction are built up in a structure that mirrors part of the persistent store but the persistent store is not updated until the transaction commits. • Once the transaction has committed, the shadow structure replaces the relevant part of the persistent store

Why we need concurrency control: Potential problems from interleaving of transactions without concurrency control Time T1 T2 balx t1 begin-transaction 100 t2 begin-transaction read(balx) 100 t3 read(balx) balx = balx +100 100 t4 balx = balx -10 write(balx) 200 t5 write(balx) commit 90 t6 commit 90

Livelock and starvation • Deadlock is when a process (along with one or more other processes) cannot proceed because it is blocked, waiting for a condition to become true that will never become true (e.g. a resource that will never become free). • Livelock is where a process is executing a loop, continually testing for a condition to become true that will never be true (i.e. busy waiting on a condition that can never become true).

Starvation is where a process is never scheduled to run by the operating system’s scheduler, perhaps because it has large resource requirements that the scheduler is never in a position to allocate, or because the priorities of other processes are such that the process is never chosen.

Dead lock example Time T9 t1 begin-transaction t2 write_lock(balx) t3 read(balx) t4balx = balx - 10 t5 write(balx) t6 write_lock(baly) t7 WAIT t8 WAIT t9 WAIT t10 WAIT t11 : Time T9 t1 begin-transaction t2 write_lock(balx) t3 read(balx) t4balx = balx - 10 t5 write(balx) t6 write_lock(baly) t7 WAIT t8 WAIT t9 WAIT t10 WAIT t11 :

Object allocation graphs to detect deadlock If a cycle exists in such a graph and there is only one object of each of the object types involved in the cycle then deadlock exists. timeout A simple alternative to deadlock detection is to abort transactions with timed-out lock requests. (A timeout does not necessarily mean that deadlock has occurred, just that a relatively long time has been spent by a transaction waiting for a lock).

Deadlock in distributed systems1. Single Lock Manager • System maintains a single lock manager that resides in a single chosen site, say Si . • When a transaction needs to lock a data item, it sends a lock request to Si and lock manager determines whether the lock can be granted.

Second solution: 2. Distributed system Deadlock Handling Coordinator Site Global WFG T1 T2 Local WFG Local WFG S1 S2 Network T1 T2 T2 T1

Unit 4

Properties of Transactions Four basic (ACID) properties of a transaction are: Atomicity ‘All or nothing’ property. Consistency Must transform database from one consistent state to another. Isolation Partial effects of incomplete transactions should not be visible to other transactions. Durability Effects of a committed transaction are permanent and must not be lost because of later failure.

Two-Phase Locking (2PL) Transaction follows 2PL protocol if all locking operations precede first unlock operation in the transactionand they will work correctly • Two phases for transaction: • Growing phase - acquires all locks but cannot release any locks. • Shrinking phase - releases locks but cannot acquire any new locks.

Time T1 T2 balx t1 begin-transaction 100 t2 begin_transaction write_lock(balx) 100 t3 write_lock(balx) read(balx) 100 t4 WAIT balx = balx +100 100 t5 WAIT write(balx) 200 t6 WAIT commit/unlock(balx) 200 t7 read(balx) 200 t8 balx = balx -10 200 t9 write(balx) 190 t10 commit/unlock(balx) 190 Using two phase locking to implement concurrency control:

Strict two-phase locking all locks are released on commit. This is a safe procedure since it means that the effects of the transaction are not visible to other transactions before the transaction is committed (the property of isolation).

Block 3: Concurrency 2 Unit 1: IPC in Non-shared Memory (I)