Distributed Process Management

Distributed Process Management (Group 2) • Team Members:Mazen HammadChuck MannVrushali Nidgundi Hong Zhang • Course:CSE 8343 Advanced Operating Systems • Professor:Dr. Mohamed Khalil Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Process Management A Collection of processors that do not share memory or a clock. Distributed process management provides various mechanisms for: • Process synchronization and communication. • Dealing with the deadlock problem and the variety of failures that are not encountered in a centralized system. Overview: • Process Migration • Distributed Global States • Distributed Algorithms Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Process Migration The process is not always executed at the site in which it is initiated, the entire process or parts of it, maybe executed at different sites. Motivation: • Load Balancing: Performance can be improved if the load is balanced. • Communications Performance: Intensively communicating processes can be moved to one particular node. If a data analysis is performed on a file/files larger than the process size it may be good idea to move the process to the data area rather than the other way around. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Motivation (Continued) • Availability: Long-running processes may need to move if the machine is going down. • Utilizing special capabilities: A process can be moved to a particular node to benefit from a specialized hardware or software capability. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Initiation of Migration • Depends on the goal of migration • If goal is load balancing, then some module in operating system responsible for monitoring will initiate the migration process. Module will preempt and signal the process migration. The module has to be in contact with peer modules on other systems to decide where to migrate the process to keep load balance. • If the goal is to reach a particular resource, then a process may migrate itself, in this case process has to be aware of the distributed system. Where as in the first case the entire migration process is transparent. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

What is Migrated • Must destroy the process on the resource system and create it on the target system. • Process control block and any links must be moved. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Example of Process Migration (Before/After) Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Migration Schemes • Eager (All): Transfer entire address space. • No trace of process is left behind. • If address space is large and if the process does not need most of it, then this approach my be unnecessarily expensive. • Pre-Copy: Process continues to execute on the source node while the address space is copied. • pages modified on the source during pre-copy operation have to be copied a second time. • Reduces the time that a process is frozen and cannot execute during migration. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Migration Schemes (Continued) • Eager (Dirty): Transfer only that portion of the address space that is in main memory and has been modified. • Any additional blocks of the virtual address space are transferred on demand. • The source machine is involved throughout the life of the process. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Migration Schemes (Continued) • Copy-on-Reference: Pages are only brought over on reference. • Variation of eager (dirty). • Has lowest initial cost of process migration. • Flushing: Pages are cleared from main memory by flushing dirty pages to disk. • Relives the source of holding any pages of the migrated process in main memory. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Negotiation of Migration Starter on the source system (S) decides a process P should be migrated to a target system (D). It sends a message to D starter for a transfer request. If D’s starter is ready to accept the offer, it sends a positive response. S’s starter communicates this message to S’s kernel. Kernel of S then offers to send process P to machine D, the offer includes statistics about P (age, processor and communication loads). Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Negotiation of Migration (Continued) Starters decision is communicated to D. 6. D reserves necessary resources to avoid deadlock and flow control, finally sends an acceptance offer. 7. If D is short of those resources described in the offer, it may reject the offer. Otherwise, kernel on the D relays the message to the controlling starter. The relay includes the same information received from S. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Example of Negotiation of Process Migration Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Eviction • System evict a process that has been migrated to it. • Negotiation allows the designated target machine in migration decision, it may also be useful to evict a process which has been migrated for an adequate response. Sprite has this capability, on sprite each process runs on a single host throughout its life time, this host is known as home node of the process. A process migrated to any node becomes a foreign process and the destination node may evict any foreign process in which case it is forced back to the home node. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Eviction (Continued) The elements of the Sprite eviction mechanism Are as follows: • A monitor process at each node monitors current load to determine when to accept a process. If the monitors detects activity it initiates an eviction process on all foreign processes. • If a process is evicted, it is sent back to the home node. • All processes once marked for eviction are immediately suspended, giving extra processing power to that node. • The entire address space of an evicted process is transferred to home node. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Some Terms • Channel: Exists between two processes if they exchange messages. • State: Sequence of messages that have been sent and received along channels incident with the process. • Snapshot: Records the state of a process. • Global State: The combined state of all processes. • Distributed Snapshot: A collection of snapshots, one for each process. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Global States The state of of a distributed system, called the global state (or global snapshot), is given by the collective state of processes and channels. • Operating system cannot know the current state of all process in the distributed system. • A process can only know the current state of all the processes on a local system through the process control block in memory. • Concurrency issues like mutual exclusion, deadlock and starvation are also present in distributed systems. • Remote processes only know state information that is received by messages. • These messages represent the state in the past Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Example • Bank account is distributed over two branches. • The total amount in the account is the sum at each branch. • At 3:00 PM the account balance is determined. • Messages are sent to request the information. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Example (Continued) • If at the time of balance determination, the balance from branch A is in transit to branch B. • The result is a false reading. • All messages in transit must be examined at time of observation. • Total consists of balance at both branches and amount in message. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Example (Continued) • If clocks at the two branches are not perfectly synchronized. • Transfer amount at 3:01 from branch A. • Amount arrives at branch B at 2:59. • At 3:00 the amount is counted twice. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Snapshot Algorithm • Assumption is that messages are delivered in the order they are sent. • It uses a control message called MARKER. • A process (Q) starts this algorithm by recording its state and sending a MARKER to all outgoing channels before any messages are sent. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Snapshot Algorithm • Each process (say P) upon receiving a MARKER performs: • (P) records its local state. • (P) records the state of the incoming channel from (Q) to (P) as empty. • (P) propagates the MARKER to all of its neighbors along all outgoing channels. • Algorithm terminates once MARKER has been received along all channels. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Mutual Exclusion • The problem of mutual exclusion arises in distributed systems whenever concurrent access to shared resources by several sites is involved. • Mutual exclusion must be enforced: only one process at a time is allowed in its critical section. • A process that halts in its non-critical section must do so without interfering with other processes. • It must not be possible for a process requiring access to a critical section to be delayed indefinitely: no deadlock or starvation. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Mutual Exclusion • When no process is in a critical section, any process that requests entry to its critical section must be permitted to enter without delay. • No assumptions are made about relative process speeds or number of processors. • A process remains inside its critical section for a finite time only. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Centralized Algorithm for Mutual Exclusion • One node is designated as the control node. • This node control access to all shared objects. • If control node fails, mutual exclusion breaks down. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Algorithm • Average all nodes have equal amount of information. • Each node has a partial picture of the entire system and decision is based on that. • All nodes bear equal responsibility for the final decision. • All nodes expands equal effort in effecting a decision. • Failure of a node does not collapse the whole system • Timing events can not be regulated against a system wide common clock. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Time-Stamping • Each system on the network maintains a counter which functions as a clock. • Each site has a numeric identifier. • When a message is received, the receiving system sets its counter to one more than the maximum of its current value and the incoming time-stamp (counter). • If two messages have the same time-stamp, they are ordered by the number of their sites. • For this method to work each message is sent from one process to all other processes. • Ensures all sites have same ordering of messages. • For mutual exclusion and deadlock all processes must be aware of the situation. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Deadlock • More complicated and complex in distributed systems. • No node has the accurate knowledge of the current state of the overall system. • Message transfer between processes involves an unpredictable delay. Two Types of Deadlocks: • Resource allocation. • Communication of messages. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Deadlock in Resource Allocation • Mutual exclusion. • Hold and wait. • No preemption. • Circular wait. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Deadlock Prevention • Circular-wait condition can be prevented by defining a linear ordering of resource types. • Hold-and-wait condition can be prevented by requiring that a process request all of its required resource at one time, and blocking the process until all requests can be granted simultaneously. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Deadlock Detection • The difficulty is that each site only knows about its own resources, whereas deadlock may involve distributed resources, following techniques can be employed: • Centralized Control: One site is responsible for deadlock detection. Therefore it has the complete picture so it can detect deadlock. • Hierarchical Control: Lowest node above the nodes involved in deadlock. It is a tree structure, at each node other than leaf nodes, information about all the resource allocation of all dependent nodes is collected. It allows the detection of deadlock at lower level rather than root node. • Distributed Control: All processes cooperate in the deadlock detection function. In this case considerable information is exchanged with timestamps, thus overheads are significant. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Deadlock in Message Communication Mutual Waiting: • Deadlock occurs in message communication: When each of a group of processes is waiting for a message from another member of the group and there are no messages in transit. Unavailability of Message Buffers: • Well known in packet-switching data networks, for each node, the queue to the adjacent node in one direction is full with packets destined for the next node beyond. Example: Buffer space for A is filled with packets destined for B. The reverse is true at B. • Structured Buffer Pool is used to prevent deadlock. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Unavailability of Message Buffers Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Structured Buffer Pool Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

References • Mittal, Neeraj (2003), “Notes on Consistent Global States,”CS 6378: Advanced Operating Systems, The University of Texas at Dallas, Fall 2003. [http://www.utdallas.edu/~neerajm/cs6378f03/001/snapshot.pdf] • Williams, Stephen and Kafura D. (1995), “Global State Recording Algorithm :GSRA,” Online Lecture Notes, CS 5204 – Operating Systems, Virginia Tech, Fall 2003. [http://courses.cs.vt.edu/~cs5204/fall99/Summaries/GlobalState/global_state.html] • Singhal, M. and Shivaratri, N. (1994), Advanced Concepts in Operating Systems, McGraw-Hill, pp. 112-113. • Chandy, K. M. and Lamport, L. (1991), “Distributed Snapshots: Determining Global States of Distributed Systems”, ACM Transactions on Computer Systems, vol. 9, no. 3, pp. 272-314. • Stallings, William (2001), Operating Systems: Internals and Design Principles, 4th Ed., Prentice-Hall, Upper Saddle River, NJ, Figs. 14.1, 14.2, 14.3, 14.17, and 14.18. Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Questions? Group 2: Hammad, Mann, Nidgundi, & ZhangCSE 8343 Advanced Operating Systems

Distributed Process Management