Slides for Chapter 6: Operating System support

Slides for Chapter 6: Operating System support From Coulouris, Dollimore and KindbergDistributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001

Outline • Introduction • The operation system layer • Protection • Processes and threads • Communication and invocation • Operating system architecture • Summary Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.1 Introduction • In this chapter we shall continue to focus on remote invocations without real-time guarantee • An important theme of the chapter is the role of the system kernel • The chapter aims to give the reader an understanding of the advantages and disadvantages of splitting functionality between protection domains (kernel and user-level code) • We shall examining the relation between operation system layer and middle layer, and in particular how well the requirement of middleware can be met by the operating system • Efficient and robust access to physical resources • The flexibility to implement a variety of resource-management policies Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Introduction (2) • The task of any operating system is to provide problem-oriented abstractions of the underlying physical resources (For example, sockets rather than raw network access) • the processors • Memory • Communications • storage media • System call interface takes over the physical resources on a single node and manages them to present these resource abstractions Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Introduction (3) • Network operating systems • They have a network capability built into them and so can be used to access remote resources. Access is network-transparent for some – not all – type of resource. • Multiple system images • The node running a network operating system retain autonomy in managing their own processing resources • Single system image • One could envisage an operating system in which users are never concerned with where their programs run, or the location of any resources. The operating system has control over all the nodes in the system An operating system that produces a single system image like this for all the resources in a distributed system is called a distributed operating system Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Introduction (4) --Middleware and network operating systems • In fact, there are no distributed operating systems in general use, only network operating systems • The first reason, users have much invested in their application software, which often meets their current problem-solving needs • The second reason against the adoption of distributed operating systems is that users tend to prefer to have a degree of autonomy for their machines, even is a closely knit organization The combination of middleware and network operating systems provides an acceptable balance between the requirement for autonomy Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.1System layers Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Provide a useful service interface to their resource The operating system layer • Our goal in this chapter is to examine the impact of particular OS mechanisms on middleware’s ability to deliver distributed resource sharing to users • Kernels and server processes are the components that manage resources and present clients with an interface to the resources • Encapsulation • Protection • Concurrent processing • Communication • Scheduling Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.2Core OS functionality Handles the creation of and operations upon process Communication between threads attached to different processes on the same computer Tread creation, synchronization and scheduling Management of physical and virtual memory Dispatching of interrupts, system call traps and other exceptions Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.3 Protection • We said above that resources require protection from illegitimate accesses. Note that the threat to a system’s integrity does not come only from maliciously contrived code. Benign code that contains a bug or which has unanticipated behavior may cause part of the rest of the system to behave incorrectly. • Protecting the file consists of two sub-problem • The first is to ensure that each of the file’s two operations (read and write) can be performed only by clients with right to perform it • The other type of illegitimate access, which we shall address here, is where a misbehaving client sidesteps the operations that resource exports this is a meaningless operation that would upset normal use of the file and that files would never be designed to export We can protect resource from illegitimate invocations such as setFilePointRandomly or to use a type-safe programming language (JAVA or Modula-3) Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Kernel and Protection • The kernel is a program that is distinguished by the facts that it always runs and its code is executed with complete access privileged for the physical resources on its host computer • A kernel process execute with the processor in supervisor (privileged) mode; the kernel arranges that other processes execute in user(unprivileged) mode • A kernel also sets up address spaces to protect itself and other processes from the accesses of an aberrant process, and to provide processes with their required virtual memory layout • The process can safely transfer from a user-level address space to the kernel’s address space via an exception such as an interrupt or a system call trap Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.4 Processes and threads • A thread is the operating system abstraction of an activity (the term derives from the phrase “thread of execution”) • An execution environment is the unit of resource management: a collection of local kernel-managed resources to which its threads have access • An execution environment primarily consists • An address space • Thread synchronization and communication resources such as semaphore and communication interfaces • High-level resources such as open file and windows Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.4.1 Address spaces • Region, separated by inaccessible areas of virtual memory • Region do not overlap • Each region is specified by the following properties • Its extent (lowest virtual address and size) • Read/write/execute permissions for the process’s threads • Whether it can be grown upwards or downward Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.3Address space Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.4.1 Address spaces (2) • A mapped file is one that is accessed as an array of bytes in memory. The virtual memory system ensures that accesses made in memory are reflected in the underlying file storage • A shared memory region is that is backed by the same physical memory as one or more regions belonging to other address spaces • The uses of shared regions include the following • Libraries • Kernel • Data sharing and communication Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.4.2 Creation of a new process • The creation of a new process has been an indivisible operation provided by the operating system. For example, the UNIX forksystem call. • For a distributed system, the design of the process creation mechanism has to take account of the utilization of multiple computers • The choice of a new process can be separated into two independent aspects • The choice of a target host • The creation of an execution environment Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Choice of process host • The choice of node at which the new process will reside – the process allocation decision – is a matter of policy • Transfer policy • Determines whether to situate a new process locally or remotely. For example, on whether the local node is lightly or heavily load • Location policy • Determines which node should host a new process selected for transfer. This decision may depend on the relative loads of nodes, on their machine architectures and on any specialized resources they may process Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

One load manager component Several load manager organized in a tree structure Node exchange information with one another direct to make allocation decisions Choice of process host (2) Load manager collect information about the nodes and use it to allocate new processes to node • Process location policies may be • Static • Adaptive • Load-sharing systems may be • Centralized • Hierarchical • decentralized Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Choice of process host (3) • In sender-initiated load-sharing algorithms, the node that requires a new process to be created is responsible for initiating the transfer decision • In receiver-initiated algorithm, a node whose load is below a given threshold advertises its existence to other nodes so that relatively loaded nodes will transfer work to it • Migratory load-sharing systems can shift load at any time, not just when a new process is created. They use a mechanism called process migration Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Creation of a new execution environment • There are two approaches to defining and initializing the address space of a newly created process • Where the address space is of statically defined format • For example, it could contain just a program text region, heap region and stack region • Address space regions are initialized from an executable file or filled with zeroes as appropriate • The address space can be defined with respect to an existing execution environment • For example the newly created child process physically shares the parent’s text region, and has heap and stack regions that are copies of the parent’s in extent (as well as in initial contents) • When parent and child share a region, the page frames belonging to the parent’s region are mapped simultaneously into the corresponding child region Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Process A’s address space Process B’s address space RB copied from RA RA RB Kernel Shared frame A's page B's page table table a) Before write b) After write Figure 6.4Copy-on-write The page fault handler allocates a new frame for process B and copies the original frame’s data into byte by byte The pages are initially write-protected at the hardware level page fault Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.4.3 Threads • thread 是 process 的簡化型式，它包含了使用 CPU 所必須的資訊：Program Counter、register set 以及 stack space。同一個程式 (task) 的 thread 之間共享 code section、data section 以及作業系統的資源 (OS resource)。如果作業系統可以提供多個 thread 同時執行的能力，其具有 multithreading 的能力 • The next key aspect of a process to consider in more detail and server process to possess more than one thread. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Thread 2 makes requests to server Input-output Receipt & queuing Thread 1 generates results T1 Requests N threads Client Server Figure 6.5Client and server with threads Worker pool A disadvantage of this architecture is its inflexibility Another disadvantage is the high level of switching between the I/O and worker threads as they manipulate the share queue Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.6Alternative server threading architectures (see also Figure 6.5) Associates a thread with each connection Associates a thread with each object request Advantage: the threads do not contend for a shared queue, and throughput is potentially maximized In each of these last two architectures the server benefits from lowered thread-management overheads compared with the thread-per-request architecture. Their disadvantage is that clients may be delayed while a worker thread has several outstanding requests but another thread has no work to perform Disadvantage: the overheads of the thread creation and destruction operations Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Execution environment Thread Address space tables Saved processor registers Communication interfaces, open files Priority and execution state (such as BLOCKED ) Semaphores, other synchronization Software interrupt handling information objects List of thread identifiers Execution environment identifier Pages of address space resident in memory; hardware cache entries Figure 6.7State associated with execution environments and threads Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

A comparison of processes and threads as follows • Creating a new thread with an existing process is cheaper than creating a process. • More importantly, switching to a different thread within the same process is cheaper than switching between threads belonging to different process. • Threads within a process may share data and other resources conveniently and efficiently compared with separate processes. • But, by the same token, threads within a process are not protected from one another. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

A comparison of processes and threads as follows (2) • The overheads associated with creating a process are in general considerably greater than those of creating a new thread. • A new execution environment must first be created, including address space table • The second performance advantage of threads concerns switching between threads – that is, running one thread instead of another at a given process Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

A context switch is the transition between contexts that takes place when switching between threads, or when a single thread makes a system call or takes another type of exception • It involves the following: • The saving of the processor’s original register state, and loading of the new state • In some cases; a transfer to a new protection domain – this is known as a domain transition Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Thread scheduling • In preemptive scheduling, a thread may be suspended at any point to make way for another thread • In non-preemptive scheduling, a thread runs until it makes a call to the threading system (for example, a system call). • The advantage of non-preemptive scheduling is that any section of code that does not contain a call to the threading system is automatically a critical section • Race conditions are thus conveniently avoided • Non-preemptively scheduled threads cannot takes advantage of multiprocessor , since they run exclusively Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Thread implementation • When no kernel support for multi-thread process is provided, a user-level threads implementation suffers from the following problems • The threads with a process cannot take advantage of a multiprocessor • A thread that takes a page fault blocksthe entire process and all threads within it • Threads within different processes cannot be scheduled according to a single scheme of relative prioritization Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Thread implementation (2) • User-level threads implementations have significant advantages over kernel-level implementations • Certain thread operations are significantly less costly For example, switching between threads belonging to the same process does not necessarily involve a system call – that is, a relatively expensive trap to the kernel • Given that the thread-scheduling module is implemented outside the kernel, it can be customized or changed to suit particular application requirements. Variations in scheduling requirements occur largely because of application-specific considerations such as the real-time nature of a multimedia processing • Many more user-level threads can be supported than could reasonably be provided by default by a kernel Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

The four type of event that kernel notified to the user-level scheduler • Virtual processor allocated • The kernel has assigned a new virtual processor to the process, and this is the first timeslice upon it; the scheduler can load the SA with the context of a READY thread, which can thus can thus recommence execution • SA blocked • An SA has blocked in the kernel, and kernel is using a fresh SA to notify the scheduler: the scheduler sets the state of the corresponding thread to BLOCKED and can allocate a READY thread to the notifying SA • SA unblocked • An SA that was blocked in the kernel has become unblocked and is ready to execute at user level again; the scheduler can now return the corresponding thread to READY list. In order to create the notifying SA, the another SA in the same process. In the latter case, it also communicates the preemption event to the scheduler, which can re-evaluate its allocation of threads to SAs. • SA preempted • The kernel has taken away the specified SA from the process (although it may do this to allocate a processor to a fresh SA in the same process); the scheduler places the preempted thread in the READY list and re-evaluates the thread allocation. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.10Scheduler activations Scheduler activation (SA) is a call from kernel to a process Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.5.1 Invocation performance • Invocation performance is a critical factor in distributed system design • Network technologies continue to improve, but invocation times have not decreased in proportion with increases in network bandwidth • This section will explain how software overheads often predominate over network overheads in invocation times Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.11Invocations between address spaces Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.12RPC delay against parameter size Client delay against requested data size. The delay is roughly proportional to the size until the size reaches a threshold at about network packet size Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

The following are the main components accounting for remote invocation delay, besides network transmission times • Marshalling • Data copying • Packet initialization • Thread scheduling and context switching • Waiting for acknowledgements Marshalling and unmarshalling, which involve copying and converting data, become a significant overhead as the amount of data grows • Potentially, even after marshalling, message data is copied several times in the course of an RPC • Across the user-kernel boundary, between the client or server address space and kernel buffers • Across each protocol layer (for example, RPC/UDP/IP/Ethernet) • Between the network interface and kernel buffers • Several system calls (that is, context switches) are made during an RPC, as stubs invokes the kernel’s communication operations • One or more server threads is scheduled • If the operating system employs a separate network manager process, then each Send involves a context switch to one of its threads This involves initializing protocol headers and trailers, including checksums. The cost is therefore proportional, in part, to the amount of data sent The choice of RPC protocol may influence delay, particularly when large amounts of data are sent Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

A lightweight remote procedure call • The LRPC design is based on optimizations concerning data copying and thread scheduling. • Client and server are able to pass arguments and values directly via an A stack. The same stack is used by the client and server stubs • In LRPC, arguments are copied once: when they are marshalled onto the A stack. In an equivalent RPC, they are copied four times Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.13A lightweight remote procedure call Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.5.2 Asynchronous operation • A common technique to defeat high latencies is asynchronous operation, which arises in two programming models: • concurrent invocations • asynchronous invocations • An asynchronous invocation is one that is performed asynchronously with respect to the caller. That is, it is made with a non-blocking call, which returns as soon as the invocation request message has been created and is ready for dispatch Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.14Times for serialized and concurrent invocations pipelining Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

6.6 Operating system architecture • Run only that system software at each computer that is necessary for it to carry out its particular role in the system architectures • Allow the software implementing any particular service to be changed independently of other facilities • Allow for alternatives of the same service to be provided, when this is required to suit different users or applications • Introduce new services without harming the integrity of existing ones Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Microkernel provides only the most basic abstraction. Principally address spaces, the threads and local interprocess communication Figure 6.15Monolithic kernel and microkernel Where these designs differ primarily is in the decision as to what functionality belongs in the kernel and what is to be left to sever processes that can be dynamically loaded to run on top of it Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 6.16The role of the microkernel Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Comparison • The chief advantages of a microkernel-based operating system are its extensibility • A relatively small kernel is more likely to be free of bugs than one that is large and more complex • The advantage of a monolithic design is the relative efficiency with which operations can be invoked Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Slides for Chapter 6: Operating System support

Slides for Chapter 6: Operating System support

Presentation Transcript

Chapter 8 Operating System Support

Operating System Support for Virtual Machines

Slides for Chapter 6

Operating System Support for Virtual Machines

Chapter 6 Slides

Operating System support

Chapter 8 Operating System Support

Chapter 6 Slides

Chapter 8: Operating System Support

Chapter 8 Operating System Support

Operating System Support

Chapter 6 Slides

Chapter 6 Slides

Support Slides for Readings 1-6

Chapter 11: Operating System Support

Operating System Support for Virtual Machines

Operating System support for Multimedia

Operating System Support for Virtual Machines

Chapter 7 Operating System Support

Operating System Support for Virtual Machines