Processes

Processes CSE5306 Lecture Quiz due 11 June 2014 at 5 PM

Processes • From an operating system or a distributed system point of view, processes are programs in execution, whose management and scheduling are crucial. • Multi-threading boosts client-server system efficiency by overlapping communications with processing. • Virtualizing an application’s entire runtime environment makes it portable by enabling it to run concurrently with other applications, independent of their hardware. (And it isolates apps from each other’s errors and security breeches.) • Organizing wide-area distributed systems for code portability also makes them scalable, and it makes clients and servers easier to dynamically reconfigure.

Introduction to Threads • Dividing a process up into many threads makes it simpler and perform better. • Operating systems and hardware can enforce barriers between concurrent threads, making their use of the same CPU and hardware resources transparent. But it is complicated, and it impacts performance. • By taking over multi-threading, process designers can keep it simple and preserve performance. (But they also take on some of the operating system’s transparency burdens.)

Thread Usage in Nondistributed Systems • Spreadsheet applications need 1) a fore-ground user interaction thread, 2) a lower priority thread to change all affected cells and 3) recurring disk backups in the background. • Threaded client-server programs run equally reliably on uniprocessors or cheap multiproc-essor systems, but much faster on the latter. • Interprocess communications (IPC, pipes) among programs in large threaded UNIX environments reduces performance in kernel context switches. • IPC should let apps pass data in shared memory. When a single user’s threads cooperate, performance improves.

Thread Implementation • Thread packages include fork and merge methods, as well as mutexes and condition objects. • Either the user handles her own threads (i.e., quickly allocating and freeing stacks, switching context), or she lets the kernel do it(i.e., slowly changing memory maps, flushing the translation look-aside buffer, creating, freeing, accounting). • But the OS blocks all user-created threads on I/O. • As a hybrid of user and kernel threading, light-weight processes (LWP) can run in the kernel’s heavy weight process context entirely without kernel intervention.

Lightweight Processes • A different system-called LWP runs each user thread. • All LWPs share a table of current threads, which guarantees mutually exclusive access. And all mutexes are handled in user space, entirely without kernel intervention. • When LWPs’ scheduler discovers a runnable thread, it switches context to that thread. • When a thread blocks on a mutex or condition variable, it tells the scheduler, and execution reverts to the current LWP (not the kernel), and the user retains control. • Advantages: • No kernel intervention in forking, joining, synchronizing threads. • Blocking system calls do not suspend all user processing. • Completely unaware of the LWPs, apps see only user-level threads. • LWPs transparently port to different CPUs in multiprocessing environments. • But occasionally creating and destroying LWPs is as slow as kernel-level threads. • In scheduler activations alternative, kernel upcalls user’s thread package. (Yuk.)

R U O K ? 1. What are the most important attributes of distributed systems’ processes? • As programs in execution, their managementand scheduling are crucial. • Multi-threading processes boosts client-server system efficiency by overlapping communications with processing. • Virtualizing an application’s entire runtime environment makes it portable by enabling it to run concurrently with other applications, independent of their hardware. (And it isolates apps from each other’s errors and security breeches.) • Organizing wide-area distributed systems for code portability also makes them scalable, and it makes clients and servers dynamically reconfigurable. • All of the above.

R U O K ? 2. How does multithreading at the operating system-level compare with user-level multithreading? • When process designers divide a process up into many threads, it is simpler and it performs better. • Operating systems and hardware can enforce barriers between concurrent threads, making their use of the same CPU and hardware resources transparent. But its generality makes it complicated, and that impacts performance. • By taking over multi-threading, process designers can keep it simple and preserve performance. (But they also take on some of the operating system’s transparency burdens.) • All of the above. • None of the above.

R U O K ? 3. How does multithreading improve non-distributed systems? • It provides spreadsheet applications 1) a foreground user interaction thread, 2) a lower priority thread to change all affected cells and 3) recurring disk backups in the background. • Threaded client-server programs run equally reliably on uniprocessors or cheap multiprocessor systems, but much faster on the latter. • Interprocesscommunications (i.e., IPC, pipes) among programs in large threaded UNIX environments reduce performance in kernel context switches. • IPC should let apps pass data in shared memory, because performance improves when a single user’s threads cooperate. • All of the above.

R U O K ? 4. What should the process designer remember in the course of implementing threads? • Thread packages include fork and merge methods, as well as mutexes and condition objects. • Either the user handles her own threads (i.e., quickly allocating and freeing stacks, switching context), or she lets the kernel do it(i.e., slowly changing memory maps, flushing the translation look-aside buffer, creating, freeing, accounting). • Beware of the OS blocking all user-created threads on I/O. • As a hybrid of user and kernel threading, light-weight processes (LWP) can run in the kernel’s heavy weight process context, entirely without kernel intervention. • All of the above.

R U O K ? 5. What benefits accrue from the use of lightweight processes (LWPs)? • The kernel does not intervene in processes’ forking, joining or synchronizing threads. • Blocking system calls do not suspend all user processing. • The kernel is completely unaware of the LWPs, and the apps see only user-level threads. • LWPs transparently port to different CPUs in multiprocessing environments. • All of the above.

Threads in Distributed Systems • Blocking system calls do not shut down an entire user process, when distributed systems are threaded, so many applications can be continuously logically connected.

Multithreaded Clients • For distribution transparency, your web browser needs to hide long communication delays. • Multithreaded client displays each piece of a downloading web page upon arrival—communications I/O does not block display. • Web servers may replicate files at one URL, so that the multithreaded client can receive many streams in parallel.

Multithreaded Servers • A single-threaded server passes a client’s file request to the disk and waits. When the file delivers, the server sends it to the client and waits for another request— I/O blocking without parallelism. • A state machine server accepts client requests from several communication channels and responses from several disks and immediately acts upon each one—parallelism without I/O blocking. • Multithreaded server code is much simpler, and its parallelism easily boosts performance, especially on multiprocessor computers. Unlimited multiprocessor parallelism without I/O blocking, and independent processes are much easier to program.

R U O K ? 6. How does multithreading improve clients’ performance? • For distribution transparency, your multithreaded web browser hides long communication delays. • It displays each piece of a downloading web page upon arrival—communications I/O does not block display. • Web servers may replicate files at one URL, so that the multithreaded client can receive many streams in parallel. • All of the above. • None of the above.

R U O K ? 7. How does a multithreaded server compare with its alternatives? • A single-threaded server passes a client’s file request to the disk and waits. When the file delivers, the server sends it to the client and waits for another request—I/O blocking without parallelism. • A state-machine server accepts client requests from several communication channels and responses from several disks, immediately acting upon each one—parallelism without I/O blocking. • Multithreaded server code is much simpler, and its parallelism easily boosts performance, especially on multiprocessor computers—unlimited multiprocessor parallelism without I/O blocking and much more easily programmed independent processes. • All of the above. • None of the above.

The Role of Virtualization in Distributed Systems • By rapidly switching between threads and processes, a single CPU creates an illusion of parallelism called “resource virtualization.” • A legacy app (A) eventually must run on new enter-prise edge servers (B). So an implementation of B mimicking A (including its antique runtime environ-ment) becomes a distributed system virtualization.

Architectures of Virtual Machines • Computer interfaces occupy four different levels: • Between hardware and software, machine instructions are available to all apps. • There also are privileged machine instructions available to only to the OS. • Above the OS are system calls available to all apps. • At the top is an application programming interface (API), which hides OS calls behind library calls. • Virtualization mimics these levels.

Architectures of Virtual Machines (continued) • Virtualization can be achieved in either of 2 ways: • Process virtual machine—A Java-like instruction interpreter or emulator is attached to each process (e.g., Windows apps on UNIX platforms). • Virtual machine monitor (VMM)—A shared system layer completely shields all processes (including their multiple legacy OSes running concurrently) from each other and the host’s hardware. • The latter is best for distributed systems, because of its reliability and security.

R U O K ? 8. How can distributed systems be virtualized? • Resource virtualization—By rapidly switching between threads and processes, a single CPU creates an illusion of parallelism. • Distributed system virtualization—To make legacy apps run on new enterprise edge servers, they must mimic the apps’ entire legacy runtime environment. • Both of the above. • None of the above.

R U O K ? 9. What various architectural levels of computer interfaces must a virtualization mimic? • Between hardware and software, machine instructions are available to all apps. • There also are privileged machine instructions available to only to the OS. • Above the OS are system calls available to all apps. • At the top is an application programming interface (API), which hides OS calls behind library calls. • All of the above.

R U O K ? 10. How can virtualization be implemented? • Process virtual machine—A Java-like instruction interpreter or emulator is attached to each process (e.g., Windows apps on UNIX platforms). • Virtual machine monitor (VMM)—For reliability and security, a shared system layer completely shields all processes (including their various legacy OSesall running concurrently) from each other and the host’s hardware. • Both of the above. • Neither of the above.

Networked User (Client) Interfaces • There are 2 modes of client-server interaction: • Fat-client PDA synchronizes schedule with cloud—an app-specific protocol handles the call (e.g., X Window system). • Thin-client terminal uses server’s apps and storage—app-independent protocol handles call.

The X Window System • X Window System controls bit-mapped terminal’s monitor, keyboard and mouse. • X kernel (server) contains terminal-specific device drivers, accessible to the (possibly remote) client apps via the Xlib interface and its X protocol. • The window manager defines the display’s look and feel, but any number of client apps can concurrently communicate with all devices.

Thin-Client Network Computing • A series of commands must be sent to X kernel to set up the display, and then bit-mapped images must be sent a pixel at a time. These could slow down cell phone images in a WAN. • NX re-engineered the X protocol: • It compresses the recurring fixed parts of messages. • On both ends of a link, it caches prior display configurations, which can be looked up and possibly edited with minimal new data. • In case of a cache miss, a whole new configuration can be sent in compressed form. • Overall bandwidth is 1000x compressed to only 9600bps. • THINC translates high-level display commands into pixels.

Compound Documents • A compound document is a diverse collection of text, images, spreadsheets, which are seamlessly integrated at the user-interface level. • To drag and drop a file into the trash, the trash app must accept the file’s contents and identifiers from the file manager app. • To rotate and insert an image into a text file, a graphics app must tell the word-wrapping word processor the image’s new dimensions and position. • The compound document user interface hides the fact that different apps are working on different parts of the document.

Client-Side Software for Distribution Transparency • Automatic teller machines, cash registers, barcode readers, TV set-top boxes have user interfaces, plus considerable software for local processing and communications with replicated (access transparent) remote servers. • Various servers pretend to be the same by generating a client stub, which describes the client interface that all replicates have in common. • Bindings to a mobile client can be passed by name from one server to another transparently, without divulging the client’s actual position. • A client can request web pages from many replicated servers and quickly gather them up for its app. • Client middleware can mask failures by transparently connecting to another server, when the first’s quality-of-service becomes unacceptable. • Transaction monitors can significantly reduce an ATM’s workload.

R U O K ? 11. What are examples of the most commonly used client-server interaction modes? • A fat-client PDA synchronizes its schedule with its work group’s cloud. (An app-specific protocol handles the call; e.g., X Window system.) • A thin-client terminal uses the server’s apps and storage. (An app-independent protocol handles call.) • Both of the above. • None of the above.

R U O K ? 12. Which of the following accurately characterize the X window system? • The X Window System controls a bit-mapped terminal’s monitor, keyboard and mouse. • The X kernel (server) contains terminal-specific device drivers, accessible to the (possibly remote) client apps via the Xlib interface and its X protocol. • The window manager defines the display’s look and feel. • Any number of client apps can concurrently communicate with all devices. • All of the above.

R U O K ? 13. How did NX improve the X protocol’s performance? • It compressed the recurring fixed parts of messages. • On both ends of a link, it cached prior display configurations, which could be looked up and possibly edited with minimal new data. • In case of cache misses, a whole new configuration could be sent in compressed form. • Overall bandwidth was compressed 1000x to only 9600bps. • All of the above.

R U O K ? 14. Which of the following accurately describe compound documents? • A diverse collection of text, images, spreadsheets, which are seamlessly integrated at the user-interface level. • To drag and drop a file into the trash, the trash app must accept the file’s contents and identifiers from the file manager app. • To rotate and insert an image into a text file, a graphics app must tell the word-wrapping word processor the image’s new dimensions and position. • The compound document user interface hides the fact that different apps are working on different parts of the document. • All of the above.

R U O K ? 15. Which of the following should the process designer consider to ensure distribution transparency in client-side software? • In addition to their obvious user interfaces, automatic teller machines, cash registers, barcode readers, TV set-top boxes need considerable software for local processing and communications with replicated (access transparent) remote servers. Transaction monitors can significantly reduce an ATM’s workload. • Various servers can pretend to be the same by generating a client stub, which describes the client interface that all replicates have in common. Thus, a client can request web pages from many replicated servers and quickly gather them up for its app. • Client middleware can mask failures by transparently connecting to another server, when the first’s quality-of-service becomes unacceptable. And bindings to a mobile client can be passed by name from one server to another transparently, without divulging the client’s actual position. • All of the above. • None of the above.

General Server Design Issues • Server • A process serving one client group. • Await request; ensure request is served; repeat. • Iterative s. handles request itself. • Concurrent s. passes every request to another s. • End point (port) • Name services identify machines hosting servers. • TCP port 21 for Internet FTP requests • TCP port 80 for worldwide web HTTP servers. • A well-known port’s daemon may redirect clients.

General Server Design Issues (continued) • Superserver • Forks process to serve request, and then exit. • UNIX’ inetd daemon listens to Internet ports. • Interrupting server • Stop downloading wrong (long) file. • Exiting client app looks like client crashed! • Send out-of-band data on higher priority port. • TCP passes urgent through requesting port.

General Server Design Issues (continued) • Stateless server • Keeps no record of client’s states. • Notifies no one when it updates its files. • For example, web servers. • All client requests are logged for replication info. • Soft state server keeps state for limited time. • Stateful server • Keeps (client, file) table of file owners, updates. • Must recover its entire before-crash state. • Session state guides local server’s dialog with 3rd tier. • Cookie sent to browser can accompany next request.

General Organization of Server Clusters • Many LAN-joined machines hosting servers. • 2nd-tier app processing requires high-end CPUs. • Enterprise data processing requires big storage. • 3rd-tier file and database servers require caches. • Only 2 tiers stream media well. • To balance workloads, migrate popular app code. • 1st tier contains transport-layer switch • One TCP connection accepts request. • Transport-layer switch does TCP (connection) handoff to a server. • Content-aware request distribution levels loads and boosts performance. • Server pretends to have switch’s IP address, so as to satisfy TCP.

Distributed Servers • Domain Name Service (DNS)—offers several access point addresses, in case one fails. • Distributed server– dynamically changing machines with changing access points, which appear to be one powerful machine. • Mobility support for IP v.6 (MIPv6) • Mobile node’s access point is a stable home address (HoA). • While away, it has an in-care-of address (CoA). • Home agent (router) forwards all traffic from HoA to CoA. • Distributed server home agent accepts CoAs from all available servers. • Directs client traffic there, while pretending remote server is here. • Route optimization • Home agent and access point could become bottlenecks. • Forwarding CoA to client can form (HoA, Coa) pair for direct calls to CoA if HoA fails.

R U O K ? 16. What general questions should be considered with regard to server design? • Are all of its processes serving just one homogeneous client group? • Should it simply await a request, ensure that each request is served and repeat, all day long? • Is it an iterative server that handles every request itself? • Is it a concurrent server that passes every request to another server. • All of the above.

R U O K ? 17. What are some common ways of interrupting server, when we discover that it is downloading the wrong (very lengthy) file? • The client app simply exits, so that it appears to have crashed. • Send out-of-band data (i.e., your terminate message) on higher priority port. • Pass your urgent message through TCP’s requesting port. • All of the above. • None of the above.

R U O K ? 18. Which of the following accurately describe a stateless server? • Keeps client’s state for limited time. • Keeps a (client, file) table of file owners and updates. • Must recover its entire before-crash state. • Keeps no record of client’s states and notifies no one when it updates its files. • Sends cookies to browsers that accompany clients’ future requests.

R U O K ? 19. Which of the following accurately characterize server clusters? • 1st-tier app processing requires high-end CPUs. • 2nd-tier contains transport-layer switch. • Many LAN-joined machines hosting servers. • All of the above. • None of the above.

R U O K ? 20. What general questions should be considered with regard to distributed server design? • Domain Name Service (DNS) can offer several access point addresses, just in case one fails. • MIPv6 offers mobility support by routing messages from a mobile node’s stable home address (HoA) to an an in-care-of address (CoA). • Forwarding the CoA to clients can form (HoA, Coa) pair to enable direct calls to CoA, if the HoAfails. • All of the above. • None of the above.

Common Approaches to Managing Server Clusters • Clients see server clusters as one machine. • Their managers don’t • Login to monitor 1 node, install, swap components. • IBM’s Cluster Sys Mgmt administers 50 servers. • Thousands of servers’ maintenance is ad hoc. • Failures are the rule, not the exception. • Self-management will mature someday. • Now PlanetLab has the best partial solution.

PlanetLab Architecture Node2 Mgr Node1 Mgr • PlanetLab is a multi-university collaboration 1-Tier server cluster. • Each organization has donated one or more of PlanetLab’s ~200 nodes (see “Hardware” above). • A virtual machine monitor (VMM) enforces a security/reliability shield between the hardware and every independent Vserver above it. • Each (virtual) Vserver runtime environment supports a family of similar-vintage legacy processes, which share files with each other, but not outside their Vserver. SLICE (VMM) (VMM) (Node 2) (Node 1)

PlanetLab (continued) • Users test PlanetLab’s distribution transparency in experiments on virtual server clusters called “slices,” which gather Vservers of different nodes (i.e., running on separate hardware). • PlanetLab’s management problem discoveries: • Each organization should be able to control who uses its node(s). • Each of the organizations’ various monitoring tools assumes very specific hardware and software configurations. • Programs in different slices that share a common node must not interfere with each other.

PlanetLab (continued) • Every node has one “node manger” (i.e., one Vserver) dedicated to creating other Vservers for its node and controlling their resources; e.g., disk space, file descriptors, network bandwidth. • Resources are allocated to processes in strict time intervals by an “rspec” specification. Every resource has an “rcap” list of capabilities that the node manager can look up in a table. • Each slice belongs to an (end-user) “service provider,” who has an account on PlanetLab. To create a new slice of nodes, they issue “slice creation services” (SCS) to ask node managers to create a sliceVserver and allocate its resources.

PlanetLab Management • Only a software “slice authority” can issue the SCS to create a slice, prompted by a (web-connected, human) certified user. • Node-owners and their “management authorities” software enforce PlanetLabrules in the bottom two architectural layers. • Conclusions: • Large server clusters must be managed by intermediaries with clearly delineated authority. • End-user service providers request slices, but organizations’ resource providers manage them.

R U O K ? 21. What are some common approaches to managing server clusters? • Login to each node to monitor its operations, to install new software or to swap components. • Use IBM’s Cluster System Management tool administer up to 50 servers. • Maintain many thousands of servers in ad hoc fashion. • All of the above. • None of the above.

R U O K ? 22. Which of the following accurately characterize the PlanetLab experimental server cluster? • It has only one tier. • Each organization donated one or more of its ~200 nodes. • A virtual machine monitor (VMM) enforces a security/reliability shield between the hardware and every independent Vserver above it. • Each (virtual) Vserver runtime environment supports a family of similar-vintage legacy processes, which share files with each other, but not outside their Vserver. • All of the above.

R U O K ? 23. What problems did PlanetLab’smanagers discover: • Each organization should be able to control who uses its node(s). • Each of the organizations’ various monitoring tools assumes very specific hardware and software configurations. • Programs in different slices that share a common node must not interfere with each other. • All of the above. • None of the above.

Processes

Processes

Presentation Transcript

Processes

Processes

Processes

Processes

Processes

Processes

Processes

Processes

Processes

PROCESSES

Processes

Processes

Processes

Processes

Processes

Processes

Processes

Processes

Processes

Processes