Exploring Parallel and Distributed Systems: Virtualization and Resource Management

Today • Parallel and Distributed Systems, Virtualization • Systems Take-Away • Final • No break: end early • Chapter 8, 13

Space Sharing • A set of 32 CPUs split into four partitions, with two CPUs available. • OS per node Parallel/distributed job has interprocessor communication Job gets full use of K processors; no other jobs on them; run to completion; usually batch mode

Time sharing • Processes of different jobs are co-mingled • Pros/Cons for parallel jobs? • More efficient utilization; particularly if blocking … • Communicating processes/threads not running together

Gang Scheduling • Figure 8-14. Communication between two threads belonging to thread A that are running out of phase. Need a more complex OS to ensure A0 and A1 running together!

Gang Scheduling The three parts of gang scheduling: • Groups of related processors/threads are scheduled as a unit, a gang. • All members of a gang run simultaneously, on different timeshared CPUs. • All gang members start and end their time slices together.

Gang Scheduling • Figure 8-15. Gang scheduling.

Backfilling

Virtualization • Compress K servers into L • Why? • Save $$, space • Run legacy apps on new h/w • Contained environment/OS – easily migrated • Reliability? • Mostly due to bogus code

Type 1 Hypervisors • When the operating system in a virtual machine executes a kernel-only instruction, it traps to the hypervisor if virtualization technology (VT) is present. If not, what would happen? • Downsides? • hardware support and T1 hypervisor is a fresh OS

T1 vs. T2 • VMware binary translation – sensitive instructions replaced with calls to emulate them; e.g. I/O instruction -> call hypervisor can do a read syscall

Paravirtualization • T1/T2 have emulation overhead. Xen and VMware (as well) – no sensitive calls in guest OS– just calls to hypervisor

Distributed Systems • Classic scheduling • Sender or Receiver initiated • Client-server systems • saturate bandwidth going in to the server … • non-scalable

Peer-to-Peer (P2P) • Systems Perspective: Scalable

Grid • The “Grid” is a vision • ensemble of geographically-dispersed resources • seamless, transparent access • Analogy to Power Grids • cheap, ubiquitous, consistent • computational grids deliver computing & data - not power • Core Grid Features and Challenges • single-sign on • dynamic and shared • highly heterogeneous • multiple administrative domains • sheer scale • Systems Perspective: Wide-area OS

Cloud Buzz words: Virtualized Pay-as-you-go Scale up Systems perspective: centralized management, co-location

CSci 5103Operating Systems Operating System Design Tannenbaum MOS Chap. 13

Goals of an OS • Define abstractions • Provide primitive operations • Ensure isolation • Manage the hardware • Multiple roles of the OS …

Why is it hard to design an OS? • Extremely large program. Unix 1M loc, Windows 2K 29M • Concurrency • Hostile users • Users want to share and yet isolation must be provided • Long-lived: hardware will change, OS must adapt • Designers have little idea how system will be used • Portability is a must • Backward compatibility

Interface Design • Simplicity (KISS) • Perfection is reached not when there is no longer anything to add, but when there is no longer anything to take away • Completeness • Everything should be as simple as possible, but no simpler • Key idea “minimum of mechanism” or elegance • Efficiency • e.g. system calls should be efficient and cost should be evident • which is faster: fork or thr_create?

System Call Interface • Adding more code adds more bugs • Don’t hide power … hide unnecessary complexity • Expose powerful hardware features • E.g if hardware provides a way to move large bitmaps around the screen (video ram), might want to provide a syscall for it

System Call Interface • Exploit unifying paradigms • In Unix many objects have file-like behavior • (source/sink for data) • read_file, read_tty, read_socket, read_device, … • instead a single read call can be used

Implementation • Monolithic: Unix – no structure or modularity • Layering: • Client/Server/u-kernel • Many OS functions are represented as servers in user-space • Extensible? • Performance?

Mechanism vs. Policy • Mechanism defines how something is implemented • Policy governs how mechanism is used • Should be kept separate • E.g. • mechanism: priority scheduling for threads • policy: how priorities are set • ACLs • Orthogonality • Combine separate concepts/mechanisms • Process is a container for resources • Thread is a schedulable entity • Evolve each capability separately and combine as needed

Naming Human-readable names hide underlying complexity Binding time: when is name bound to an address? early vs. late binding Example?

Static vs. Dynamic Structures Searching a static table for a pid is easy. Suppose it were a linked-list of mini-tables? Flexibility vs. Complexity Dynamic = Flexibility but pitfalls

Useful Implementation Techniques • Indirection • “no problem in Computer Science that cannot be solved with another level of indirection” • Examples of indirection? • Tradeoff? • Reusability • keep OS binary small • e.g. bitmap routines • Brute Force • assembly code for critical routines • linear searches are ok on small tables: sort/hash code can have bugs

Performance • Why does MS-DOS boot in seconds on old x86 hardware but Windows 2K take minutes on hardware that is hundreds of times faster? • features, features, features – e.g. plug-and-play (on every reboot inspect status of all hardware) • Add a feature • should ask what is the price in code size, speed, complexity, and reliability • what would happen if we DIDN’T add this feature? • Optimize useful things, not rarely used features • good enough is good enough • optimize the common case: example?

Performance (cont’d) • Space-Time Trade-Offs • Memory and CPU tradeoff • e.g. store small records (4 bytes) – I want to look them up, sort them, delete them, etc. • array – all operations are linear time (except sort) • binary tree – all operations are log time (double the storage) • ACL vs. ACM • Other tricks • Use macros to save function call overhead • #define max(a,b) a<b • Keep critical regions as small as possible

Performance (cont’d) • Caching: e.g. i-node cache • To look up /usr/ast/mbox requires: • 1. read i-node for root dir, 2. read the root dir, 3. read the i-node for /usr, 4. Read the /usr dir, 5. Read the i-node for /usr/ast, 5. read the /usr/ast/dir • Other examples of caching?

Performance (cont’d) • Caching: e.g. i-node cache • To look up /usr/ast/mbox requires: • 1. read i-node for root dir, 2. read the root dir, 3. read the i-node for /usr, 4. Read the /usr dir, 5. Read the i-node for /usr/ast, 5. read the /usr/ast/dir • Other examples of caching? TLB, …

Systems Mantras • Be clever at high utilization! • simple techniques work well at “low resource demands” • sender initiated – random destination works well under low load; • under high load can lead to many hops • Bulk operations work better than large number of smaller ones • Indirection, indirection, ….

Final • When: Weds July 28 in the classroom, 12:20-2:20 • Final is based on material since last exam – closed everything • memory management (working set, thrashing, memory hogs, …) • I/O • file systems: design and implementation including distributed file systems • provenance • protection • distributed multiprocessor systems • Lecture notes and papers • 2 hours allocated – exam will be same length as in-class exams • As before it will be a mixture of short answer and longer questions

Hints • I will ask a working set question • I will ask an i-node type question • I will ask a question regarding DFS

That’s All Folks • Good luck on the final! • Weds July 28 in the classroom, 12:20-2:20

Exploring Parallel and Distributed Systems: Virtualization and Resource Management

Exploring Parallel and Distributed Systems: Virtualization and Resource Management

Presentation Transcript

Today

Today…..

Today

Today

Today

Today

TODAY:

TODAY! TODAY! TODAY!

Today

Today

TODAY

Today

Today

Today

Today

TODAY

Today…

Today

TODAY

Today:

Today

Today