Introduction • Motivation • Sharing common resources • Multiprocessor architecture
Multiple Threads and Processors • True parallelism for multiprocessor architectures • Multiplex if T > P • Ideally: if an application need 1 unit time with one thread version, it will only need 1/n unit time with a multithread version on a computer with n processors.
Concurrency & Parallelism • Concurrency: The maximum parallelism it can achieve with an unlimited number of processors. • Parallelism: The actual degree of parallel execution achieved and is limited by the number of physical processors. • User/System concurrency
Fundamental Abstraction • A process is a compound entity that can be divided into two components—a set of threads and a collection of resources. • A thread is a dynamic object that represents a control point in the process and that executes a sequence of instructions. • Shared resources + its private objects: pc, stack, register context
Processes • Have a virtual address space which holds the process image • Protected access to processors, other processes, files, and I/O resources
Threads • Has an execution state (running, ready, etc.) • Saves thread context when not running • Has an execution stack • Has some per-thread static storage for local variables • Has access to the memory and resources of its process • all threads of a process share this
Threads and Processes one process one thread one process multiple threads multiple processes one thread per process multiple processes multiple threads per process
Single Threaded and Multithreaded Process Models Multithreaded Process Model Single-Threaded Process Model Thread Thread Thread Thread Control Block Thread Control Block Thread Control Block Process Control Block User Stack Process Control Block User Stack User Stack User Stack User Address Space Kernel Stack User Address Space Kernel Stack Kernel Stack Kernel Stack
Benefits of Threads • Takes less time to create a new thread than a process • Less time to terminate a thread than a process • Less time to switch between two threads within the same process • Since threads within the same process share memory and files, they can communicate with each other without invoking the kernel
Threads • Suspending a process involves suspending all threads of the process since all threads share the same address space • Termination of a process, terminates all threads within the process
Kernel Threads • A kernel thread is responsible for executing a specific function. • It shares the kernel text and global data, and has its own kernel stack. • Independently scheduled. • Useful to handle asynchronous I/O. • Inexpensive
Lightweight Processes • LWP is a kernel-supported user thread. • It belongs to a user process. • Independently scheduled. • Share the address space and other resources of the process. • LWP should be synchronized on shared data. • Blocking an LWP is expensive.
User Threads • All thread management is done by the application • The kernel is not aware of the existence of threads • Thread switching does not require kernel mode privileges • Scheduling is application specific
User Threads • Created by thread library such as C-threads(Mach) or pthreads(POSIX). • A user-level library multiplex user threads on top of LWPs and provides facilities for inter-thread scheduling, context switching, and synchronization without involving the kernel.
Latency Creation time Synchronization Time using semaphore User thread 52 66 LWP 350 390 Process 1700 200
Split scheduling • The threads library schedules user threads • The kernel schedules LWPs and processes. • Every threads are eventually scheduled. • True: each thread is bound to one LWP. • False: Multiplexed on several LWPs
Lightweight Process Design • Semantics of fork • Create a child process. • Copy only the LWP into the new process. • Duplicate all the LWPs of the parent. • All LWPs share a set of file descriptors • All LWPs share a common address space and may manipulate it concurrently through system calls such as mmap & brk.
Signal Delivery and Handling • Different methods: • Send the signal to each thread • Appoint a master thread in each process to receive all signals • Send the signal to any arbitrarily chosen thread • Use heuristics to determine the thread to which the signal applies • Create a new thread to handle each signal
Stack Growth • Overflows of stack: a segmentation violation fault. • The kernel has no ideas about the user thread stack. • It is the thread’s responsibility to extend the stack or handle the overflows, the kernel responds by sending a SIGSEGV signal to the appropriate thread.
User-Level Thread Libraries • The Programming Interface • Creating and terminating threads • Suspending and resuming threads • Assigning priorities to individual threads • Thread scheduling and context switching • Synchronizing activities through facilities such as semaphores and mutual exclusion locks • Sending messages from one thread to another • The priority of a thread is simply a process-relative priority used by the threads scheduler to select a thread to run within the process.
Implementing Threads Libraries • By LWPs: • Bind each thread to a different LWP. • Multiplex user threads on a (smaller) set of LWPs. • Allow a mixture of bound and unbound threads in the same process.
Multithreading in Solaris • Process includes the user’s address space, stack, and process control block • User-level threads (threads library) • invisible to the OS • are the interface for application parallelism • Kernel threads • KT can be dispatched on a processor and its structures are maintained by the kernel • Lightweight processes (LWP) • each LWP supports one or more ULTs and maps to exactly one KLT • each LWP is visible to the application
Multithreading in Solaris • Kernel Threads(resources) • Saved copy of the kernel registers • Priority and scheduling information • Pointers to put the thread on a scheduler queue or, if the thread is blocked, on a resource wait queue. • Pointer to the stack. • Pointers to the associated lwp and proc structures • Pointers to maintain a queue of all threads of a process and a queue of all threads in the system • Information about the associated LWP
Lightweight Process Impl. • lwp: (per-LWP part) • Saved values of user-level registers • System call arguments, results, and error code • Signal handling information • Resource usage and profiling data • Virtual time alarms • User time and CPU usage • Pointer to the kernel thread • Pointer to the proc structure
LWP Impl. • Synchronization: • Mutex • Condition variables • Counting semaphores • Read-write locks • All LWPs share a common set of signal handlers. • May mask.
User Threads • Threads library: create, destroy, manage threads without kernel • Threads library was built on LWPs. • Most applications are written with user threads. • Two types: bound & unbound • Having many more threads than LWPs is a disadvantage. • Having bound & unbound threads in one application is very useful.
Process 2 is equivalent to a pure ULT approach Process 4 is equivalent to a pure KLT approach We can specify a different degree of parallelism (process 3 and 5)
User Thread Implementation • User thread info • Thread ID • Saved register state • User Stack • Signal mask • Priority • Thread local storage • User thread synchronization • Synchronization variables in shared memory
Solaris: versatility • We can use ULTs when logical parallelism does not need to be supported by hardware parallelism (we save mode switching) • Ex: Multiple windows but only one is active at any one time • If threads may block then we can specify two or more LWPs to avoid blocking the whole application
Solaris: user-level thread execution • Transitions among states is under the exclusive control of the application • a transition can occur only when a call is made to a function of the thread library • It’s only when a ULT is in the active state that it is attached to a LWP (so that it will run when the kernel level thread runs) • a thread may transfer to the sleeping state by invoking a synchronization primitive) and later transfer to the runnable state when the event waited for occurs • A thread may force another thread to go to the stop state...
Solaris: user-level thread states (attached to a LWP)
Decomposition of user-level Active state • When a ULT is Active, it is associated to a LWP and thus, to a KLT • Transitions among the LWP states is under the exclusive control of the kernel • A LWP can be in the following states: • running: when the KLT is executing • blocked: because the KLT issued a blocking system call (but the ULT remains bound to that LWP and remains active) • runnable: waiting to be dispatched to CPU
Solaris: Lightweight Process States LWP states are independent of ULT states (except for bound ULTs)
Interrupt Handling • In traditional: using ipl • In Solaris: mutex locks & semaphores • Deal with interrupts using kernel threads with the highest priorities • The interrupt handling thread is preallocated and partially initialized
Pinned here unpinned
System Call Handling • fork() : duplicate each LWPs of the parent. • fork1() : only copy one thread invoking it • Concurrent random I/O : pread(), pwrite() • Programmers can write applications using only threads, and later optimize them by manipulating the underlying LWPs to best provide the real concurrence needed by the application.
pthread Library • POSIX-compliant • User-level • aioread(), new thread to deal with I/O
Linux Task Data Structure • State • Scheduling information • Identifiers • Interprocess communication • Links • Times and timers • File system • Address space • Processor-specific context