COP 5611 Operating Systems Spring 2010

COP 5611 Operating Systems Spring 2010 Dan C. Marinescu Office: HEC 439 B Office hours: M-Wd 2:00-3:00 PM

Lecture 6 Last time: Virtualization Today: Thread coordination Scheduling Next Time: Multi-level memories I/O bottleneck 2 2 2 2

Evolution of ideas regarding communication among threads using a bounded buffer Use locks  does not address the busy waiting problem YIELD  based on voluntary release of the processor by individual threads Use WAIT (for an event ) and NOTIFY (when the event occurs) primitives . Use AWAIT (for an event) and ADVANCE (when the event occurs) 3

Primitives for thread sequence coordination • YIELD requires the thread to periodically check if a condition has occurred. • Basic idea  use events and construct two before-or-after actions • WAIT(event_name) issued by the thread which can continue only after the occurrence of the event event_name. • NOTIFY(event_name)  search the thread_table to find a thread waiting for the occurrence of the event event_name. 4

This solution does not work The NOTIFY should always be sent after the WAIT. If the sender and the receiver run on two different processor there could be a race condition for the notempty event. The NOTIFY could be sent before the WAIT. Tension between modularity and locks Several possible solutions: AWAIT/ADVANCE, semaphores, etc 6

AWAIT - ADVANCE solution • A new state, WAITING and two before-or-after actions that take a RUNNING thread into the WAITING state and back to RUNNABLE state. • eventcount  variables with an integer value shared between threads and the thread manager; they are like events but have a value. • A thread in the WAITING state waits for a particular value of the eventcount • AWAIT(eventcount,value) • If eventcount >value  the control is returned to the thread calling AWAIT and this thread will continue execution • If eventcount ≤value  the state of the thread calling AWAIT is changed to WAITING and the thread is suspended. • ADVANCE(eventcount) • increments the eventcount by one then • searches the thread_table for threads waiting for this eventcount • if it finds a thread and the eventcount exceeds the value the thread is waiting for then the state of the thread is changed to RUNNABLE 7

Solution for a single sender and multiple receivers 8

Supporting multiple senders: the sequencer Sequencer shared variable supporting thread sequence coordination -it allows threads to be ordered and is manipulated using two before-or-after actions. TICKET(sequencer)  returns a negative value which increases by one at each call. Two concurrent threads calling TICKET on the same sequencer will receive different values based upon the timing of the call, the one calling first will receive a smaller value. READ(sequencer)  returns the current value of the sequencer 9

Multiple sender solution; only the SEND must be modified 10

More about thread creation and termination • What if want to create/terminate threads dynamically  we have to: • Allow a tread to self-destroy and clean-up -> EXIT_THREAD • Allow a thread to terminate another thread of the same application DESTRY_THREAD • What if no thread is able to run  • create a dummy thread for each processor called a processor_thread which is scheduled to run when no other thread is available • the processor_thread runs in the thread layer • the SCHEDULER runs in the processor layer • The procedure followed when a kernel starts ----------------------------------------------------------------------- Procedure RUN_PROCESSORS() for each processor do allocate stack and setup processor thread /*allocation of the stack done at processor layer shutdown  FALSE SCHEDULER() deallocate processor_thread stack /*deallocation of the stack done at processor layer halt processor 14

Switching threads with dynamic thread creation • Switching from one user thread to another requires two steps • Switch from the thread releasing the processor to the processor thread • Switch from the processor thread to the new thread which is going to have the control of the processor • The last step requires the SCHEDULER to circle through the thread_table until a thread ready to run is found • The boundary between user layer threads and processor layer thread is crossed twice • Example: switch from thread 1 to thread 6 using • YIELD • ENTER_PROCESSOR_LAYER • EXIT_PROCESSOR_LAYER 15

Lecture 19 18

Thread scheduling policies • Non-preemptive scheduling  a running thread releases the processor at its own will. Not very likely to work in a greedy environment. • Cooperative scheduling  a thread calls YIEALD periodically • Preemptive scheduling  a thread is allowed to run for a time slot. It is enforced by the thread manager working in concert with the interrupt handler. • The interrupt handler should invoke the thread exception handler. • What if the interrupt handler running at the processor layer invokes directly the thread? Imagine the following sequence: • Thread A acquires the thread_table_lock • An interrupt occurs • The YIELD call in the interrupt handler will attempt to acquire the thread_table_lock • Solution: the processor is shared between two threads: • The processor thread • The interrupt handler thread • Recall that threads have their individual address spaces so the scheduler when allocating the processor to thread must also load the page map table of the thread into the page map table register of the processor 19

Polling and interrupts • Polling  periodically checking the status of a subsystem. • How often should the polling be done? • Too frequently  large overhead • After a large time interval  the system will appear non-responsive • Interrupts • could be implemented in hardware as polling  before executing the next instruction the processor checks an “interrupt” bit implemented as a flip-flop • If the bit is ON invoke the interrupt handler instead of executing the next instruction • Multiple types of interrupts  multiple “interrupts” bits checked based upon the priority of the interrupt. • Some architectures allow the interrupts to occur durin the execution of an instruction • The interrupt handler should be short and very carefully written. Interrupts of lower priority could be masked.

Virtual machines • First commercial product IBM VM 370 originally developed as CP-67 • Advantages: • One could run multiple guest operating systems on the same machine • An error in one guest operating system does not bring the machine down • An ideal environment for developing operating systems

Performance metrics • Wide range, sometimes correlated, other times with contradictory goals : • Throughput, utilization, waiting time, fairness • Latency (time in system) • Capacity • Reliability as a ultimate measure of performance • Some measures of performance reflect physical limitations: capacity, bandwidth (CPU, memory, communication channel), communication latency. • Often measures of performance reflect system organization and policies such as scheduling priorities. • Resource sharing is an enduring problem; recall that one of the means for virtualization is multiplexing physical resources. • The workload can be characterized statistically • Queuing Theory can be used for analytical performance evaluation. 24

System design for performance • When you have a clear idea of the design, simulate the system before actually implementing it. • Identify the bottlenecks. • Identify those bottlenecks likely to be removed naturally by the technologies expected to be embedded in your system. • Keep in mind that removing one bottleneck exposes the next. • Concurrency helps a lot both in hardware and in software. • in hardware implies multiple execution units • Pipelining  multiple instructions are executed concurrently • Multiple exaction units in a processor: integer, floating point, pixels • Graphics Processors – geometric engines. • Multi-processor system • Multi-core processors • Paradigm: SIMD (Single instruction multiple data), MIMD (Multiple Instructions Multiple Data. 25

System design for performance (cont’d) • in software  complicates writing and debugging programs. SPMD (Same Program Multiple data) paradigm • Design a well balanced system: • The bandwidth of individual sub-systems should be as close as poosible • The execution time of pipeline stages as close as poosible 26

COP 5611 Operating Systems Spring 2010