Chapter 5: CPU Scheduling part II

Chapter 5: CPU Schedulingpart II

Chapter 5: CPU Scheduling • Multiple-Processor Scheduling • Real-Time Scheduling • Thread Scheduling • Operating Systems Examples • Java Thread Scheduling • Algorithm Evaluation

Multiple-Processor Scheduling • CPU scheduling more complex when multiple CPUs are available • Homogeneous processors within a multiprocessor • Makes it easy to share processes/threads • Any processor can run any process • Limitations: one processor may have unique resources (disk drive, etc.) • Load sharing • Goal is to make each processor work equally

Multiple-Processor Scheduling • Asymmetric multiprocessing – only one processor accesses the system data structures, alleviating the need for data sharing • Uses a “master” server • Simple to implement • No coherency problems • Symmetric multiprocessing (SMP) – each processor is self scheduling • Processes may be in single queue • Or each processor may have its own queue • Regardless, each processor runs own scheduler • Mutual exclusion problems • SMP is supported by all modern operating sytstems: XP, 2000, Solaris, Linux, Mac OS X

Multiple-Processor Scheduling • Processor Affinity – process is associated with a particular processor during its entire lifetime • Reason: cache problem. If switch processors must flush one cache and repopulate another • Soft affinity: SMP system tries to keep process on same processor but doesn’t guarantee it. • Hard affinity: SMP system guarantees that process will remain on a single procesor • Linux allows both. Special system call to get hard affinity.

Multiple-Processor Scheduling • Load Balancing – • On SMP systems must keep the workload balanced across processors • Only necessary in systems where each processor has own queue • In most contemporary systems each processor does have its own queue

Multiple-Processor Scheduling • Two general approaches • Push migration A specific task periodically checks the load on each processor • if it finds an imbalance, it moves (pushes) processes to idle processors • Pull migration. An idle processor pulls a waiting task from a busy processor. • Hybrid. Uses both push and pull. • Example: Linux scheduler and the ULE scheduler (FreeBSD) implement both • Linux runs balancing algorithm every 200 milliseconds (push) • Or whenever the run queue for a processor is empty (pull)

Multiple-Processor Scheduling • Problem: load balancing often counteracts the benefits of processor affinity • If use push or pull migration, take a process from its processor • This violates processor affinity • No absolute rule governing which policy is best • In some systems an idle processor always pulls a process from a non-idle process • In some systems process are moved only if the imbalance exceeds a threshold.

Symmetric Multithreading • Alternative to SMP • Provides multiple logical (not physical) processor • Also called hyperthreading technology (on Intel processors) • Idea: create multiple logical processors on the same physical processor. • Each logical processor has its won architecture state • Includes general-purpose and machine-state registers • Each logical processor is responsible for its own interrupt handling • Interrupts are delivered to and handled by logical processors rather than physical processors • Each logical processor shares the resources of its physical processor • Including cache, memory, buses • See next slide

Symmetric Multithreading Two physical processors each with two logical processors OS sees 4 processors available for work.

Symmetric Multithreading • Note that SMT is provided in hardware not software • Hardware provides the representation of the architecture state for each logical processor • Also provides for interrupt handling • OS do not have to recognize the difference between physical and logical processors • Can gain performance if OS is aware of SMT • Better to keep two physical processors busy than two logical processors on a single physical processor in previous slide.

Symmetric Multithreading • Why does hyperthreading work? • Superscalar architectures: many different hardware components exist • Example: mulitple integer arithmetic units. • To take advantage of these units, a process must be able to execute multiple instructions in parallel • Often not possible. • Idea: if run two processes simultaneously, can keep more of the architecture units busy. • The processor coordinates the simultaneous execution of multiple processes.

Real-Time Scheduling • Hard real-time systems – required to complete a critical task within a guaranteed amount of time • Soft real-time computing – requires that critical processes receive priority over less fortunate ones

Thread Scheduling • Scheduling is different for user-level threads and kernel-level threads • Kernel does not know about user-level threads thus does not schedule them • Thread library cannot schedule kernel level threads or processes

Thread Scheduling • Local Scheduling – on many-to-one and many-to-many systems • threads library decides which thread to put onto an available LWP • Called process-contention scope (PCS) since competition takes place among threads in the same process • The thread library schedules the thread onto a LWP • but the kernel must schedule the LWP; the thread library cannot do this.

Thread Scheduling • PCS scheduling • Done according to priority • User-level thread priorities set by the programmer • Priorities are not adjusted by the thread library • Some thread libraries may allow programmer to change priority of threads dynamically • PCS typically preempt current thread in favor of a higher-priority thread

Thread Scheduling • Global Scheduling – on one-to-one systems (XP, Solaris 9, Linux) • How the kernel decides which kernel thread to run next • Kernel uses system-contention scope (SCS) • Competition for the CPU with SCS scheduling takes place among all threads in the system.

Pthread Scheduling API • POSIX Pthread API • Allows specification of PCS or SCS during thread creation. • PTHREAD_SCOPE_PROCESS • Schedules threads using PCS scheduling • Thread library will map threads onto available LWPs • May use scheduler activations • PTHREAD_SCOPE_SYSTEM • Schedules threads using SCS scheduling • Will create and bind an LWP for each user-level thread on many-to-many systems • This creates a one-to-one mapping

Pthread Scheduling API • POSIX Pthread API • To set/get the scheduling policy: pthread_attr_setscope(pthread_attr_t *attr, int scope) pthread_attr_getscope(pthread_attr_t *attr, int *scope) • First parameter is a pointer to the attribute set for the thread • Second parameter for setscope function is either • PTHREAD_SCOPE_SYSTEM or • PTHREAD_SCOPE_PROCESS • Second parameter for getscope function is a pointer to an int • On return, will contain the integer representing the policy • Both functions return non-zero values on error • On some systems only certain contention scope values are allowed • Linux and Mac OS X only allow PTHREAD_SCOPE_SYSTEM

Pthread Scheduling API • POSIX Pthread API example: next slide • First determines the existing contention scope • Then sets it to PTHREAD_SCOPE_PROCESS • Then creates 5 separate threads that run using the SCS policy

Pthread Scheduling API #include <pthread.h> #include <stdio.h> #define NUM THREADS 5 int main(int argc, char *argv[]) { int i; pthread t tid[NUM THREADS]; pthread attr t attr; /* get the default attributes */ pthread attr init(&attr); /* set the scheduling algorithm to PROCESS or SYSTEM */ pthread attr setscope(&attr, PTHREAD SCOPE SYSTEM); /* set the scheduling policy - FIFO, RT, or OTHER */ pthread attr setschedpolicy(&attr, SCHED OTHER); /* create the threads */ for (i = 0; i < NUM THREADS; i++) pthread create(&tid[i],&attr,runner,NULL);

Pthread Scheduling API /* now join on each thread */ for (i = 0; i < NUM THREADS; i++) pthread join(tid[i], NULL); } /* Each thread will begin control in this function */ void *runner(void *param) { printf("I am a thread\n"); pthread exit(0); }

Operating System Examples • Solaris scheduling • Windows XP scheduling • Linux scheduling

Contemporary Scheduling • Involuntary CPU sharing -- timer interrupts • Time quantum determined by interval timer -- usually fixed for every process using the system • Sometimes called the time slice length • Priority-based process (job) selection • Select the highest priority process • Priority reflects policy • With preemption • Usually a variant of Multi-Level Queues using RR within a queue

Solaris Scheduling • Solaris 2 is a version of UNIX with support for threads at the kernel and user levels, symmetric multiprocessing, and real-time scheduling. • Scheduling: priority-based thread scheduling with 4 classes of priority: • Real time (highest priority) • Run before a process in any other class • System • (only kernel processes; user process running in kernel mode are not given this priority) • Time sharing • Interactive (lowest priority) • Within each class there are different priorities and different scheduling algorithms

Solaris Scheduling • Scheduler converts class-specific priorities into global priorities and then selects the thread with the highest global priority to run. • Selected threads run until • It blocks • It uses its time slice • It is preempted by a higher-priority thread • If there are multiple threads with the same priority, scheduler uses round-robin queue.

Solaris Scheduling • Default class is time sharing • Policy for Time sharing: • Uses a mulitlevel feedback queue • Different levels have different time slice lengths • Dynamically alters priorities • Inverse relationship between priority and time slice • The higher the priority, the smaller the time slice • The lower the priority, the larger the time slice • I/O bound typically have higher priority • CPU-bound typically have lower priority • Get good response time for I/O-bound • Get good throughput for CPU-bound

Solaris Scheduling • Policy for Interactive processes: • Same policy as time-sharing • Gives windowing applications a higher priority for better performance

Solaris Scheduling

Solaris 2 Scheduling • Before Solaris 9 used a many-to-many model • Solaris 9 switched to a one-to-one model • Solaris 2 resource needs of thread types: • Kernel thread: small data structure and a stack; thread switching does not require changing memory access information – relatively fast. • LWP: PCB with register data, accounting and memory information; switching between LWPs is relatively slow. • User-level thread: only need stack and program counter; no kernel involvement means fast switching. Kernel only sees the LWPs that support user-level threads.

Solaris 2 Scheduling

Solaris 9 Scheduling • Dispatch table for scheduling interactive and time-sharing threads • See next slide • These two classes include 60 priority levels (only a few are shown) • Dispatch table fields: • Priority The class-dependent priority. Higher number means higher priority. • Time quantum. The time quantum for the associated priority. Notice the inverse relationship. • Time quantum expired. The new priority of a thread that has used its entire time quantum without blocking. The thread is now considered CPU-bound. Priority is lowered. • Return from sleep. The new priority of a thread that is returning from sleeping. Its priority is boosted to between 50 and 59. Assures good response time for interactive processes.

Solaris Dispatch Table

Solaris 9 Scheduling • Solaris 9 scheduling • Introduced two new scheduling classes: • Fixed priority. These have the same priority range as those in time-sharing class • Their priorities are not dynamically adjusted. • Fair share. Uses CPU shares instead of priorities to make scheduling decisions. • CPU shares are allocated to a set of processes (a project)

Windows XP Priorities

Linux Scheduling • Kernel v. 1.0.9 (very old) • Dispatcher is a kernel function, schedule( ) • Function is called from • Other system functions • After every system call • After every interrupt • Dispatcher jobs: • Performs periodic work (e.g., processes pending signals) • Inspects set of tasks in the TASK_RUNNING state (the ready list) • Chooses a task to execute • Dispatches the task to CPU

Linux Scheduling • Policy: variant of RR • Uses conventional timeslicing mechanism • Dynamic priority computed based on value assigned to task by nice( ) or setpriority( ) • and by amount of time process has been waiting • Count field in the task descriptor is adjusted on each timer interrupt • Interrupt handler adjusts each timer field for task • Dispatcher selects the ready task with max counter value.

/* * … * NOTE!! Task 0 is the ‘idle’ task, which gets called when no * other tasks can run. It cannot be killed, and it cannot * sleep. The ‘state’ information in task[0] is never used. * … */ Asmlinkage void schedule(void) { int c; struct task_struct * p; // Pointer to the process descriptor currently being inspected struct task_struct * next; unsigned long ticks; /* check alarm, wake up any interruptible tasks that have got a signal */ … // This code is elided from the description /* this is the scheduler proper: */ #if 0 /* give the process that go to sleep a bit higher priority … */ /* This depends on the values for TASK_XXX */ /* This gives smoother scheduling for some things, but */ /* can be very unfair under some circumstances, so .. */ if (TASK_UNINTERRUPTIBLE >= (unsigned) current->state && current->counter < current->priority*2){ ++ current->counter; } #endif

c = -1; // Choose the task with the highest c == p->counter value next = p = &init_task; for(;;) { if ((p = p->next_task) == &init_task) goto confuse_gcc; // this is the loop exit if (p->state == TASK_RUNNING && p->counter > c) c = p->counter, next = p; // this task has the highest p->count so far // but keep looking } Confuse_gcc: if (!c){ for_each_task(p) p->counter = (p->counter >> 1) + p->priority; } if (current != next) kstat.context_switch++; switch_to(next); // this is the context switch … // more code }; }

Contemporary Linux Scheduling • Prior to version 2.5 Linux kernel ran a variable of the traditional UNIX scheduling algorithm. • Poor support for SMP • Does not scale well as the number of tasks on the system grows • New kernel • Scheduling algorithm runs in constant O(1) time regardless of the number of tasks • Includes support for SMP: processor affinity, load balancing, interactive tasks, etc.

Contemporary Linux Scheduling • Linux scheduler is preemptive, priority-based algorithm • Two algorithms: time-sharing and real-time • Real time priorities range from 0-99 • Time-sharing priorities range from 100-140 • These two ranges are mapped into a global priority scheme (lower numbers have higher priority) • Higher-priority tasks get longer time-quanta • Lower-priority tasks get shorter time-quanta

The Relationship Between Priorities and Time-slice length

Contemporary Linux Scheduling • Time-sharing • Prioritized credit-based – process with most credits is scheduled next • Credit subtracted when timer interrupt occurs • When credit = 0, another process chosen • When all processes have credit = 0, recrediting occurs • Based on factors including priority and history • Use a tasks nice value plus or minus 5 • The interactivity of a task determines whether 5 is added to or subtracted from the nice value. • Interactivity determined by how long task has been sleeping while waiting for I/O • Tasks that are more interactive have longer sleep times, thus get adjustments closer to –5 • Scheduler favors interactive taks • Tasks that have shorter sleep times are CPU-bound and thus get adjustments closer to +5

Contemporary Linux Scheduling • Time-sharing • Kernel maintains all runable tasks in a runqueue data structure • Each processor has own runqueue (on SMP systems) • Each runqueue contains two priority arrays: active and expired • The active array contains all tasks with time remaining in their time slices • Expired array contains all expired tasks • Each of these arrays are priority arrays: list is indexed according to priority (see next slide) • When all tasks have exhausted their time slices (active array is empty) the two priority arrays are exchanged.

List of Tasks Indexed According to Priorities

Contemporary Linux Scheduling • Real-time • Soft real-time • Real-time tasks have static priorities • Posix.1b compliant – two classes • FCFS and RR • Highest priority process always runs first

BSD 4.4 Scheduling • Involuntary CPU Sharing • Preemptive algorithms • Dispatcher selects a process from highest priority queue: • only processes in highest priority, non-empty queue can run • Within a queue uses RR • 32 Multi-Level Queues • Queues 0-7 are reserved for system functions • Queues 8-31 are for user space functions • nice influences (but does not dictate) queue level • Once per time quantum scheduler recomputes each processes priority • Priority function of nice and recent demand on CPU (more utilization means lower priority)

Java Thread Scheduling • JVM Uses a Preemptive, Priority-Based Scheduling Algorithm. • FIFO Queue is Used if There Are Multiple Threads With the Same Priority.

Java Thread Scheduling (cont) JVM Schedules a Thread to Run When: • The Currently Running Thread Exits the Runnable State. • A Higher Priority Thread Enters the Runnable State * Note – the JVM Does Not Specify Whether Threads are Time-Sliced or Not.

Time-Slicing • Since the JVM Doesn’t Ensure Time-Slicing, the yield() Method May Be Used: while (true) { // perform CPU-intensive task . . . Thread.yield(); } This Yields Control to Another Thread of Equal Priority.

Chapter 5: CPU Scheduling part II