Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Multi-core Real-Time Scheduling for Generalized Parallel Task Models Abusayeed Saifullah, Kunal Agrawal, Chenyang Lu, Christopher Gill

Real-Time Systems on Multi-core • Traditional multiprocessor scheduling • Focuses on inter-task parallelism • Mostly restricted to sequential task models • Computation-intensive complex real-time tasks are growing • Video surveillance • Radar tracking • Hybrid real-time structural testing • Multi-core processors provide an opportunity to schedule computation-intensive tasks in real-time • Most of the tasks exhibit intra-task parallelism • Real-time systems need to be developed to exploit intra-task parallelism

Parallel Task Model • Synchronous task model Parallel threads form a segment Each horizontal bar indicates a thread of execution (sequence of instructions) Segment 1 Seg 2 Seg 3 Segment 4 Segment 5 Threads of each segment synchronize at the end of the segment Threads of Segment 1 synchronize here • Lakshmanan et al. (RTSS ’10) have addressed a restricted synchronous model where • A task is an alternate sequence of parallel and sequential segments • All parallel segments have an equal number of threads • The total number of threads in each segment ≤ number of cores

Our Contributions • We address a general synchronous parallel task model • Different segments may have different numbers of threads • Each segment can have an arbitrary number of threads • Example: such tasks are generated by • Parallel for loops in OpenMP, CilkPlus • Barrier primitives in thread libraries • This model is more portable • The same program can execute on machines with different numbers of cores

A Task Example void parallel_task(float *a,float *b,float *c,float * d) { 7 int n=7; int i=0; parallel_for(; i< n; i++) c[i] = a[i] + b[i]; n=4; i=0; parallel_for(; i< n; i++) d[i] = a[i] - b[i]; } start end

Our Contributions (contd..) • We propose a task decomposition for general synchronous parallel task model • Decomposes each parallel task into a set of sequential subtasks • Subtasks are scheduled like traditional tasks • Why decomposition? • We can exploit the rich literature of multiprocessor scheduling • The proposed decomposition ensures that if the decomposed tasks are schedulable, the original task set is also schedulable

Our Contributions (contd..) • We analyze schedulability in terms of processor speed augmentation bound • Speed augmentation bound ν for an Algorithm A: if an optimal algorithm can schedule a synchronous parallel task set on unit-speed processor cores, then A can schedule the decomposed tasks on ν-speed processor cores. • We prove that the proposed decomposition requires a speed augmentation of at most • 4 for Global Earliest Deadline First (G-EDF) scheduling • 5 for Partitioned Deadline Monotonic (P-DM) scheduling

Overview of a Task Decomposition • Each thread of the task becomes an individual task with • An intermediate subdeadline • A release offset to retain precedence relations in the original task • Deadlines are assigned by distributing slack among segments • Deadline of a thread= execution requirement+ assigned slack

Slack Distribution • How much slack a segment demands depends on • Available slack of the task • Execution requirement of the segment • Execution requirement of a segment is the product of • Total number of parallel threads in the segment and • Execution requirement of each thread in the segment • Larger execution requirement implies more demand for slack • In the figure, Segment 1 requires more slack than Segment 2

Slack Distribution (contd..) • We use the following principle to distribute slack • All segments that receive slack will achieve an equal density • Reasons to equalize the density among segments • Fairness: deadline of each segment becomes proportional to its execution requirement • We can bound the density of the decomposed tasks • We can exploit existing density-based analyses for multiprocessor

Slack Distribution (contd..) • Slack of each segment is determined by solving the equalities • Sum of subdeadlines=task deadline (total assigned slack = task slack) • Density of Segment 1= density of Segment 2 = so on • All threads in a segment have the same deadline and offset • Deadline= execution requirement of the thread + segment slack • Release offset=sum of deadlines of preceding segment …

An Example of Task Decomposition Segment 3: deadline=9 density= (3*3)/9=1 Segment 5: deadline=3 density= (1*3)/3=1 Segment 2: deadline=4 density= (2*2)/4=1 Segment 4: deadline=16 density= (4*4)/16=1 Segment 1: deadline=20 density= (5*4)/20=1 All segments have an equal density!

Global EDF (G-EDF) Schedulability • A sufficient condition for • G-EDF scheduling on m unit- • speed cores [Baruah RTSS ’07] • A necessary condition • for any task set for any • scheduler max density total density Using the density bounds for decomposed tasks If the original task set is schedulable anyway on munit-speed cores, the decomposed tasks are schedulable under G-EDF on 4-speed cores

Partitioned DM (P-DM) Schedulability FBB-FFD (Fisher Baruah Baker – First-Fit Decreasing) is a well-known P-DM scheduler [ECRTS ’06] • A sufficient condition for FBB-FFD • scheduling on m unit-speed cores • A necessary condition • for any scheduler max cumulative exe. req. of tasks divided by time length Using load and density bounds for decomposed tasks If the original task set is schedulable anyway on munit-speed cores, the decomposed tasks are FBB-FFD schedulable on 5-speed cores

Conclusion • Multi-core processors provide opportunities to schedule computation-intensive tasks in real-time • Real-time systems need to exploit intra-task parallelism • We have addressed real-time scheduling for generalized synchronous parallel task model • Different segments may have different number of threads • Each segment can have an arbitrary number of threads • We have proposed a task decomposition that achieves • A processor-speed augmentation bound of 4 for Global EDF • A processor-speed augmentation bound of 5 for Partitioned DM

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Presentation Transcript

Real-Time Task Scheduling

Real-Time Scheduling

Real Time Scheduling

Real-Time Scheduling

TimeGraph : GPU Scheduling for Real-Time Multi-Tasking Environments

§❻ Hierarchical (Multi-Stage) Generalized Linear Models

Real-Time Mutli-core Scheduling

Aperiodic Task Scheduling for Hard Real-Time Systems

Real Time Scheduling

Real-Time Multi-core Scheduling With Power, Thermal And Reliability Awareness

Real-Time Scheduling

Real-Time Scheduling

Task Partitioning for Multi-Core Network Processors

Synthesizing Parallel Programming Models for Asymmetric Multi-core Systems

Multi-Core Performance Modeling for Real-Time Systems

Virtual Machine Scheduling for Parallel Soft Real-Time Applications

Real-Time Scheduling

Real-time multi-core PDE-solvers in LabVIEW

Condor and Multi-core Scheduling

Real Time Scheduling

Real time scheduling