430 likes | 545 Vues
This part of the series focuses on the task decomposition capabilities of OpenMP, highlighting the differences between the `task` and `for` pragmas. You will learn how to utilize the `task` construct to solve problems with irregular or recursive algorithms, and to build independent units of work that can be executed by multiple threads. The tutorial covers practical coding examples and explains concepts like implicit barriers and the execution order of tasks. By the end of this part, you should be able to effectively implement task-based parallelism in your OpenMP applications.
E N D
OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8
Review & Objectives • Previously: • Defined deadlock and explained ways to prevent it • At the end of this part you should be able to: • Describe how the OpenMP taskpragma is different from the forpragma • Code a task decomposition solution with the OpenMP task construct
Pragma: single • Denotes block of code to be executed by only one thread • First thread to arrive is chosen • Implicit barrier at end #pragmaomp parallel { DoManyThings(); #pragmaomp single { printf(“Many Things done\n”); }// threads wait here for single DoManyMoreThings(); }
New Addition to OpenMP • Tasks – Main change for OpenMP 3.0 • Allows parallelization of irregular problems • unbounded loops • recursive algorithms • producer/consumer
What are tasks? • Tasks are independent units of work • Threads are assigned to perform the work of each task • Tasks may be deferred • Tasks may be executed immediately • The runtime system decides which of the above • Tasks are composed of: • code to execute • data environment • internalcontrol variables (ICV) Serial Parallel
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
data data data data data next next next next next A Linked List Example node *p = head; while (p) { process(p); p = p->next; } p head
Task Construct – Explicit Task View node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } • A team of threads is forked at the omp parallel construct • A single thread, T0, executes the while loop • Each time T0 crosses the omp task construct it generates a new task • Each task runs in a thread • All tasks complete at the barrier at the end of the parallel region’s single construct
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } process() p p head
data data data data data next next next next next A Linked List Example node *p = head; #pragmaomp parallel { #pragmaomp single while (p) { #pragmaomp task process(p); p = p->next; } } p head
When are tasks gauranteed to be complete? • Tasks are gauranteed to be complete: • At thread or task barriers • At the directive: #pragma omp barrier • At the directive: #pragma omp taskwait
Example: Naive Fibonacci Calculation • Recursion typically used to calculate Fibonacci number • Widely used as toy benchmark • Easy to code • Has unbalanced task graph long SerialFib( long n ) { if( n < 2 ) return n; else return SerialFib(n-1) + SerialFib(n-2); }
SerialFib(2) SerialFib(4) SerialFib(2) SerialFib(3) SerialFib(3) SerialFib(1) SerialFib(1) SerialFib(2) SerialFib(0) SerialFib(1) SerialFib(0) SerialFib(2) SerialFib(0) SerialFib(1) SerialFib(0) SerialFib(1) SerialFib(1) Example: Naive Fibonacci Calculation • We can envision Fibonacci computation as a task graph
Fibonacci - Task Spawning Solution • long ParallelFib(long n) • { long sum; • #pragmaomp parallel • { • #pragmaomp single • FibTask(n,&sum); • } • return sum; • } • Write a helper function to set up parallel region • Call FibTask() to do computation • Use sum return parameter in FibTask()
Fibonacci - Task Spawning Solution • void FibTask(long n, long* sum) • { • if( n < CutOff ) { • *sum = SerialFib(n); • } • else { • long x, y; • #pragmaomp task • FibTask(n-1,&x); • #pragmaomp task • FibTask(n-2,&y); • #pragmaomptaskwait • *sum = x+y; • } • } • Thread will first check the value of n against CutOff • If the cutoff hasn’t been reached, the thread will create two new tasks • One to compute the n-1 Fib value • One to compute the n-2 Fib value • The computed values for these tasks will be returned through the private variables x and y, respectively • The #pragmaomptaskwaitis required to make sure that the values for x and y have been computed before they are added together into sum
Fibonacci Task Solution Example FibTask(8,*sum) long x, y; FibTask(7,&x); FibTask(6,&y); *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; FibTask(7,&x); FibTask(6,&x); FibTask(6,&y); FibTask(5,&y); *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) FibTask(7,*sum) FibTask(6,*sum) long x, y; long x, y; long x, y; FibTask(7,&x); FibTask(6,&x); FibTask(5,&x); FibTask(6,&y); FibTask(5,&y); FibTask(4,&y); *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(6,*sum) FibTask(6,*sum) FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; long x, y; long x, y; FibTask(5,&x); FibTask(7,&x); FibTask(5,&x); FibTask(6,&x); FibTask(4,&y); FibTask(6,&y); FibTask(4,&y); FibTask(5,&y); *sum = x + y; *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(6,*sum) FibTask(6,*sum) FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; long x, y; long x, y; FibTask(5,&x); FibTask(7,&x); FibTask(5,&x); FibTask(6,&x); FibTask(4,&y); FibTask(6,&y); FibTask(4,&y); FibTask(5,&y); *sum = x + y; *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(6,*sum) FibTask(6,*sum) FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; long x, y; long x, y; FibTask(5,&x); FibTask(7,&x); FibTask(5,&x); FibTask(6,&x); FibTask(4,&y); FibTask(6,&y); FibTask(4,&y); FibTask(5,&y); *sum = x + y; *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(6,*sum) FibTask(6,*sum) FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; long x, y; long x, y; FibTask(5,&x); FibTask(7,&x); FibTask(5,&x); FibTask(6,&x); FibTask(4,&y); FibTask(6,&y); FibTask(4,&y); FibTask(5,&y); *sum = x + y; *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) FibTask(7,*sum) FibTask(6,*sum) long x, y; long x, y; long x, y; FibTask(7,&x); FibTask(6,&x); FibTask(5,&x); FibTask(6,&y); FibTask(5,&y); FibTask(4,&y); *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) FibTask(7,*sum) FibTask(6,*sum) long x, y; long x, y; long x, y; FibTask(7,&x); FibTask(6,&x); FibTask(5,&x); FibTask(6,&y); FibTask(5,&y); FibTask(4,&y); *sum = x + y; *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) FibTask(7,*sum) long x, y; long x, y; FibTask(7,&x); FibTask(6,&x); FibTask(6,&y); FibTask(5,&y); *sum = x + y; *sum = x + y;
Fibonacci Task Solution Example FibTask(8,*sum) long x, y; FibTask(7,&x); FibTask(6,&y); *sum = x + y;
References • OpenMP API Specification, www.openmp.org