300 likes | 454 Vues
This document provides a comprehensive outline of OpenMP, a portable standard for shared memory multi-processing. It covers fundamental concepts such as multi-threading, how to utilize OpenMP directives, and the varied scheduling methods for efficient thread management. The limitations of OpenMP are discussed, including scalability issues and thread binding challenges. Additionally, it explores the integration of OpenMP with MPI for larger-scale applications. By following the practical examples and instructions, users can effectively implement OpenMP in their programming workflows.
E N D
OpenMP E. Bruce Pitman October, 2002
Outline • What is OpenMP • Multi-threading • How to use OpenMP • Limitations • OpenMP + MPI • References
What is OpenMP? • A portable standard for shared memory multi-processing • A library of compiler directives • Fine-grained (loop-level) • To varying degrees, Sun, SGI, IBM, HP, Intel
How to use OpenMP • schedule(static [,chunk]) • Deal-out blocks of iterations of size “chunk” to each thread. • schedule(dynamic[,chunk]) • Each thread grabs “chunk” iterations off a queue until all iterations have been handled. • uschedule(guided[,chunk]) • Threads dynamically grab blocks of iterations. The size of the block starts large and shrinks down to size “chunk” as the calculation proceeds.
How to use OpenMP • schedule(runtime) • Schedule and chunk size taken from the OMP_SCHEDULE environment variable.
How to use OpenMP Here’s an example of PRIVATE and FIRSTPRIVATE variables A,B, and C = 1C$OMP PARALLEL PRIVATE(B) C$OMP& FIRSTPRIVATE(C) Inside this parallel region “A” is shared by all threads and equals 1 “B” and “C” are local to each thread - B’s initial value is undefined - C’s initial value equals 1 Outside this parallel region the values of “B” and “C” are undefined.
How to use OpenMP • #!/bin/csh -f • #PBS -l ncpus=8 • #PBS -V • #PBS -q medium_p • #PBS -M pitman@math.buffalo.edu
How to use OpenMP • cp /CCRSGI/home/pitman/fortran/sph/sph* $PBSTMPDIR • cd $PBSTMPDIR • setenv OMP_NUM_THREADS 8 • f90 -O2 -LNO -IPA -n32 -mips4 -r12000 -mp -o sph sphomp.f • time ./sph > outfile • cp outfile /CCRSGI/home/pitman/fortran/sph/ • # remove scratch directory • cd /FCScratch • \rm -r $PBSTMPDIR
How to use OpenMP k = 1 do while (k .le. maxit .and. error .gt. tol) error = 0.0 !$omp parallel !$omp do do j=1,m do i=1,n uold(i,j) = u(i,j) enddo enddo
How to use OpenMP !$omp do private(resid) reduction(+:error) do j = 2,m-1 do i = 2,n-1 resid = (ax*(uold(i-1,j) + uold(i+1,j)) & + ay*(uold(i,j-1) + uold(i,j+1)) & + b * uold(i,j) - f(i,j))/b u(i,j) = uold(i,j) - omega error = error + resid*resid end do enddo
How to use OpenMP !$omp enddo nowait !$omp end parallel $omp end parallel k = k + 1 error = sqrt(error)/dble(n*m) enddo
Limitations • Easy to port serial code to OpenMP • OpenMP code can run in serial mode HOWEVER • Shared memory machines only • Limited scalability -- after ~8 processors, not much speed-up • Overhead of parallel do, parallel regions
Limitations • OpenMP currently does not specify or provide constructs for controlling the binding of threads to processors. • Processors can migrate, causing overhead. • This behavior is system-dependent. • System-dependent solutions may be available.
References • www.openmp.org • http://www.ccr.buffalo.edu/documents/CCR_openmp_pbs.PDF • http://www.epcc.ed.ac.uk/research/openmpbench/ • http://www.llnl.gov/computing/tutorials/workshops/workshop/openMP/MAIN.html • http://scv.bu.edu/SCV/Tutorials/OpenMP/