1 / 23

Efficient Parallelization Techniques for Integer Square Sum Calculation in MPI and OpenMP

This document presents parallel programming strategies for calculating the sum of squares of the first 'n' integers using both Message Passing Interface (MPI) and OpenMP. It includes example codes for improved efficiency through parallelization techniques. The MPI implementation leverages multiple processes, while the OpenMP version utilizes shared memory for enhanced performance on multi-core systems. Key focus areas include optimizing code for performance, employing reduction operations for accumulation, and profiling to identify computational hotspots.

khuong
Télécharger la présentation

Efficient Parallelization Techniques for Integer Square Sum Calculation in MPI and OpenMP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 么石磊 博士 亚太技术网络(APTN) stone@beijing.sgi.com

  2. Data Data N threads N sub-domains 1 Domain

  3. APO -APO C$OMP

  4. Example for Parallelisation program sum_squares integer I,n real*8 sum print *,’enter n:’ read *,n sum=0 do I=1,n sum = sum + dble(I)**2 end do print *, ‘sum of first’,n,’ squares =‘,sum end

  5. Rewriting Parallelisation code Message Passing Parallel Programming program sum_squares include “mpif.h integer I,n,myid,numprocs real*8 my_sum,sum call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,myid,ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD,numprocs,ierr) if(myid.eq.0) then print *,’enter n:’ read *,n endif call MPI_BCAST(n,1,MPI_INTEGER,0, & MPI_COMM_WORLD, ierr) my_sum = 0 do I=1+myid*n/numprocs,(myid+1)*n/numprocs my_sum = my_sum + dble(I)**2 end do call MPI_REDUCE(my_sum,sum,1,MPI_DOUBLE_PRECISION, & MPI_SUM,0,MPI_COMM_WORLD,ierr) if(myhid,eq.0) then print *, ‘sum of first’,n,’ squares =‘,sum endif call MPI_FINALLIZE(ierr) end Shared Memory Parallel Programming program sum_squares integer I,n real*8 sum print *,’enter n:’ read *,n sum=0 !$OMP PARALLEL DO reduction(+:sum) do I=1,n sum = sum + dble(I)**2 end do print *, ‘sum of first’,n,’ squares =‘,sum end Added Shared Memory Code Added Dist. Memory Message Passing Code

  6. – Use condition modifier in c$doacross:

  7. Parallelisation Strategy • port code if necessary. Make sure it gives correct answer • Profile the code, work only on loops and routines that use a large percentage of cpu time (think about coarse-grained parallelism) • Use APO and be done(!?) or • Hand tune for loop level parallelization • To further improve performance, look for coarse-grained parallelism opportunities.

More Related