Create Presentation
Download Presentation

Download Presentation
## CS4402 – Parallel Computing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**CS4402 – Parallel Computing**Lecture 9 – Sorting Algorithms (2) Compare and Exchange Operation Compare and Exchange Sorting**Compare and Exchange Operation**Take place between processors rank1, rank2. Each processor keeps the sub-array a=(a[i],i=0,1,…,n). if(rank is rank1){ MPI_Send(&a,n,MPI_INT,rank2, tag1,MPI_COMM_WORLD); MPI_Recv(&b,n,MPI_INT,rank2, tag2,MPI_COMM_WORLD,&status); c = merge(n,a,n,b); for(i=0;i<n;i++)a[i]=c[i]; } if(rank is rank2){ MPI_Send(&a,n,MPI_INT,rank2, tag2,MPI_COMM_WORLD); MPI_Recv(&b,n,MPI_INT,rank2, tag1,MPI_COMM_WORLD,&status); c = merge(n,a,n,b); for(i=0;i<n;i++)a[i]=c[i+n]; }**Compare and Exchange Operation**Complexity? What amount of computation is being used? What amount of communication takes place? CAN YOU FIND ARGUMENTS TO PROVE THAT THIS IS OPTIMAL OR EFFICIENT?**Compare and Exchange Algorithms**Step 1. The array is scattered onto p sub-arrays. Step 2. Processor rank sorts a sub-array. At any time the processors keep the sub-arrays sorted. Step 3. While is not sorted / is needed compare and exchange between some processors Step 4. Gather of arrays to restore a sorted array.**Odd-Even Sort**1. Scatter the array onto processors. 2. Sort each sub-array aa. 3. Repeat for step=0,1,2,…, p-1 if (step is odd){ if(rank is odd)exchange(aa,n/size,rank, rank+1); if(rank is even) exchange(aa,n/size,rank-1, rank); } if (step is even){ if(rank is even)exchange(aa,n/size,rank, rank+1); if(rank is odd) exchange(aa,n/size,rank-1, rank); } 4. Gather the sub-arrays back to root.**Odd-Even Sort**Simple Remarks: • Odd-Even Sort uses size rounds of exchange. • Odd-Even Sort keeps all processors busy … or almost all. • The complexity is given by • Scatter and Gather the array n/size elements • Sorting the array n/size elements • Compare and Exchange process size rounds involving n/size elements**if( rank == 0 )**{ array = (double *) calloc( n, sizeof(double) ); srand( ((unsigned)time(NULL)+rank) ); for( x = 0; x < n; x++ ) array[x]=((double)rand()/RAND_MAX)*m; } MPI_Scatter( array, n/size, MPI_DOUBLE, a, n/size, MPI_DOUBLE, 0, MPI_COMM_WORLD ); merge_sort(n/size,a); for(i=0;i<size;i++){ if( (i+rank)%2 ==0 ){ if( rank < size-1 ) exchange(n/size,a,rank,rank+1,MPI_COMM_WORLD); } else { if( rank > 0 ) exchange(n/size,a,rank-1,rank,MPI_COMM_WORLD); } MPI_Barrier(MPI_COMM_WORLD) } MPI_Gather( a, n/size, MPI_DOUBLE, array, n/size, MPI_DOUBLE, 0, MPI_COMM_WORLD ); if( rank == 0 ) { for( x = 0; x < n; x++ ) printf( "Output : %f\n", array[x] ); }**Comments on Odd-Even**Features of the algorithm: - Simple and quite efficient. - In p steps of compare and exchange the array is sorted out - Why??? - The number of steps can be reduced if test “array sorted” but still in O(p). - C&E operations only between neighbors. Can we do C&E operations between other processors?**Odd-Even Sort Complexity**Stage 0. To sort out the scattered array Stage 1. Odd-Even for p levels Scatter and Gather Total computation complexity **isSorted(n, a, comm)**The parallel routine int isSorted(int n, double *a, MPI_Comm comm) • Test if the processors have all the local arrays in order. • rank1 < rank2 elements of rank1 < rank2. • If the answer if yes then no exchange is needed. How to do it? • The test is done at the root. • The test is done collectively by all processors.**isSorted(n,a,comm) – Strategy 1**The test is done collectively by all processors • Send last to the right processor • Receive last from the left processor • Test if last > a[0] then answer = 0 • All_Reduce answer by using MIN**isSorted(n,a,comm) – Strategy 2**The test is done at the root. • Gather the first elements to the root. • Gather the last elements to the root. • If rank == 0 then • For size-1 times do - test if last[i] > first[i+1] • Broadcast the answer**Shell Sort**It is based on the notion of “shell/group” of consecutive processors. - C&E take place between equally extreme procs. - The shell is then divided into 2. (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15) #(shell)=p (0 1 2 3 4 5 6 7) (8 9 10 11 12 13 14 15) #(shell)=p/2 (0 1 2 3) (4 5 6 7) (8 9 10 11) (12 13 14 15) #(shell)=p/4 (0 1) (2 3) (4 5) (6 7) (8 9) (10 11) (12 13) (14 15) #(shell)=p/8 - There are log(p) levels of division. For the level l we have - there are pow(2,l) shells each of size p/pow(2,l). - The shell k contains the processors**Shell Sort**Shell Sort is based on two stages: Stage 1. Divide the shells for l=0,1,2, log(p) - exchange in parallel between extreme processors in each shell. Stage 2. Odd-Even for l=0,1,2, …,p - if rank and l are both even then exchange in parallel betw rank and rank+1 - if rank and l are both odd then exchange in parallel betw rank and rank+1 - test “array sorted”**Shell Sort Complexity**Stage 0. To sort out the scattered array Stage 1. Odd-Even for l levels Catch the average complexity of l is in this case O(log^2(p)) so that in average the shell can be Scatter and Gather Total computation complexity **Complexity Comparison for Parallel Sorting**Odd-Even Sort Shell Sort Merge Sort **Assignment**Description: Write a MPI program to sort out an array: • Use a MPI method to compare and exchange • Use a MPI method to test isSorted() • Use the odd-even sort. • Evaluate the performances of the program in a readme.doc General Points: • It is for 10% of the marks. • Deadline on Monday 2/12/2013 at 5 pm. • The following elements must be submitted by email to j.horan@4c.ucc.ie: • The c program name with your name and student number e.g. SabinTabirca_111111111.c. • The Makefile file • Readme.doc in which you have 1) to give your student details and 2) to state the performances.