120 likes | 227 Vues
This document explores advanced techniques in non-linear algebra applications to optimize and parallelize the Quicksort algorithm. It delves into two versions of parallel Quicksort: one that partitions the array sequentially and another that takes advantage of multiple processes to enhance performance. The challenges associated with communication between processes during this partitioning and sorting phases are discussed, alongside runtime complexities. This resource is ideal for computer science students and professionals looking to deepen their understanding of algorithm efficiency and parallel computing.
E N D
Non-Linear Algebra Problems Sathish Vadhiyar SERC IISc
Quick Sort Sequential Quicksort(A, q, r){ /* To divide A[q..r] into A[q..s], A[s+1..r] such that elements in one array lesser than elements in 2nd array */ Choose a pivot x Partition A[q..r] into 2 arrays such that A[q..s] < x and A[s+1..r] >= x Quicksort(A, q, s); QuickSort(A, s+1, r); }
Quicksort Parallel (Version 1) • At the 1st step, one of the processes partitions the array into 2 partitions, distributes it to 2 processes • At the 2nd step, each of the 2 processes partitions the array giving rise to 4 partitions distributed to 4 processes. • Continue till all the processes get partitions • Perform serial quick sort in each of the partitions
Quicksort Parallel (Version 1) • Example: 8 processes, 13 elements 5 4 19 30 21 6 13 99 51 55 12 7 1 P1 P2 P6 5 4 6 13 12 7 1 19 21 30 99 51 55 P4 P8 P7 P3 1 5 4 6 7 12 13 19 21 30 99 51 55 1 4 7 19 5 6 12 13 21 30 99 51 55 P2 P5 P6 P7 P3 P4 P8 P1 1 4 5 6 7 12 13 19 21 30 51 55 99
Quicksort Parallel (Version 1) - Problems • Follows the formula T(n) = T(n/2) + O(n) • O(n) due to sequential partitioning • Needed parallel partitioning
Quicksort Parallel (Version 2) • Initially, each process is assigned n/p elements. • A pivot is chosen by one of the processes in the group and broadcast to all other processes • Each process forms 2 blocks S and L where S < pivot and L >= pivot • The entire array is rearranged so that all ‘S’s at beginning of the array and all ‘L’s at the end of the array.
Quicksort Parallel (Version 2) • Thus 2 groups of processes are formed to sort S and L. • Parallel quick sort recursively called. • Recursion terminates when a particular sub-block is assigned to only a single process • Finally, each process calls serial quicksort on its local array
Quicksort Parallel (version 2) - challenges • Communication of S and L between processes • Runtime: Split: O(logP) – broadcast of pivot O(n/P) – local splitting O(logP) – global splitting Thus for a single split – O(n/P)+O(logP) logP splits – O(nlogP/P) +O(logPlogP) local sort: O(n/Plogn/P)
Bubble Sort 5 4 19 30 21 6 13 99 51 55 12 7 4 5 19 30 21 6 13 99 51 55 12 7 4 5 19 30 21 6 13 99 51 55 12 7 4 5 19 30 21 6 13 99 51 55 12 7 4 5 19 21 30 6 13 99 51 55 12 7 4 5 19 21 6 30 13 99 51 55 12 7 4 5 19 21 6 13 30 99 51 55 12 7 4 5 19 21 6 13 30 99 51 55 12 7 4 5 19 21 6 13 30 51 99 55 12 7 4 5 19 21 6 13 30 51 55 99 12 7 4 5 19 21 6 13 30 51 55 12 99 7 4 5 19 21 6 13 30 51 55 12 7 99 Difficult to parallelize
Bubble Sort - Odd even variant 5 4 19 30 21 6 13 99 51 55 12 7 odd 4 5 19 30 6 21 13 99 51 55 7 12 even 4 5 19 6 30 13 21 51 99 7 55 12 odd 4 5 6 19 13 30 21 51 7 99 12 55 even 4 5 6 13 19 21 30 7 51 12 99 55 odd 4 5 6 13 19 21 7 30 12 51 55 99 even 4 5 6 13 19 7 21 12 30 51 55 99 odd 4 5 6 13 7 19 12 21 30 51 55 99 even 4 5 6 7 13 12 19 21 30 51 55 99 odd 4 5 6 7 12 13 19 21 30 51 55 99
Bubble sort – odd even variant (Parallelization) • Assign n/p elements to each processor • Each processor first performs local sort. • Then p phases are performed • In each phase, O(n/p) comparisons and O(n/p) communications • Total runtime – O((n/p)log(n/p)) + O(n) + (n)
Bubble Sort - Odd even variant - Parallelization 5 4 19 30 21 6 13 99 51 55 12 7 5 4 19 30 21 6 13 99 51 55 12 7 5 4 19 30 6 13 21 99 7 12 51 55 4 5 6 13 19 21 30 99 7 12 51 55 4 5 6 13 7 12 19 21 30 51 55 99 4 5 6 7 12 13 19 21 30 51 55 99