460 likes | 654 Vues
Bitonic Sorting and Its Circuit Design. Kenneth E. Batcher Professor, Kent State University. http://www.cs.kent.edu/~batcher. “Sorting networks and their applications”, AFIPS Proc. of 1968 Spring Joint Computer Conference, Vol. 32, pp 307-314. Background. Sorting is fundamental
E N D
Kenneth E. Batcher Professor, Kent State University http://www.cs.kent.edu/~batcher “Sorting networks and their applications”, AFIPS Proc. of 1968 Spring Joint Computer Conference, Vol. 32, pp 307-314.
Background • Sorting is fundamental • Low bound of any sequential sorting algorithms is O(nlogn) • Can we improve the time complexity further? • Parallel algorithms • Circuit/Network Design • Parallel Computing Models
①Bitonic Sequence 双调序列 • sequence of elements {a0, a1, …, an-1} where either • (1) there exists an index, i, 0 i n-1, such that {a0, …, ai} is monotonically increasing, and {ai+1, …, an-1} is monotonically decreasing, • e.g. {1, 2, 4, 7, 6, 0} Or • (2) there exists a cyclic shift of indices so that (1) is satisfied • e.g. {8, 9, 2, 1, 0, 4} {0, 4, 8, 9, 2, 1}
Value of element { 3, 5, 7, 9, 8, 6, 4, 2 } ai a0 a1 a2 a3 a4 a5 a6 a7 Value of element { 8, 6, 4, 2, 3, 5, 7, 9} ai a0 a1 a2 a3 a4 a5 a6 a7 ①Bitonic Sequence : Examples
Value of element { 3, 5, 7, 9, 11, 13, 15, 17 } ai a0 a1 a2 a3 a4 a5 a6 a7 Value of element { 5, 3, 1, 2, 4, 6, 8, 7 } ai a0 a1 a2 a3 a4 a5 a6 a7 ①Bitonic Sequence : Examples
Value of element ai a0 a1... an/2-1 an/2 an/2 +1 … an-1 Bitonic Sort: basic idea • Consider a bitonic sequence S of size n where • the first half ( {a0, a1, …, an/2-1} ) is increasing, and the second half ( {an/2, an/2+1, …, an-1} ) is decreasing
ai a0 a1... an/2-1 an/2 an/2 +1 … an-1 Compare and exchange an/2 an/2-1 value value a0 an-1 ②“Bitonic Split” 双调分裂 • Pair-wise min-max comparison • s1 = {min(a0, an/2), min(a1, an/2+1), … , min(an/2-1, an-1)} • s2 = {max(a0, an/2), max(a1, an/2+1), … , max(an/2-1, an-1)} S2 S1
value • There exists • an element b in S1such that all elements before b is increasing and all elements after b is decreasing • an element c in S2 such that all elements before c is decreasing and all elements after c is increasing • S1 and S2 • Both S1 and S2 are bitonic sequences • Any elements in S1 < any elements in S2 (because b < c and b is the maximum value in S1 and c is the minimum value in S2) S2 c b S1
pair-wise min-max comparison e.g. { 2, 4, 6, 8, 7, 5, 3, 1} { 2, 4, 6, 8 7, 5, 3, 1 } => S1={2, 4, 3, 1} S2={7, 5, 6, 8} bitonic sequence of size 8 => 2 bitonic sequence of size 4 Compare and exchange
②Bitonic Split • The split is applicable to any bitonic sequence. • Need not to have the 1st half to be increasing/decreasing and the 2nd half to be decreasing/increasing: Bitonic Split 2 Bitonic(n/2) Bitonic(n)
Sorting a bitonic sequence • By using bitonic split recursively, INPUT:a bitonic sequence of size n • Phase 1: 2 bitonic sequence of size n/2 • Phase 2: 4 bitonic sequence of size n/4 • … • … • Phase (log n): n bitonic sequence of size 1 • a sorted sequence can be generated by concatenating the n bitonic sequence of size 1
③Bitonic Merge 双调合并 • sort a bitonic sequence using bitonic splits 1 2 3 4 5 6 7 8 9 10111213141516 length 16 8 4 2
Questions ? • How can we convert an unsorted sequence to a bitonic sequence ? (then, by using bitonic split recursively, a sorted sequence can be formed).
Turn an unsorted sequence into a bitonic sequence: ③Bitonic Merge (BM) Operation 1 2 3 4 5 6 7 8 9 10111213141516 length 4 8 16 At every phase, sort a bitonic sequence of size 2, 4, 8, 16 into a monotonically increasing or decreased sequence
④Bitonic Sort 1 2 3 4 5 6 7 8 9 10111213141516 length 4 8 16
Sort (any ordered of) sequence • Using bitonic merge repeatedly • Definition: • BM[n]: increasing bitonic merge of size n • bitonic merge : sort a bitonic sequence of size n into a monotonically increasing sequence • BM[n]: decreasing bitonic merge of size n • bitonic merge that sort a bitonic sequence of size n into a monotonically decreasing sequence
Steps: • Divide the sequence into a group of 2 • any sequence of size 2 is a bitonic sequence: either the increasing part is of size 2 and the decreasing part is of size 0, or vice versa • Using BM[2] on a group to form an increasing sequence, and BM[2] on the adjacent group to form an decreasing sequence • Concatenate the two group to form a bitonic sequence of size 4
Steps: • Repeat the above steps on other groups • Repeat the above steps recursively, until a bitonic sequence of size n is formed • Using bitonic merge again to turn the bitonic sequence into a sorted sequence
Bitonic Sorting Circuit: BS(18) • BM[n]: increasing bitonic merge of size n • BM[n]: decreasing bitonic merge of size n
Sort (any ordered of) sequence • Hence, • n unsorted numbers • n/2 group of 2-number bitonic sequence • n/4 group of 4-number bitonic sequence • … • 1 group of n-number bitonic sequence • a sorted sequence
⑤Complexity of Bitonic Sort • Parallel bitonic sort with n processor • The last stage of an n-element bitonic sorting need to merge n-element, and has a depth of log(n) • Other stages perform a complete sort of n/2 elements • Depth, d(n) = d(n/2) + log(n) • d(n) = 1 + 2 + 4 + … + log(n) = (log2n) • Complexity: T(n) = (log2n)
⑤Complexity of Bitonic Sort • Parallel sorting with a block of elements per processor • sort the local block of elements first (using any sorting algorithm such as quicksort, bitonic sort) • sort the elements among processors using parallel bitonic sort • T(n) = T(local_sort) + T(comparisons) +T(communication) • Only computation time is considered here (you need to consider all communication time also)
⑥Concluding Remarks • Bitonic Sorting: Common Sense • Regression to Computer Science • One of 10 Most Important Papers • Parallel Algorithm: Ascend/Descend • Another example: Prefix sum • Network Model:
Bitonic Sorting Network Hypercube connections! Try to Write Bitonic Sorting algorithm on hypercube.
PRAM Model … P1 P2 P3 Pn Memory • Access time from any processor to any memory unit is equal • It is impossible in practice • So it is an ideal model for parallel computing
PRAM Model • Program for Sum= a(1)+a(2)+…+a(N) for i = 1 to log N for j= 1 to n/ 2i parallel doa(j) = a(j)+ a(N/ 2i + j) endpar endfor endfor • Finally a(1) is the sum
Hypercube Model • Suppose node N(i) holds element a(i), where i is the value of node index x1x2…xn for i = 1 to n for j= i to n parallel do N(00…0 (xj=0) xj+1…xn) N(00…0 (xj=1) xj+1…xn); a(00…0 (xj=0) xj+1…xn)= a(00…0 (xj=0) xj+1…xn) + a(00…0 (xj=1) xj+1…xn) endpar endfor endfor • Finally node 00…0 holds the sum
Hypercube Model • Suppose node 000 holds element a(0) and 111holds element a(7) a(4) a(5) a(0) a(1) a(0)+a(4) a(1)+a(5) a(6) a(7) a(3) a(2) a(2)+a(6) a(3)+a(7) a(0)+a(4) +a(2)+a(6) a(0)+a(4) +a(2)+a(6) +a(1)+a(5) +a(3)+a(7) a(1)+a(5) +a(3)+a(7)