 Download Presentation Faster Sorting

# Faster Sorting

Télécharger la présentation ## Faster Sorting

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Faster Sorting Recursion Merge sort, Quick sort, Heap Sort

2. Outline • Recursive sorting methods • Merge sort • solving recurrence relations • Quick sort • Non-recursive sorting method with splitting • Heap sort

3. Recursive Sorting Methods • Merge sort • split list in half (as close as possible) • sort each half separately • merge results • Quick sort • small items to front • larger items to rear • sort each part (not “half”) separately

4. Merge Sort • Based on combining pre-sorted lists • compare first elements of both lists • move smaller to result & repeat • when done one list, just copy rest of other • Use recursion to get sorted sub-lists • when down to length 1, automatically sorted • Requires an auxiliary array

5. Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 94 81 11 32 12 24 17 29

6. Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 94 81 11 32 12 24 17 29

7. 81 94 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 94 81 11 32 12 24 17 29 81 94

8. 11 32 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 81 94 11 32 12 24 17 29 11 32

9. 11 32 81 94 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 81 94 11 32 12 24 17 29 11 32 81 94

10. Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 11 32 81 94 12 24 17 29

11. 12 24 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 11 32 81 94 12 24 17 29 12 24

12. 17 29 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 11 32 81 94 12 24 17 29 17 29

13. 12 17 24 29 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 11 32 81 94 12 24 17 17 17 29 29 29 12 17 24 29

14. 11 12 17 24 29 32 81 94 Merge Sort • Sort left half • Sort right half • sort each into the auxiliary array • Merge two halves into auxiliary array • copy result back to original 11 32 81 94 12 17 17 24 17 29 29 29 11 12 17 24 29 32 81 94

15. Merge Sort to MergeSort(List a) temp  new List(Length(a)); MergeSort(a, temp, 1, Length(a)); to MergeSort(List a, List b, int lo, int hi) if (lo < hi) mid  lo + (hi – lo)  2; MergeSort(a, b, lo, mid); MergeSort(a, b, mid+1, hi); Merge(a, b, lo, mid+1, hi);

16. Defensive Programming • Consider finding the midpoint mid = lo + (hi – lo) / 2; • could just do mid = (hi + lo) / 2; • BUT what if the array is HUGE • hi == 2,000,000,000; lo == 1,000,000,000 • (hi + lo) / 2 == -647,483,648 (int overflow) • lo + (hi – lo) / 2 == 1,500,000,000 (no overflow) • hi and lo both positive, so no underflow worry

17. Merging Sorted Sub-lists 11 11 12 32 81 17 24 94 29 12 17 32 17 24 81 17 29 29 94 29 • Each half sorted • left start • right start • right end • left end (right start – 1) • Take smaller front item until one side ends • Take rest of items • Copy back 0 0 1 2 3 4 4 5 6 7 8 7 3 0 11 0 12 0 17 0 24 0 29 0 32 17 17 0 81 0 29 29 94 0 1 2 3 4 5 6 7 8

18. Merging Sorted Sub-lists to Merge(List a, List b, int ls, int rs, int re) le  rs – 1, p  ls, initialLS  ls; while (ls  le && rs  re) b[p++]  (a[ls] < a[rs]) ? a[ls++] : a[rs++]; while (ls  le) b[p++]  a[ls++]; while (rs  re) b[p++]  a[rs++]; for i  initialLS .. re a[i]  b[i]; ls = “left start” rs = “right start” re = “right end”

19. Exercise • Show the evolution of the following Lists under merge sort: • lst1  [15, 3, 21, 45, 7, 17, 4, 12]; • lst2  [13, 11, 20, 15, 16, 6, 9, 87];

20. Complexity of Merge Sort • Assume length is a power of 2 • makes the math easier • Time to merge sort 2k items: • time to merge sort 2k–1 items • time to merge sort 2k–1 (other) items • time to merge them • TMergeSort(2k) = 2TMergeSort(2k–1) + TMerge(2k) • “recurrence” relation

21. Complexity of Merge Sort • Worst case time to merge N items is O(N) • at most N comparisons, at most N assignments • for O calculation, just take to be N • Base case: 1 item takes 0 time to sort • T(1) = 0 • T(2) = 2T(1) + 2 = 2(0) + 2 = 2 • T(4) = 2T(2) + 4 = 2(2) + 4 = 8 • T(8) = 2T(4) + 8 = 2(8) + 8 = 24

22. Complexity of Merge Sort • Solving by “inspection” – try N2 NTMergeSort(N)N2T(N) / N2 1 0 1 0 2 2 4 1/2 4 8 16 1/2 8 24 64 3/8 16 64 256 1/4 • Ratio shrinking – probably better than N2

23. Complexity of Merge Sort • Solving by “inspection” – factor out an N NTMergeSort(N)TMergeSort(N)/N 1 0 0 2 2 1 4 8 2 8 24 3 16 64 4 • Hmmm – what’s that look like?

24. Complexity of Merge Sort • Solving by “inspection” – factor out an N NTMergeSort(N)TMergeSort(N)/Nlog N 1 0 0 0 2 2 1 1 4 8 2 2 8 24 3 3 16 64 4 4 • TMergeSort(N) = O(N log N)

25. Reality Check • Next couple of values • T(32) = 2T(16) + 32 = 2(64) + 32 = 160 • 32 log 32 = 32(5) = 160 • T(64) = 2T(32) + 64 = 2(160) + 64 = 384 • 64 log 64 = 64(6) = 384 • looking good… • …or you could try to prove it • by induction

26. Recursive Sorting Methods • Merge sort • split list in half (as close as possible) • sort each half separately • merge results • Quick sort • small items to front • larger items to rear • sort each part (not “half”) separately

27. 12 17 24 11 29 32 94 81 all  29 all  29 “pivot” Quicksort Overview • Get all the small items down near the front • Get all the large items up near the back • Sort each half separately • No need to merge the results! 81 94 32 11 12 24 17 29

28. Quicksort Overview • List is partitioned into small & large parts • Sort part below pivot • Sort part above pivot 12 17 24 11 29 32 94 81 11 12 17 24 29 32 94 81 11 12 17 24 29 32 81 94

29. Quicksort to Quicksort(List A, int lo, int hi) if (lo < hi) /* more than 1 item left */ mid  Partition(A, lo, hi); Quicksort(A, lo, mid–1); Quicksort(A, mid+1, hi); most of the work done in Partition

30. 12 17 24 11 29 32 94 81 all  29 all  29 “pivot” Partitioning a List • Getting a list into the shape required… • small items all to front, large all to rear • …is called partitioning the list • Pivot might not end up right in the middle 81 94 32 11 12 24 17 29

31. Partitioning a List • Need to move big items away from the front • and small items away from the back • small the pivot; big  the pivot • pivot is small and big • “Match” small items with big ones • find first big item & last small one • if they’re out of order, swap them • Need to return where split point is

32. 81 29 94 94 32 32 11 11 12 12 24 24 17 17 81 29 Partitioning Example • Pick 29 as the pivot • using the “lucky guess” method • Swap pivot to front • temporary location

33. 29 29 17 94 32 32 11 11 12 12 24 24 17 94 81 81 Partitioning Example • Look for value bigger than pivot • start above pivot, go up • find 94 right away • Look for value smaller than pivot • start at end of array, back down • find 17 • Swap them

34. 29 29 17 17 32 24 11 11 12 12 24 32 94 94 81 81 Partitioning Example • Repeat • find next “big” • find next “small” • Swap them • Repeat until no more swaps to do • looking for small, reach last big • (maybe was no last big = go past end of array)

35. 29 12 17 17 24 24 11 11 29 12 32 32 94 94 81 81 Partitioning Example • No more exchangesleft to do • next big value (32) in place • Put pivot into place • exchange with lastsmall value • (just in front of “next big value 32)”) • Return pivot’s (final) position

36. Partition to Partition( List A, int lo, int hi ) p  PivotPlace(A, lo, hi); A[p]  A[lo] /* put pivot at front */ i  lo + 1, j  hi; while (i < j) while (i  j && A[i]  A[lo]) {i++;} while (i  j && A[lo]  A[j]) {j--;} if (i < j) {A[i]  A[j];} A[lo]  A[j]; /* put pivot in place */ return j; /* return pivot location */

37. Picking a Pivot • Need to pick some number • Want it to split the list into two equal parts • hard to get exactly right • expensive to get exactly right • Need to go with something cheap to get • first item in list is very cheap • others a little more expensive – but worth it

38. First Element Pivot • Pick 81 as the pivot • using the “first element” method • First big = 94; last small = 29 • out of order – swap them • Done • swap pivot with last small • 6 to 1 split • not good 81 94 32 11 12 24 17 29 81 29 32 11 12 24 17 94 17 29 32 11 12 24 81 94

39. 12 81 94 17 32 24 11 11 29 12 24 32 17 94 29 81 Median of Three Pivot • Compare 1st, last & mid-point elements • 0, 3 and 7 • Pick the median • the one that’s neither biggest nor smallest • the median of 81, 11 and 29 is 29 • Less chance of getting a bad choice • this splits 4 to 3 29

40. 81 94 32 11 12 24 17 29 lo mid PivotPlace (Median of Three) • Small “sorting” problem • three elements • but don’t move the elements! • find small, medium & large small medium compare lo to mid large

41. 81 94 32 11 12 24 17 29 hi PivotPlace (Median of Three) • Small “sorting” problem • three elements • but don’t move the elements! • find small, medium & large small medium compare medium to hi large

42. 81 94 32 11 12 24 17 29 PivotPlace (Median of Three) • Small “sorting” problem • three elements • but don’t move the elements! • find small, medium & large small medium compare medium to hi large

43. 81 94 32 11 12 24 17 29 PivotPlace (Median of Three) • Small “sorting” problem • three elements • but don’t move the elements! • find small, medium & large small medium if medium changed: compare medium to small large

44. 81 94 32 11 12 24 17 29 PivotPlace (Median of Three) • Small “sorting” problem • three elements • but don’t move the elements! • find small, medium & large small medium return medium large

45. PivotPlace (Median of Three) the PivotPlace(List A, int lo, int hi) mid  lo + (hi – lo) ÷ 2; if (A[mid] < A[lo]) { sm mid; medium  lo; } else { sm lo; medium  mid; } if (A[hi] < A[mid]) { med  hi; /* don’t need to keep track of lg */ if (A[medium] < A[sm]) { medium  sm; } } return medium;

46. Exercise • Show the evolution of the following list under quicksort. Show the result after each call to Partition (use median of 3 pivot): • [15, 3, 21, 45, 7]

47. Complexity of Partition • Always takes O(N) comparisons • start i at one end, j at the other • look at each item at most 2x (usually only 1x) • Worst case reverses the list: N/2 swaps • How long to pick the pivot? • can do it in constant time • that’s best

48. Complexity of Quick Sort • Worst case: pivot is biggest (smallest) item • each step reduces list by length 2 • O(N + (N–2) + (N–4) + … + 3 + 1) time • = O(N2) time • Best case: pivot is median item • each step splits list in half • T(2N+1) = 2T(N) + O(N) = O(N log N) time Saw similar recurrence with merge sort

49. 81 5 24 7 8 12 17 5 7 24 81 8 12 17 17 5 24 7 8 24 81 7 5 8 81 24 12 17 8 5 17 7 17 24 81 8 5 7 12 17 81 24 Average Case Complexity • Assume each possible split is equally likely • N = 7: 1-5, or 2-4, or 3-3, or 4-2, or 5-1 • T(N) = O(N) + 2 * average(T(1) to T(N–2))

50. Calculated T(N)s N Formula T(N) Average So Far 1 1 1 1 2 2 2 1.5 3 3 + 2*(1) 5 2.67 4 4 + 2*(1.5) 7 3.75 5 5 + 2*(2.67) 10.33 5.07 6 6 + 2*(3.75) 13.5 6.47 7 7 + 2*(5.07) 17.13 8.00 8 8 + 2*(6.47) 20.94 9.61