Description
E N D
Presentation Transcript
Description Given a linear collection of items x1, x2, x3,….,xn arrange them so that they (or some key field in them) are in ascending order x1<= x2<=x3 ….<=xn or in descending order x1>= x2>=x3 ….>=xn
Some Issues • internal vs. external sorting • array vs. linked list data structure • N2 vs. N log2 N performance • worst case vs. average case big-O • additional memory requirements • stable vs. unstable sorts
N2 sorting strategy • start with unsorted array of length N • do a series of “passes” through the array (outer loop) • during each pass do comparisons and maybe exchanges of items (inner loop) • result of a pass is to reduce size of the unsorted part of the array by 1 and increase size of the sorted part of the array by 1 • number of passes is N-1 and work/pass is O(N) so big-Oh is N2
N2 sorting algorithms • Selection Sort • each pass selects largest/smallest remaining element and moves it to its correct location • Insertion Sort • each pass inserts 1 element into an already sorted array • Exchange (Bubble) Sort • each pass does a series of exchanges so that 1 element moves to where it belongs
N2 sorts compared Algorithm #comp #exchanges Big O selection (N2-N)/2 N-1 N2 bubble worst (N2-N)/2 (N2-N)/2 N2 average (N2-N)/2 (N2-N)/4 N2 best N-1 0 N insertion worst (N2-N)/2 (N2-N)/2 N2 average (N2-N)/4 (N2-N)/4 N2 best N-1 N-1 N
insertion sort start order 67 33 25 21 94 49 after pass 1 33 67 25 21 94 49 after pass 2 25 33 67 21 94 49 after pass 3 2125 33 67 94 49 after pass 4 2125 33 6794 49 after pass 5 2125 33 49 67 94 How many passes are needed to sort N items? Work (comparisons/exchanges) per pass?
how to sort faster? • N2 algorithms do N-1 passes and O(N) work/pass --> O(N2) • reduce passes to log2 N • mergesort • quicksort • reduce work/pass to log2 N • heapsort • don’t do any comparisons at all • radix sort
QuickSort • fastest general purpose sort • usually written recursively • many variations • usually is O(N log2 N) • performance is data-dependent • worst case is O(N2)
Quicksort • Choose some element called a pivot • Perform a sequence of exchanges so that • All elements that are less than this pivot are to its left and • All elements that are greater than the pivot are to its right. • Divides the (sub)list into two smaller sub lists, • Each of which may then be sorted independently in the same way.
Quicksort If the list has 0 or 1 elements, return. // the list is sorted Else do: Pick an element (near middle) as the pivot. Split remaining elements into two disjoint groups: SmallerThanPivot= {all elements < pivot} LargerThanPivot= {all elements > pivot} Return the list rearranged as: Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).
QuickSort void QuickSort (ElementType x[ ], int first, int last) { if (first < last) int pivot; pivot = Split (x, first, last); QuickSort (x, first, pivot - 1); QuickSort (x, pivot + 1, last); } QuickSort (A, 0, N-1);
QuickSort strategy initial order: 45 23 13 68 54 17 70 24 after pass 1: 17 23 13 244554 70 68 after pass 2: 131723 2445 54 70 68 after pass 3: 13 17 23 24 45 54 68 70 How many passes? Work/pass?
Quicksort Performance • O(log2n) is the average case computing time • If the pivot results in sublists of approximately the same size. • O(n2) worst-case • List already ordered, elements in reverse • When Split() repetitively results, for example, in one empty sublist
Assume Even Partitioning N N/2 N/2 N/4 N/4 N/4 N/4 big-Oh is N log2 N
Worst Case N N-1 N-2 O(N2) N-3
Quicksort Improvements • pivot selection methods • first element • middle element • median of 3 • random choice • insertion sort for partitions of size < 20 • leave small partitions unsorted and do one insertion sort of whole array at the end • sort smaller partition first • STL sort algorithm uses improved Quicksort
MergeSort • based on algorithm to merge 2 sorted lists into 1 sorted list • merging can be done in O(N) time • N is number of items in the 2 sorted lists • usually written recursively • "work" is done during the unwinding • requires O(N) temporary memory • what do other sorting algorithms need? • average and worst case big O is N log2 N • used for sorting large external files
Merge Algorithm 1. Open File1 and File2 for input, File3 for output 2. Read first element x from File1 and first element y from File2 3. While neither eof File1 or eof File2If x < y then a. Write x to File3 b. Read a new x value from File1Otherwise a. Write y to File3 b. Read a new y from File2End while 4. If eof File1 encountered copy rest of of File2 into File3. If eof File2 encountered, copy rest of File1 into File3
MergeSort strategy initial order: 45 23 13 68 54 17 70 24 pass 1: 45 23 13 6854 17 70 24 pass 2: 45 2313 6854 1770 24 pass 3: 4523136854177024 13 17 23 24 45 54 68 70 13 23 45 6817 24 54 70 23 4513 6817 5424 70 How many passes? Work/pass?
Shellsort (D.L.Shell) voidshellsort (int[ ] a, int n) { inti, j, k, h, v; int[ ] cols = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1} for (k=0; k<16; k++) { h=cols[k]; for (i=h; i<n; i++) { v=a[i]; j=i; while (j>=h && a[j-h]>v) { a[j]=a[j-h]; j=j-h; } a[j]=v; } } }
Implementing a Heap -1 • Note the placement of the nodes in the array • The lower value key always has a parent node with a higher-value key. This is NOT a heap (yet)
Implementing a Heap -2 • In an array implementation children of ithnode are at myArray[2*i] and myArray[2*i+1] • Parent of theithnode is atmyArray[i/2]
heapSort • 2 phases – done sequentially • reorganize the array into a heap • takes O(N log2 N) • do N-1 passes • each pass moves the largest/smallest item to where it belongs • selection of largest item takes O(log2 N) work • takes O(N log2 N)
phase 1 of heapSort • work node by node, down the tree • leaf nodes are already heaps • for every non-leaf node, move it down a path of children until heap property is satisfied • possible because each subtree will already be a heap • step one can be done in N log2 N time
6 3 5 9 2 10 10 9 6 3 2 5 6 3 10 9 2 5 6 9 10 3 2 5 making a heap A [6 3 5 9 2 10] (not a heap) A [10 9 6 3 2 5] Now it is a heap
Heapsort Algorithm 1. Consider x as a complete binary tree, use heapify to convert this tree to a heap 2. for i = n down to 2:a. Interchange x[1] and x[i] (puts largest element at end)b. Apply percolate_down to convert binary tree corresponding to sublist in x[1] .. x[i-1]
Heapsort • Algorithm for converting a complete binary tree to a heap – called "heapify"For r = n/2 down to 1: Apply percolate_down to the subtree in myArray[r] , … myArray[n]End for • Puts largest element at root
Heapsort • Now swap element 1 (root of tree) with last element • This puts largest element in correct location • Use percolate down on remaining sublist • Converts from semi-heap to heap
Heapsort • Again swap root with rightmost leaf • Continue this process with shrinking sublist
Percolate Down Algorithm Initialize: c = 2 * r // r=root of subtree While r <= n/2 do following If c < n and myArray[c] < myArray[c + 1] Increment c by 1If myArray[r] < myArray[c] Swap myArray[r] and myArray[c]r = cc = 2 * celse endEnd while
Priority Queue • Implementation possibilities • As a list (array, vector, linked list) • As an ordered list • Best is to use a heapBasic operations have O(log2n) time • STL priority queue adapter uses heap • Note operations in table of Fig. 13.2 in text, page 766
Radix Sort • Based on examining digits in some base-b numeric representation of items (or keys) • Least significant digit radix sort • Processes digits from right to left • Used in early punched-card sorting machines • Create groupings of items with same value in specified digit • Collect in order and create grouping with next significant digit