450 likes | 457 Vues
Updated 29.3.2004. QuickSort. Problem. From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j] in A[i]?” Note - there are a total of nlogn bits, so we are not allowed to read the entire input! . Solution.
E N D
Problem • From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j] in A[i]?” • Note - there are a total of nlogn bits, so we are not allowed to read the entire input!
Solution • Ask all the n integers what their last bit is and see whether 0 or 1 is the bit which occurs less often than it is supposed to. That is the last bit of the missing integer! • How can we determine the second-to-last bit?
Solution • Ask the n/2 numbers which ended with the correct last bit! By analyzing the bit patterns of the numbers from 0 to n which end with this bit. • By recurring on the remaining candidate numbers, we get the answer in T(n) = T(n/2) + n =O(n), by the Master Theorem
Why is sorting so important? • Most of the interesting concepts in the course can be taught in the context of sorting, such as: • Divide and conquer • Randomized algorithms • Lower bounds
Why is sorting so important? • One of the reasons why sorting is so important, is that after sorting items, other problems become very simple to solve.
Searching • Binary search runs on a sorted set in O(logn) time • Searching for an element in a non sorted set take linear time • This is probably the most important application of sorting
Element Uniqueness • Given a set of numbers we want to check if all numbers are unique. • Sort the elements and linearly scan all adjacent pairs.
Closest pairs • Given n numbers, find the pair which are closest to each other. • After sorting the elements, the closest pairs will be next to each other, so a linear scan will do. • Related problems….
Frequency distribution • Which element appears the largest number of times in a set. • After sorting, a linear scan will do.
Median and Order statistics • What is the median of a set of numbers? What is the k-th largest element? • After sorting the elements the k-th largest element can be found at index k, in constant time
Convex hulls • Given n points in two dimensions, find the smallest area polygon which contains them all.
Huffman Codes • If you are trying to minimize the size of a text file, you should want to assign different lengths to represent different characters, according to the frequency in which each character appears in the text.
Quicksort • Although mergesort is O(nlogn), it is not convenient to be used with arrays, since it requires extra space. • In practice, Quicksort is the fastest algorithm and it uses partition as its main idea.
Quicksort • Partitioning places all the elements less than the pivot in the left part of the array, and all elements greater than the pivot in the right part of the array. • The pivot fits in the slot between them.
Partition • Example – use 10 as a pivot • Note that the pivot element ends up in the correct place in the total order!
Partition • First we must select a pivot element • Once we have selected a pivot element, we can partition the array in one linear scan, by maintaining three sections of the array: • All elements smaller than the pivot • All elements greater than the pivot • All unexplored elements
Example: pivot element is 10 | 17 12 6 19 23 8 5 | 10 | 5 12 6 19 23 8 | 17 5 | 12 6 19 23 8 | 17 5 | 8 6 19 23 | 12 17 5 8 | 6 19 23 | 12 17 5 8 6 | 19 23 | 12 17 5 8 6 | 19 | 23 12 17 5 8 6 ||19 23 12 17 5 8 6 10 23 12 17 19
Quicksort • Partition does at most n swaps and takes linear time. • The pivot element ends up in the position it retains in the final sorted order. • After a partitioning, no element flops to the other side of the pivot in the final sorted order. • Thus we can sort the elements to the left of the pivot and the right of the pivot independently! And recursively
QuickSort QuickSort(A, p,r) if (p < r) then q Partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r) QuickSort(A,1,length[A])
QuickSort publicvoid sort (Comparable[] values) { sort (values, 0, values.length - 1); } privatevoid sort (Comparable[] values, int from , int to) { if (from < to) { int pivot = partition (values, from , to); sort(values, from, pivot); sort(values, pivot + 1, to); } }
privateint partition (Comparable[] values, int from, int to) { Comparable pivot = values[from]; int j = to + 1; int i = from - 1; while (true) { do { j--; } while (values[j].compareTo(pivot) >= 0); do { i++; } while (values[i].compareTo(pivot) < 0); if (i < j) { Comparable temp = values[i]; values[i] = values[j]; values[j] = temp; } else { return j; } } }
Partition • The partition method returns the index separating the array, but also has a side effect, which is swapping the elements in the array according to their size
Partition – version 2 publicint partition (int[] values, int from, int to) { int pivot = values[from]; int leftWall = from; for (int i = from + 1; i <= to; i++) { if (values[i] < pivot) { leftWall++; int temp = values[i]; values[i] = values[leftWall]; values[leftWall] = temp; } } int temp = values[from]; values[from] = values[leftWall]; values[leftWall] = temp; return leftWall; }
Partition (A[], left, right) 1. pivot left 2. temp right 3. while temp <> pivot 4. if A[min(pivot,temp] > A[max(pivot,temp)] 5. swap (A[pivot],A[temp]) 6. swap (pivot,temp) 7. if temp > pivot 8. temp— 9. else 10. temp++ 11. return pivot
Time Analysis • The running time for quick sort depends on how equally partition divides the array. • The chosen pivot element determines how equally partition divides the array. • If partition results in equal sub arrays quicksort can be as good as merge sort • If partition to equally divide the array, quicksort may be asymptotically as worse as insertion sort.
Best case • Since each element ultimately ends up in the correct position, the algorithm correctly sorts. But how long does it take? • The best case for divide-and-conquer algorithms comes when we split the input as evenly as possible. Thus in the best case, each subproblem is of size n/2.
Best case • The partition step on each subproblem is linear in its size. Thus the total effort in partitioning each step is O(n). • The total partitioning on each level is O(n), and it take log(n) levels of perfect partitions to get to single element. The total effort is therefore O(nlogn)
Worst case • If the pivot is the biggest or smallest element in the array, the sub-problems will be divided into size 0 and n-1, thus instead of log(n) recursive steps we end up with O(n) recursive steps and a total sorting time of O(n^2)
Worst case • The worst case input for quick sort depends on the way we choose the pivot element. • If we choose the first or last element as the pivot, the worst case is when the elements are already sorted!!
Worst case • Having the worst case occur in a sorted array is bad, since this is an expected case in many applications. (Insertion sort deals with sorted arrays in linear time.) • To eliminate this problem, pick a better pivot: • Use a random element of the array as the pivot. • take the median of three elements (first, last, middle) as the pivot. • The worst case remains, however, because the worst case is no longer a natural order it is much more difficult to occur.
Randomization • Quicksort is good on average, but bad on certain worst-case instances. • Suppose you picked the pivot element at random. Your enemy select a worst case input because you would have the same probability for a good pivot! • By either picking a random pivot or scrambling the permutation before sorting it, we can say: ``With high probability, randomized quicksort runs in O(nlogn) time.''
Time Analysis • Worst Case
Time Analysis • Best Case
Time Analysis • Average Case