1 / 68

Chapter 9: Algorithm Efficiency and Sorting

Chapter 9: Algorithm Efficiency and Sorting. Chien Chin Chen Department of Information Management National Taiwan University. Measuring the Efficiency of Algorithms (1/5). Measuring an algorithm ’ s efficiency is very important.

teresabrown
Télécharger la présentation

Chapter 9: Algorithm Efficiency and Sorting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 9: Algorithm Efficiency and Sorting Chien Chin Chen Department of Information Management National Taiwan University

  2. Measuring the Efficiency of Algorithms (1/5) • Measuring an algorithm’s efficiency is very important. • Your choice of algorithm for a application often has a great impact. • Components that contribute to the cost of a computer program: • The cost of human time. • Time of development, maintenance … • The cost of program execution. • The amount of computer time and space that the program requires to execute.

  3. Measuring the Efficiency of Algorithms (2/5) • Analysis of algorithms: • Provides tools for contrasting the efficiency of different algorithms. • Time efficiency, space efficiency • Should focus on significant differences in efficiency. • Should not consider reductions in computing costs due to clever coding tricks.

  4. Measuring the Efficiency of Algorithms (3/5) • This chapter focuses on time efficiency. • Comparison: implement different programs; check which one is faster. • Three difficulties with comparing programs instead of algorithms: • How are the algorithms coded? • What computer should you use? • What data should the programs use? • The most important difficulty.

  5. Measuring the Efficiency of Algorithms (4/5) • Algorithm analysis should be independent of : • Specific implementations. • Computers. • Data. • How? • Counting the number of significant operations in a particular solution.

  6. Measuring the Efficiency of Algorithms (5/5) • Counting an algorithm's operations is a way to assess its time efficiency. • An algorithm’s execution time is related to the number of operations it requires. • Example: Traversal of a linked list of n nodes. • n + 1 assignments, n + 1 comparisons, n writes. • Example: The Towers of Hanoi with n disks. • 2n - 1 moves.  1 assignment Node *cur = head; while (cur != NULL) { cout << cur->item << endl; cur = cur->next; }  n+1 comparisons  n writes  n assignments

  7. Algorithm Growth Rates (1/3) • An algorithm’s time requirements can be measured as a function of the problem size(instance characteristic). • Number of nodes in a linked list. • Size of an array. • Number of items in a stack. • Number of disks in the Towers of Hanoi problem. • Algorithm efficiency is typically a concern for large problems only.

  8. Algorithm Growth Rates (2/3) Varies with different computers and implementations. • Algorithm A requires time proportional ton2 • Algorithm B requires time proportional ton

  9. Algorithm Growth Rates (3/3) • An algorithm’s growth rate: • How quickly the algorithm’s time requirement grows as a function of the problem size. • Algorithm A requires time proportional to n2. • Algorithm B requires time proportional to n. • Algorithm B is faster than algorithm A. • n2 and n are growth-rate functions. • A mathematical function used to specify an algorithm’s order in terms of the size of the problem. • Algorithm A is O(n2) - order n2. • Algorithm B is O(n) - order n. • Big O notation.

  10. Order-of-Magnitude Analysis and Big O Notation (1/5) • Definition of the order of an algorithm: Algorithm A is order f(n) – denoted O (f(n)) – if constants k and n0 exist such that A requires no more than k * f(n) time units to solve a problem of size n ≥ n0.

  11. Order-of-Magnitude Analysis and Big O Notation (2/5) O(1) constant

  12. Order-of-Magnitude Analysis and Big O Notation (3/5)

  13. Order-of-Magnitude Analysis and Big O Notation (4/5) • Order of growth of some common functions: • O(1) < O(log2n) < O(n) < O(n * log2n) < O(n2) < O(n3) < O(2n). • Properties of growth-rate functions: • O(n3 + 3n) is O(n3): ignorelow-order terms. • O(5 f(n)) = O(f(n)): ignore multiplicative constant in the high-order term. • O(f(n)) + O(g(n)) = O(f(n) + g(n)).

  14. Order-of-Magnitude Analysis and Big O Notation (5/5) • An algorithm can require different times to solve different problems of the same size. • Average-case analysis • A determination of the average amount of time that an algorithm requires to solve problems of size n. • Best-case analysis • A determination of the minimum amount of time that an algorithm requires to solve problems of size n. • Worst-case analysis • A determination of the maximum amount of time that an algorithm requires to solve problems of size n. • Is easire to calculate and is more common.

  15. Keeping Your Perspective (1/2) • Only significant differences in efficiency are interesting. • Frequency of operations: • When choosing an ADT’s implementation, consider how frequently particular ADT operations occur in a given application. • However, some seldom-used but critical operations must be efficient. • E.g., an air traffic control system.

  16. Keeping Your Perspective (2/2) • If the problem size is always small, you can probably ignore an algorithm’s efficiency. • Order-of-magnitude analysis focuses on large problems. • Weigh the trade-offs between an algorithm’s time requirements and its memory requirements. • Compare algorithms for both style and efficiency.

  17. The Efficiency of Searching Algorithms (1/2) • Sequential search • Strategy: • Look at each item in the data collection in turn. • Stop when the desired item is found, or the end of the data is reached. • Efficiency • Worst case: O(n)

  18. The Efficiency of Searching Algorithms (2/2) • Binary search of a sorted array • Strategy: • Repeatedly divide the array in half. • Determine which half could contain the item, and discard the other half. • Efficiency • Worst case: • For large arrays, the binary search has an enormous advantage over a sequential search. • At most 20 comparisons to search an array of one million items. O(log2n)

  19. Sorting Algorithms and Their Efficiency • Sorting: • A process that organizes a collection of data into either ascending or descending order. • The sort key. • The part of a data item that we consider when sorting a data collection. • Categories of sorting algorithms. • An internal sort: • Requires that the collection of data fit entirely in the computer’s main memory. • An external sort: • The collection of data will not fit in the computer’s main memory all at once, but must reside in secondary storage.

  20. Selection Sort (1/9) • Strategy: • Select the largest (or smallest) item and put it in its correct place. • Select the next largest (or next smallest) item and put it in its correct place. • And so on. • Until you have selected and put n-1 of the n items. • Analogous to card playing.

  21. Selection Sort (2/9) • Shaded elements are selected; boldface element are in order. Initial array: Select: swap Partial sorted array: Select: swap Sorted array:

  22. /** Finds the largest item in an array. * @pre theArray is an array of size items, size >= 1. * @post None. * @param theArray The given array. * @param size The number of elements in theArray. * @return The index of the largest item in the array. The * arguments are unchanged. */ int indexOfLargest(const DataType theArray[], int size) { int indexSoFar = 0; // index of largest item // found so far for (int currentIndex = 1; currentIndex < size; ++currentIndex) { if (theArray[currentIndex] > theArray[indexSoFar]) indexSoFar = currentIndex; } // end for return indexSoFar; // index of largest item } // end indexOfLargest typedef int DataType; 0 1 indexSoFar: currentIndex size-1

  23. Selection Sort (4/9) /** Swaps two items. * @pre x and y are the items to be swapped. * @post Contents of actual locations that x and y * represent are swapped. * @param x Given data item. * @param y Given data item. */ void swap(DataType& x, DataType& y) { DataType temp = x; x = y; y = temp; } // end swap

  24. Selection Sort (5/) /** Sorts the items in an array into ascending order. * @pre theArray is an array of n items. * @post The array theArray is sorted into ascending order; * n is unchanged. * @param theArray The array to sort. * @param n The size of theArray. */ void selectionSort(DataType theArray[], int n) { // last = index of the last item in the subarray of // items yet to be sorted, // largest = index of the largest item found for (int last = n-1; last >= 1; --last) { // select largest item in theArray[0..last] int largest = indexOfLargest(theArray, last+1); // swap largest item theArray[largest] with // theArray[last] swap(theArray[largest], theArray[last]); } // end for } // end selectionSort

  25. Selection Sort (6/9) • Analysis: • Sorting in general compares, exchanges, or moves items. • We should count these operations. • Such operations are more expensive than ones that control loops or manipulate array indexes, particularly when the data to be sorted are complex. • The for loop in the function selectoinSort executes n-1 times. • indexOfLargest and swap are called n-1 times.

  26. Selection Sort (7/9) • Each call to indexOfLargest causes its loop to execute last(or size – 1) times. • Cause the loop to execute a total of: • (n-1) + (n-2) + … + 1 = times. • Each execution of the loop performs one comparison: • The calls of indexOfLargest require n*(n-1)/2comparisons. n*(n-1)/2

  27. Selection Sort (8/9) • The n-1 calls to swap require: • 3 * (n-1)moves. • Together, a selection sort of n items requires: • n*(n-1)/2 + 3(n-1) = n2/2 + 5n/2 -3 major operations. • Thus, selection sort is O(n2).

  28. Selection Sort (9/9) • Does not depend on the initial arrangement of the data. • Best case = worst case = average case = O(n2) • Only appropriate for small n!!

  29. Bubble Sort (1/4) • Strategy • Compare adjacent elements and exchange them if they are out of order. • Moves the largest (or smallest) elements to the end of the array. • Repeating this process eventually sorts the array into ascending (or descending) order.

  30. Bubble Sort (2/4) Pass 1 Pass 2 Initial array: Initial array: swap swap swap

  31. Bubble Sort (3/) /** Sorts the items in an array into ascending order. * @pre theArray is an array of n items. * @post theArray is sorted into ascending order; n is unchanged. * @param theArray The given array. * @param n The size of theArray. */ void bubbleSort(DataType theArray[], int n) { bool sorted = false; // false when swaps occur for (int pass = 1; (pass < n) && !sorted; ++pass) { sorted = true; // assume sorted for (int index = 0; index < n-pass; ++index) { int nextIndex = index + 1; if (theArray[index] > theArray[nextIndex]) { // exchange items swap(theArray[index], theArray[nextIndex]); sorted = false; // signal exchange } // end if } // end for } // end for } // end bubbleSort You can terminate the process if no exchanges occur during any pass.

  32. Bubble Sort (4/4) • Analysis: • In the worst case, the bubble sort requires at most n-1 passes through the array. • Pass 1 requires n-1 comparisons and at most n-1 exchanges. • Pass 2 require n-2 comparisons and at most n-2 exchanges. • ... • Require a total of • (n-1) + (n-2) + … + 1 = n*(n-1)/2 comparisons. • (n-1) + (n-2) + … + 1 = n*(n-1)/2 exchanges. • Each exchange require 3 moves. • Thus, altogether there are: 2 * n*(n-1) = O(n2) • In the best case: require only one pass. • n-1 comparisons and no exchanges: O(n)

  33. Insertion Sort (1/4) • Strategy: • Partition the array into two regions: sorted and unsorted. • At each step, • Take the first item from the unsorted region. • Insert it into its correct order in the sorted region. insert sorted unsorted

  34. Insertion Sort (2/4) • You can omit the first step by considering the initial sorted region to be theArray[0] and the initial unsorted region to be theArray[1...n-1]. Initial array: Initial array: shift: shift: shift: insert: insert:

  35. Insertion Sort (3/) /** Sorts the items in an array into ascending order. * @pre theArray is an array of n items. * @post theArray is sorted into ascending order;n is unchanged. * @param theArray The given array. * @param n The size of theArray. */ void insertionSort(DataType theArray[], int n) { for (int unsorted = 1; unsorted < n; ++unsorted) { DataType nextItem = theArray[unsorted]; int loc = unsorted; for (;(loc > 0) && (theArray[loc-1]> nextItem); --loc) // shift theArray[loc-1] to the right theArray[loc] = theArray[loc-1]; // insert nextItem into Sorted region theArray[loc] = nextItem; } // end for } // end insertionSort nextItem 5 unsorted 9 5 7 7 9 5 loc

  36. Insertion Sort (4/4) • Analysis: • In the worst case, the outer for loop executes n-1 times. • This loop contains an inner for loop that executes at mostunsorted times. • unsorted ranges from 1 to n-1. • Number of comparisons and moves: • 2 * [1+2+…+(n-1)] = n*(n-1). • The outer loop moves data items twice per iteration, or 2*(n-1) times. • Together, there are n*(n-1) + 2*(n-1) = n2+n-2 major operations in the worst case, O(n2). • Prohibitively inefficient for large arrays.

  37. Mergesort (1/9) • A recursive sorting algorithm. • Strategy: • Divide an array into halves. • Sort each half (by calling itself recursively). • Merge the sorted halves into one sorted array. • Base case: an array of one item IS SORTED.

  38. Mergesort (2/9) theArray: Divide the array in half and conquer ... sorted array: Merge the halves tempArray: 1 2 3 4 8 copy theArray:

  39. Mergesort (3/9) /** Sorts the items in an array into ascending order. * @pre theArray[first..last] is an array. * @post theArray[first..last] is sorted in ascending order. * @param theArray The given array. * @param first The first element to consider in theArray. * @param last The last element to consider in theArray. */ void mergesort(DataType theArray[], int first, int last) { if (first < last) { int mid = (first + last)/2; // index of midpoint mergesort(theArray, first, mid); mergesort(theArray, mid+1, last); // merge the two halves merge(theArray, first, mid, last); } // end if } // end mergesort

  40. const int MAX_SIZE = 10000; void merge(DataType theArray[], int first, int mid, int last) { DataType tempArray[MAX_SIZE]; // temporary array // initialize the local indexes to indicate the subarrays int first1 = first; // beginning of first subarray int last1 = mid; // end of first subarray int first2 = mid + 1; // beginning of second subarray int last2 = last; // end of second subarray // while both subarrays are not empty, copy the // smaller item into the temporary array int index = first1; // next available location in // tempArray for (; (first1 <= last1) && (first2 <= last2); ++index) { if (theArray[first1] < theArray[first2]) { tempArray[index] = theArray[first1]; ++first1; } else { tempArray[index] = theArray[first2]; ++first2; } // end if } // end for Mergesort (4/) first1 last1 first2 last2 anArray first mid last index 3 5 tempArray

  41. Mergesort (5/9) // finish off the nonempty subarray // finish off the first subarray, if necessary for (; first1 <= last1; ++first1, ++index) tempArray[index] = theArray[first1]; // finish off the second subarray, if necessary for (; first2 <= last2; ++first2, ++index) tempArray[index] = theArray[first2]; // copy the result back into the original array for (index = first; index <= last; ++index) theArray[index] = tempArray[index]; } // end merge

  42. Mergesort (6/9) void mergesort(DataType theArray[], int first, int last) { if (first < last) { int mid = (first + last)/2; mergesort(theArray, first, mid); mergesort(theArray, mid+1, last); merge(theArray, first, mid, last); } } mergesort(theArray,0,5) mergesort(theArray,3,5) mergesort(theArray,0,2) mergesort(theArray,1,1) mergesort(theArray,0,1)

  43. Mergesort (7/9) • Analysis (worst case): • The merge step of the algorithm requires the most effort. • Each merge step merges theArray[first…mid] and theArray[mid+1…last]. • If the number of items in the two array segment is m: • Merging the segments requires at most: • m-1comparisons. • mmoves from the original array to the temporary array. • mmoves from the temporary array back to the original array. • Each merge requires 3*m-1 major operations.

  44. Mergesort (8/9) Level 0 n Merge n items: 3*n-1 operations or O(n) Level 1 n/2 n/2 Merge two n/2 items: 2*(3*n/2-1) operations 3n-2 operations or O(n) Level 2 n/4 n/4 n/4 n/4 Each level requires O(n) operations . . . Level log2n (or 1 + log2n rounded down) 1 1 1 1 1 1 Each level O(n) operations & O(log2n) levels  O(n*log2n)

  45. Mergesort (9/9) • Analysis: • Worst case: • Average case: • Performance is independent of the initial order of the array items. • Advantage: • Mergesort is an extremely fast algorithm. • Disadvantage: • Mergesort requires a second array as large as the original array. O(n * log2n). O(n * log2n).

  46. Quicksort (1/15) • A divide-and-conquer algorithm. • Strategy: • Choose a pivot. • Partition the array about the pivot. • Pivot is now in correct sorted position. • The items in [first…pivotIndex-1] remain in positions first through pivotIndex-1 when the array is properly sorted. • Sort the left section and Sort the right section. (solve small problems)

  47. Quicksort (2/15) • Partition algorithm: • To partition an array segment theArray[first…last]. • Choose a pivot. • If the items in the array are arranged randomly, you can choose a pivot at random. • Choose theArray[first] as the pivot. • Three regions: S1, S2, and unknown.

  48. Quicksort (3/15) • Initially, all items except the pivot (theArray[first]) constitute the unknown region. • Conditions: • lastS1 = first • firstUnknown = first + 1 • At each stop, you examine one item of the unknown region. • i.e., theArray[firstUnknown]. • Determine in which of the S1 or S2 it belongs, and place it there. • Then decrease the size of unknown by 1 (i.e., firstUnknown++). first lastS1 firstUnknown last

  49. Quicksort (4/15) • Move theArray[firstUnknown] into S1: • Swap theArray[firstUnknown] with theArray[lastS1+1]. • Increment lastS1. • Increment firstUnknown. swap S2 unknown S1 ≧p <p <p ≧p <p ≧p ? first lastS1 firstUnknown last

  50. Quicksort (5/15) • Move theArray[firstUnknown] into S2: • Simply increment firstUnknown by 1. S2 unknown S1 ≧p <p ≧p ? first lastS1 firstUnknown last

More Related