540 likes | 548 Vues
Searching. 7. 2. 6. 9. 4. 3. 8. 10. 1. 5. Searching. Assume that we have an array of integers:. And we wished to find a particular element in the array (e.g., 10). #include " stdafx.h " #include < iostream > using namespace std ; void main()
E N D
7 2 6 9 4 3 8 10 1 5 Searching Assume that we have an array of integers: And we wished to find a particular element in the array (e.g., 10) #include"stdafx.h" #include<iostream> usingnamespacestd; void main() { intiarray[10] = {7,2,6,9,4,3,8,10,1,5}, index, search = 10; for (index = 0; (index < 10) && (iarray[index] != search); index++); if (index == 10) cout << "The Integer is NOT on the list\n"; else cout << "The Integer " << iarray[index] << " was found in position " << index << " (address: " << (unsignedlongint)&iarray[index] << " )\n"; }
Following the program during the for loop: for (index = 0; index < 10 && iarray[index] != search; index++); Variable values (search set to 10) Condition Check: index < 10 && iarray[index] != search; index iarray[index] 0 7 TRUE 1 2 TRUE 6 TRUE 2 9 TRUE 3 4 4 TRUE 5 3 TRUE 6 8 TRUE 7 10 FALSE Exit Loop cout << "The Integer " << iarray[index] << " was found in position " << index << " (address: " << (unsignedlongint)&iarray[index] << " )\n"; Output:The Integer 10 was found in position 7 (Address: 2881536)
Since the list of integers is not in any order, we must perform a sequential search: • Each element in the list be checked until: • The element is found • The end of the list is reached • The procedure is adequate if each element is to be considered (e.g., in a transaction listing) • The procedure is inadequate if specific elements are sought In a sequential search: • The MAXIMUM number of searches required is: n + 1 (where n = the number of elements on the list) • The AVERAGE number of searches required is: (n + 1)/2
Average Searches (n + 1)/2 Maximum Searches (n + 1) The number of searches required is dependent upon the number of elements in the list: Number elements 10 11 5.5 100 101 55.5 1,000 1,001 550.5 10,000 10,001 5,000.5 100,000 100,001 50,000.5 1,000,000 1,000,001 500,000.5 10,000,000 10,000,001 5,000,000.5 100,000,000 100,000,001 50,000,000.5 1,000,000,000 1,000,000,001 500,000,000.5
1 2 3 4 5 6 7 8 9 10 IF the list were sorted We could perform a Binary Search on it: 1. Determine the bottom and top of the list 2. If the bottom offset > top offset: STOP: The number is NOT in the list 3. Find the midpoint = (bottom + top)/2 4. If the element at the midpoint is the Search number: STOP: The number has been found 5. If the element is greater than the search number: top = midpoint- 1 Else (the element is less than the search number): bottom = midpoint + 1 Go to step 2
1 2 3 4 5 6 7 8 9 10 No Let’s consider the procedure, step by step (assume we are trying to find the integer 6 on the list) 1. Determine the bottom and top of the list bottom = 0 9 = top offsets: 2. Is the bottom offset > top offset ?? 3. Find the midpoint = (bottom + top)/2 = (0 + 9)/2 = 4
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Offset: 0 1 2 3 4 5 6 7 8 9 No 4. Element at midpoint the search element ?? 5. Element greater than the search number?? No bottom = midpoint + 1 = 4 + 1 = 5 The new search list is: top (unchanged)
No 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 Offset: 0 1 2 3 4 5 6 7 8 9 bottom = 5 9 = top 2. Is the bottom offset > top offset ?? 3. Find the midpoint = (bottom + top)/2 = (5 + 9)/2 = 7 Offset: 0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Offset: 0 1 2 3 4 5 6 7 8 9 No 4. Element at midpoint the search element ?? 5. Element greater than the search number?? Yes top = midpoint - 1 = 7 - 1 = 6 The new search list is: bottom (unchanged)
1 2 3 4 5 6 7 8 9 10 No 1 2 3 4 5 6 7 8 9 10 Offset: 0 1 2 3 4 5 6 7 8 9 bottom = 5 6 = top 2. Is the bottom offset > top offset ?? 3. Find the midpoint = (bottom + top)/2 = (5 + 6)/2 = 5 Offset: 0 1 2 3 4 5 6 7 8 9
No 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 Offset: 0 1 2 3 4 5 6 7 8 9 bottom = 5 = top 2. Is the bottom offset > top offset ?? 3. Find the midpoint = (bottom + top)/2 = (5 + 5)/2 = 5 Offset: 0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 Offset: 0 1 2 3 4 5 6 7 8 9 Yes: STOP 4. Element at midpoint the search element ?? The search number was found This does NOT seem like a savings over a sequential search. In fact, it seems like much more work. In this case (because the list is short (and because we intentionally chose the worst case scenario), probably not.
For a binary search: • The MAXIMUM number of searches required is: log2 n(where n = the number of elements on the list) • The AVERAGE number of searches required is: (log2 n) - 1(for n > 30) Ave. Sequential Searches Ave. Binary Searches Max. Binary Searches No. Elements 10 5.5 4 2.9 100 55.5 7 5.8 1,000 550.5 10 9.0 10,000 5,000.5 14 12.3 100,000 50,000.5 17 15.6 1,000,000 500,000.5 20 18.9 10,000,000 5,000,000.5 24 22.3 100,000,000 50,000,000.5 27 25.6 1,000,000,000 500,000,000.5 30 28.9
1 2 6 10 12 14 15 21 22 29 Is a binary search always preferred to a sequential search? NO. It depends: • If all elements are to be examined, a sequential search is preferred • A binary search: • Is programatically more complex • requires more comparisons • As a general rule of thumb, a binary search is preferred if the list contains more than 30-50 elements How does a binary search work if an element is NOT on the list?? Consider the array: Suppose we were to search the list for the value 9 (Which is NOT on the list)
1 2 6 10 12 14 15 21 22 29 1 2 6 10 12 14 15 21 22 29 1 2 6 10 12 14 15 21 22 29 1 2 6 10 12 14 15 21 22 29 bottom midpoint top Search #1: bottom midpoint top Search #2: midpoint bottom top Search #3: top bottom Search #4: Since the bottom offset is > top offset STOP
What would the C code for a binary search look like?? #include"stdafx.h" #include<iostream> usingnamespacestd; void main() { intiarray[10] = {1,2,6,10,12,14,15,21,22,29}, search, bottom = 0, top = 9, found = 0, midpt = 9/2; char temp[10]; cout << "Enter the number to search for: "; cin >> search; while ((top > bottom) && (found == 0)) if (iarray[midpt] == search) found = 1; else { if (search > iarray[midpt]) bottom = midpt + 1; else top = midpt - 1; midpt = (bottom + top)/2; } if (found == 0) cout << "The Integer is NOT on the list\n"; else cout << iarray[midpt] << " was found in position " << midpt << " (address: " << (unsignedlongint)&iarray[midpt] << " )\n"; }
Outputs If the value is on the list If the value is NOT on the list
Sorting • Why? • Displaying in order • Faster Searching • Categories • Internal • List elements manipulated in RAM • Faster • Limited by amount of RAM • External • External (secondary) storage areas used • Slower • Used for Larger Lists • Limited by secondary storage (Disk Space)
Sorting • Why? • Displaying in order • Faster Searching • Categories • Internal • List elements manipulated in RAM • Faster • Limited by amount of RAM • External • External (secondary) storage areas used • Slower • Used for Larger Lists • Limited by secondary storage (Disk Space)
Basic Internal Sort Types • • Exchange(e.g., bubble sort) • Single list • Incorrectly ordered pairs swapped as found • • Selection • Two lists (generally); Selection with exchange uses one list • Largest/Smallest selected in each pass and moved into position • • Insertion • One or two lists (two more common) • Each item from original list inserted into the correct position in the new list
7 2 6 1 3 4 8 10 9 5 Exchange Sorts Bubble Sort 1:The largest element ‘bubbles’ up Given: • Point to bottom element • Compare with element above: • if the element is greater, swap positions (in this case, swap) • if the element is smaller, reset the bottom pointer (not here) • Continue the process until the largest element is at the end • This will require n-1 comparisons (9 for our example) (where n = the length of the unsorted list) • At the end of the pass: • The largest number is in the last position • The length of the unsorted list has been shortened by 1
7 2 6 5 2 7 6 5 1 1 3 3 4 4 8 8 8 8 8 10 10 10 10 10 9 9 9 9 9 2 6 7 1 3 4 5 2 6 1 7 3 4 5 2 6 1 3 7 4 5 2 6 1 3 4 7 8 10 9 5 How does this work?? Comparison: Pass #1: 1 Swap 2 Swap 3 Swap 4 Swap 5 Swap 6 Don’t Swap
2 6 1 8 10 9 5 3 3 3 3 4 4 4 4 7 7 7 7 2 6 1 8 10 9 5 2 2 6 6 1 1 8 8 9 9 5 10 5 10 Continuing Comparison: Pass #1: 7 Don’t Swap 8 Swap 9 Swap The new list appears as Note: • 9 (n - 1) comparisons were required • We know that the largest element is at the end of the list
2 6 1 8 9 5 10 2 6 1 8 9 5 10 3 3 4 4 7 7 2 1 6 3 4 7 8 9 5 10 2 1 3 6 4 7 8 9 5 10 2 1 3 4 6 7 8 9 5 10 2 1 3 4 6 7 8 9 5 10 2 1 3 4 6 7 8 9 5 10 2 1 3 4 6 7 8 9 5 10 Continuing Comparison: Pass #2: 1 (10) Don’t’ Swap 2 (11) Swap 3 (12) Swap 4 (13) Swap 5 (14) Don’t Swap 6 (15) Don’t Swap 7 (16) Don’t Swap 8 (17) Swap
2 1 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 1 2 3 4 6 7 8 5 9 10 Continuing Comparison: Pass #3: 1 (18) Swap 2 (19) Don’t Swap 3 (20) Don’t Swap 4 (21) Don’t Swap 5 (22) Don’t Swap 6 (23) Don’t Swap 7 (24) Swap
1 2 3 4 6 7 5 8 9 10 1 2 3 4 6 7 5 8 9 10 1 2 3 4 6 7 5 8 9 10 1 2 3 4 6 7 5 8 9 10 1 2 3 4 6 7 5 8 9 10 1 2 3 4 6 7 5 8 9 10 Continuing Comparison: Pass #4: 1 (25) Don’t Swap 2 (26) Don’t Swap 3 (27) Don’t Swap 4 (28) Don’t Swap 5 (29) Don’t Swap 6 (30) Don’t Swap
1 2 3 4 6 5 7 8 9 10 1 2 3 4 6 5 7 8 9 10 1 2 3 4 6 5 7 8 9 10 1 2 3 4 6 5 7 8 9 10 1 2 3 4 6 5 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Continuing Comparison: Pass #5: 1 (31) Don’t Swap 2 (32) Don’t Swap 3 (33) Don’t Swap 4 (34) Don’t Swap 5 (35) Swap And the new list Right ??? Is in order, so we can stop.
9 8 7 6 5 4 3 2 1 10 8 7 6 5 4 3 2 1 9 10 7 6 5 4 3 2 1 8 9 10 6 5 4 3 2 1 7 8 9 10 5 4 3 2 1 6 7 8 9 10 4 3 2 1 5 6 7 8 9 10 10 9 8 7 6 5 4 3 2 1 3 2 1 4 5 6 7 8 9 10 2 1 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 NO. In the WORST case scenario (numbers in reverse order): A bubble sort would yield: After Pass Order Comparisons Pass 1 9 2 8 3 7 4 6 5 5 6 4 3 7 2 8 1 9 45 Maximum Comparisons necessary
Max. Compares: (n2 - n)/2 What does this imply ??? If we want to be sure, given an array of n dimensions, we need a maximum of n-1 passes to sort the array, and a total of: S [(n-1)+(n-2)+...+1]or(n2 - n)/2 comparisons. No. Items Max. Passes: (n - 1) 10 9 45 99 4,950 100 999 499,500 1,000 10,000 9,999 49,995,000 100,000 99,999 4,999,950,000 1,000,000 999,999 499,999,500,000 The C code necessary?
include <stdio.h> void main() { int pass=0, compare=0, swaps=0, top=9, i, j, temp, iarray[10]={7,2,6,1,3,4,8,10,9,5}; while (top > 0) // check end { pass++; // increment ctr for (i = 0; i < top; i++) // begin pass { compare++; // increment ctr if (iarray[i] > iarray[i+1]) // ?? out of order { swaps++; // increment ctr temp = iarray[i]; // temp. storage iarray[i] = iarray[i+1]; // swap iarray[i+1] = temp; } printf("%3d %3d %3d: ", pass,compare,swaps); for (j = 0; j < 10; j++) printf("%3d",iarray[j]); // print element printf("\n"); } top--; } }
PassComparisonSwapOrder PassComparisonSwapOrder The Output (modified slightly) would appear as: 1 1 1: 2 7 6 1 3 4 8 10 9 5 1 2 2: 2 6 7 1 3 4 8 10 9 5 1 3 3: 2 6 1 7 3 4 8 10 9 5 1 4 4: 2 6 1 3 7 4 8 10 9 5 1 5 5: 2 6 1 3 4 7 8 10 9 5 1 6 5: 2 6 1 3 4 7 8 10 9 5 1 7 5: 2 6 1 3 4 7 8 10 9 5 1 8 6: 2 6 1 3 4 7 8 9 10 5 1 9 7: 2 6 1 3 4 7 8 9 5 10 2 10 7: 2 6 1 3 4 7 8 9 5 10 2 11 8: 2 1 6 3 4 7 8 9 5 10 2 12 9: 2 1 3 6 4 7 8 9 5 10 2 13 10: 2 1 3 4 6 7 8 9 5 10 2 14 10: 2 1 3 4 6 7 8 9 5 10 2 15 10: 2 1 3 4 6 7 8 9 5 10 2 16 10: 2 1 3 4 6 7 8 9 5 10 2 17 11: 2 1 3 4 6 7 8 5 9 10 3 18 12: 1 2 3 4 6 7 8 5 9 10 3 19 12: 1 2 3 4 6 7 8 5 9 10 3 20 12: 1 2 3 4 6 7 8 5 9 10 3 21 12: 1 2 3 4 6 7 8 5 9 10 3 22 12: 1 2 3 4 6 7 8 5 9 10 3 23 12: 1 2 3 4 6 7 8 5 9 10 3 24 13: 1 2 3 4 6 7 5 8 9 10 4 25 13: 1 2 3 4 6 7 5 8 9 10 4 26 13: 1 2 3 4 6 7 5 8 9 10 4 27 13: 1 2 3 4 6 7 5 8 9 10 4 28 13: 1 2 3 4 6 7 5 8 9 10 4 29 13: 1 2 3 4 6 7 5 8 9 10 4 30 14: 1 2 3 4 6 5 7 8 9 10 5 31 14: 1 2 3 4 6 5 7 8 9 10 5 32 14: 1 2 3 4 6 5 7 8 9 10 5 33 14: 1 2 3 4 6 5 7 8 9 10 5 34 14: 1 2 3 4 6 5 7 8 9 10 5 35 15: 1 2 3 4 5 6 7 8 9 10 6 36 15: 1 2 3 4 5 6 7 8 9 10 6 37 15: 1 2 3 4 5 6 7 8 9 10 6 38 15: 1 2 3 4 5 6 7 8 9 10 6 39 15: 1 2 3 4 5 6 7 8 9 10 7 40 15: 1 2 3 4 5 6 7 8 9 10 7 41 15: 1 2 3 4 5 6 7 8 9 10 7 42 15: 1 2 3 4 5 6 7 8 9 10 8 43 15: 1 2 3 4 5 6 7 8 9 10 8 44 15: 1 2 3 4 5 6 7 8 9 10 9 45 15: 1 2 3 4 5 6 7 8 9 10
Since the list IS sorted after 5 passes (35 comparisons), why can’t we stop?? We could, IF we knew the list was sorted: • If we make a pass without swapping any elements, we know the list is sorted (one extra pass is needed) • We need a flag which we set to 0 (zero) before each pass: • If we make any swaps in the pass, we set the flag to 1 • If we exit the loop, and the flag = 0, the list is sorted For our example, we could stop after Pass 6 (39 comparisons) How would the C code appear?
include <stdio.h> void main() { int pass=0, compare=0, swaps=0, top=9, i, j, temp, sorted = 1, iarray[10]={7,2,6,1,3,4,8,10,9,5}; while ((top > 0) && (sorted == 1)) // check end AND if NOT sorted { pass++; // increment ctr sorted = 0; // reset our flag for (i = 0; i < top; i++) // begin pass { compare++; // increment ctr if (iarray[i] > iarray[i+1]) // ?? out of order { swaps++; // increment ctr sorted = 1; // set the flag temp = iarray[i]; // temp. storage iarray[i] = iarray[i+1]; // swap iarray[i+1] = temp; } printf("%3d %3d %3d: ", pass,compare,swaps); for (j = 0; j < 10; j++) printf("%3d",iarray[j]); // print element printf("\n"); } top--; } }
3 3 3 4 4 4 7 7 7 2 6 1 8 9 5 10 2 6 1 8 9 5 10 2 6 1 8 5 9 10 Could we refine the bubble sort?? We could ‘bubble - up’ in one pass (as we did before) AND‘bubble-down’ in the next pass. Consider our list after our first pass (9th comparison): Starting at the top of the list, we now ‘bubble-down’ the smallest element (‘1’ will end up at the bottom of the list): Comparison Pass #2 1 (10) Swap 2 (11) Swap
2 6 1 3 4 7 5 8 9 10 2 6 1 3 4 5 7 8 9 10 2 6 1 3 4 5 7 8 9 10 2 6 1 3 4 5 7 8 9 10 2 6 1 3 4 5 7 8 9 10 2 1 6 3 4 5 7 8 9 10 Continuing: Pass #2: Comparison: 3 (12) Swap 4 (13) Don’t Swap 5 (14) Don’t Swap 6 (15) Don’t Swap 7 (16) Swap 8 (17) Swap
1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 1 2 3 6 4 5 7 8 9 10 1 2 3 4 6 5 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Continuing: Comparison: Pass #3: 1 (18) Don’t Swap 2 (19) Swap 3 (20) Swap 4 (21) Swap 5 (22) Swap 6 (23) Don’t Swap 7 (24) Don’t Swap
1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 1 2 6 3 4 5 7 8 9 10 Since the List is in order, Can we Stop?? NO: Remember, we need one pass WITHOUT a swap Comparison: Pass #4: 1 (25) Don’t Swap 2 (26) Don’t Swap 3 (27) Don’t Swap 4 (28) Don’t Swap 5 (29) Don’t Swap 6 (30) Don’t Swap
include <stdio.h> void swap(int *swaparray, int a, int b); int sorted = 1; void main() { int bottom = 0, top=9, i, iarray[10]={7,2,6,1,3,4,8,10,9,5}; while ((top > bottom) && (sorted == 1)) // check end AND if NOT sorted { sorted = 0; // reset our flag for (i = bottom; i < top; i++) // begin bubble-up pass if (iarray[i] > iarray[i+1]) // ?? out of order swap(iarray, i, i+1); // Swap the elements top--; if ((top > bottom) && (sorted == 1)) // check end AND if NOT sorted { sorted = 0; // reset our flag for (i = top; i > bottom; i--) // begin bubble-down pass if (iarray[i] < iarray[i-1]) // ?? out of order swap(iarray, i, i-1); // Swap the elements bottom++; } } } void swap(int *swaparray, int a, int b) { int temp; sorted = 1; // set the flag temp = swaparray[a]; // temp. storage swaparray[a] = swaparray[b]; // swap swaparray[b] = temp; }
Are there better sorting methods? YES: Generally speaking, bubble sorts are very slow • The Quicksort Method: • Generally the fastest internal sorting method • intended for longer lists How does a quicksort work? • As we have seen, the shorter the list, the faster the sort • Quicksort recursively partitions the list into smaller sublists, gradually moving the elements into their correct position
7 7 3 2 6 6 9 9 4 4 3 2 8 8 10 10 1 1 5 5 Step 1:Choose a pivot element from list • Optimal Pivot: Median element • One alternative: Median of list The pivot element will divide the list in half Step 2:Partition The List • move numbers larger than pivot to right, smaller numbers to left • compare leftmost with rightmost until a swap is needed Elements in Order: No Swap Elements out of order: Swap needed Elements out of order: Swap needed Swap Elements
1 3 6 9 4 2 8 10 7 5 1 1 3 3 2 2 4 9 9 4 6 6 8 8 10 10 7 7 5 5 Larger Elements Smaller Elements Continue with remaining elements: No Swap No Swap Swap No Swap Swap New List: Swap The Left and right partitions are partially sorted:
1 1 3 3 2 2 4 4 1 2 3 4 Smaller Elements Larger Elements Put the LEFT Partition in Order (even though already sorted): Step 1: Select Pivot: Midpoint = (bottom + top )/2 = (0 + 3)/ 2 = 1 Array Offset: 0 1 2 3 Repeat Step 2 with the partitioned list: No Swap No Swap Swap We made 1 swap. Our new partitioned list appears as:
1 2 2 2 1 1 OK – So the list is in order. We can stop, Right??? Not really. The only way to be sure that the complete list is in order is to keep breaking the list down until there no swaps are made or there is only one element on each sublist. Looking at the left sublist: All we know is that the elements on it are smaller than the elements on the right sub-list. The order could have been: Assume that it was the sublist above. We have to continue making sublists: The list midpoint is (0 + 1)/2 = 0 Swap NOW we are done since each sublist contains only one element
5 5 6 6 7 8 10 10 8 7 9 9 9 9 6 6 8 8 10 10 7 7 5 5 Smaller Elements Larger Elements Now put the RIGHT Partition in Order: Step 1: Select Pivot: Midpoint = (bottom + top )/2 = (4 + 9)/ 2 = 6 Array Offset: 4 5 6 7 8 9 Repeat Step 2 with the partitioned list: Swap Swap No Swap Swap New Partitioned List:
5 5 6 6 7 7 Put the new LEFT Partition in Order (already sorted): Step 1: Select Pivot: Midpoint = (bottom + top )/2 = (4 + 6)/ 2 = 5 Array Offset: 4 5 6 Repeat Step 2 with the partitioned list: No Swap No Swap Since no swaps were made, the partition is in order
8 10 9 10 10 8 8 9 9 9 10 Once again, put the new RIGHT Partition in Order: Step 1: Select Pivot: Midpoint = (bottom + top )/2 = (7 + 9)/ 2 = 8 Array Offset: 7 8 9 Repeat Step 2 with the partitioned list: Swap No Swap Step 1: Find new right pivot: Pivot = (8 + 9)/2 = 8 Offset: 8 9 Note that since the (new) left partition contains only 1 (one) element, it MUST be in order Swap Step 2: Check Order: And the new right list: Is Sorted (as is the whole list)