Military Technical Academy Data Structures and Algorithms. Lecturer: Dr. Nguyen Nam Hong Tel: 04 8781 437 Mob: 0912 312 816 Email: nguyennamhong2003@yahoo.com.au Lecture 8. Searching Algorithms.
Lecture 8. Searching Algorithms (1/2) References: • Data structures and Algorithms Searching.htm • Kyle Loudon Mastering Algorithms Chapter 12 Sorting and Searching • Lecture 19 Sequential and Binary Search.htm • Sedgewick Algorithms 14. Elementary Searching Methods
Lecture 8. Searching Algorithms (2/2) Contents: 8.1. Searching Concept (3) 8.2. Linear Search (7) 8.3. Binary Search (8) 8.4. Interpolation Search (7)
8.1. Searching Concepts (1/3) • The problem of locating an element in a list (ordered or not) occurs in many contexts. • For instance, a program that checks the spelling of words searches for them in a dictionary, which is just an ordered list of words. • Problems of this kind are called searching problems.
8.1. Searching Concepts (2/3) • There are many searching algorithms. • The natural searching method is linear search (or sequential search, or exhaustive search), which is very simple but takes a long time when applying with large lists.
8.1. Searching Concepts (3/3) • A binary search repeatedly subdivides the list to locate an item and for larger lists it is much faster than linear search. • Like a binary search, an interpolation search repeatedly subdivides the list to locate an item. • Interpolation search is much faster than binary search because it makes a reasonable guess about where the target item should lie.
8.1. Linear Search (1/8) • This is a very simple algorithm. • It uses a loop to sequentially step through an array, starting with the first element. • It compares each element with the value being searched for and stops when that value is found or the end of the array is reached.
8.1. Linear Search (2/8) Sub LinearSearch(x:int, a[]: Int, loc: Int) i:=1 While (i<=n) And (x<>a[i]) i:=i+1 End While If i<=n Then loc = i Else loc = 0 End Sub
8.1. Linear Search (3/8) • Array numlist contains • Searching for the the value 11, linear search examines 17, 23, 5, and 11 -> Found (loc = 4) • Searching for the the value 7, linear search examines 17, 23, 5, 11, 2, 29, and 3 -> Not Found (loc = 0)
8.1. Linear Search (4/8) • The advantage is its simplicity. • It is easy to understand • Easy to implement • Does not require the array to be in order • The disadvantage is its inefficiency • If there are 20,000 items in the array and what you are looking for is in the 19,999th element, you need to search through the entire list.
8.1. Linear Search (5/8) • Whenever the number of entries doubles, so does the running time, roughly. • If a machine does 1 million comparisons per second, it takes about 30 minutes for 4 billion comparisons.
8.1. Linear Search (6/8)
8.1. Linear Search (7/8) Use a Sentinel to Improve the Performance Sub LinearSearch2(x:int, a[]: Int, loc: Int) a[n+1] = x: n = n + 1: i = 1 While (x<>a[i]) i = i+1 End While If i<=n Then loc = i Else loc = 0 End Sub
8.1. Linear Search (8/8) Apply Linear Search to Sorted Lists Sub LinearSearch3(x:int, a[]: Int, loc: Int) i = 1 While (x > a[i]) i = i+1 End While If a[i] = x Then loc = i Else loc = 0 End Sub
8.2. Binary Search (1/9) Can We Search More Efficiently? • Yes, provided the list is in some kind of order, for example alphabetical order with respect to the names. • If this is the case, we use a "divide and conquer" strategy to find an item quickly. • This strategy is what one would use in a "number guessing game", for example.
8.2. Binary Search (2/9) I'm Thinking of A Number… • … between 1 and 1000. Guess it! • Is it 500? Nope, too low. • Is it 750? Nope, too high. • Is it 625? … etc… This strategy guarantees a correct guess in no more than ten guesses!
8.2. Binary Search (3/9) Apply This Strategy to Searching • The resulting algorithm is called the "Binary Search" algorithm. • We check the middle key in our list. • If it is beyond what we are looking for (too high), we look only at the top half of the list. • If it's not far enough in (too low), we look at the bottom half. • Then iterate!
8.2. Binary Search (4/9) • Divide a sorted array into three sections. • middle element • elements on one side of the middle element • elements on the other side of the middle element • If the middle element is the correct value, done. Otherwise, go to step 1, using only the half of the array that may contain the correct value.
8.2. Binary Search (5/9) • Continue steps 1 and 2 until either the value is found or there are no more elements to examine.
8.2. Binary Search (6/9) Binary Search Example • Array numlist2 contains • Searching for the the value 11, binary search examines 11 and stops. Found. • Searching for the the value 7, binary search examines 11,3,5,and stops. Not Found.
8.2. Binary Search (7/9) Algorithm for Binary search Sub BinarySearch(x:int, a[]: int, loc: Int) i =1: j =n while i<j begin m =(i + j) \ 2 if x > a[m] then i=m+1 else j=m end if x=a[i] then loc=i else loc=0 End Sub
8.2. Binary Search (8/9) • The worst case number of comparisons grows by only 1 comparison every time list size is doubled. • Only 32 comparisons would be needed on a list of 4 billion using Binary Search. (Sequential Search would need 4 billion comparisons and would take 30 minutes!)
8.2. Binary Search (9/9) • Benefit • Much more efficient than linear search. • For array of N elements, performs at most log2N comparisons. • Disadvantage • Requires that array elements be sorted.
8.3. Interpolation Search (1/9) • Binary search is a great improvement over linear search because it eliminates large portion of the list without actually examing all the eliminated values. • If we know that the values are fairly evenly distributed, we can use interpolation to eliminate even more values at each step.
8.3. Interpolation Search (2/9) • Interpolation is the process of using known values to gess where an unknown value lies. • We use the indexes of known values in the list to gess what index the target value should have. • Interpolation search selects the dividing point by interpolation using the following code m = l + (x – a[l])*(r-l)/(a[r]-a[l])
8.3. Interpolation Search (3/9) • Compare x to a[m] • If x = a[m]: Found. • If x<a[m]: set r = m-1 • If x > a[m]: set l = m + 1 • If searching is still not finish, continue searching with new l and r. • Stop searching when Found or x<a[l] or x>a[r].
8.3. Interpolation Search (4/9) Example: Find the key x = 32 in the list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70 1: l=1, r=20 -> m=1+(32-1)*(20-1)/(70-1) = 10 a[10]=21<32=x -> l=11 2: l=11, r=20 -> m=11+(30-24)*(20-11)/(70-24) = 12 a[12]=32=x -> Found at m = 12
8.3. Interpolation Search (5/9) Example: Find the key x = 30 in the list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70 1: l=1, r=20 -> m=1+(30-1)*(20-1)/(70-1) = 9 a[9]=19<30=x -> l=10 2: l=10, r=20 -> m=10+(30-21)*(20-10)/(70-21) = 12 a[12]=32>30=x -> r = 11 3: l=10, r=11 -> m=10+(30-24)*(11-10)/(24-21) = 12 m=12>11=r: Not Found
8.3. Interpolation Search (6/9) Private Sub Interpolation(a[]: Int, x: Int, n: Int, Found: Boolean) l = 1: r = n Do While (r > l) m = l + ((x – a[l]) / (a[r] – a[l])) * (r - l) 'Verify and Decise What to do next Loop End Sub
8.3. Interpolation Search (7/9) 'Verify and Decide what to do next If (a[m] = x) Or (m < l) Or (m > r) Then Found = iif(a[m] = x, True, False) Exit Do ElseIf (a[m] < x) Then l = m + 1 ElseIf (a[m] > x) Then r = m – 1 End If
8.3. Interpolation Search (8/9) • Binary search is very fast (O(logn)), but interpolation search is much faster (O(loglogn)). • For n = 2^32 (four billion items) • Binary search took 32 steps of verification • Interpolation search took only 5 steps of verification.
8.3. Interpolation Search (9/9) • Interpolation search performance time is nearly constant for a large range of n. • Interpolation is still more usefull if the data had been stored on a hard disk or other relatively slow device.