CSC 211 Data Structures Lecture 12

CSC 211Data StructuresLecture 12 Dr. Iftikhar Azim Niaz ianiaz@comsats.edu.pk 1

Last Lecture Summary • Dynamic Representation • Allocation from Dynamic Storage • Returning unused storage back to dynamic storage • Linked List Operations • Insert • Delete 2

Objectives Overview • Cursor-based Implementation of List • Search operation • Concepts and Definitions • Sequential Search • Implementation of Sequential search • Complexity of Sequential Search

Comparison of Methods • If the bound to which the list can grow is not known, use the pointer implementation. • INSERT and DELETE take constant time in linked list but take O(n) time in array. • PREVIOUS and END take constant time in array but O(n) time in linked list. • Insertion or deletion that affects the element at the position denoted by some position variable, eg; HEAD or TAIL should be used with care.

Comparison of Methods • Array Implementation wastes space • since it uses maximum space irrespective of the number of elements in the list. • Linked list uses space proportional to the number of elements in the list, • but requires extra space to save the position pointers.

Cursor-based Implementation of List • Some languages do not support pointers, but we can simulate using cursors. • Create one array of records. • Each record consists of an element and an integer that is used as a cursor. • An integer variable LHead is used as a cursor to the header cell of the list L.

D 7 4 C 0 6 A 8 0 E 0 B 3 10 3 Cursor-based Implementation of List L = a,b,c M = d.e 1 2 5 L 3 4 1 M 5 available 9 6 7 8 9 10 Element next

Cursor-based Implementation of List … a b p … a b q temp Moving a cell from one list to another

Searching • A question you should always ask when selecting a search algorithm is • “How fast does the search have to be?” • The reason is that, in general, the faster the algorithm is, the more complex it is. • Bottom line: you don’t always need to use or should use the fastest algorithm. • Let’s explore the following search algorithms, keeping speed in mind. • Sequential (linear) search • Binary search

Searching • A search algorithm is a method of locating a specific item of information in a larger collection of data • Search Algorithms • Computer has organized data into computer memory. • Now we look at various ways of searching for a specific piece of data or for where to place a specific piece of data. • Each data item in memory has a unique identification called its key of the item.

What is Searching • Finding the location of the record with a given key value, or finding the locations of some or all records which satisfy one or more conditions. • Search algorithms start with a target value and employ some strategy to visit the elements looking for a match. • If target is found, the index of the matching element becomes the return value.

Linear Search • In computer science, linear search or sequential search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found • Linear search is the simplest search algorithm • Its worst case cost is proportional to the number of elements in the list; and so is its expected cost, if all list elements are equally likely to be searched for. • Therefore, if the list has more than a few elements, other methods (such as binary search or hashing) will be faster, but they also impose additional requirements.

Properties of Linear Search • It is easy to implement. • It can be applied on random as well as sorted arrays. • It has more number of comparisons. • It is better for small inputs not for long inputs.

Linear Search • very simple algorithm. • It uses a loop to sequentially step through an array, starting with the first element. • It compares each element with the value being searched for (key) and stops when that value is found or the end of the array is reached. • Can be applied to both sorted and unsorted list

Linear Search - Algorithm set found to false; set position to –1; set index to 0 while (index < number of elements) and (found is false) if list[index] is equal to search value found = true position = index end if add 1 to index end while return position

Linear Search - Program IntLinSearch(int [] list, int item, int size) { int found = 0; int position = -1; int index = 0; while (index < size) && (found == 0) { if (list[index] == item ) { found = 1; position = index; } // end if index++; } // end of while return position; } // end of function LinSearch

Linear Search - Example • Array numlist contains: • Searching for the the value 11, linear search examines 17, 23, 5, and 11 • Searching for the the value 7, linear search examines 17, 23, 5, 11, 2, 29, and 3

Linear Search - Tradeoffs • Benefits: • Easy algorithm to understand • Array can be in any order • Disadvantages: • Inefficient (slow): • For array of N elements, examines N/2 elements on average for value in array, • N elements for value not in array

Linear Search Analysis • For a list with n items, • the best case is when the value is equal to the first element of the list, in which case only one comparison is needed. • The worst case is when the value is not in the list (or occurs only once at the end of the list), in which case n comparisons are needed.

Linear Search Analysis • If the value being sought occurs k times in the list, and • all orderings of the list are equally likely, the expected number of comparisons is • If k=0 then it is n • If 1<=k<=n then it is (n+1) / (k+1) • For example, if the value being sought occurs once in the list, and all orderings of the list are equally likely, the expected number of comparisons is (n+1) /2 • However, if it is known that it occurs once, then at most n - 1 comparisons are needed, and the expected number of comparisons is • ((n+2)(n-1)) / 2n • (for example, for n = 2 this is 1, corresponding to a single if-then-else construct). • Either way, asymptotically the worst-case cost and the expected cost of linear search are both O(n).

Non-Uniform Probabilities • The performance of linear search improves if the desired value is more likely to be near the beginning of the list than to its end. • Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list. • In particular, when the list items are arranged in order of decreasing probability, and these probabilities are geometrically distributed, the cost of linear search is only O(1). • If the table size n is large enough, linear search will be faster than binary search, whose cost is O(log n)

Linear Search - Application • Linear search is usually very simple to implement, and is practical when the list has only a few elements, or when performing a single search in an unordered list. • When many values have to be searched in the same list, it often pays to pre-process the list in order to use a faster method. • For example, one may sort the list and use binary search, or build any efficient search data structure from it. Should the content of the list change frequently, repeated re-organization may be more trouble than it is worth. • As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches

Linear Search - Pseudocode • Forward iteration • This pseudocode describes a typical variant of linear search, where the result of the search is supposed to be either the location of the list item where the desired value was found; or an invalid location Λ, to indicate that the desired element does not occur in the list. For each item in the list: if that item has the desired value, stop the search and return the item's location. Return Λ.

Linear Search - Pseudocode • In this pseudocode, the last line is executed only after all list items have been examined with none matching. • If the list is stored as an array data structure, the location may be the index of the item found (usually between 1 and n, or 0 and n−1). • In that case the invalid location Λ can be any index before the first element (such as 0 or −1, respectively) or after the last one (n+1 or n, respectively). • If the list is a simply linked list, then the item's location is its reference, and Λ is usually the null pointer.

Searching in Reverse Order • Linear search in an array is usually programmed by stepping up an index variable until it reaches the last index. • This normally requires two comparison instructions for each list item: • one to check whether the index has reached the end of the array, and • another one to check whether the item has the desired value. • In many computers, one can reduce the work of the first comparison by scanning the items in reverse order.

Searching in Reverse Order • Suppose an array A with elements indexed 1 to n is to be searched for a value x. The following pseudocode performs a forward search, returning n + 1 if the value is not found: Set i to 1. Repeat this loop: If i > n, then exit the loop. If A[i] = x, then exit the loop. Set i to i + 1. Return i. • k

Searching in Reverse Order • The following pseudocode searches the array in the reverse order, returning 0 when the element is not found: Set i to n. Repeat this loop: If i ≤ 0, then exit the loop. If A[i] = x, then exit the loop. Set i to i − 1. Return i. • k

Using a Sentinel • Another way to reduce the overhead is to eliminate all checking of the loop index. • This can be done by inserting the desired item itself as a sentinel value at the far end of the list, as in this pseudocode: Set A[n + 1] to x. Set i to 1. Repeat this loop: If A[i] = x, then exit the loop. Set i to i + 1. Return i. • n

Using a Sentinel - Analysis • With this stratagem, it is not necessary to check the value of i against the list length n: • even if x was not in A to begin with, • the loop will terminate when i = n + 1. • However this method is possible only if the array slot A[n + 1] exists but is not being otherwise used. • Similar arrangements could be made if the array were to be searched in reverse order, and element A(0) were available. • Although the effort avoided by these ploys is tiny, it is still a significant component of the overhead of performing each step of the search, which is small. • Only if many elements are likely to be compared will it be worthwhile considering methods that make fewer comparisons but impose other requirements.

Linear Search on an Ordered List • For ordered lists that must be accessed sequentially, • such as linked lists or files with variable-length records lacking an index, • the average performance can be improved by giving up at the first element which is greater than the unmatched target value, rather than examining the entire list. • If the list is stored as an ordered array, then binary search is almost always more efficient than linear search as with n > 8, say, unless there is some reason to suppose that most searches will be for the small elements near the start of the sorted list.

Sequential Search on an Unordered File • Basic algorithm: Getthe search criterion (key) Get the first record from the file While ( (record != key) and (still more records) ) Get the next record End_while • When do we know that there wasn’t a record in the file that matched the key?

Sequential Search on an Ordered File • Basic algorithm: Getthe search criterion (key) Get the first record from the file While ( (record < key) and (still more records) ) Getthe next record End_while If( record = key ) Then success Else there is no match in the file End_else • When do we know that there wasn’t a record in the file that matched the key?

Sequential Search of Ordered vs.. Unordered List • Let’s do a comparison. • If the order was ascending alphabetical on customer’s last names, how would the search for John Adams on the ordered list compare with the search on the unordered list? • Unordered list • if John Adams was in the list? • if John Adams was not in the list? • Ordered list • if John Adams was in the list? • if John Adams was not in the list?

Ordered Vs. Unordered (Cont…) • How about George Washington? • Unordered • if George Washington was in the list? • If George Washington was not in the list? • Ordered • if George Washington was in the list? • If George Washington was not in the list? • How about James Madison?

Ordered Vs. Unordered (Cont…) • Observation: the search is faster on an ordered list only when the item being searched for is not in the list. • Also, keep in mind that the list has to first be placed in order for the ordered search. • Conclusion: the efficiency of these algorithms is roughly the same. • So, if we need a faster search, we need a completely different algorithm. • How else could we search an ordered file?

Comparing Algorithms • Before we can compare different methods of searching (or sorting, or any algorithm), we need to think a bit about the time requirements for the algorithm to complete its task. • We could also compare algorithms by the amount of memory needed • For the code • For execution (work space)

Comparing Algorithms • An algorithm can require different times to solve different problems of the same size ( a measure of efficiency) • For example, the time it takes an algorithm to search for the integer ‘1’ in an array of 100 integers depends on the nature of the array • are they sorted already? • if so, ‘1’ may be at the start or end

Order: A Comparison Tool • Most of the time we consider the maximum amount of time that an algorithm can require • We call this worst-caseanalysis • Worst-case analysis states that an algorithm is O(f(n)) if it will not take anymore time than k * f(n) time units for all but a finite number of values n. • Read the ‘big-O’, O(…), as ‘on the order of’ • f(n) is a function describing how the time or memory requirements increase with increasing problem size (increasing values of n).

Order • The worst-case scenario doesn’t mean the algorithm will always be slow, but that it is guaranteed never to take more time then the given bound • This is called an asymptotic bound • Remember those asymptotes from algebra (same thing) • Sometimes, the worst-case happens very rarely (if at all) in practice

Average Performance • A harder to calculate metric is an algorithm’s average-case performance • Average-case analysis uses probabilities of problem sizes and problems of a given size to determine how it will act on average • We won’t worry about calculating the average-case performance at this point

Sequential Search • If the item we are looking for is the first item, the search is O(1). • This is the best-case scenario • If the target item is the last item (item n), the search takes O(n). • This is the worst-case scenario. • On average, the item will tend to be near the middle (n/2) but this can be written (½*n), and as we will see, we can ignore multiplicative coefficients. Thus, the average-case is still O(n)

Sequential Search - Analysis • To determine the average number of comparisons in the successful case of the sequential search algorithm: • Consider all possible cases. • Find the number of comparisons for each case. • Add the number of comparisons and divide by the number of cases. • If the search item, called the target, is the first element in the list, one comparison is required. • If it is the second element in the list, two comparisons are required. • If it is the nth element in the list, n comparisons are required

Sequential Search - Analysis • The following expression gives the average number of comparisons to find an item in a list size of n: • It is known that: • Therefore, the following expression gives the average number of comparisons made by the sequential search in the successful case:

Sequential Search • So, the time that sequential search takes is proportional to the number of items to be searched • Another way of saying the same thing using the Big-O notation is: • O(n) • A sequential search is of order n

index = seqSearch ( arr , 1, 8, 3); target= 3 Index 1 2 3 4 5 6 7 8 3 6 4 2 9 5 10 7 match at index = 6 return index 6 index= seqSearch ( arr 1, 8,11); target =11 Index 1 2 3 4 5 6 7 8 6 4 2 9 5 3 10 7 no match returnindex 0 Linear Search

Linear Search Algorithm Input: An Array A with n elements and the particular element X to be found Output: Element X exists or NOT. For i:=1 to n IF (A[i]=X) THEN Print: Item Exists End Algorithm Print: Item does not exist in Array EXIT

Linear Search Tracing Lets search for the number 3. We start at the beginning and check the first element in the array. Is it 3? No, not it. Is it the next element? Not there either. The next element?

Linear Search Tracing Not there either. Next? We found it!!! Now you understand the idea of linear searching; we go through each element, in order, until we find the correct value or we don’t till the very end.

Linear Search • Consider a membership file in which each record contains, among other data the name and telephone number of its member. Suppose we are given the name of a member and we want to find his or her telephone number. One way to do this is to linearly search through the file, that is, apply the Linear Search: • Search each record of the file, one at a time, until finding the given Name and hence the corresponding telephone number

Linear Search Complexity • First of all, it is clear that the time required to execute the algorithm is proportional to the number of comparisons. • Also, assuming that each name in the file is equally likely to be picked, it is intuitively clear that the average number of comparisons for a file with n records is equal to n/2; • that is, the complexity of the linear search algorithm is given by O(n) for average case

CSC 211 Data Structures Lecture 12

CSC 211 Data Structures Lecture 12

Presentation Transcript

CSC 211 Data Structures Lecture 26

CSC 211 Data Structures Lecture 22

CSC 211 Data Structures Lecture 5

CSC 211 Data Structures Lecture 6

CSC 211 Data Structures Lecture 17

CSC 211 Data Structures Lecture 4

CSC 211 Data Structures Lecture 15

CSC 211 Data Structures Lecture 14

CSC 211 Data Structures Lecture 20

CSC 211 Data Structures Lecture 31

CSC 211 Data Structures Lecture 23

CSC 211 Data Structures Lecture 30

CSC 211 Data Structures Lecture 19

CSC 211 Data Structures Lecture 18

CSC 211 Data Structures Lecture 25

CSC 211 Data Structures Lecture 21

CSC 211 Data Structures Lecture 2

CSC 211 Data Structures Lecture 16

CSC 211 Data Structures Lecture 32

CSC 211 Data Structures Lecture 13

CSC 211 Data Structures Lecture 28

CSC 211 Data Structures Lecture 7