440 likes | 545 Vues
This lecture covers the fundamentals of searchable data in computer programming. Explore how different Abstract Data Types (ADTs) are used for searching, including Maps and Dictionaries, which are essential for effective data handling. We will discuss methods for adding, removing, and accessing data, and the significance of search strategies in unknown data scenarios. Additionally, learn about unique keys in Maps, the relationship between keys and values, and gain insights into the implementation of sequence-based searching.
E N D
CSC 213 – Large Scale Programming Lecture 10:Searching& Mapping
Today’s Goal • Consider the basics of searchable data • How do we search using a computer? • What are our goals while searching? • ADTs used for search & how would they work? • Most critically, where the $&*#%$# are my keys? • How do Map & Dictionary ADT work and search? • Methods to add, remove, and access data? • How Sequenceused to implemented these • When & why would we use Sequence-based approach
Searching • Search for unknown data in most cases • Consider the reverse: why search for what you have? • Seek data related to terms used • Already have idea, want web pages containing terms • Get encoded proteins given set of amino acids • Given “borrowed” credit cards, get credit limits • Exacting, but boring, work doing these searches • Make this work ideal for computers & students
Map-Based Bartender No problem. I’ll have aManhattan ¾ oz sweet vermouth2½ oz bourbon 1 dash bitters1 maraschino cherry1 twist orange peel
Map-Based Bartender That’ll be $2 billion I’ll have aManhattan
Map-Based Bartender I’ll have aManhattan key value
Search Terms • Keygets valuables • We already have key • Want valueas a result of this • Mapworks similarly • Give it keyvaluereturned • Uses Entryto do this work
Entry Interface • Need a key to get valuables • key used to search – it is what we already have • What we want is the result of search – value interface Entry<K,V> { K key();V value(); }
Map Method Madness, Mmmm… • Describes a searchable Collection • put(K key, V value)adds data as an Entry • remove(K key)removes Entry containing key • get(K key)returns valueassociated with key • Several Iterablemethods are also defined • Methods to use are entries(), keys(), & values() • Iterates over expected data so can use in for(-each) loops • Also defines usual Collectionmethods • isEmpty() & size()
Searching Through a Map • Map is a Collection of key-valuepairs • Give it key& get value in return from ADT • Now we have ADT to work with searchable data • Many searches unsuccessful • Unsuccessful search is normal, not unusual • Expected events should NOTthrow exceptions • This is normal; return null when nothing found
At Most 1 Value Per Key • Entrys have unique keys in a Map • If key exists, put(key,value)replaces existing Entry • Returns prior value forkey in the Map so its not lost • If before call key not in Map, null returned
Sequence-Based Map • Sequence’s perspective of Mapthat it holds Positions elements
Sequence-Based Map • Outside view of Map and how it is stored Positions Entrys
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary • Could try associating multiple values per key • Map key to Sequence of valuespossible solution • But this means Map’s user must handle complexity
Using a Map • Could try associating multiple values per key • Map key to Sequence of valuespossible solution • But this means Map’s user must handle complexity
Dictionary-based Bartender No problem. I’ll have aManhattan key value
Dictionary-based Bartender Not that Manhattan Sorry. key value
Dictionary-based Bartender Not that Manhattan Sorry. How about… key anothervalue
Dictionary-based Bartender That’ll be $2 billion Mmmmm... Manhattan key not a anothervalue
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option “awesome”
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option “awesome” • Also many Entryswith same keybut different value “cool”“cool”
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Collection of Entrys • key– searched for • value– cared about
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • keyin at most1Entry • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • Entryscan sharekey
Ordered List-Based Approach • Idea normally imagined w/ Map & Dictionary • Maintains ordered list of key-value pairs • Must maintain Entrys ordered by their key • Faster searching provides performance win Q: “Mom, how do I spell _______?” A: “Look it up.” • Efficiency gains not just for get& getAll • Entrys with same key stored in any order • Only requires that keys be in order only
Ordered List-Based Approach • Iteratorsshouldrespect ordering of Entrys • Should not be a problem, if Entrys stored in order • If O(1) access time, search time is O(log n) • Array-based structure required to hold Entrys • To get immediate access, needs to access by index • Requires IndexList-based implementation
Binary Search • Finds keyusing divide-and-conquer approach • First of many times you will be seeing this approach • Algorithm has problems solved using recursion • Base case 1:No Entrys remain to find the key • Base case 2: At data’s midpoint is matching key • Recursive Step 1: If midpoint too high, use lower half • Recursive Step 2: Use upper half,if midpoint too low
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with key at midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 l = m = h
Using Ordered Sequence • getuses binary search; takes O(log n)time • Should also start with binary search for getAll() • getAllchecks neighbors to find all matches • Add and remove methods could use binary search • List shifts elements in putto make hole for element • Would also need to do shift when removing from list • Each takes O(n) total time in worst case as a result
Comparing Keys • For all searching, must find matching keys • Cannot rely upon equals()when ordering • Want to be lazy, write code for all types of key • Use <, >, == if keys numeric type, but this is limiting • String also has simple method: compareTo() • General way comparing keys based upon this idea?
Comparable<E> Interface • In Java as a standard from java.lang • Defines single method used for comparison • compareTo(E obj)compares instance with obj • Returns intwhich is either negative, zero, positive
Ordered Sequence Example • Easiest to require that keys be Comparable • Now reuse class anywhere by adding interface • Also use standard types like String & Integer • compareTo()in binary search makes it simple int c = k.compareTo(list.get(m).getKey());if (c > 0) {return binarySearch(k, m + 1, h);} else if (c < 0) { return binarySearch(k, l, m - 1);} else { return m;}
What is a Map/Dictionary? • At simplest level, both are collection of Entrys • Focus on transforming data (or so it appears) • Add data with keyand value to which it is transformed • Accessortransforms keytovalueassociated with key • remove() used to delete an Entry • At most one valueper keyusing a Map • With Dictionary, multiple values per keypossible
Before Next Lecture… • Week #4 assignment due Tuesday at 5PM • Continue to do reading in your textbook • Learn more about hash & what it means in CSC • How can we tell if a hash is any good? • Hash tables sound cool, but how do we make them? • Monday is when lab project phase #1 due • Will have time in lab, but then will be the weekend • Project #1 available tonight after lab • Will be due in parts to “encourage” good habits