1 / 15

COMP 171 Data Structures and Algorithms

COMP 171 Data Structures and Algorithms. Tutorial 10 Hash Tables. Data Dictionary. A data structure that supports: Insert Search Delete Examples: Binary Search Tree Red Black Tree B Tree Link List. Hash Table. An effective data dictionary Worst Case: Θ(n) time

Télécharger la présentation

COMP 171 Data Structures and Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP 171Data Structures and Algorithms Tutorial 10 Hash Tables

  2. Data Dictionary • A data structure that supports: • Insert • Search • Delete • Examples: • Binary Search Tree • Red Black Tree • B Tree • Link List

  3. Hash Table • An effective data dictionary • Worst Case: Θ(n) time • Under assumptions: O(1) time • Generalization of an array • Size proportional to the number of keys actually stored • Array index is computed from the key

  4. Direct Address Tables • Universe U: • a set that contains all the possible keys • Table has |U| slots. • Each key in U is mapped into one unique entry in the Table • Insert, delete and search takes O(1) time • Works well when U is small • If U is large, impractical

  5. Hash Function • Assume Hash Table has m slots • Hash function h is used to compute slot from the key k • h maps U into the slots of a hash table • h: U → {0, 1, …, m-1} • |U| > m, at least 2 keys will have the same hash value, collision • “Good” hash function can minimize the number of collision

  6. If the keys are not natural number: • Interpret them as natural number using suitable radix notation • Example: character string into radix-128 integer • Division method • h(k) = k mod m • m is usually a prime • Avoid m too close to an exact power of 2 • Ex 11.3-3 • Choose m = 2p-1 and k is a character string interpreted in radix 2p. Show that if x can be derived from y by permuting its characters, then h(x) = h(y).

  7. Multiplication method • h(k) = m ( k A mod 1), 0 < A < 1 • Value m is not critical • Usually choose m to be power of 2 • It works better with some values of A • Eg. (√5 – 1 ) / 2

  8. Separate Chaining • Put all the elements that hash to the same slot in a link list • Element is inserted into the head of the link list • Worst case insertion: O(1) • Worst case search: O(n) • Worst case deletion: O(1)

  9. Given a hash table has m slots that stores n elements, we define load factor α • α= n/m • Simple uniform hashing • Element is equally likely to hash into any of the m slots, independently of where any other element has hashed to • Under Simple uniform hashing • Average time for search = Θ(1+α)

  10. Open Addressing • Each table slot contains either an element or NIL • When collision happens, we successively examine, or probe, the hash table until we find an empty slot to put the key • Deletion is done by marking the slot as “Deleted” but not “NIL” • Hash function h now takes two values: • The key value and the probe number • h(k, i)

  11. Linear Probing • h(k, i) = ( h’(k) + i ) mod m • Initial probe determine the entire probe sequence, there are only m distinct probe sequence • Primary clustering • Quadratic Probing • h(k, i) = ( h’(k) + c1i + c2i ) mod m • there are only m distinct probe sequence • Secondary clustering

  12. Double hashing • Make use of 2 different hash function • h(k, i) = ( h1(k) + ih2(k) ) mod m • ih2(k) should be co-prime with m • Usually take m as a prime number • Probe sequence depends on both has function, so there are m2 probe sequences • Double hashing is better then linear or quadratic probing

  13. Trie • Assumption: • Digital data / radix • Tree structure is used • Insertion is done by creating a path of nodes from the root to the data • Deletion is done by removing the pointer that points to that element • Time Complexity: O(L) • Max # of keys for given L = 128L+1 - 1

  14. Memory Usage: • Node size * Number of node • ((N+1)*pointer size) * (L * n) • N: radix • L: maximum length of the keys • n: number of keys • Improvement 1 • Put all nodes into an array of nodes • Replace pointer by array index • |Array index| = lg (L * n) 

  15. Improvement 2 • Eliminate nodes with a single child • Do Skipping • Label each internal node with its position • Improvement 3 • De La Briandais Tree • Eliminate null pointer in the internal node • Save memory when array are sparsely populated

More Related