1 / 126

Hash Tables

Hash Tables. Briana B. Morrison Adapted from William Collins. Sequential Search. Given a vector of integers: v = {12, 15, 18, 3, 76, 9, 14, 33, 51, 44} What is the best case for sequential search? O(1) when value is the first element What is the worst case?

xenos
Télécharger la présentation

Hash Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hash Tables Briana B. Morrison Adapted from William Collins

  2. Hashing

  3. Hashing

  4. Sequential Search • Given a vector of integers: v = {12, 15, 18, 3, 76, 9, 14, 33, 51, 44} • What is the best case for sequential search? • O(1) when value is the first element • What is the worst case? • O(n) when value is last element, or value is not in the list • What is the average case? • O(1/2 * n) which is O(n) Hashing

  5. Hashing

  6. Hashing

  7. Binary Search • Given a vector of integers: v = {3, 9, 12, 14, 15, 18, 33, 44, 51, 76} • What is the best case for binary search? • O(1) when element is the middle element • What is the worst case? • O(log n) when element is first, last, or not in list • What is the average case? • O(log n) Hashing

  8. Hashing

  9. Hashing

  10. Hashing

  11. Hashing

  12. Hashing

  13. Hashing

  14. Hashing

  15. Hashing

  16. Hashing

  17. Hashing

  18. Hashing

  19. Hashing

  20. Hashing

  21. Map vs. Hashmap • What are the differences between a map and a hashmap? • Interface • Efficiency • Applications • Implementation Hashing

  22. Hashing

  23. Hashing

  24. CONTIGUOUS array?vector?deque? heap? LINKED Linked? list? map? BUT NONE OF THESE WILL GIVE CONSTANT AVERAGE TIME FOR SEARCHES, INSERTIONS AND REMOVALS. Hashing

  25. Hashing

  26. Hashing

  27. Hashing

  28. Hashing

  29. Hashing

  30. Hashing

  31. Hashing

  32. To make these values fit into the table, we need to mod by the table size; i.e., key % 1000. 210 256 816 OOPS! Hashing

  33. Hashing

  34. Hashing

  35. Hashing

  36. Hash Codes • Suppose we have a table of size N • A hash code is: • A number in the range 0 to N-1 • We compute the hash code from the key • You can think of this as a “default position” when inserting, or a “position hint” when looking up • A hash function is a way of computing a hash code • Desire: The set of keys should spread evenly over the N values • When two keys have the same hash code: collision Hashing

  37. Hash Functions • A hash function should be quick and easy to compute. • A hash function should achieve an even distribution of the keys that actually occur across the range of indices for both random and non-random data. • Calculation should involve the entire search key. Hashing

  38. Examples of Hash Functions • Usually involves taking the key, chopping it up, mix the pieces together in various ways • Examples: • Truncation – ignore part of key, use the remaining part as the index • Folding – partition the key into several parts and combine the parts in a convenient way (adding, etc.) • After calculating the index, use modular arithmetic. Divide by the size of the index range, and take the remainder as the result Hashing

  39. Example Hash Function Hashing

  40. Devising Hash Functions • Simple functions often produce many collisions • ... but complex functions may not be good either! • It is often an empirical process • Adding letter values in a string: same hash for strings with same letters in different order • Better approach: size_t hash = 0; for (size_t i = 0; i < s.size(); ++i) hash = hash * 31 + s[i]; Hashing

  41. Devising Hash Functions (2) • The String hash is good in that: • Every letter affects the value • The order of the letters affects the value • The values tend to be spread well over the integers Hashing

  42. Devising Hash Functions (3) • Guidelines for good hash functions: • Spread values evenly: as if “random” • Cheap to compute • Generally, number of possible values much greater than table size Hashing

  43. Memory address: We reinterpret the memory address of the key object as an integer Good in general, except for numeric and string keys Integer cast: We reinterpret the bits of the key as an integer Suitable for keys of length less than or equal to the number of bits of the integer type (e.g., char, short, int and float on many machines) Component sum: We partition the bits of the key into components of fixed length (e.g., 16 or 32 bits) and we sum the components (ignoring overflows) Suitable for numeric keys of fixed length greater than or equal to the number of bits of the integer type (e.g., long and double on many machines) Hash Code Maps Hashing

  44. Polynomial accumulation: We partition the bits of the key into a sequence of components of fixed length (e.g., 8, 16 or 32 bits)a0 a1 … an-1 We evaluate the polynomial p(z)= a0+a1 z+a2 z2+ … … +an-1zn-1 at a fixed value z, ignoring overflows Especially suitable for strings (e.g., the choice z =33gives at most 6 collisions on a set of 50,000 English words) Polynomial p(z) can be evaluated in O(n) time using Horner’s rule: The following polynomials are successively computed, each from the previous one in O(1) time p0(z)= an-1 pi(z)= an-i-1 +zpi-1(z) (i =1, 2, …, n -1) We have p(z) = pn-1(z) Hash Code Maps (cont.) Hashing

  45. Hashing

  46. Hashing

  47. Hashing

  48. Hashing

  49. Hashing

  50. Hashing

More Related