1 / 14

Algorithms

Algorithms. Hashing. Hash Functions. Hash functions have many uses Most well-known is for building hash tables. Hashing versus Direct Addressing. Direct addressing is often used for tables. A unique key for an item is used as its address in an array

Télécharger la présentation

Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithms Hashing

  2. Hash Functions • Hash functions have many uses • Most well-known is for building hash tables

  3. Hashing versus Direct Addressing • Direct addressing is often used for tables. • A unique key for an item is used as its address in an array • If the universe of key values (U) is very large, it may be impractical or impossible to use direct addressing • The set K of keys actually stored may be small relative to U, so much of the allocated space may be wasted • Hashing is used for more efficient storage in data structures called hash tables

  4. Hash Tables • When the set of keys K stored in a dictionary is much smaller than the universe of possible keys U, a has table requires much less storage than a direct address table. • Storage requirements are (|K|) • Searching still requires O(1) on average.

  5. Hash Table Notation • Element with key k is stored at • Slot k with direct-addressing • Slot h(k) with hashing • The hash function h() is used to compute the slot from the key k • The function h maps the universe U of keys into the slots of a hash table T[0..m-1]

  6. Terminology • An element with key khashes to slot h(k). • The value h(k) is the hash value of key k.

  7. T 0 h(k1) h(k4) h(k2)=h(k5) h(k3) U (universe of keys) k1 K (actual keys) k2 k3 k4 k5

  8. Collisions • When two keys hash to the same slot a collision results. • Since |U| > m, collisions are unavoidable. • The simplest (and often effective) collision resolution technique is chaining.

  9. k1 k4 k5 k3 T U (universe of keys) k1 K (actual keys) k2 k2 k3 k4 k5

  10. Hash Functions • A good hash function satisfies (approximately) the assumption of simple uniform hashing: • each key is equally likely to has to any of the m slots • the hash value for a particular key is independent of the hash value for any other key. • It is typically not possible to check this condition because • we do not know the probability distribution from which the keys will be drawn. • The keys may not be drawn independently.

  11. Example If the keys are known to be random real numbers k independently and uniformly distributed in the range 0 < k < 1, the following hash function satisfies the condition of simple uniform hashing.

  12. The Division Method • This is a heuristic hash function that is often effective. • Hash value is the remainder of k divided by m. • Avoid m value that is a power of 2 • Often use an m value that is a prime that is not too close to an exact power of 2

  13. The Multiplication Method • Two step method • Multiply the key k by a constant A in the range 0 < A < 1 and extract the fractional part of kA. • Multiply this value by m and take the floor of the result • More precisely • Some values of A work better than others (see text).

  14. Perfect Hashing • Static hashing: once the set of keys is stored in the table, the set of keys never changes. • Set of reserved words in a programming language • Set of files names on a READ-ONLY CD • Perfect hashing: the worst case number of memory accesses required to perform a search is O(1)

More Related