1 / 20

Chapter 5: Hashing

Mark Allen Weiss: Data Structures and Algorithm Analysis in Java. Chapter 5: Hashing . Hash Tables. Lydia Sinapova, Simpson College. H as hing. What is Hashing? Direct Access Tables Hash Tables. Hashing - basic idea.

gasha
Télécharger la présentation

Chapter 5: Hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Chapter 5: Hashing Hash Tables Lydia Sinapova, Simpson College

  2. Hashing • What is Hashing? • Direct Access Tables • Hash Tables

  3. Hashing - basic idea • A mapping between the search keys and indices - efficient searching into an array. • Each element is found with one operation only.

  4. Hashing example Example: • 1000 students, • identification number between 0 and 999, • use an array of 1000 elements. • SSN of each student a 9-digit number. • much more elements than the number of the students, - a great waste of space.

  5. The Approach • Directly referencing records in a table usingarithmetic operations on keysto map them onto table addresses. • Hash function: function that transforms thesearch key into a table address.

  6. Direct-address tables • The most elementary form of hashing. • Assumption– direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large. • Array A[m]. Each position (slot) in the array contains a pointer to a record, or NIL.

  7. Direct-Address Tables • The most elementary form of hashing • Assumption –direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large. • Array A[m].Each position (slot) in the array contains either a reference to a record, or NULL. • Cost –the size of the array we need is determined the largest key. Not very useful if there are only a few keys

  8. Hash Functions Transform the keys into numbers within a predetermined interval to be used as indices in an array (table, hash table) to store the records

  9. Hash Functions – Numerical Keys Keys – numbers If M is the size of the array, then h(key) = key % M. This will map all the keys into numbers within the interval [0 .. (M-1)].

  10. Hash Functions – Character Keys Keys – strings of characters Treat the binary representation of a key as a number, and then apply the hash function

  11. How keys are treated as numbers • If each character is represented withmbits, • then the string can be treated asbase-mnumber.

  12. Example A K E Y : 00001 01011 00101 11001 = 1 . 323 + 11 . 322 + 5 . 321 + 25 . 320 = 44271 Each letter is represented by itsposition in the alphabet. E.G,Kis the 11-th letter, and its representation is01011( 11 in decimal)

  13. Long Keys If the keys are very long, an overflow may occur. A solutionto this is to apply theHorner’s method in computing the hash function.

  14. Horner’s Method anxn + an-1.xn-1 + an-2.xn-2 + … + a1x1 + a0x0 = x(x(…x(x (an.x +an-1) + an-2 ) + …. ) + a1) + a0 4x5 + 2x4 + 3x3 + x2 + 7x1 + 9x0 = x( x( x( x ( 4.x +2) + 3) + 1 ) + 7) + 9 The polynomial can be computed by alternating the multiplication and addition operations

  15. Example V E R Y L O N G K E Y 10110 00101 10010 11001 01100 01111 01110 00111 01011 00101 11001 V E R Y L O N G K E Y 22 5 18 25 12 15 14 7 11 5 25 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327+ 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320

  16. Example (continued) V E R Y L O N G K E Y 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320 (((((((((22.32 + 5)32 + 18)32 + 25)32 + 12)32 + 15)32 + 14)32 + 7)32 +11)32 +5)32 + 25 compute the hash function by applying the mod operation at each step, thus avoiding overflowing. h0 = (22.32 + 5) % M h1 = (32.h0 + 18) % M h2 = (32.h1 +25) % M . . . .

  17. Code int hash32 (char[] name, int tbl_size) { key_length = name.length; int h = 0; for (int i=0; i < key_length ; i++) h = (32 * h + name[i]) % tbl_size; return h; }

  18. Hash Tables • Index - integer, generated by a hash function between 0 and M-1 • Initially - blank slots. • sentinel value, or a special field in each slot.

  19. Hash Tables • Insert- hash function to generate an address • Searchfor a key in the table - the same hash function is used.

  20. Size of the Table Table size M- different from the number of recordsN Load factor:N/M M must be primeto ensure even distribution of keys

More Related