1 / 27

Efficient Hashing Techniques in C++11: Functors, Chaining, and Performance Considerations

This guide provides an in-depth exploration of hashing techniques in C++11, focusing on hash functors, their instantiation, and usage. We delve into different data structures such as hash tables, including closed addressing through chaining with linked lists, and examine their efficiency in terms of time complexity and load factors. Key concepts include the importance of immutability in hashed values, effective use of bitwise operations, and strategies for preventing collisions. This comprehensive resource caters to developers looking to optimize their hashing strategies.

nate
Télécharger la présentation

Efficient Hashing Techniques in C++11: Functors, Chaining, and Performance Considerations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Complex Hashing & Chaining

  2. std::Hash Functors • C++11 STL includes hashfunctors • Instantiate and use as function:

  3. std::Hash Functors • C++11 STL includes hashfunctors • Instantiate and use as function: • One line version:

  4. Other Types • How do we hash: • Point? • Employee? • BitmapImage?

  5. Other Types • Cover as many bits as possible

  6. Other Types • Cover as many bits as possible • Combine all values that vary • "John Smith" K100203 vs "John Smith" K923424

  7. Other Types • Cover as many bits as possible • Combine all values that vary • "John Smith" K100203 vs "John Smith" K923424 • Try to make the lowest bits most random • 2013/05/28 day << 20 + month << 10 + yearyear << 20 + month << 10 + day

  8. Bitwise XOR • Bitwise XOR : ^ • combines binary values, preserves entropy 0101 ^ 1111 = 1010 0101 ^ 0000 = 0101 0101 ^ 1011 = 0001

  9. Other Types • Uses existing hash functions: • Combine with bitwise xor

  10. Other Types • Use bit shifts to spread out values if needed

  11. Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17…

  12. Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17… p1.firstName = "Bob"

  13. Hashing Danger • Person p1:"John Smith" • Say hash code forJohn Smith is17… p1.firstName = "Bob" hash(p1) just changedwon't find p1!

  14. Hashing Danger • NEVER modify something being used as a hashed value in hash table!!! • Remove, modify, reinsertor • Use immutable values for hashing

  15. Probing Review • Probing Issues: • Clusters • Extra work proportional to 1/(1-)

  16. Chaining • Chaining (Closed Addressing) :Each bucket can hold multiple values

  17. Chaining • Chaining (Closed Addressing) :Each bucket can hold multiple values • Implementation • Linked List • Holds a few/zero items efficiently • Time efficiency not a big concern

  18. IntHashSet Storage = array of std::list

  19. IntHashSet • Contains: • Find right linked list • Search it

  20. IntHashSet • Insert: • If not there • Find right list and add value

  21. IntHashSet • Remove: • Find right list • Look for item in list • If found, remove

  22. Efficiency • Avg time proportional to load factor • O() = O()

  23. Efficiency • Avg time proportional to load factor • O() = O() • If k is constant, technically O(n) • Massive constant divisor • If k grows proportionally with n = O(1)

  24. Real World • Hash table grows when load factor too large • Cost of all ops O(1) • Insert is amortized O(1) • Cache use oftendetermining factor 

  25. But • No natural ordering

  26. Ordered - O(1) • Space vs Time trade offs • Hybrid/Duplicative representations

  27. HashMap • Map • Key/Value pairsJohn Smith521-1234 • HashMap • Identity determined by key • Only hash key • Value stored with key in table

More Related