1 / 28

Optimal Fast Hashing

Optimal Fast Hashing. Yossi Kanizo (Technion, Israel). Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy). Hash Tables for Networking Devices. Hash tables and hash-based structures are often used in high-speed devices

jodie
Télécharger la présentation

Optimal Fast Hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimal Fast Hashing Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)

  2. Hash Tables for Networking Devices Hash tables and hash-based structures are often used in high-speed devices Heavy-hitter flow identification Flow state keeping Flow counter management Virus signature scanning IP address lookup algorithms For hash tables, ideally, 1 memory access per element insertion Maximize throughput & minimize power

  3. Hash Tables for Networking Devices • Collisions are unavoidable  wasted memory accesses • For load≤1, let a and d be the average and worst-case time (number of memory accesses) per element insertion • Initiallyemptybuckets • Only insertions (no deletions) 3 2 1 Objective: Minimize a and d 1 2 3 4 5 6 7 8 9 Memory

  4. Why We Care • On-chip memory: memory accesses  power consumption • Off-chip memory: memory accesses  lost on/off-chip pin capacity • Datacenters: memory accesses  network & server load • Parallelism does not help reduce these costs • d serial or parallel memory accesses have same cost

  5. Traditional Hash Table Schemes 9 6 7 3 1 2 3 4 5 6 7 8 9 Memory 4 1 5 2 8 Example 1: linked lists (chaining)

  6. Traditional Hash Table Schemes 6 1 2 3 4 5 6 7 8 9 Memory 4 1 5 3 2 8 Example 1: linked lists (chaining) Example 2: linear probing (open addressing) Problem: the worst-case time cannot be bounded by a constant d

  7. High-Speed Hardware 6 h 7 3 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9 • Enable overflows: if time exceeds d → overflow list • Can be stored in expensive CAM • Otherwise, overflow elements = lost elements • Bucket contains h elements • E.g.: 128-bit memory word h=4 elements of 32 bits • Assumption: Access cost (read & write word) = 1 cycle

  8. Problem Formulation Given average time a and worst-case time d, Minimize overflow rate  6 h 7 3 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9

  9. Example: Power of d Random Choices 12 10 11 6 h 7 3 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9 • d hash functions: pick least loaded bucket. • Break ties u.a.r. [Azar et al.] or to the left [Vöcking] • Intuition: can reach low … but average time a = worst-case time d  wasted memory accesses

  10. Main Results • Lower bound on overflow for any scheme • Optimality of three schemes on successively larger ranges: • SIMPLE • GREEDY • MHT (optimal when subtable sizes fall geometrically)

  11. Overflow Lower Bound Objective: given any online scheme with average a and worst-case d, find lower-bound on overflow . No scheme can achieve (capacity region) [h=4, load=n/(mh)=0.95, fixed d]

  12. Overflow Lower Bound 13 14 12 6 h 7 3 11 10 1 5 2 8 9 4 CAM 1 2 3 4 5 6 7 8 9 • Problem: the number of hashes of each element depends on the instantaneous memory state. • How can we bound the overflow?

  13. Overflow Lower Bound: Proof Intuition 14 14 14 14 13 13 14 14 12 6 h 7 14 3 11 10 1 5 2 14 8 9 4 CAM 1 2 3 4 5 6 7 8 9 • Assume hashes are uniform. Then relax constraints: • Offline, • No worst-case d, and • Uncolor the hashes • (n elements) x (a hashes per element) = an uncolored hashes • Lower-bound on expected number of unhashed memory bins

  14. Overflow Lower Bound • Result: closed-form lower-bound formula • Given n elements in m buckets of height h: • Valid also for non-uniform hashes • Defines a capacity region for high-throughput hashing

  15. Lower-Bound Example For 3% overflow rate, throughput can be at most 1/a = 2/3 of memory rate [h=4, load=n/(mh)=0.95]

  16. Overflow Lower Bound Example: d-left scheme: low overflow , but high average memory access rate a [h=4, load=n/(mh)=0.95, m=5,000]

  17. Main Results • Lower bound on overflow for any scheme • Optimality of three schemes on successively larger ranges: • SIMPLE • GREEDY • MHT (optimal when subtable sizes fall geometrically)

  18. The SIMPLE Scheme 10 11 6 h 7 3 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9 • SIMPLE scheme: single hash function • Looks like truncated linked list • Intuition: The final state only depends on the hashes, not on the successive states  can uncolor elements

  19. The SIMPLE Scheme: Proof Intution When all elements have been hashed: 11 6 h 7 3 10 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9 • Same reasoning as offline lower-bound • Result: for a = 1, SIMPLE is optimal (i.e. achieves min ) • Formal proof relies on mean-field analysis (differential equations with continuous-time fluid limit)

  20. Performance of SIMPLE Scheme The lower bound can actually be achieved for a=1 [h=4, load=0.95, m=5,000]

  21. The GREEDY Scheme 12 10 11 d=2 6 h 7 3 1 5 2 8 9 4 Memory CAM 1 2 3 4 5 6 7 8 9 Using uniform hashes, try to insert each element greedily until either inserted or d

  22. The GREEDY Scheme: Proof Intuition 14 14 13 13 14 14 12 6 h 7 3 11 10 1 5 2 8 9 4 CAM 1 2 3 4 5 6 7 8 9 Un-coloring argument: 2nd try of collided element  new element with 1 hash (GREEDY with x elements, i.e. x∙a(x) hashes)  (SIMPLE with x∙a(x) elements) Optimal: For any xn elements Optimality true until no more elements can be added: cut-off point aco ≡ a(n)

  23. Performance of GREEDY Scheme The GREEDY scheme is always optimal until aco [d=4, h=4, load=0.95, m=5,000]

  24. Performance of GREEDY Scheme Overflow rate worse than 4-left, but better throughput (1/a) [d=4, h=4, load=0.95, m=5,000]

  25. The MHT Scheme 10 11 6 h 7 3 1 9 2 8 4 5 Memory CAM 1 2 3 4 5 6 7 MHT (Multi-Level Hash Table) [Broder&Karlin]: d successive subtables with their d hash functions 1st Subtable 2nd Subtable 3rd Subtable

  26. Performance of MHT Scheme • Optimality of MHT until cut-off point aco(MHT) • Proof that subtable sizes fall geometrically • Confirmed in simulations Overflow rate close to 4-left, with much better throughput (1/a) [d=4, h=4, load=0.95, m=5,000]

  27. Conclusion • Established “capacity region” of high-speed hashing • Showed that three schemes are optimal on different ranges • MHT is optimal when subtable sizes fall geometrically • Long-known rule-of-thumb • The MHT cut-off point is larger than the Greedy one

  28. Thank you.

More Related