1 / 20

Cuckoo Hashing : Hardware Implementations

Cuckoo Hashing : Hardware Implementations. Adam Kirsch Michael Mitzenmacher. Motivation. Hash tables are ubiquitous. Highly useful in router hardware. Measurement and monitoring tasks. Desiderata: Few (parallel) memory accesses . High space utilization. Low failure probability.

greg
Télécharger la présentation

Cuckoo Hashing : Hardware Implementations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cuckoo Hashing :Hardware Implementations Adam Kirsch Michael Mitzenmacher

  2. Motivation • Hash tables are ubiquitous. • Highly useful in router hardware. • Measurement and monitoring tasks. • Desiderata: • Few (parallel) memory accesses. • High space utilization. • Low failure probability. • Hardware-level simplicity. • What are good hash table designs for hardware?

  3. State of the Art : Multiple Choice Hashing • Each element placed in least loaded of d locations. (If 1 element/cell, look for 1 empty cell out of d.)

  4. Cuckoo Hashing and Moves • Cuckoo hashing paradigm: give each element d choices, and move elements among choices as needed.

  5. Original Cuckoo Hashing • 2 subtables, left and right. Each element gets one location per subtable. • Place new element in left subtable. • If element already there, kick it out, move to right subtable. • If element already there, kick it out, move to left subtable… • Until everything placed. • Works with high probability as long as load is less than ½.

  6. Better Cuckoo Hashing • More choices • More elements per bucket • Generally kick out a random item. • Such schemes are not fully analyzed.

  7. What’s Wrong with Cuckoo Hashing? • Lots of moves per insert in worst case. • Average is constant. • But maximum is Omega(log n) with non-trivial (inverse-poly) probability. • Router hardware settings: may need bounded number of memory accesses per insert.

  8. Moves Needed per Insertion

  9. The Power of One Move • Previous work (submitted): How much gain from allowing just one move? • Framework: allow small content-addressable memory (CAM) to handle unsolvable collisions [max 0.2%]. • Multiple schemes analyzed. • With 4 choices, insertions only (no deletions), factor of 2 or larger improvement in space.

  10. Pros/Cons of One Move Systems • Pros • Simple to implement • Efficient • High space utilization for insertion-only • Analyzable and optimizable • Cons • Performance suffers in settings with churn • Better space utilization possible with more moves

  11. The New Idea • Use the CAM as a queue for move operations. • Lookup: check the hash table and the CAM-queue. • Try move operations from queue as available. • Move attempt = 1 parallel memory lookup. • De-amortization • Use queue to make worst-case performance same as average-case performance.

  12. Queue Policy • Key point: better to give priority to “new” insertions over moves. • New moves have d choices; moves effectively have d – 1. • Intuition suggests older items may be less likely to be successfully placed. • True in practice. • Full priority queue may be too complex. • Simple strategy: new items placed at front, failed moves places at back.

  13. Probability of Success vs. Age

  14. Experimental Evaluation • Table of size 32768, 4 subtables. • Target utilization u. • Insert 32678u elements, then alternate insertions/deletions to get to steady state. • Allow ops queue operations (parallel memory operations) per insertion.

  15. Analysis • Currently we do not know how to analyze such systems. • For d > 2 choices, lots of open questions in cuckoo hashing analysis. • Analyzing d = 2 may be possible, but very low space utilization. • See [Kutzelnigg], asymptotic analysis of cuckoo hashing. • Need to understand distribution of move operations/element to analyze queue.

  16. Conclusions and Open Questions • Moving elements leads to much better space utilization in hash tables, at a price. • Cuckoo hashing appears implementable, with per-insert move guarantees based on de-amortization via a CAM queue. • Analysis in an idealized model? • Even analysis for basic cuckoo hashing open. • Performance on real traffic? • Bursty insertions/deletions? • Distribution of element lifetimes? • Proper sizing of CAM queue? • How does overflow probability scale?

More Related