Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Multiple Choice Hash Tables with Moves on Deletes and Inserts

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Multiple Choice Hash Tables with Moves on Deletes and**Inserts Adam Kirsch Michael Mitzenmacher**Hashing : Modern Perspective**• For many situations (e.g., hardware for routers) multiple choice hash tables are state-of-the-art. • Each item gets d possible hash locations, placed in one. • Moving items among choices (e.g., cuckoo hashing) greatly improves space utilization. • Only cost : may take many moves per insert.**Previously**• Schemes that move at most 1 item per insertion. • Limit cost of cuckoo hashing. • Schemes that batch move operations in a queue. • Amortize cost of cuckoo hashing. • Using content addressable memories (CAMs) to reduce chance of overflow. • Small CAMs yield big gains.**Contributions**• Consider potential of moving items on deletions. • Focus on one move per deletion/insertion. • Examine alternative approach using weaker hashing from [KTC, Peacock Hashing]. • Analyze limits of performance.**Multilevel Hash Table [BK90]**• Use a multilevel hash table (MHT) • Can store n elements with d = log log n + O(1) levels in O(n) space with high probability • Example with d = 4 hash functions Level 1 2 x 3 Skew: more elements placed by early hash functions (double exponential decay) 4**Second Chance (SC) Scheme**• Standard MHT fills from top down • elements cascade from table to table. • We try to slow cascade at every step. x Standard MHT Insertion**Second Chance (SC) Scheme**• Standard MHT fills from top down • elements cascade from table to table. • We try to slow cascade at every step. x**Second Chance (SC) Scheme**• Standard MHT fills from top down • elements cascade from table to table. • We try to slow cascade at every step. x**CAMs**• Last few collisions hard to stop. • Can waste lots of space on few items. • Solution : content addressable memory. • CAMs fully asociative. • Hold small numbers of items.**Moves on Deletions**• Harder to manage. • What item to move up? Level 1 2 x 3 4**Hint-Based Approach**• Each cell stores hint for where an item to move on delete is held. • Hints can be kept fairly small. • About log n bits. • Various hint approaches possible. • We found “replace hint on any collision” works well. • May depend on item lifetime distribution, etc. • One move, recursive move variations.**Simulation Data**• No current method of analysis for hints. • Use simulations. 10,000 trials per data point. • MHT levels decreasing in size by factor of 2. Plus small CAM. • With n items, top level has size n. • Space usage just above 50%. • Load table to n elements, alternate inserts/deletes for 218 steps. • Exponentially distributed lifetimes. • Goal : how many hash functions needed?**Lessons from Simulations**• No moves very weak. • Second Chance (move on insert) more powerful than hint-based move on delete. • But the two combine well. • Four hash functions: better than 50% load, small CAM.**Alternative : Weak Hashes**• To avoid hints, overflow at each bucket splits to two buckets at next level. • Each bucket receives from four buckets. • Less spreading of items, but know where to look on deletes. • Conjecture : loss of randomness implies weak performance.**Two Idealized Schemes**• Each bucket holds random item, splits rest. • Each bucket counts items passed to bucket A and bucket B at next level, greedily holds item from bucket with larger count. • Assume invariants kept over insertions/deletions at all times. • Can be analyzed recursively level by level. • Get distribution of bucket loads at each level. • Obtain average case peformance.**Conclusions**• Weak hashes, based on buckets, much less effective than hints. • Even under optimistic assumptions. • One move approaches effective. • Move on insert/delete complement each other. • Need methods for analysis. • Challenging dependencies; hard to get exact numbers.