Lecture 10 Hashing
90 likes | 170 Vues
Explore the motivation, properties, collisions, solutions, universal hash function, modular arithmetic, and designing hash functions. Learn the concepts behind efficient lookup and space optimization in hashing.
Lecture 10 Hashing
E N D
Presentation Transcript
Motivation: Set and Map • Goal: An array whose index can be any object. • Example: DictionaryDictionary[“hash”] = “a dish of diced or chopped meat and often vegetables…” • Properties:1. Efficient lookup: Hope lookup is O(1)2. Space: space is within constant factor to a list.
Naïve implementation of a set • Method 1: Maintain a linked list. • Problem: Lookup takes O(n) time. • Method 2: Use a large array • a[i] = 1 if i is in the set • Problem: Needs huge amount of memory.
Hashing • Idea: for each number, assign a random location • Example: {3, 10, 3424, 643523} • Store number i in a[f(i)] • f(i): hash function.
Collisions • Problem: want to add 123, f(123) = 4 = f(3424). • (This will always happen because of pigeon hole principle) • Solution: 123 and 3424 will share this location. 123
Fixed Hash Function • If the hash function is fixed, then it can be very slow for some bad examples. • Example: We can try to find n numbers x1, x2, …, xn such that f(xi) = y for some fixed y(always possible by pigeon hole principle) • Then hash table degenerates into a linked list. • Solution: Use a family of random hash functions.
Universal Hash Function • Hash function should be as “random” as possible. • Ideally: Choose a random function out of all functions! • However: cannot store a totally random function. • Can use modular arithmetic to construct good hash functions! • Goal: Construct a family of hash functions F, such that for any x ≠ y, we have
Recap: Modular Arithmetic • For a prime number p, only consider numbers {0, 1, 2, 3, …, p-1} • Can do addition, subtraction, multiplication the usual way (take mod p at the end). • Inverse: For any integer 0 < x < p, there is an integer 0 < y < p such that • Example: p = 7, x = 2, then y = 4. We call y = x-1 • Inverse can be computed efficiently.
Designing the Hash function • Pick a prime number p, construct a hash family with p2 functions • For every a, b in {0,1,2,…, p-1}, we have • Claim: For every x, y (x≠y), any two numbers u, v in {0, 1, 2, …, p-1}, we have