310 likes | 475 Vues
Randomized Algorithms CS648. Lecture 12 Hashing - II. Recap of Last Lecture. Problem Definition. called universe and Examples: , Aim Given a set , build a data structure storing s.t. we can answer in O ( 1 ) time : “ Does ?” for any given. Hashing. Hash table:
E N D
Randomized AlgorithmsCS648 Lecture 12 Hashing - II
Problem Definition • called universe • and Examples: , Aim Given a set , build a data structure storing s.t. we can answer in O(1) time : “Does ?” for any given .
Hashing • Hash table: : an array of size . • Hash function : Answering a Query:“Does ?” • ; • Search the list stored at . Properties of : • computable in O(1) time. • Space required by : O(1). Elements of 0 1 How many bits needed to encode ?
Collision Definition: Two elements are said to collide under hash function if Worst case time complexity of searching an item : No. of elements in colliding with . 0 1
Universal Hash Family Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any , This definition appears strange in the beginning! But we shall soon see that there is a very natural way to arrive at this definition.
Perfect hashing using O() space Let be Universal Hash Family. Let : the number of collisions for when ? Question:What is ?
Perfect hashing using O() space Let be Universal Hash Family. Let : the number of collisions for when ? Lemma1: Lemma2:For , there will be no collision with probability at least . Algorithm1: Perfect hashing for Fix ; Repeat • Pick ; • the number of collisions for under . Until . Build the hash table. Theorem: A perfect hash function can be computed for in expected O() time.
Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1:. Question: What is ] when = ? Answer: .
Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; • no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1
Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; • no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1
Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; • no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1
Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; • no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1
0 1 2 . . . be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Let : number of elements in [] Extra Space required: = • = + • 0 1 2 . . . Is there any relation between and ’s?
Theorem: A given set can be preprocessed in expectedO() time to build a data structure (2-level hash table) of O() size such that any search query can be answer in worst case O(1) time.
Why does hashing work so well in Practice ? A simple hash function:. • works so well in practice because the set is usually a uniformly random subset of . As a result • It is easy to fool this hash function such that it achieves O(s) search time. This makes us think: “Can we achieve expected O(1) search time for any given set .” similar question while Quick Sort Randomized Quick Sort
Universal Hash Family A simple hash function:. Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,
The starting point The simple hash function:. Problem: Two elements in are bound to collide if divides || . Is there some operation which when applied over any distributes || randomly uniformly over [0,1,…,] ?
mod operation : a non-negative integer : a positive integer mod{0,1,…,}. Question: How is |mod| related to ||mod ? Consider some Examples: • | 55mod3143mod31 | = ?? and | 55 43| mod 31 = ?? • | 91mod31102mod31 | = ?? and |91 102| mod 31 = ?? Answer: Let = || mod. Then|mod| = ?? 12 12 20 11 {, }
mod operation : a prime number : {} Consider any . Question: What can we say about set = {} ? Example: ,. 3 6 2 5 1 4
mod operation : a prime number : {} Consider any . Question: What can we say about set = {} ? Example: ,. Fact: = for all . Proof: = • divides • divides • divides ordivides 6 2 5 1 4 3 4 1 5 2 6 3 Not possible
mod operation : a prime number : {} Consider any . Define set = {}? Fact: = for all . Question:If , then what can we say about ? Answer: distributed randomly uniformly over . Can you now see, that the above answer plays the key role in formulating the hash function ?
Good fact: An element is mapped to a random element in {}. Slightly bad fact : Once element is mapped to a location, the mapping of is no more random. So it is not clear whether | - | is mapped uniformly randomly over {0,…,}. …So let us see () a bit more closely… 1 2 . . .
Probability of collision between and Let and will collide under if |modmod| is divisible by . Question: What is relation between |modmod| and mod ? Answer:|modmod| is either mod or .
Probability of collision between and Let Lemma: If andcollide under , then either mod is divisible by or is divisible by . {mod| } = ?? Let . Probability of collision between and = P(mod is divisible of or is divisible by ) 2P(mod is divisible of ) = Students must realize that it is a necessary condition and not sufficient condition for collision. To get an idea, study the example given at the last slide of this lecture. {,…,}
Theorem: Let, then H={| } is universal.
Example , . Observe that =1 Question: How many collisions between nd? Answer: two (for =3,4). Here for =4. And for =3 Answer:No collisions! (although for here.) 1 2 3 4 5 6 1 2 3 4 5 6 Table storing
Homework: Let, Then prove that H={| } is universal. In particular, show that for any , Hence it is slightly better than the hash family discussed just now.