Understanding IP Address Lookup and Hashing Techniques
Learn about IP address lookup, hashing, chaining, hash functions, universal hash functions, perfect hashing, bloom filters, false positive probabilities, and optimal hash function selection.
Understanding IP Address Lookup and Hashing Techniques
E N D
Presentation Transcript
Look-up problem IP address did we see the IP address before?
Hashing + chaining use IP address as an index linked list index hash function IP address
How to choose a hash function? depends on the distribution of the data x [0,1] x x.n interpret x as a number x x mod n IP-addresses n=256 bad n=257 good?
Universal hash functions choose a hash function “randomly” n = number of entries in the hash table U = the universe h: U {0,...,n-1} a hash function
Universal hash functions choose a hash function “randomly” n = number of entries in the hash table U = the universe h: U {0,...,n-1} a hash function a set of hash functions H is universal if x,y U and random h H P ( h(x) = h(y) ) 1/n
Universal hash functions a set of hash functions H is universal if x,y U and random h H P ( h(x) = h(y) ) 1/n For IP addresses choose a1,a2,a3,a4 {0,1,...,256} (x1,x2,x3,x4) a1x1+a2x2+a3x3+a4x4 mod 257
Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements
Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements n = m2 i.e., space used (m2) H = family of universal hash functions hash function h H with no collision
Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions x1,...,xn the number of elements that map to 1,2,...,n h H such that xi2 = O(m)
h H such that xi2 = O(m) Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions x1,...,xn the number of elements that map to 1,2,...,n secondary hash table of size xi2
Bloom filter Goal: store an m element subset of IP addresses IP address HASH HASH HASH 0 0 0 n-bits of storage
Bloom filter - insert INSERT(x) for i from 1 to k do A(hi(x)) 1 IP address HASH HASH HASH 1 1 1 n-bits of storage
Bloom filter – member MEMBER(x) for i from 1 to k do if A(hi(x))=0 then return FALSE return TRUE IP address HASH HASH HASH 1 1 1 n-bits of storage
Bloom filter – member MEMBER(x) for i from 1 to k do if A(hi(x))=0 then return FALSE return TRUE sometimes gives false positive answer error parameter: false positive probability
Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions
Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p = fraction of the bits filled p e-km/n
Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p e-km/n p = fraction of the bits filled false positive probability (1-p)k
Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions optimal k 0.7 m/n false positive rate 0.6185m/n