Given by: Erez Eyal Uri Klein

Approximate Nearest Neighbor (Locality Sensitive Hashing) - Theory Given by: Erez Eyal Uri Klein

Overview Detailed Lecture Outline • Exact Nearest Neighbor search • Definition • Low dimensions • KD-Trees • Approximate Nearest Neighbor search (LSH based) • Locality Sensitive Hashing families • Algorithm for Hamming Cube • Algorithm for Euclidean space • Summary

? Nearest Neighbor Search in Springfield

? Nearest “Neighbor” Search for Homer Simpson Home planet distance Height Weight Color

p q Nearest Neighbor (NN) Search • Given: a set P of n points in Rd (d - dimension) • Goal: a data structure, which given a query point q, finds the nearest neighborp of q in P (in terms of some distance function D)

Nearest Neighbor Search Interested in designing a data structure, with the following objectives: • Space: O(dn) • Query time: O(d log(n)) • Data structure construction time is not important

Lecture Outline • Exact Nearest Neighbor search • Definition • Low dimensions • KD-Trees • Approximate Nearest Neighbor search (LSH based) • Locality Sensitive Hashing families • Algorithm for Hamming Cube • Algorithm for Euclidean space • Summery

1 4 7 8 13 19 25 32 q = 9 Simple cases: 1-D (d = 1) • A binary search will give the solution • Space: O(n); Time: O(log(n))

Simple cases: 2-D (d = 2) • Using Voronoi diagrams will give the solution • Space: O(n2); Time: O(log(n))

Lecture Outline • Exact Nearest Neighbor search • Definition • Low dimensions • KD-Trees • Approximate Nearest Neighbor search (LSH based) • Locality Sensitive Hashing families • Algorithm for Hamming Cube • Algorithm for Euclidean space • Summary

KD-Trees • KD-tree is a data structure based on recursively subdividing a set of points with alternating axis-aligned hyperplanes. • The classical KD-tree uses O(dn) space and answers queries in time logarithmic in n (worst case is O(n)), but exponential in d.

4 6 l1 7 8 5 9 10 3 2 1 11 2 5 4 11 8 1 3 9 10 6 7 KD-Trees Construction l9 l1 l5 l6 l3 l2 l3 l2 l10 l4 l5 l7 l6 l8 l7 l4 l8 l10 l9

l1 q 2 5 4 11 8 1 3 9 10 6 7 KD-Trees Query 4 6 l9 l1 7 l5 l6 l3 8 l2 l3 l2 5 9 l10 10 3 l4 l5 l7 l6 l8 l7 2 1 l4 11 l8 l10 l9

KD-Trees Algorithms

A conjecture: “The curse of dimensionality” In an exact solution, any algorithm for high dimension must use either nw(1) space or have dw(1) query time “However, to the best of our knowledge, lower bounds for exact NN Search in high dimensions do not seem sufficiently convincing to justify the curse of dimensionality conjecture” (Borodin et al. ‘99)

Why Approximate NN? • Approximation allow significant speedup of calculation (on the order of 10’s to 100’s) • Fixed-precision arithmetic on computer causes approximation anyway • Heuristics are used for mapping features to numerical values (causing uncertainty anyway)

Approximate Nearest Neighbor (ANN) Search • Given: a set P of n points in Rd (d - dimension) and a slackness parameter e>0 • Goal: a data structure, which given a query point q of which the nearest neighbor in P is a, finds any p s.t. D(q, p)b(1+e)D(q, a) a q (1+e)D(q, a)

Locality Sensitive Hashing A (r1, r2, P1, P2) - Locality Sensitive Hashing (LSH) family, is a family of hash functions H s.t. for a random hash function h and for any pair of points a, b we have: • D(a, b)br1 Pr[h(a)=h(b)]rP1 • D(a, b)rr2 Pr[h(a)=h(b)]bP2 • (r1<r2, P1>P2) (A common method to reduce dimensionality without loosing distance information) [Indyk-Motwani ’98]

Hamming Cube • A d-Dimensional hamming cube Qd is the set {0, 1}d • For any a, bQd we define Hamming distance H:

LSH – Example in Hamming Cube • H={h|h(a)=ai, i{1, …, d}} Pr[h(q)=h(a)]=1-H(q, a)/d Pr is a monotonically decreasing function in H(q, a) • Multi-index hashing: G={g|g(a)=(h1(a) h2(a)… hk(a))} Pr[g(q)=g(a)]=(1-H(q, a)/d)k Pr is a monotonically decreasing function in k

LSH – ANN Search Basic Scheme Preprocess: • Construct several such ‘g’functions for each l{1,…, d} • Store each aP at the place gi(a) of the corresponding hash table Query: • Perform binary search on l • In each step retrieve gi(q) (of l, if exists) • Return the last non empty result

ANN Search in Hamming Cube b-test t: • Pick a subset C of {1, 2, …, d} independently, at random w.p. b • For each iC, pick independently and uniformly ri{0, 1} • For any aQd: (Equivalently, we may pick R{0, 1}d s.t. Ri is 1 w.p. b/2, and the test is an inner product of R and a. Such R represents a b-test t) [Kushilevitz et al. ’98]

ANN Search in Hamming Cube • Define: D(a, b)=Pr[t(a)Rt(b)] • For a query q, Let H(a, q)bl, H(b, q)>l(1+e) Then for b=1/(2l): D(a, q)bd1<d2<D(b, q) Where: And define: d=d2-d1=Q(1-e-e/2)

ANN Search in Hamming Cube Data structure: S ={S1, …, Sd} Positive integers - M, T For any l{1,…, d}, Sl={G1,…, GM} For any j{1,…, M}, Gj consists of a set {t1,…, tT} (each tk is a (1/(2l))-test) and a table Aj of 2T entries

ANN Search in Hamming Cube In each Sl, construct Gj as follows: • Pick {t1,…, tT} independently at random • For vQd, the trace t(v)=(t1(v),…, tT(v)){0,1}T • An entry z{0, 1}T in Ajcontains a point aP, if H(t(a), z)b(d1+(1/3) d)T (else empty) The space complexity:

ANN Search in Hamming Cube For any query q and a, bP s.t. H(q, a)bl and H(q, b)>(1+e)l, it can be proven using Chernoff bounds that: This gives the result that the trace t functions as a LSH family (in its essence) (When the event presented in these inequalities occur for some Gj in Sl, Gj is said to ‘fail’) [Alon & Spencer ’92]

ANN Search in Hamming Cube Search Algorithm: We perform a binary search on l. In every step: • Pick Gj in Sl uniformly, at random • Compute t(q) from the list of tests in Gj • Check the entry labeled t(q) in Aj: • If the entry contains a point from P, restrict the search to lower l’s • Otherwise restrict the search to greater l’s Return the last non-empty entry in the search

Initialize l=d/2 Access Sl Choose Gj Calculate t(q) No l covered already? lupper half Is Aj(t(q)) empty? Yes Yes No ResAj(t(q)), llower half ANN Search in Hamming Cube Search Algorithm: Example

ANN Search in Hamming Cube • Construction of S is said to ‘fail’, if for some l more than mM/log(d) structures Gj in Sl ‘fail’ • Define (for some g, m): Then S’sconstruction fails w.p. of at most g • If S does not fail, then for every query the search algorithm fails to find an ANN w.p. of at most m

ANN Search in Hamming Cube • Query time complexity: • Space complexity: • Complexities are also proportional to e-2

Euclidean Space • The d-Dimensional Euclidean Space lid is Rd endowed with the Li distance • For any a, bQd we define Li distance: • The algorithm presented deals with l2d, and with l1d under minor changes

Euclidean Space Define: • B(a, r) is the closed ball around a with radius r • D(a, r)=PIB(a, r) (A subset of Rd) [Kushilevitz et al. ’98]

LSH – ANN Search Extended Scheme Preprocess: • Prepare a data structure for each ‘hamming ball’ induced by any a, bP. Query: • Start with some maximal ball • In each step calculate the ANN • Stop according to some threshold

ANN Search in Euclidean Space For aP, Define a Euclidian to Hamming mapping (h:D(a, r){0, 1}DF): • Define a parameter L • Given a set of i.i.d. unit vectors z1, …, zD • For each zi, The cutting points c1, …, cF are equally spaced on: • Each zi and cjdefine a coordinate in the DF-hamming cube, on which the projection of any bD(a, r) is 0 iff

b a h(a) h(b) (aiR) (biR) 0 1 z1 z1 1 1 a1 b1 1 1 b2 a2 0 0 a3 b3 1 0 z2 z2 1 1 ANN Search in Euclidean Space Euclidian to hamming Mapping Example: d=3, D=2, F=3

ANN Search in Euclidean Space • It can be proven that, expectedly, the mapping h preserves the relative distances between points in P • This mapping gets more accurate as r grows smaller:

ANN Search in Euclidean Space Data structure: S={Sa|aP} Positive integers - D, F, L For any aP, Sa consists of: • A list of all other P’s elements sorted by increasing distance from a • A structure Sa,b for any bRa (bP)

ANN Search in Euclidean Space Let r=L2(a, b), then Sa,b consists of: • A list of D i.i.d. unit vectors {z1, …, zD} • For each unit vector zi, a list of F cutting points • A Hamming Cube data structure of dimension DF, containing D(a, r) • The size of D(a, r)

ANN Search in Euclidean Space Search Algorithm (using a positive integer T): • Pick a random a0P where b0 is the farthest point from a0, and start from Sa0,b0 (r0=L2(a0, b0)) • For any Saj,bj: • Query for ANN of h(q) in the Hamming Cube d.s. and get result h(a’) • If L2(q, a’)>r-1/10 return a’ • Otherwise, pick T points of D(aj, rj) at random, and let a” be the closest to q among them • Let aj+1 be the closest to q of {aj, a’, a”}

ANN Search in Euclidean Space • Let b’P be the farthest from aj+1 s.t. 2L2(aj+1, q)rL2(aj+1, b’), Using a binary search on the sorted list of Sa(j+1) • If can’t find, return aj+1 • Otherwise, let bj+1=b’

ai bi q ANN Search in Euclidean Space Each ball in the search contains q’s (exact) NN

bi-1 ANN Search in Euclidean Space • contains only points from • contains at most points w.p. of at least 1-2-T ai-1 ai bi-1 q

ai-1 ai bi q ANN Search in Euclidean Space

ANN Search in Euclidean Space Conclusion: In the expected case, this gives us an O(log(n)) number of iterations

a1 b1 ANN Search in Euclidean Space Search Algorithm: Example a0 b0 q

ANN Search in Euclidean Space • Construction of S is said to ‘fail’, if for some Sa,b, h does not preserve the relative distances • Define (for some z): Then S’sconstruction fails w.p. of at most z • If S does not fail, then for every query the search algorithm finds an ANN

ANN Search in Euclidean Space • Query time complexity: • Space complexity: • Complexities are also proportional to e-2

Given by: Erez Eyal Uri Klein

Given by: Erez Eyal Uri Klein

Presentation Transcript

By: Dr. Uri Mahlab

Erez Zimerman

No Logo by Naomi Klein

By Grace Klein

uri/advance

Eyal Golan

Klein

Lecturer: Erez Petrank cs.technion.ac.il/~erez/courses/seminar

A Talk Given By

By: Dr. Uri Mahlab

klein

URI

by Daniel Klein dklein@gmu

URI

By: Dr. Uri Mahlab

EYAL NACHUM

EYAL NACHUM

LD Klein - Led By CEO Lisa Klein