1 / 9

Lecture 10: Sketching S3: Nearest Neighbor Search

Lecture 10: Sketching S3: Nearest Neighbor Search. Plan. PS2 due yesterday, 7pm Sketching Nearest Neighbor Search Scriber?. Sketching. short bit-strings given and , should be able to estimate some function of and With success probability ( ) , norm: words

duman
Télécharger la présentation

Lecture 10: Sketching S3: Nearest Neighbor Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 10:SketchingS3: Nearest Neighbor Search

  2. Plan • PS2 due yesterday, 7pm • Sketching • Nearest Neighbor Search • Scriber?

  3. Sketching • short bit-strings • given and ,should be able to estimate some function of and • With success probability () • , norm: words • Decision version: given in advance… • Lemma: , norm: bits 010110 010101 Estimate Distinguish between

  4. Sketching: decision version Consider Hamming space: Lemma: for any , can achieve -bit sketch. [on blackboard]

  5. Conclusion • Dimension reduction: • [Johnson-Lindenstrauss’84]: • a random linear projection into dimensions • preserves , up to approximation • with probability • Random linear projection: • Can be where is Gaussian, or entry-wise • Hence: preserves distance between points as long as • Can do faster than time • Using Fast Fourier Transform • In : no dimension reduction • But can do sketching • Using -stable distributions (Cauchy for ) • Sketching: decision version, constant : • For , can do with bits!

  6. Section 3: Nearest Neighbor Search

  7. Approximate NNS -approximate -near neighbor: given a query point • assuming there is a point within distance , • report a point s.t.

  8. NNS: approach 1 • Boosted sketch: • Let = sketch for the decision version (90% success probability) • new sketch : • keep copies of • estimator is the majority answer of the estimators • Sketch size: bits • Success probability: (Chernoff) • Preprocess: compute sketches for all the points • Query: compute sketch , and compute distance to all points using sketch • Time: improved from to

  9. NNS: approach 2 • Query time below ? • Theorem [KOR98]: query time and space for approximation. • Proof: • Note that has bits • Only possible sketches! • Store an answer for each of possible inputs • In general: • if a distance has constant-size sketch, admits a poly-space NNS data structure! • Space closer to linear? • approach 3 will require more specialized sketches…

More Related