1 / 46

Range Spaces

Small-size  -nets for Axis-Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University University. Range Spaces. Range space (X, R) : X – Ground set (the “universe”).

yitta
Télécharger la présentation

Range Spaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Small-size  -nets for Axis-Parallel Rectangles and BoxesBoris Aronov EstherEzra Micha sharirpolytechnic Duke Tel-AvivInstitute of NYU University University

  2. Range Spaces Range space (X, R) : X – Ground set (the “universe”). R – Ranges: Subsets of X . |R|  2|X| Abstract form: Hypergraphs. X – vertices. R – hyperedges.

  3. Geometric Range Spaces specification: X d, R = set of simply-shaped regions in d . X – Points on the real line. R – Intervals. X – Points on the plane. R – halfplanes, disks,… For simplicity, assume X is finite: |R| is polynomial in|X|.

  4.  -nets for range spaces Given: • A range space (X, R) , assume X is finite, |X| = n . • A parameter 0 <  < 1 , An  -net for (X, R) is a subset N  X that hits every range Q  R, with |Q  X|   n . N is a hitting set for all the ``heavy'' ranges. Example: Points and intervals on the real line: |N| = 1/ . Captures at least an  -fraction of the universe. Bound does not depend on n.  n

  5. The hitting-set problem A hitting set for (X, R) is a subset H  X, s.t., for any Q  R , Q  H   . Goal: find smallest hitting set. Useful applications: art-gallery, sensor networking, and more.

  6. Hardness of hitting sets Finding a hitting set of smallest size is NP-hard, even for geometric range spaces! Use an approximation algorithm instead. Abstract range spaces [Chvatal 79]:Greedy algorithm. Approximation factor: O(log |X|) Geometric range spaces [Bronimann-Goodrich95], [Clarkson 93]: Achieve improved approximation factor! Approximation factor: O(log OPT), or smaller! This is achieved via -nets: Small-size -nets imply small approximation factors! OPT = size of the smallest hitting set.

  7. An upper bound for the  -net size The  -net theorem [Haussler-Welzl 87]: If the ranges are simply-shaped regions, then, for any > 0, a random sample of size O(1/ log (1/ )) is an  -net, with constant probability. Remark: In fact, it is sufficient to assume that the number of ranges is only polynomial in n. Is it optimal? Bound does not depend on n.

  8. The lower bound Theorem [Komlos, Pach, Woeginger 92]: The bound is tight! The construction: Artificial on abstract hypergraphs (non-geometric!). No lower bound better than (1/ ) is known in geometry. What is the actual bound? O(1/ ) ? Goal:Obtain smaller bounds for geometric range spaces. Ideally O(1/ ), but anything better than O(1/ log (1/ ))is `exciting‘ ! Achieved by points and intervals on the real line.

  9. Previous results Points and halfspaces in 2D, 3D. O(1/ )[Matousek 92], [Pyrga, Ray 08], [Har-Peled et al. 08] Points and disks, or pseudo-disks in 2D: O(1/ ) [Matousek, Seidel, Welzl 90], [Pyrga, Ray 08]. Pseudo-disks

  10. Our results Points and axis-parallel rectangles in the plane.  -net size is O(1/ log log (1/ )). Points and axis-parallel boxes in 3-space.  -net size is O(1/ log log (1/ )). Points and -fat triangles in the plane.  -net size is O(1/ log log (1/ )). Points uniformly distributed over the unit-cube, and axis-parallel boxes in d-space.  -net size is O(1/ log log (1/ )). Each of the angles  

  11. Improved approximation factorsfor geometric hitting sets Ranges previous bound new bound Axis-parallel rectangles log OPT log log OPT Axis-parallel 3-boxes log OPT log log OPT -fat triangles log OPT log log OPT Axis-parallel d-boxes log OPT log log OPT Uniformly distributed points in [0,1]d .

  12. Main idea :Use two-level sampling Primary sampling step: Obtain an initial sample S of ~1/ points of X. On average, each heavy rectangle Q must satisfy Q S  . Second sampling step (repair step): In each heavy rectangle Q  R , with Q S =  , sample additional points to guarantee that Q is stabbed by the net. S contains at least  n points Q

  13. The  -net construction Input:X - a set of n points. Parameters:r := 1/ . Primary sample: Produce a random sample S  X of size r . Make S part of the output.|S| = r. Apply the second sampling step in each empty rectangle… Instead of processing all input rectangles, we consider a smaller set of representative rectangles.

  14. The set of maximal S-empty rectangles A maximal S-empty rectangle M satisfies int(M)  S =  , and for each rectangle M’  M, int(M’)  S   . M is defined by  4 points of S. M - set of all maximal S-empty rectangles. Apply repair-step on M instead on the input rectangles. M S

  15. Why is it sufficient to consider M? For each input heavy rectangle Q, with Q  S =  , expand Q until each of its sides touches a point of S or continues to  . Since Q is heavy, a sufficiently large sample in M will hit Q, with high probability. Otherwise, done! M Q

  16. The repair step [CF-90, CV-07] Consider a heavy rectangle M, with |M  X | = t n/r, 1  t  log r . Second sampling step: Construct (1/ t)-net NM inside M , by sampling O(t log t) points in M. According to the  -net theorem, each input (empty) rectangle Q  R, Q  M, with |Q|  n/r , must be stabbed by NM ! r=1/ According to the  -net theorem The excess of M M Q The “universe” size is now t n/r

  17. The final  -net Output: The union of S andM  M NM . What is the expected size of the  -net ? r +E{t1 t log t|Mt | } Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf 98] E{ |Mt | } = O( 2-t E{ |M| }) , The number of heavy rectangles decreases exponentially! Mt= set of rectangles inMwith excesst Expected number of maximal empty rectangles

  18. An improved  -net Theorem: E{ |M| } = O(r log r) E{ |Mt| } = O(r log r). The expected  -net size is O(1/ log(1/ )) . Key observation: Use oversampling. Choose a slightly larger primary sample, and repair only rectangles M with excess t c log log r . t  1 No improvement yet… c > 1 |S| = r |S| = c r log log r |M  X | n/r

  19. What have we gained? “On average”, an S-empty rectangle contains now at most O(n/(r loglog r)) << n/r points. So M cannot be an “average” S-empty rectangle. It is much heavier. Exponential Decay Lemma:E{ |Mt | } = O( 2-t E{ |M| }) . # maximal heavy S-empty rectangles is much smaller! E{ |Mt| } = O(s log s / polylog r) = o(s) = o(r). The number of heavy (empty) rectangles is only sublinear in r ! The expected  -net size is O(rlog log r) . s = |S| = c r log log rt = clog log r .

  20. Oversampling: A trick or a technique? By oversampling at the preliminary step, we significantly decrease the size of the secondary sample. Note: The number of maximal S-empty rectangles is O(s log s) , however, we do not traverse all of them, but only the heavy ones! New Concept: The sample points and the maximal S-empty rectangles are two different entities.

  21. merci beaucoup!

  22. Bounding the number of maximal S-empty rectangles Upper bound:O(s2) . Each rectangle is determined by its two opposite corners. problem: The bound O(s2) is bad for the analysis, and yields an  -net of size O(1/ 2) ! Recall s=c 1/ log log 1/

  23. Quadratic Lower bound construction A staircase construction: Each point in the upper staircase is matched with each point in the lower staircase. (s2) empty rectangles. We can prune away most of these rectangles and remain only with O(s log s) rectangles .

  24. An O(s log s) bound for |M| Key observation: Consider a vertical line l, and all points to its left. Claim: The number of maximal S-empty rectangles, anchored at l is only linear. Next step:Use a tree decomposition built on top of X in order to obtain the O(s log s) bound. l Q’ Q v l3 l2 l‘3 l1 l3 l‘2 l‘‘3

  25. Dual (Geometric) Range Spaces Flip roles of X and R, and obtain (R, X*) . R = set of regions in d , X* = {Rp | p X}, Rp = {r | r  R , r containsp} . R – Intervals. X* – Subsets of intervals containing a common point in 1 . R – Disks. X* – Subsets of Disks containing a common point in 2 p p

  26.  -nets for dual range spaces  -net for (R, X*) is a subset N  R that covers all points at depth  |R| . An  -net is a set cover for all the “deep” points. Upper bound [Haussler-Welzl, 87]: O(/ log (/ )) ,  is the VC-dimension of (R, X*) . depth(p) = #ranges that cover p  X.

  27. Previous results Range space (R, X*), s.t., for each T  R, |T| = m, the union T has (a small) complexity o(m log m) : o(1/ log (1/ )) . [Clarkson, Varadarajan 07] Theorem: [Clarkson, Varadarajan 07] The complexity of the union is O(m (m))  -net size is O(1/  (1/ )). In fact, this should be the complexity of the vertical decomposition of the complement of the union.  () is a slowly growing function.

  28. More about the Clarkson-Varadarajantechnique Example: disks (or pseudo-disks) and points Input: A set T of m (pseudo) disks. Union complexity: O(m) . [kedem et al. 86]  -net size is O(1/ ). Example: fat triangles and points Input: A set T of m-fat triangles. Union complexity: O(m loglog m) . [Matousek et al. 1994]  -net size is O(1/ log log (1/ )). Each of the angles  

  29. Our results: Dual Theorem: [Clarkson, Varadarajan 07] The complexity of the union is O(m (m))  -net size is O(1/  (1/ )). Using the oversampling concept: Theorem (improvement!): The complexity of the union is O(m (m))  -net size is O(1/ log (1/ )) .

  30. Proof sketch Draw a random sample S of s = c/log (1/ ) regions. Construct the union of S: Decompose its complement into O(s (s)) “trapezoidal cells”. Each cell  is defined by  4 regions. Claim: With high probability,  meets  (n/s) log sregions of the input. 

  31. Proof sketch Apply a repair step on the heavy cells: Sample O(t log t)regions in each cell  that meets t n/s regions, for t  c log  (1/ ) Each point at depth  n is covered by at least one region. Use the Exponential Decay Lemma to show: # regions sampled at the repair step = o(1/ ) . Overall  -net size: O(1/log (1/ )) .

  32. New  -net bounds Fat triangles: Union complexity: O(m loglog m)  -net size is O(1/ log log log(1/ )). Locally -fat objects: Union complexity: O(m polylogm)  -net size is O(1/ log log (1/ )). And several other improved bounds. area(D O)    area(D) 0 <   1 O D

  33. Open problems • Improve our upper bound O(1/ loglog (1/ ))for points and axis-parallel rectangles.Conservative goal: Obtain a weak -net of size o(1/ loglog (1/ )) . • Extend our bound to points and axis-parallel boxes in d  4.Best known upper bound: O(1/ log (1/ )) . • Dual range spaces for rectangles and points.Best known upper bound: O(1/ log (1/ )) .Can improve to O(1/ loglog (1/ )) ? The points of the  -net are not necessarily chosen from X . p

  34. Motivation: Approximation for geometric hitting sets The Bronimann-Goodrich technique / LP-relaxation If (X, R) admits an  -net of size f(1/ ) , then there exists a polynomial-time approximation algorithm that reports a hitting set of size O(f(OPT)) . Idea: Assign weights on X s.t each range Q  R becomes heavy . Construct an -net for the weighed range space. Each range is hit by the -net. Small-size -nets imply small approximation factors!

  35. The repair step repair step: On average, each heavy rectangle Q must satisfy Q S  . The number of “bad” rectangles is small. It is sufficient to consider a set M of maximal S-empty rectangles, instead of R. Mis defined over the points of S. |M| = f(1/)(does not depend on n). S and so does #points sampled at the repair step. M Q

  36. An O(s log s) bound for |M| Key observation: Consider a vertical line l, and all points to its left. Claim: The number of maximal S-empty rectangles, anchored at l is only linear. Handling a query rectangle Q: One of the halves Q’of Q contains at least n/(2r) points. Q’ is anchored at l. Expand Q’on “heavier” side of l . l l Q Q’

  37. Tree decomposition Each node is a vertical strip • Build balanced binary tree T on X, sorted by x-coordinate • Stop expansion of T when nodes have n/r points. T has O(log r) = O(log s) levels. At each level: #maximal S-empty anchored rectangles: O(s) Overall (over all levels): O(s log s) . v l3 l2 l‘3 l1 l3 l‘2 l‘‘3

  38. Query rectangle Q For an input rectangle Q with n/r points: Find the first (highest) node of T whose bounding line lmeets Q. Expand Q within the“heavier” strip v bounded by l . The maximal S-empty anchored rectangles comprise the representative set for R. Q’ Q v l3 l2 l‘3 l1 l3 l‘2 l‘‘3

  39. Is the bound optimal? Theorem [Komlos, Pach, Woeginger 92]: The bound is tight! The construction: Artificial on abstract hypergraphs (non-geometric!). No lower bound better than (1/ ) is known in geometry. What is the actual bound? O(1/ ) ? Goal:Obtain smaller bounds for geometric range spaces. Ideally O(1/ ), but anything better than O(1/ log (1/ ))is `exciting‘ ! Achieved by points and intervals on the real line.

  40. Bounding the  -net size Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98] E{ |Mt| } = O( 2-t E{ | M'| }) , where: • S' is a smaller random sample, each point chosen with probability s/(t n) . • Mt - all maximal S-empty rectangles M with tMt . • M' - all maximal S'-empty rectangles.

  41. Bounding the final  -net size A very useful tool: Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98] E{ |Mt| } = O( 2-t E{ |M| }) , where Mt is all maximal S-empty rectangles M with tMt . The number of heavy rectangles decreases exponentially!

  42. A nearly-linear bound for |M| Fix a node v of T and its strip v : Xv = S v , Sv = S v Lemma: The number of maximal Sv-empty anchored rectangles in v is O(Sv) . At a fixed level i of T , overall number is O(s) . Overall: O(s log r) . v Entry side

  43. The set-cover problem Primal: A hitting set for (X, R) is a subset H  X, s.t., for any Q  R , Q  H   . Dual: A set cover for (X, R) is a subset S  R, s.t., any x  X is covered by S . A set cover for (X, R) is a hitting set for (R, X*) Finding a set cover of smallest size is NP-hard! (even for geometric range spaces). Achieve improved approximation factors via -nets (using the Bronimann-Goodrich technique / LP-relaxation). f(1/ ) O(f(OPT))

  44.  -nets for dual range spaces depth(p) = #ranges that cover p  X.  -net for (R, X*) is a subset N  R that covers all points at depth  |R| . An  -net is a set cover for all the deep points. Example: Intervals and points on the real line: |N| = 1/ .  n

  45. Extensions to axis-parallel boxes in 3-space Use similar machinery, with s = c r log log r, and a 3-level range tree decomposition. At each fixed triple-level of the tree, we have a subdivision of space into (clipped) orthants. y-order z-order x-order

  46. Axis-parallel boxes in 3-space Fix a orthant . Consider the points in , and the set M of all maximal S-empty boxes anchored at the apex of . Claim:M = O(s). E{|M| } = E{ |M| } = O(s log3s) The expected size of the -net is O(1/ log log (1/ )).  All these boxes grow from a common point. They behave as maximal S-empty orthants!

More Related