290 likes | 472 Vues
Embedding Metric Spaces in Their Intrinsic Dimension. Ittai Abraham , Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech. Emebdding Metric Spaces. Metric spaces (X,d X ), (Y,d Y ) Embedding is a function f : X → Y Distortion is the minimal α such that
E N D
Embedding Metric Spaces in Their Intrinsic Dimension Ittai Abraham , Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech
Emebdding Metric Spaces • Metric spaces (X,dX), (Y,dY) • Embedding is a function f : X→Y • Distortion is the minimal α such that dX(x,y)≤dY(f(x),f(y))≤α·dX(x,y)
Intrinsic Dimension • Doubling Constant : The minimal λ such any ball of radius r>0, can be covered by λ balls of radius r/2. • Doubling Dimension : dim(X) = log2λ. • The problem: Relation between metric dimension to intrinsic dimension.
Previous Results • Given a λ-doubling finite metric space (X,d) and 0<γ<1, it’s snow-flake version (X,dγ) can be embedded into Lp with distortion and dimension depending only onλ[Assouad 83]. • Conjecture (Assouad) : This hold for γ=1. • Disproved by Semmes. • A lower bound on distortion of for L2, with a matching upper bound [GKL 03].
Rephrasing the Question • Is there a low-distortion embedding for a finite metric space in its intrinsic dimension? Main result : Yes.
Main Results • Any finite metric space (X,d) embeds into Lp: • With distortion O(log1+θn) and dimension O(dim(X)/θ), for any θ>0. • With constant average distortion and dimension O(dim(X)log(dim(X))).
Additional Result • Any finite metric space (X,d) embeds into Lp: • With distortion and dimension . ( For all D≤ (log n)/dim(X) ). • In particular Õ(log2/3n) distortion and dimension into L2. • Matches best known distortion result [KLMN 03] for D=(log n)/dim(X) , with dimension O(log n log(dim(X))).
Distance Oracles • Compact data structure that approximately answers distance queries. • For general n-point metrics: • [TZ 01]O(k) stretch with O(kn1/k) bits per label. • For a finite λ-doubling metric: • O(1) average stretch with Õ(log λ) bits per label. • O(k) stretch with Õ(λ1/k) bits per label. Follows from variation on “snow-flake” embedding (Assouad).
First Result • Thm: For any finite λ-doubling metric space (X,d) on n points and any 0<θ<1 there exists an embedding of (X,d) into Lpwith distortion O(log1+θn) and dimension O((log λ)/θ).
Probabilistic Partitions • P={S1,S2,…St} is a partition of Xif • P(x)is the cluster containing x. • Pis Δ-bounded if diam(Si)≤Δfor all i. • A probabilistic partitionP is a distribution over a set of partitions. • A Δ-bounded P is η-padded if for all xєX :
η-padded Partitions • The parameter η determines the quality of the embedding. • [Bartal 96]:η=Ω(1/log n) for any metric space. • [CKR01+FRT03]:Improved partitions with η(x)=1/log(ρ(x,Δ)). • [GKL 03] :η=Ω(1/log λ) for λ-doubling metrics. • [KLMN 03]:Used to embed general + doubling metrics into Lp : distortion O((log λ)1-1/p(log n)1/p), dimension O(log2n). The local growth rate of x at radius r is:
Uniform Local Padding Lemma • A local padding : padding probability for x is independent of the partition outside B(x,Δ). • A uniform padding : padding parameter η(x) is equal for all points in the same cluster. • There exists a Δ-bounded prob. partition with local uniform padding parameter η(x) : • η(x)>Ω(1/log λ) • η(x)> Ω(1/log(ρ(x,Δ))) C1 C2 v2 v1 v3 η(v1) η(v3)
Plan: • A simpler result of: • Distortion O(log n). • Dimension O(loglog n·log λ). • Obtaining lower dimension of O(log λ). • Brief overview of: • Constant average distortion. • Distortion-dimension tradeoff.
Embeddinginto one dimension • For each scale iєZ, create uniformly padded local probabilistic 8i-bounded partition Pi. • For each cluster choose σi(S)~Ber(½) i.i.d. fi(x)=σi(Pi(x))·min{ηi-1(x)·d(x,X\Pi(x)), 8i} • Deterministic upper bound : |f(x)-f(y)| ≤ O(logn·d(x,y)). using Pi x d(x,X\Pi(x)
Lower Bound - Overview • Create a ri-net for all integers i. • Define success event for a pair (u,v) in the ri-net, d(u,v)≈8i : as having contribution >8i/4 , for many coordinates. • In every coordinate, a constant probability of having contribution for a net pair (u,v). • Use Lovasz Local Lemma. • Show lower bound for other pairs.
Lower Bound – Other Pairs? • x,y some pair, d(x,y)≈8i. u,v the nearest in the ri-net to x,y. • Suppose that |f(u)-f(v)|>8i/4. • We want to choose the net such that |f(u)-f(x)|<8i/16, chooseri= 8i/(16·log n). • Using the upper bound |f(u)-f(x)| ≤ log n·d(u,x) ≤ 8i/16 • |f(x)-f(y)| ≥ |f(u)-f(v)|-|f(u)-f(x)|-|f(v)-f(y)| ≥ 8i/4-2·8i/16 = 8i/8. 8i/(16log n) v u x y
Lower Bound: v u • ri-net pair (u,v). Can assume that 8i≈d(u,v)/4. • It must be that Pi(u)≠Pi(v) • With probability ½ :d(u,X\Pi(u))≥ηi8i • With probability ¼ : σi(Pi(u))=1 and σi(Pi(v))=0
Lower Bound – Net Pairs • d(u,v)≈8i. Consider • If R<8i/2 : • With prob. 1/8 fi(u)-fi(v)≥ 8i. • If R≥ 8i/2 : • With prob. 1/4 fi(u)=fi(v)=0. • In any case • Lower scales do not matter The good event for pair in scale i depend on higher scales, but has constant probability given any outcome for them. Oblivious to lower scales. v u ηi(u) 8i
Local Lemma • Lemma (Lovasz): Let A1,…Anbe “bad” events. G=(V,E) a directed graph with vertices corresponding to events with out-degree at most d. Let c:V→Nbe “rating” function of event such that (Ai,Aj)єE then c(Ai)≥c(Aj), if and then Rating = radius of scale.
Lower Bound – Net Pairs • A success eventE(u,v) for a net pair u,v : there is contribution from at least 1/16 of the coordinates. • Locality of partition – the net pair depend only on “nearby” points, with distance < 8i. • Doubling constant λ, and ri≈8i/log n - there are at most λloglogn such points, so d=λloglogn. • Taking D=O(logλ·loglog n) coordinates will give roughly e-D= λ-loglogn failure probability. • By the local lemma, there is exists an embedding such that E(u,v) holds for all net pairs.
Obtaining Lower Dimension • To use the LLL, probability to fail in more than 15/16 of the coordinates must be < λ-loglogn • Instead of taking more coordinates, increase the success probability in each coordinate. • If probability to obtain contribution in each coordinate >1-1/log n, it is enough to take O(log λ) coordinates. Similarly, if failure prob. in each coordinate < log-θn, enough to take O((log λ)/θ) coordinates
Using Several Scales • Create nets only every θloglog n scales. • A pair (x,y) in scale i’ (i.e. d(x,y)≈8i’) will find a close net pair in nearest smaller scale i. • 8i’<logθn·8i, so lose a factor of logθn in the distortion. • Consider scales i-θloglog n,…,i. i+θloglog n i’ θloglog n > i i-θloglog n
Using Several Scales • Take u,v in the net with d(u,v)≈8i. • A success in one of these scales will give contribution >8i-θloglog n = 8i/logθn. • The success for u,v in each scale is : • Unaffected by higher scales events • Independent of events “far away” in the same scale. • Oblivious to events in lower scales. • Probability that all scales failed<(7/8)θloglog n. • Take only D=O((log λ)/θ) coordinates. Lose a factor of logθn inthe distortion` i+θloglog n i i-θloglog n
Constant Average Distortion • Scaling distortion– for every 0<ε<1 at most ε·n2 pairs with distortion > polylog(1/ε). • Upper bound of log(1/ε), by standard techniques. • Lower bound: • Define a net for any scale i>0and ε=exp{-8j}. • Every pair (x,y) needs contribution that depends on: • d(x,y). • Theε-value of x,y. • Sieve the nets to avoid dependencies between different scales and different values of ε. • Show that if a net pair succeeded, the points near it will also succeed.
Constant Average Distortion • Lower bound cont… • The local Lemma graph depends on ε, use the general case of local Lemma. • For a net pair (u,v) in scale 8i– consider scales: 8i-loglog(1/ε),…,8i-loglog(1/ε)/2. • Requires dimension O(log λ·loglog λ). The net depends on λ.
Distortion-Dimension Tradeoff • Distortion : • Dimension : • Instead of assigning all scales to a single coordinate: • For each point x: Divide the scales into D bunches of coordinates, in each • Create a hierarchical partition. D ≤ (log n)/log λ Upper bound needs the x,y scales to be in the same coordinates
Conclusion • Main result: • Embedding metrics into their intrinsic dimension. • Open problem: • Best distortion in dimension O(log λ). • Dimension reduction in L2 : • For a doubling subset of L2 ,is there an embedding into L2 with O(1) distortion and dimension O(dim(X))? For p>2 there is a doubling metric space requiring dimension at least Ω(log n) for embedding into LPwith distortion O(log1/pn).