Create Presentation
Download Presentation

Download Presentation
## Locality Sensitive Distributed Computing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Locality Sensitive Distributed Computing**David PelegWeizmann Institute**Structure of mini-course**• Basics of distributed network algorithms • Locality-preserving network representations • Constructions and applications**Part 2: Representations**• Clustered representations • Basic concepts: clusters, covers, partitions • Sparse covers and partitions • Decompositions and regional matchings • Skeletal representations • Spanning trees and tree covers • Sparse and light weight spanners**Basic idea of locality-sensitive distributed computing**• Utilize locality to both • simplify control structures and algorithms and • reduce their costs • Operation performed in large network may concern fewprocessors in small region • (Global operation may have local sub-operations) • Reduce costs by utilizing “locality of reference”**Components of locality theory**• General framework, complexity measures and algorithmic methodology • Suitable graph-theoretic structures and efficient construction methods • Adaptation to wide variety of applications**Fundamental approach**• Clustered representation: • Impose clustered hierarchical organization on given network • Use it efficiently for bounding complexity of distributed algorithms. • Skeletal representation: • Sparsify given network • Execute applications on remaining skeleton, reducing complexity**Clusters, covers and partitions**Cluster = connected subset of vertices S V**Cover of G(V,E,w) = collection of clusters**S={S1,...,Sm} containing all vertices of G (i.e., s.t. [S = V). Clusters, covers and partitions**Partial partition of G = collection of disjoint**clusters S ={S1,...,Sm}, i.e., s.t. SiÅ Sj= Partition = cover & partial partition Partitions**Evaluation criteria**Locality and Sparsity Locality level: cluster radius Sparsity level: vertex / cluster degrees**Evaluation criteria**Locality - sparsity tradeoff: • localityand sparsity parameters • go opposite ways: • better sparsity ⇔ worse locality • (and vice versa)**Evaluation criteria**Locality measures Weighted distances: Length of path (e1,...,es) = ∑1≤i≤sw(ei) dist(u,w,G) = (weighted) length of shortest path dist(U,W) = min{ dist(u,w) | uU, wW }**Evaluation criteria**• Diameter, radius: As before, except weighted • Denote logD = dlog Diam(G)e • For collection of clusters S: • Diam(S) = maxi Diam(Si) • Rad (S) = maxi Rad (Si)**Neighborhoods**G(v) = neighborhood of v = set of neighbors in G (including v itself) G(v)**Gl(v) = l-neighborhood of v**= vertices at distance l or less from v Neighborhoods G0(v) G1(v) G2(v)**Neighborhood covers**For W V: Gsl(W) = l-neighborhood cover of W = { Gl(v) | vW } (collection of l-neighborhoods of W vertices)**Neighborhood covers**E.g: Gs0(V) = partition into singleton clusters**Neighborhood covers**E.g: Gs1(W) = cover of W nodes by neighborhoods W = colored nodes Gs1(W)**Sparsity measures**Different representations Different ways to measure sparsity**deg(v,S) = # occurrences of v in clusters SS**i.e., degree of v in hypergraph (V,S) Cover sparsity measure - overlap DC(S) = maximum degree of cover S AvD(S) = average degree of S = ∑vV deg(v,S) / n = ∑SS|S| / n v deg(v) = 3**Intuition: “contract” clusters into super-nodes,**look at resulting cluster graph of S, G(S)=(S, E) Partition sparsity measure - adjacency**G(S)=(S, E) :**E={(S,S') | S,S‘S, G contains edge (u,v) for u S and v S'} Partition sparsity measure - adjacency E edges =inter-cluster edges**Cluster-neighborhood**Def: Given partition S, cluster S S, integer l≥0: Cluster-neighborhood of S = neighborhood of S in cluster graph G(S) Gcl(S,G) = Gl(S,G(S)) Gc(S,G) S**Sparsity measure**Average cluster-degree of partition S: AvDc(S) = SSS |Gc(S)| / n Note: AvDc(S) ~# inter-cluster edges**Example: A basic construction**Goal: produce a partition S with: 1. clusters of radius ≤ k 2. few inter-cluster edges (or, low AvDc(S)) Algorithm BasicPart Algorithm operates in iterations, each constructing one cluster**Example: A basic construction**At end of iteration: - Add resulting cluster S to output collection S - Discard it from V - If V is not empty then start new iteration**Arbitrarily pick a vertex v from V**• Grow cluster S around v, adding layer by layer • Vertices added to S are discarded from V Iteration structure**Iteration structure**• Layer merging process is carried repeatedly until reaching required sparsity condition: • next iteration increases # vertices by a factor of < n1/k (I.e., |G(S)| < |S| · n1/k)**Analysis**• Av-Deg-Partition Thm: • Given n-vertex graph G(V,E), integer k≥1, • Alg. BasicPart creates a partition S satisfying: • Rad(S) ≤ k-1, • # inter-cluster edges in G(S) ≤ n1+1/k • (or, AvDc(S) ≤ n1/k)**Analysis (cont)**• Proof: • Correctness: • Every S added to S is (connected) cluster • The generated clusters are disjoint • (Alg erases from V every v added to cluster) • S is a partition (covers all vertices)**Property (2): [E(G(S)) ≤ n1+1/k ]**By termination condition of internal loop, the resulting S satisfies |G(S)| ≤ n1/k·|S| (# inter-cluster edges touching S) ≤ n1/k·|S| Number can only decrease in later iterations, if adjacent vertices get merged into same cluster |E| ≤ ∑SS n1/k ·|S| = n1+1/k Analysis (cont)**Property (1): [ Rad(S) ≤ k-1]**Consider iteration of main loop. Let J = # times internal loop was executed Let Si= S constructed on i'th internal iteration |Si| > n(i-1)/k for 2≤i≤J (By induction on i) Analysis (cont)**J ≤ k**(otherwise, |S| > n) Note:Rad(Si) ≤ i-1 for every 1≤i≤J (S1 is composed of a single vertex, each additional layer increases Rad(Si) by 1) Rad(SJ) ≤ k-1 Analysis (cont)**Sep(S) = Separation of partial partition S**= minimal distance between any two S clusters Variant - Separated partial partitions When Sep(S)=s, we say S is s-separated Example: 2-separated partial partition**Cover T={T1,...,Tq}coarsens S ={S1,...,Sp}**if S clusters are fully subsumed in T clusters Coarsening S T**r**R Coarsening (cont) The radius ratio of the coarsening = Rad(T) / Rad(S) = R / r S T**Coarsening (cont)**• Motivation: • Given “useful” S with high overlaps: • Coarsen S by merging some clusters together, getting a coarsening cover T with • larger clusters • better sparsity • increased radii**Goal:**For initial cover S, construct coarsening T with low overlaps, paying little in cluster radii Sparse covers Inherent tradeoff: lower overlap higher radius ratio (and vice versa) Simple Goal: Low average degree**Algorithm AvCover**• Operates in iterations • Each iteration merges together some S clusters • into one output cluster ZT • At end of iteration: • Add resulting cluster Z to output collection T • Discard merged clusters from S • If S is not empty then start new iteration Sparse covers**Algorithm AvCover – high-level flow**Sparse covers**Arbitrarily pick cluster S0 in S(as kernelY of cluster Z**constructed next) • Repeatedly merge cluster with intersecting clusters from S(adding one layer at a time) • Clusters added to Z are discarded from S Iteration structure**- Layer merging process is carried repeatedly**until reaching required sparsity condition: Iteration structure adding next layer increases # vertices by a factor of ≤ n1/k (|Z| ≤ |Y| · n1/k)**Thm: Given graph G(V,E,w), cover S, int k≥1,**• Algorithm AvCover constructs a cover T s.t.: • T coarsens S • Rad(T) ≤ (2k+1) Rad(S) (radius ratio≤ 2k+1) • AvD(T) ≤ n1/k (low average sparsity) Analysis**Analysis (cont)**• Corollary for l-neighborhood cover: • Given G(V,E,w), integers k,l≥1, • there exists cover T = Tl,k s.t. • T coarsens the neighborhood cover Gsl(V) • Rad(T) ≤ (2k+1)l • AvD(T) ≤ n1/k**Proof of Thm:**Property (1):[TcoarsensS] Holds directly from construction (Each Z added to T is a (connected) cluster, since at the beginning S contained clusters) Analysis (cont)**Claim:The kernelsY corresponding to clusters Z generated by**the algorithm are mutually disjoint. Analysis (cont) Proof: By contradiction. Suppose there is a vertex v s.t. v YÅY' W.l.o.g. suppose Y was created before Y' v Y' There is a cluster S' s.t. vS' and S' was still in S when algorithm started constructing Y'.**Analysis (cont)**But S' satisfies S'ÅY ≠ ∅ The final merge creating Zfrom Y should have added S' into Zand eliminated it from S; contradiction.**Output clusters and kernels**kernels coverT**Property (2): [ Rad(T) ≤ (2k+1)·Rad(S) ]**Consider some iteration of main loop (starting with clusterSS) J = # times internal loop was executed. Z0= initial set Z Zi = Z constructed on i'th internal iteration (1≤i≤J) Respectively Zi,Yi Analysis (cont)**Analysis (cont)**Note 1: |Zi| > ni/k, for every 1≤i≤J-1, J ≤ k Note 2: Rad(Yi) ≤ (2i-1)Rad(S), for every 1≤i≤J Rad (YJ) ≤ (2k-1)Rad(S)