Locality Sensitive Distributed Computing

Locality Sensitive Distributed Computing David PelegWeizmann Institute

Structure of mini-course • Basics of distributed network algorithms • Locality-preserving network representations • Constructions and applications

Part 2: Representations • Clustered representations • Basic concepts: clusters, covers, partitions • Sparse covers and partitions • Decompositions and regional matchings • Skeletal representations • Spanning trees and tree covers • Sparse and light weight spanners

Basic idea of locality-sensitive distributed computing • Utilize locality to both • simplify control structures and algorithms and • reduce their costs • Operation performed in large network may concern fewprocessors in small region • (Global operation may have local sub-operations) • Reduce costs by utilizing “locality of reference”

Components of locality theory • General framework, complexity measures and algorithmic methodology • Suitable graph-theoretic structures and efficient construction methods • Adaptation to wide variety of applications

Fundamental approach • Clustered representation: • Impose clustered hierarchical organization on given network • Use it efficiently for bounding complexity of distributed algorithms. • Skeletal representation: • Sparsify given network • Execute applications on remaining skeleton, reducing complexity

Clusters, covers and partitions Cluster = connected subset of vertices S  V

Cover of G(V,E,w) = collection of clusters S={S1,...,Sm} containing all vertices of G (i.e., s.t. [S = V). Clusters, covers and partitions

Partial partition of G = collection of disjoint clusters S ={S1,...,Sm}, i.e., s.t. SiÅ Sj=  Partition = cover & partial partition Partitions

Evaluation criteria Locality and Sparsity Locality level: cluster radius Sparsity level: vertex / cluster degrees

Evaluation criteria Locality - sparsity tradeoff: • localityand sparsity parameters • go opposite ways: • better sparsity ⇔ worse locality • (and vice versa)

Evaluation criteria Locality measures Weighted distances: Length of path (e1,...,es) = ∑1≤i≤sw(ei) dist(u,w,G) = (weighted) length of shortest path dist(U,W) = min{ dist(u,w) | uU, wW }

Evaluation criteria • Diameter, radius: As before, except weighted • Denote logD = dlog Diam(G)e • For collection of clusters S: • Diam(S) = maxi Diam(Si) • Rad (S) = maxi Rad (Si)

Neighborhoods G(v) = neighborhood of v = set of neighbors in G (including v itself) G(v)

Gl(v) = l-neighborhood of v = vertices at distance l or less from v Neighborhoods G0(v) G1(v) G2(v)

Neighborhood covers For W  V: Gsl(W) = l-neighborhood cover of W = { Gl(v) | vW } (collection of l-neighborhoods of W vertices)

Neighborhood covers E.g: Gs0(V) = partition into singleton clusters

Neighborhood covers E.g: Gs1(W) = cover of W nodes by neighborhoods W = colored nodes Gs1(W)

Sparsity measures Different representations  Different ways to measure sparsity

deg(v,S) = # occurrences of v in clusters SS i.e., degree of v in hypergraph (V,S) Cover sparsity measure - overlap DC(S) = maximum degree of cover S AvD(S) = average degree of S = ∑vV deg(v,S) / n = ∑SS|S| / n v deg(v) = 3

Intuition: “contract” clusters into super-nodes, look at resulting cluster graph of S, G(S)=(S, E) Partition sparsity measure - adjacency

G(S)=(S, E) : E={(S,S') | S,S‘S, G contains edge (u,v) for u  S and v  S'} Partition sparsity measure - adjacency E edges =inter-cluster edges

Cluster-neighborhood Def: Given partition S, cluster S S, integer l≥0: Cluster-neighborhood of S = neighborhood of S in cluster graph G(S) Gcl(S,G) = Gl(S,G(S)) Gc(S,G) S

Sparsity measure Average cluster-degree of partition S: AvDc(S) = SSS |Gc(S)| / n Note: AvDc(S) ~# inter-cluster edges

Example: A basic construction Goal: produce a partition S with: 1. clusters of radius ≤ k 2. few inter-cluster edges (or, low AvDc(S)) Algorithm BasicPart Algorithm operates in iterations, each constructing one cluster

Example: A basic construction At end of iteration: - Add resulting cluster S to output collection S - Discard it from V - If V is not empty then start new iteration

Arbitrarily pick a vertex v from V • Grow cluster S around v, adding layer by layer • Vertices added to S are discarded from V Iteration structure

Iteration structure • Layer merging process is carried repeatedly until reaching required sparsity condition: • next iteration increases # vertices by a factor of < n1/k (I.e., |G(S)| < |S| · n1/k)

Analysis • Av-Deg-Partition Thm: • Given n-vertex graph G(V,E), integer k≥1, • Alg. BasicPart creates a partition S satisfying: • Rad(S) ≤ k-1, • # inter-cluster edges in G(S) ≤ n1+1/k • (or, AvDc(S) ≤ n1/k)

Analysis (cont) • Proof: • Correctness: • Every S added to S is (connected) cluster • The generated clusters are disjoint • (Alg erases from V every v added to cluster) • S is a partition (covers all vertices)

Property (2): [E(G(S)) ≤ n1+1/k ] By termination condition of internal loop, the resulting S satisfies |G(S)| ≤ n1/k·|S| (# inter-cluster edges touching S) ≤ n1/k·|S| Number can only decrease in later iterations, if adjacent vertices get merged into same cluster |E| ≤ ∑SS n1/k ·|S| = n1+1/k Analysis (cont)

Property (1): [ Rad(S) ≤ k-1] Consider iteration of main loop. Let J = # times internal loop was executed Let Si= S constructed on i'th internal iteration |Si| > n(i-1)/k for 2≤i≤J (By induction on i) Analysis (cont)

J ≤ k (otherwise, |S| > n) Note:Rad(Si) ≤ i-1 for every 1≤i≤J (S1 is composed of a single vertex, each additional layer increases Rad(Si) by 1) Rad(SJ) ≤ k-1 Analysis (cont)

Sep(S) = Separation of partial partition S = minimal distance between any two S clusters Variant - Separated partial partitions When Sep(S)=s, we say S is s-separated Example: 2-separated partial partition

Cover T={T1,...,Tq}coarsens S ={S1,...,Sp} if S clusters are fully subsumed in T clusters Coarsening S  T

r R Coarsening (cont) The radius ratio of the coarsening = Rad(T) / Rad(S) = R / r  S T

Coarsening (cont) • Motivation: • Given “useful” S with high overlaps: • Coarsen S by merging some clusters together, getting a coarsening cover T with • larger clusters • better sparsity • increased radii

Goal: For initial cover S, construct coarsening T with low overlaps, paying little in cluster radii Sparse covers Inherent tradeoff: lower overlap higher radius ratio (and vice versa) Simple Goal: Low average degree

Algorithm AvCover • Operates in iterations • Each iteration merges together some S clusters • into one output cluster ZT • At end of iteration: • Add resulting cluster Z to output collection T • Discard merged clusters from S • If S is not empty then start new iteration Sparse covers

Algorithm AvCover – high-level flow Sparse covers

Arbitrarily pick cluster S0 in S(as kernelY of cluster Z constructed next) • Repeatedly merge cluster with intersecting clusters from S(adding one layer at a time) • Clusters added to Z are discarded from S Iteration structure

- Layer merging process is carried repeatedly until reaching required sparsity condition: Iteration structure adding next layer increases # vertices by a factor of ≤ n1/k (|Z| ≤ |Y| · n1/k)

Thm: Given graph G(V,E,w), cover S, int k≥1, • Algorithm AvCover constructs a cover T s.t.: • T coarsens S • Rad(T) ≤ (2k+1) Rad(S) (radius ratio≤ 2k+1) • AvD(T) ≤ n1/k (low average sparsity) Analysis

Analysis (cont) • Corollary for l-neighborhood cover: • Given G(V,E,w), integers k,l≥1, • there exists cover T = Tl,k s.t. • T coarsens the neighborhood cover Gsl(V) • Rad(T) ≤ (2k+1)l • AvD(T) ≤ n1/k

Proof of Thm: Property (1):[TcoarsensS] Holds directly from construction (Each Z added to T is a (connected) cluster, since at the beginning S contained clusters) Analysis (cont)

Claim:The kernelsY corresponding to clusters Z generated by the algorithm are mutually disjoint. Analysis (cont) Proof: By contradiction. Suppose there is a vertex v s.t. v  YÅY' W.l.o.g. suppose Y was created before Y' v  Y'  There is a cluster S' s.t. vS' and S' was still in S when algorithm started constructing Y'.

Analysis (cont) But S' satisfies S'ÅY ≠ ∅  The final merge creating Zfrom Y should have added S' into Zand eliminated it from S; contradiction.

Output clusters and kernels kernels coverT

Property (2): [ Rad(T) ≤ (2k+1)·Rad(S) ] Consider some iteration of main loop (starting with clusterSS) J = # times internal loop was executed. Z0= initial set Z Zi = Z constructed on i'th internal iteration (1≤i≤J) Respectively Zi,Yi Analysis (cont)

Analysis (cont) Note 1: |Zi| > ni/k, for every 1≤i≤J-1, J ≤ k Note 2: Rad(Yi) ≤ (2i-1)Rad(S), for every 1≤i≤J Rad (YJ) ≤ (2k-1)Rad(S)

Locality Sensitive Distributed Computing

Locality Sensitive Distributed Computing

Presentation Transcript

Distributed computing

DISTRIBUTED COMPUTING

Locality-Sensitive Hashing

Distributed Computing

VLSH: Voronoi-based Locality Sensitive Hashing

DISTRIBUTED COMPUTING

Distributed Computing

Distributed Computing

Locality in distributed graph algorithms

Finding Similar Items: Locality Sensitive Hashing

Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

Locality Sensitive Distributed Computing

Locality Sensitive Distributed Computing Exercise Set 1

Locality Sensitive Distributed Computing Exercise Set 2

Applications of LSH (Locality-Sensitive Hashing)

Distributed Computing

Distributed computing

Introduction to locality sensitive approach to distributed systems

Locality Sensitive Hashing