Connected Components & All Pairs Shortest Paths

Connected Components& All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad

Outline • Adjacent matrix and connectivity matrix • Parallel algorithm for computing connectivity matrix • Parallel algorithm for computing connected components • Sequential algorithms for all-pairs shortest paths • Parallel algorithm for all-pairs shortest paths • Analysis • Related recent research • References

9.3. Connected Components

Connected Components • Let G=(V,E) be a graph, V={v0,v1,…,vn-1} • It can be represented by an n x n adjacency matrixA defined as • Connected component of an undirected graph G is a connected subgraph of G of maximum size • Given such a graph G, we develop an algorithmfor computing its connected components on a hypercube interconnection network parallel computer

Adjacency matrix - Examples v0 v0 v2 v1 v2 v1 v3 v3 v4 v4 v5 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 Example 1: undirected graph Example 2: directed graph

Applications for Connected Components • Identifying clusters. We can represent each item by a vertex and add an edge between each pair of items that are ``similar.'' The connected components of this graph correspond to different classes of items. • Component labeling is commonly used in image processing, to join neighboring pixels into connected regions which are the shapes in the image. • Testing whether a graph is connected is an essential preprocessing step for every graph algorithm.

Computing the Connectivity Matrix • A key step in the algorithm for finding the connected components is to find the so-called connectivity matrix • Definition:A connectivity matrix of a (directed or undirected) graph G with n vertices is an n x n matrix C defined as: for 0  j,k  n-1 • C also known as reflexive and transitive closure of G • Given the adjacency matrix A of G,it is required to compute C

Computing the Connectivity Matrix – cont. • Approach: Boolean matrix multiplication • The matrices to be multiplied, and the product matrix are all binary • The logical “and” operation replaces regular multiplication • The logical “or” operation replaces regular addition • If X, Y and Z are n x n Boolean matrices, where Z is a Boolean product of X and Y, then zij = (xi1andy1j) or (xi2andy2j) or … or (xinandynj) (in regular product: )

Computing the Connectivity Matrix – cont. • 1st step: obtain an n x n matrix B from A as follows for 0  j,k  n-1 i.e. B is equal to A with augmented 1’s along the diagonal  B represents all the paths in G of length less than 2, or

Computing the Connectivity Matrix – cont. • Then B2 = B x B (a Boolean product of B with itself) represents paths of length 2 or less • bik2 represents a path of length 2 from vi to vk through vj • Generally, Bnrepresents paths of length nor less • Observe: If there is a path from vi to vj, it cannot have length more than n-1 since G has only n vertices. • Hence, the connectivity matrix C = Bn-1 • Bn-1 is computed through successive squaring j k k vk k vi i 1 0 i 1 = j 1 vj

Computing the Connectivity Matrix – cont. • C is obtained after log (n-1) Boolean matrix multiplications • When n -1 is not a power of 2, C is obtained form Bm, where m = 2 log (n-1) (the smallest power of 2 larger than n-1) this is correct since Bm=Bn-1 for m > n-1 Implementation: • We use the algorithm HYPERCUBE MATRIX MULTIPLICATION, adopted to perform Boolean matrix multiplication • Input: the adjacency matrix A of G • Output: the connectivity matrix C

Computing the Connectivity Matrix – cont. • The hypercube used has N = n3 processors: P0, P1, …, PN-1 • Arranged in an n x n x n array; Pr occupies position (i,j,k) where r = in2+jn+k , 0  i,j,k  n-1 • Processor Pr has 3 registers: A(i,j,k), B(i,j,k), C(i,j,k) • Initially, the processors in position (0,j,k) (0  j,k  n-1) contain the adjacency matrix: A(0,j,k) = ajk • At the end, the same processors contain the connectivity matrix: C(0,j,k) = cjk (0  j,k  n-1)

Algorithm HYPERCUBE CONNECTIVITY (A,C) Step 1: forj = 0 ton-1 do in parallel A(0, j, j) 1 end for Step 2: forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel B(0, j, k)  A(0, j, k) end for end for Step 3: fori = 1 tolog (n-1) do (3.1) HYPERCUBE MATRIX MULTIPLICATION (A,B,C) (3.2) forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel (i) A(0, j, k)  C(0, j, k) (i) B(0, j, k)  C(0, j, k) end for end for end for

Analysis of the “HYPERCUBE CONNECTIVITY” algorithm • Steps 1, 2 and 3 take constant time • HYPERCUBE MATRIX MULTIPLICATION: O(log n) time and this step is iterated log (n-1) times • Total running time: t(n) = O(log2n) • Since p(n) = n3  cost c(n) = O(n3 log2n)

Algorithm for Connected Components • Construct an n x n matrix D using the connectivity matrix C: for 0  j,k  n-1 i.e. row j of D contains names of vertices to which vjis connected by a path • Connected components of G are found by assigning each vertex to a component in a following way: • vj is assigned to a component l if l is the smallest index for which djl  0 i value of l is i j vi vk 0…0

Implementation of the Connected Components algorithm • Implemented on a hypercube using the HYPERCUBE CONNECTIVITY algorithm • It runs on a hypercube with N = n3 processors, each with three registers A, B, C • Processors are arranged in n x n x n array as required for the HYPERCUBE CONNECTIVITY algorithm • Initially: A(0,j,k) = ajkfor 0  j,k  n-1 • At the end: C(0,j,0) contains the component number for vertex vj

Algorithm HYPERCUBE CONNECED COMPONENTS Algorithm HYPERCUBE CONNECED COMPONENTS (A,C) Step 1: HYPERCUBE CONNECTIVITY (A,C) Step 2: forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel ifC(0,j,k) = 1 then C(0, j, k)  vk end if end for end for Step 3: forj = 1 ton-1 do in parallel (3.1) The n processors in row j find the smallest l for which C(0,j,l)  0 (3.2) C(0,j,0) = l end for Creating matrix D

Analysis of the “HYPERCUBE CONNECED COMPONENTS” algorithm • Step 1 requires O(log2n) time • Steps 2 and (3.2) take constant time • Step (3.1): the n processors in row j form a log n dimensional hypercube; this step is a reduction operation (Step 3 of HYPERCUBE MATRIX MULTIPLICATION with “+” replaced by “min”) • Overall running time: t(n) = O(log2n) • p(n)=n3  c(n) =O(n3log2n)

Example: comp. Conn. Comp. on a hypercube v0 v2 v1 v7 v5 Graph G v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Adjacency matrix of G

Example v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 = X = A A2 x A

Example 2 – computing the Connectivity Matrix v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 = X = A4 (= A2)  stop A2 x A2

Example 2 – cont. v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Connectivity matrix Matrix of connected components

Example 2 – cont. v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Component 1: { v0,v5,v7 } Component 2: { v1,v2,v4 } Component 3: {v3,v6 } Matrix of connected components

9.5. All-Pairs Shortest Paths

Graph Terminology • G = (V, E) • W = weight matrix • wij = weight/length of edge (vi, vj) • wij = ∞ if vi and vj are not connected by an edge • wii = 0 • Assume W has positive, 0, and negative values • For this problem, we cannot have a negative-sum cycle in G

0 1 2 3 4 Weighted Graph and Weight Matrix v1 v0 5 -4 v2 3 1 2 7 9 6 v3 v4

Directed Weighted Graph and Weight Matrix 0 1 2 3 4 5 v3 v0 -2 1 7 v1 v2 -1 2 5 9 6 3 4 v4 v5

All-Pairs Shortest Paths Problem • For every pair of vertices vi and vj in V, it is required to find the length of the shortest path from vi to vj along edges in E. • Specifically, a matrix D is to be constructed such that dij is the length of the shortest path from vi to vj in G, for all i and j. • Length of a path (or cycle) is the sum of the lengths (weights) of the edges forming it.

Sample Shortest Path v3 v0 -2 1 7 v1 v2 2 -1 5 9 6 3 4 v4 v5 Shortest path from v0 to v4 is along edges (v0, v1), (v1, v2), (v2, v4) and has length 6

Disallowing Negative-length Cycles • APSP does not allow for input to contain negative-length cycles • This is necessary because: • If such a cycle were to exist within a path from vi to vj, then one could traverse this cycle indefinitely, producing paths of ever shorter lengths from vi to vj. • If a negative-length cycle exists, then all paths which contain this cycle would have a length of -∞.

Sequential Algorithms for APSP • Floyd-Warshall algorithm is Θ(V3) • Appropriate for dense graphs: |E| = O(|V|2) • Johnson’s algorithm • Appropriate for sparse graphs: |E| = O(|V|) • O(V2 log V + V E) if using a Fibonacci heap • O(V E log V) if using binary min-heap

Properties of Interest • Let denote the length of the shortest path from vi to vj that goes through at most k - 1 intermediate vertices (k hops) • = wij (edge length from vi to vj) • If i ≠ j and there is no edge from vi to vj, then • Also, • Given that there are no negative weighted cycles in G, there is no advantage in visiting any vertex more than once in the shortest path from vi to vj. • Since there are only n vertices in G,

Guaranteeing Shortest Paths • If the shortest path from vi to vj contains vr and vs (where vr precedes vs) • The path from vr to vs must be minimal (or it wouldn’t exist in the shortest path) • Thus, to obtain the shortest path from vi to vj, we can compute all combinations of optimal sub-paths (whose concatenation is a path from vi to vj), and then select the shortest one vi vr vs vj MIN MIN MIN ∑ MINs

Iteratively Building Shortest Paths v1 w1j v2 w2j … vn vi vj wnj

Recurrence Definition • For k > 1, • Guarantees O(log k) steps to calculate vi vl vj MIN MIN ≤ k/2 vertices ≤ k/2 vertices ≤ k vertices

Similarity

Computing D • Let Dk = matrix with entries dij for 0 ≤ i, j ≤ n - 1. • Given D1, compute D2, D4, … , Dm Where • D = Dm • To calculate Dk from Dk/2, use special form of matrix multiplication • ‘’ → ‘’ • ‘’ → ‘min’

“Modified” Matrix Multiplication Step 2: forr = 0 toN – 1 do par Cr = Ar + Br end Step 3: form = 2qto 3q – 1 do forallr N (rm = 0) do par Cr = min(Cr, Cr(m))

“Modified” Example (1) P101 P100 P000 P001 1 -1 1 -2 From 9.2, Initial 3 -3 4 -4 P110 P010 P011 P111

“Modified” Example(2) P101 P100 2 -2 1 -1 P000 P001 1 -1 2 -2 From 9.2, after step (1.1) 3 -3 4 -4 P110 P010 P011 P111 3 -3 4 -4

“Modified” Example (5) P101 P100 -1 -2 P000 P001 0 -1 From 9.2, after modified step 2 2 1 P110 P010 P011 P111 1 0

“Modified” Example (6) P101 P100 P000 P001 MIN MIN From 9.2, after modified step 3 -1 -2 1 0 MIN MIN P110 P010 P011 P111

Hypercube Setup • Begin with a hypercube of n3 processors • Each has registers A, B, and C • Arrange them in an nnn array (cube) • Set A(0, j, k) = wjk for 0 ≤ j, k ≤ n – 1 • i.e processors in positions (0, j, k) contain D1 = W • When done, C(0, j, k) contains APSP = Dm

0 1 2 3 4 5 Setup Example D1 = Wjk = A(0, j, k) = v3 v0 -2 1 7 v1 v2 -1 2 5 9 6 3 v4 4 v5

APSP Parallel Algorithm Algorithm HYPERCUBE SHORTEST PATH (A,C) Step 1: forj = 0 ton - 1 dopar fork = 0 ton - 1 dopar B(0, j, k) = A(0, j, k) end for end for Step 2: fori = 1 todo (2.1) HYPERCUBE MATRIX MULTIPLICATION(A,B,C) (2.2) forj = 0 ton - 1 dopar for k = 0 ton - 1 dopar (i) A(0, j, k) = C(0, j, k) (ii) B(0, j, k) = C(0, j, k) end for end for end for

An Example 0 1 2 3 4 5 0 1 2 3 4 5 D1 = D2 = 0 1 2 3 4 5 0 1 2 3 4 5 D4 = D8 =

Analysis • Steps 1 and (2.2) require constant time • There are iterations of Step (2.1) • Each requires O(log n) time • The overall running time is t(n) = O(log2n) • p(n) = n3 • Cost is c(n) = p(n) t(n) = O(n3 log2n) • Efficiency is

Related Paper Edwin Romeijn and Rober Smith: “Parallel Algorithms for Solving Aggregated Shortest Path Problems”, Computers and Operations Research, Special Issue on Aggregation, Volume 26, Issue 10-11, pp 941-953, 1999

Connected Components & All Pairs Shortest Paths