1 / 53

Graph Algorithms

Graph Algorithms. Minimum Spanning Trees (MST) Union - Find Dana Shapira. Spanning tree. A spanning tree of G is a subset T  E of edges, such that the sub-graph G '=( V,T ) is connected and acyclic. Minimum Spanning Tree.

carnig
Télécharger la présentation

Graph Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graph Algorithms Minimum Spanning Trees (MST) Union - Find Dana Shapira

  2. Spanning tree A spanning tree of G is a subset T E of edges, such that the sub-graph G'=(V,T) is connected and acyclic.

  3. Minimum Spanning Tree Given a graph G = (V, E) and an assignment of weights w(e) to the edges of G, a minimum spanning treeT of G is a spanning tree with minimum total edge weight 1 3 3 6 6 9 1 5 7 8 3 2 7 4

  4. How To Build A Minimum Spanning Tree • General strategy: • Maintain a set of edges A such that (V, A) is a spanning forest of G and such that there exists a MST (V, F) of G such that AF. • As long as (V, A) is not a tree, find an edge that can be added to A while maintaining the above property. Generic-MST(G=(V,E)) • A= ; • while (A is not a spanning tree of G) do • choose a safe edge e=(u,v)E • A=A{e} • return A

  5. Cuts A cut (X, Y) of a graph G = (V, E) is a partition of the vertex set V into two sets X and Y = V \ X. An edge (v, w) is said to cross the cut (X, Y) if vX and wY. A cut (X, Y) respectsa set A of edges if no edge in A crosses the cut.

  6. 1 9 4 2 3 4 A Cut Theorem Theorem:Let A be a subset of the edges of some minimum spanning tree of G; let (X, Y) be a cut that respects A; and let e be a minimum weight edge that crosses (X, Y). Then A{e} is also a subset of the edges of a minimum spanning tree of G; edge e is safe.

  7. A Cut Theorem A Cut Theorem Theorem:Let A be a subset of the edges of some minimum spanning tree of G; let (X, Y) be a cut that respects A; and let e be a minimum weight edge that crosses (X, Y). Then A{e} is also a subset of the edges of a minimum spanning tree of G; edge e is safe. 1 9 4 2 3 4

  8. A Cut Theorem Theorem:Let A be a subset of the edges of some minimum spanning tree of G; let (X, Y) be a cut that respects A; and let e be a minimum weight edge that crosses (X, Y). Then A{e} is also a subset of the edges of a minimum spanning tree of G; edge e is safe. 12 9 4

  9. A Cut Theorem u e e v f T w(e) ≤ w(f) w(e) ≤ w(f) w(T') ≤ w(T)

  10. Proof: • Let T be a MST such that AT. • If e = (u,v) T, add e to T. • The edge e = (u,v) forms a cycle with edges on the path p from u to v in T. Since u and v are on opposite sides of the cut, there is at least one edge f = (x,y) in T on the path p that also crosses the cut. • f A since the cut respects A. Since f is on the unique path from u to v in T, removing it breaks T into two components. • w(e) ≤ w(f) (why?) • Let T ' = T – {f}{e} w(T ') ≤w(T).

  11. A Cut Theorem Proof: The cut (VC, V–VC) respects A, and e is a light edge for this cut. Therefore, e is safe. Corollary:Let G=(V,E) be a connected undirected graph and A a subset of E included in a minimum spanning tree T for G, and let C=(VC,EC) be a tree in the forest GA=(V,A). If e is a light edge connecting C to some other component in GA, then e is safe for A.

  12. Kruskal’s Algorithm Kruskal(G) 1A←∅ 2for every edge e = (v, w) of G, sorted by weight 3do ifv and w belong to different connected components of (V, A) 4then add edge e to A a a 9 9 1 1 b b d d 3 3 4 4 c c 5 5 (a, d):1 (h, i):1 (c, e):1 (f, h):2 (g, h):2 (b, c):3 (b, f):3 (b, e):4 (c, d):5 (f, g):5 (e, i):6 (d, g):8 (a, b):9 (c, f):12 e e 1 1 12 12 3 3 8 8 6 6 f f 5 5 i i 2 2 g g 2 2 1 1 h h

  13. ei ei Correctness Proof Sorted edge sequence: e1, e2, e3, e4, e5, e6, …, ei, ei + 1, ei + 2, ei + 3, …, en Every edge ej that cross the cut have a weight w(ej) ≥ w(ei). Hence, edge ei is safe.

  14. Union-Find Data Structures • Given a set S of n elements, maintain a partition of S into subsets S1, S2, …, Sk • Support the following operations: • Union(x, y): Replace sets Si and Sj such that xSi and ySj with SiSj in the current partition • Find(x): Returns a member r(Si) of the set Si that contains x • In particular, Find(x) and Find(y) return the same element if and only if x and y belong to the same set. • It is possible to create a data structure that supports the above operations in O(α(n)) amortized time, where α is the inverse Ackermann function.

  15. Kruskal’s Algorithm Using Union-Find Data Structure Kruskal(G,w) A for each vertex vV do Make-Set(v) sort the edges in E in non-decreasing weight order w for each edge (u,v)Edo if Find-Set(u) ≠ Find-Set(v) thenA  A  {(u,v)} Union(u,v) return A

  16. Kruskal’s Algorithm Using Union-Find Data Structure • Analysis: • O(|E| log |E|) time for everything except the operations on S • Cost of operations on S: • O(α(|E|,|V|)) amortized time per operation on S • |V| – 1 Union operations • |E| Find operations • Total: O((|V| + |E|)α(|E|,|V|)) running time • Total running time: O(|E| lg |E|).

  17. Prim’s Algorithm Prim(G) 1for every vertex v of G 2do label v as unexplored 3for every edge e of G 4do label e as unexplored and non tree edge 5s ← some vertex of G 6 Mark s as visited 7Q← Adj(s) 8whileQ is not empty 9do(u, w) ← DeleteMin(Q) 10if (u, w) is unexplored 11then ifw is unexplored 12then mark edge (u, w) as tree edge 13 mark vertex w as visited 14Insert(Q, Adj(w)) a 9 1 b d 3 4 c 5 e 1 12 3 8 6 f 5 i 2 g 2 1 h

  18. Correctness Proof Observation:At all times during the algorithm, the set of tree edges defines a tree that contains all visited vertices; priority queue Q contains all unexplored edges incident to these vertices. Corollary:Prim’s algorithm constructs a minimum spanning tree of G.

  19. Union/Find • Assumptions: • The Sets are disjoint. • Each set is identified by a representative of the set. • Initial state: • A union/find structure begins with n elements, each considered to be a one element set. • Functions: • Make-Set(x): Creates a new set with element x in it. • Union(x,y): Make one set out of the sets containing x and y. • Find-Set(x): Returns a pointer to the representative of the set containing x.

  20. Basic Notation • The elements in the structure will be numbered 0 to n-1 • Each set will be referred to by the number of one of the element it contains • Initially we have sets S0,S1,…,Sn-1 • If we were to call Union(S2,S4), these sets would be removed from the list, and the new set would now be called either S2 or S4 • Notations: • n Make-Set operations • m total operations • nm

  21. First Attempt • Represent the Union/Find structure as an array arr of n elements • arr[i] contains the set number of element i • Initially, arr[i]=i (Make-Set(i)) • Find-Set(i) just returns the value of arr[i] • To perform Union(Si,Sj): • For every k such that arr[k]=j, set arr[k]=i

  22. Analysis • The worst-case analysis: • Find(i) takes O(1) time • Union(Si,Sj) takes (n) time • A sequence of nUnions will take (n2) time

  23. Second Attempt • Represent the Union/Find structure using linked lists. • Each element points to another element of the set. • The representative is the first element of the set. • Each element points to the representative. • How do we perform Union(Si,Sj)?

  24. Analysis • The worst-case analysis: • Find(i) takes O(1) time • Make-Set(i) takes O(1) time • Union(Si,Sj) takes (n) time (Why?) • A sequence of nUnions-Find will take (n2) time (Example?)

  25. Up-Trees • A simple data structure for implementing disjoint sets is the up-tree. • We visualize each element as a node • A set will be visualized as a directed tree • Arrows will point from child to parent • The set will be referred to by its root H X F A W B R H, A and W belong to the same set. H is the representative X, B, R and F are in the same set. X is the representative

  26. Operations in Up-Trees Follow pointer to representative element. find(x) { if (x≠p(x)) // not the representative then p(x)find(p(x)); return p(x); }

  27. Union • Union is more complicated. • Make one representative element point to the other, but which way?Does it matter?

  28. Union(H, X) H X F X points to H B, R and F are now deeper A W B R H X F H points to X A and W are now deeper A W B R

  29. A worst case for Union Union can be done in O(1), but may cause find to become O(n) A B C D E Consider the result of the following sequence of operations: Union (A, B) Union (C, A) Union (D, C) Union (E, D)

  30. Array Representation of Up-tree • Assume each element is associated with an integer i=0…n-1. From now on, we deal only with i. • Create an integer array, A[n] • An array entry is the element’s parent • A -1 entry signifies that element i is the representative element.

  31. Array Representation of Up-tree Now the union algorithm might be: Union(x,y) { A[y] = x; // attaches y to x } The find algorithm would be find(x) { if (A[x] < 0) return(x); else return(find(A[x])); } Performance: ???

  32. Analysis • Worst case: • Union(Si,Sj) take O(1) time • Find(i) takes O(n) time • Can we do better in an amortized analysis? • What is the maximum amount of time n operations could take us? • Suppose we perform n/2 unions followed by n/2 finds • The n/2 unions could give us one tree of height n/2-1 • Thus the total time would be n/2 + (n/2)(n/2) = O(n2) • This strategy doesn’t really help…

  33. Array Representation of Up-tree • There are two heuristics that improve the performance of union-find. • Union by weight • Path compression on find

  34. Union by Weight Heuristic Make-Set(x) { p(x)x; rank(x)=0; } Always attach smaller tree to larger. union(x,y) { LINK(FIND-Set(x),Find-Set(y)) } LINK(x,y){ if (rank(x) > rank(y)) { p(y)x; else p(x)y; if(rank(x)=rank(y)){ rank(x) = rank(y)+1; } }

  35. Union by Weight Heuristic Let’s change the weight from rank to number of nodes: union(x,y) { rep_x = find(x); rep_y = find(y); if (weight[rep_x] < weight[rep_y]) { A[rep_x] = rep_y; weight[rep_y] += weight[rep_x]; } else { A[rep_y] = rep_x; weight[rep_x] += weight[rep_y]; } }

  36. Implementation • Still represent this with an array of n nodes • If element i is a “root node”, then arr[i]= -s, where s is the size of that set. • Otherwise, arr[i] is the index of i’s “parent” • If arr[i] < arr[j], then set arr[i] to arr[i]+arr[j] and set arr[j]to i • Else, set arr[j] to arr[i]+arr[j] and set arr[j] to i

  37. Implementation 0 1 2 3 4 5 6 7 8 9 -2 -1 8 1 0 8 0 2 3 4 2 7 4 4 6 5 9 8 -7 7

  38. Performance w/ Union by Weight • If unions are done by weight, the depth of any element is never greater than lg N. • Initially, every element is at depth zero. • When its depth increases as a result of a union operation (it’s in the smaller tree), it is placed in a tree that becomes at least twice as large as before (union of two equal size trees). • How often can each union be done? -- lg n times, because after at most lg n unions, the tree will contain all n elements. • Therefore, find becomes O(lg n) when union by weight is used.

  39. New Bound on h Theorem:Assume we start with a Union/Find structure where each set has 1 node, and perform a sequence of Weighted Unions. Then any tree T of m nodes has a height no greater than log2 m.

  40. Proof • Base case: If m=1, then this is clearly true • Assumption: Assume it is true for all trees of size m-1 or less • Proof: Let T be a tree of m nodes created by a sequence of Weighted Unions. Consider the last union: Union(Sj,Sk). Assume Sj is the smaller tree. If Sjhas a nodes, then Skhas m-a nodes, and 1 a  m/2.

  41. Proof (continued) • The height of T is either: • The height of Tk • One more than the height of Tj • Since a  m-a  m-1, the assumptions applies to both Tkand Tj • If T has the height of Tk, then h  log2(m-a)  log2m • If T is one greater than the height of Tj: h  log2a+1  log2m/2)+1  log2m

  42. Example • Is the bound tight? • Yes: “pair them off” • Union(S0,S1), Union(S2,S3), Union(S4,S5), Union(S6,S7), Union(S0,S2), Union(S4,S6), Union(S0,S4) 0 1 2 3 4 5 6 7

  43. Example 0 2 4 6 1 3 5 7

  44. Example 0 4 1 2 5 6 3 7

  45. Example 0 1 2 4 3 5 6 7

  46. Analysis • Worst case: • Union is still O(1) • Find is now O(log n) • Amortized case: • A “worst amortized case” can be achieved if we perform n/2 unions and n/2 finds • Take O(n log n) time • Conclusion: This is better, but we can improve it further

  47. Path Compression • Each time we do a find on an element x, we make all elements on path from root to x be immediate children of root by making each element’s parent be the representative. find(x) { if (A[x]<0) return(x); A[x] = find(A[x]); return (A[x]); } • When path compression is done, a sequence of m operations takes O(m lg n) time. Amortized time is O(lg n) per operation.

  48. Find(7) 0 0 1 2 4 1 2 4 6 7 3 5 6 3 5 7

  49. Analysis • The worst case analysis does not change • In fact, we are going to have to increase the worst-case time of Find by a constant factor • The amortized analysis does get better • we need to define Ackerman’s function

  50. Performance with Both Optimizations • When both optimizations are performed, for a sequence of m operations (m  n) (unions and finds), it takes no more than O(m lg* n) time. • lg*n is the iterated (base 2) logarithm of n. The number of times you take lg n before n becomes  1. • Example: • lg*16=3 • lg*65536=4 • lg*265536=5 • Union-find is essentially O(m) for a sequence of m operations (Amortized O(1)).

More Related