1 / 61

Large Graph Mining: Power Tools and a Practitioner’s Guide

Large Graph Mining: Power Tools and a Practitioner’s Guide. Christos Faloutsos Gary Miller Charalampos (Babis) Tsourakakis CMU. Outline. Reminders Adjacency matrix Intuition behind eigenvectors: Bipartite Graphs Walks of length k Laplacian Connected Components

orien
Télécharger la présentation

Large Graph Mining: Power Tools and a Practitioner’s Guide

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large Graph Mining:Power Tools and a Practitioner’s Guide Christos Faloutsos Gary Miller Charalampos (Babis) Tsourakakis CMU

  2. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  3. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  4. Matrix Representations of G(V,E) Associate a matrix to a graph: • Adjacency matrix • Laplacian • Normalized Laplacian Main focus Faloutsos, Miller, Tsourakakis

  5. Matrix as an operator The image of the unit circle (sphere) under any mxn matrix is an ellipse (hyperellipse). e.g., Faloutsos, Miller, Tsourakakis

  6. More Reminders • Let M be a symmetric nxn matrix. λ eigenvaluex eigenvector Faloutsos, Miller, Tsourakakis

  7. More Reminders 1-Dimensional Invariant Subspaces Diagonal: No rotation y x (λ,u) Ax Ay Faloutsos, Miller, Tsourakakis

  8. Keep in mind! • For the rest of slides we are talking for square nxn matricesand unless noticed symmetric ones, i.e, Faloutsos, Miller, Tsourakakis

  9. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Cheeger Inequality and Sparsest Cut: • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  10. Adjacency matrix Undirected 4 1 A= 2 3 Faloutsos, Miller, Tsourakakis

  11. Adjacency matrix Undirected Weighted 4 10 1 4 0.3 A= 2 3 2 Faloutsos, Miller, Tsourakakis

  12. Adjacency matrix Directed 4 1 ObservationIf G is undirected,A = AT 2 3 Faloutsos, Miller, Tsourakakis

  13. Spectral Theorem Theorem [Spectral Theorem] • If M=MT, thenwhere 0 0 Reminder 2: xi i-th principal axis λilength of i-th principal axis Reminder 1: xi,xj orthogonal λi λj xi xj Faloutsos, Miller, Tsourakakis

  14. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  15. Bipartite Graphs Any graph with no cycles of odd length is bipartite e.g., all trees are bipartite K3,3 1 4 2 5 Can we check if a graph is bipartitevia its spectrum?Can we get the partition of the verticesin the two sets of nodes? 3 6 Faloutsos, Miller, Tsourakakis

  16. Bipartite Graphs Why λ1=-λ2=3?Recall: Ax=λx, (λ,x) eigenvalue-eigenvector Adjacency matrix K3,3 1 4 where 2 5 3 6 Eigenvalues: Λ=[3,-3,0,0,0,0] Faloutsos, Miller, Tsourakakis

  17. Bipartite Graphs 1 1 3=3x1 3 2 1 1 1 4 1 4 1 1 1 2 5 5 1 1 1 3 6 6 Repeat same argument for the other nodes Faloutsos, Miller, Tsourakakis

  18. Bipartite Graphs 1 -1 -3=(-3)x1 -3 -2 -1 -1 1 4 1 4 1 -1 -1 2 5 5 1 -1 -1 3 6 6 Repeat same argument for the other nodes Faloutsos, Miller, Tsourakakis

  19. Bipartite Graphs • Observationu2 gives the partition of the nodes in the two sets S, V-S! S V-S Question: Were we just “lucky”? Answer: No Theorem: λ2=-λ1iff G bipartite. u2 gives the partition. Faloutsos, Miller, Tsourakakis

  20. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  21. Walks • A walk of length r in a directed graph:where a node can be used more than once. • Closed walk when: 4 4 1 1 Closed walk of length 3 2-1-3-2 Walk of length 2 2-1-4 2 2 3 3 Faloutsos, Miller, Tsourakakis

  22. Walks • Theorem G(V,E) directed graph, adjacency matrix A. The number of walks from node u to node v in G with length r is (Ar)uv • Proof Induction on k. See Doyle-Snell, p.165 (i,i1),..,(ir-1,j) (i, i1),(i1,j) (i,j) Faloutsos, Miller, Tsourakakis

  23. Walks 4 1 2 3 4 i=3, j=3 i=2, j=4 4 1 1 2 3 2 3 Faloutsos, Miller, Tsourakakis

  24. Walks 4 1 2 3 Always 0,node 4 is a sink 4 1 2 3 Faloutsos, Miller, Tsourakakis

  25. Walks • Corollary A adjacency matrix of undirected G(V,E) (no self loops), e edges and t triangles. Then the following hold:a) trace(A) = 0 b) trace(A2) = 2ec) trace(A3) = 6t 1 Computing Ar is a bad idea: High memory requirements,expensive! 1 2 1 2 3 Faloutsos, Miller, Tsourakakis

  26. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  27. Laplacian 4 1 L= D-A= 2 3 Diagonal matrix, dii=di Faloutsos, Miller, Tsourakakis

  28. Weighted Laplacian 4 10 1 4 0.3 2 3 2 Faloutsos, Miller, Tsourakakis

  29. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  30. Connected Components • LemmaLet G be a graph with n vertices and c connected components. If L is the Laplacian of G, then rank(L)=n-c. • Proof see p.279, Godsil-Royle Faloutsos, Miller, Tsourakakis

  31. Connected Components G(V,E) 1 2 3 L= 4 6 #zeros = #components 7 5 eig(L)= Faloutsos, Miller, Tsourakakis

  32. Connected Components G(V,E) 1 2 3 L= 0.01 4 6 #zeros = #components Indicates a “good cut” 7 5 eig(L)= Faloutsos, Miller, Tsourakakis

  33. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  34. Adjacency vs. Laplacian Intuition V-S Let x be an indicator vector: S Consider now y=Lx k-th coordinate Faloutsos, Miller, Tsourakakis

  35. Adjacency vs. Laplacian Intuition G30,0.5 S Consider now y=Lx k Faloutsos, Miller, Tsourakakis

  36. Adjacency vs. Laplacian Intuition G30,0.5 S Consider now y=Lx k Faloutsos, Miller, Tsourakakis

  37. Adjacency vs. Laplacian Intuition G30,0.5 S Consider now y=Lx k k Laplacian: connectivity, Adjacency: #paths Faloutsos, Miller, Tsourakakis

  38. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  39. Why Sparse Cuts? • Clustering, Community Detection • And more: Telephone Network Design, VLSI layout, Sparse Gaussian Elimination, Parallel Computation cut 4 8 1 5 9 2 3 6 7 Faloutsos, Miller, Tsourakakis

  40. Quality of a Cut • Edge expansion/Isoperimetric number φ 4 1 2 3 Faloutsos, Miller, Tsourakakis

  41. Quality of a Cut • Edge expansion/Isoperimetric number φ 4 1 and thus 2 3 Faloutsos, Miller, Tsourakakis

  42. Why λ2? V-S Characteristic Vector x S Edges across cut Then: Faloutsos, Miller, Tsourakakis

  43. Why λ2? S V-S cut 4 8 1 5 9 2 3 6 7 x=[1,1,1,1,0,0,0,0,0]T xTLx=2 Faloutsos, Miller, Tsourakakis

  44. Why λ2? Ratio cut Sparsest ratio cut NP-hard Relax the constraint: ? Normalize: Faloutsos, Miller, Tsourakakis

  45. Why λ2? Sparsest ratio cut NP-hard Relax the constraint: λ2 Normalize: because of the Courant-Fisher theorem (applied to L) Faloutsos, Miller, Tsourakakis

  46. Why λ2? OSCILLATE x1 xn Each ball 1 unit of mass Node id Eigenvector value Faloutsos, Miller, Tsourakakis

  47. Why λ2? Fundamental mode of vibration: “along” the separator Faloutsos, Miller, Tsourakakis

  48. Cheeger Inequality • Step 1: Sort vertices in non-decreasing order according to their assigned by the second eigenvector value. • Step 2: Decide where to cut. • Bisection • Best ratio cut Two common heuristics Faloutsos, Miller, Tsourakakis

  49. Outline • Reminders • Adjacency matrix • Intuition behind eigenvectors: Bipartite Graphs • Walks of length k • Laplacian • Connected Components • Intuition: Adjacency vs. Laplacian • Sparsest Cut and Cheeger Inequality : • Derivation, intuition • Example • Normalized Laplacian Faloutsos, Miller, Tsourakakis

  50. Example: Spectral Partitioning • K500 • K500 dumbbell graph A = zeros(1000); A(1:500,1:500)=ones(500)-eye(500); A(501:1000,501:1000)= ones(500)-eye(500); myrandperm = randperm(1000); B = A(myrandperm,myrandperm); In social network analysis, such clusters are called communities Faloutsos, Miller, Tsourakakis

More Related