1 / 146

MIS 644 Social Newtork Analysis 2017/2018 Spring

MIS 644 Social Newtork Analysis 2017/2018 Spring. Chapter 3 Measures and Metrics. Outline. Centrality Measures Structural Balance Similarity Homophily and Assortative Mixing. Structure of a network – calculate verious useful quantities or measures

dsheehan
Télécharger la présentation

MIS 644 Social Newtork Analysis 2017/2018 Spring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MIS 644Social Newtork Analysis2017/2018 Spring Chapter 3 Measures and Metrics

  2. Outline • Centrality Measures • Structural Balance • Similarity • Homophily and Assortative Mixing

  3. Structure of a network – calculate • verious useful quantities or measures • capture features of the topology of the network

  4. Centrality Meadures • Centrality Measures • Degree • Eigen value • Katz • Closeness • Betweenness

  5. Centrality Mesures • Which are the most important or central vertices in a network? • many possible definitions of importance

  6. Degree Centrality • undirected networks – degree • directed networks – in-degree and out-degree • E.g.: • SNs: individuals with high connections have more prestige, access to information resources • Citation networks: papers with high in-degree, are cited more influencial papers

  7. Eigenvector Centrality • not all neighbors are equivalent • a vertices importance increased by having connections to other vertices themselves important • Instead of treating each neighboring vertex equally • give a score reflecting its importance Xi • Score of i (Xi) is proportional to the scores of the neighbors xijAijxj xi = jAijxj

  8. Or in matrix form X = AX AX = -1X Let -1 =  AX = X • X: right eigenvector of A and  corresponding eigenvalue • For a symetric n x n matrix threre are n real eigenvectors and values • But which eigenvector or value?

  9. AX = X • (A-I)X = 0 • non trivial solutions of this eq • making A-I singular • or det(A-I) = 0 • solve this for  making determinant 0

  10. start: initial guess of centrality for each vertex i • 1 or degree centrality for each Xi • update x’i = jAijxj • in matrix notation • X’ = AX • where A: adjacency matrix • X vector of scores • repeating this t steps: • X(t) = AtX(0) • X(0)= jcivi,linear combination of eigenvectors of A

  11. A = V-1V • for symetric matrices V= VT, • eigenvectors are orthogonal v’ ivj = 0 for disticnt i and j • A = V-1TV • At = V-1tV for symetric matrices • At = VTtV =

  12. X(t)=Aticivi =iticivi =t1i(ti/t1)civi, • Where i is the leading eigenvalue maximum value since i<1 for all j other then 1 • So (ti/t1) = (i/1)t  0 as t   • X(t)= 1c1v1, • The converged score vector is proportional to the leading eigenvector • As eigenvector are invariant up to mulitplication by a constant: vi is an eigen vector then cvi is

  13. AX = 1X • where • 1 is the leading eigenvalue • normalization of the X eigenvector • normalize to n – average centrality to 1 • Undirected networks more sutiable • Directed nets: • A is asymetric – right and left eigenvectors hence two leading eigenvectors • Rigth eigenvector – inlinks

  14. Example 7 2 5 4 3 1 2 3 4 5 6 7 0,1,1,0,0,0,0 1,0,1,0,0,0,0 1,1,0,1,0,0,0 0,0,1,0,1,0,0 0,0,0,1,0,1,1 0,0,0,0,1,0,1 0,0,0,0,1,1,0 6

  15. Leading Eigen Vector of A • 1 2 3 4 5 6 7 • 0.894 0.894 1.200 1.025 1.200 0.894 0.894

  16. Illustrative Iterations 0.875 1.312 0.875 1.312 0.875 0.875 0.921 0.921 1.105 1.105 1.105 0.921 0.921 0.875 0.875 1.273 0.955 1.273 0.875 0.875 0.909 0.909 1.144 1.077 1.144 0.909 0.909 0.882 0.882 1.244 0.983 1.244 0.882 0.882 0.903 0.903 1.167 1.056 1.167 0.903 0.903 0.887 0.887 1.226 1.000 1.226 0.887 0.887 0.899 0.899 1.180 1.044 1.180 0.899 0.899 0.890 0.890 1.216 1.010 1.216 0.890 0.890 0.897 0.897 1.188 1.036 1.188 0.897 0.897 0.891 0.891 1.210 1.016 1.210 0.891 0.891 0.896

  17. Directed Networks xi = -11jAijxj • Or AX = 1X • X right eigenvector • Each row of A is multiplied by X • Aij = 1 for ingoing links • If a node i has no ingoing links all Aij = 0 for all j • Hence xi for that vertex is 0 • Any outgoing links gets a weigth of 0 as well • Vertices in strongly connected compoents or their out-component have non zero centrality • Acyclic networks – citation – • no strongly connected compnents • Centrality of all nodes 0

  18. A portion of a directed network. Vertex A in this network has only outgoing edges and hence will have eigenvector centrality zero. Vertex B has outgoing edges and one ingoing edge, but the ingoing one originates at A, and hence vertex B will also have centrality zero.

  19. 1 2 3 4 5 6 0.000 0.000 0.000 2.400 3.600 0.000 0.000 0.000 0.000 3.600 2.400 0.000 0.000 0.000 0.000 2.400 3.600 0.000 Eigen Vector Centrality 0.000 0.000 0.000 2.400 3.600 0.000

  20. B A A has no ingoing links So its centrality is zere B has linkd from A only So its cenralityisalso zere

  21. Strongly connected component and its out components 0,0,1,0,0,0,0,0 1,0,0,0,0,0,0,0 0,1,0,0,0,0,1,0 1,0,0,0,0,1,0,0 0,0,0,1,0,0,0,0 0,0,0,0,0,0,0,0 0,0,0,0,0,0,0,0 0,0,0,0,0,0,1,0

  22. Acyclical network

  23. Eigen Vector Centrality 2.667 1.333 1.333 1.333 1.333 0.000 0.000 0.000 Largest alpha:2.0 with alfa = 1.5 Katz Centrality 1.806 1.484 1.613 1.484 1.613 0.000 0.000 0.000 few iterations: 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.600 1.600 1.000 0.400 0.400 1.000 1.566 1.063 1.399 1.399 1.566 0.224 0.224 0.559 1.398 1.549 1.297 1.752 1.398 0.135 0.135 0.337 PageRank Centrality 1.539 1.017 1.383 1.326 1.491 0.363 0.363 0.518

  24. trns 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Cit Mat 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 Bib Mat 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00

  25. autority Centrality 0.000 1.528 2.472 2.472 0.000 0.000 0.000 1.528 Hub Centrality 2.472 1.528 0.000 0.000 0.000 1.528 2.472 0.000 Hub Centrality2 4.000 2.472 0.000 0.000 0.000 2.472 4.000 0.000 autority Eigen Value: 1.0 Hub Eigen Value 2.618033988749895

  26. Katz Centrality • Give a free centrality to each vertex xi = jAijxj+ , In matrix form X = AX + 1 Where 1 is a vector of 1s (1,1,1,...,1) X = (I - A)-11 make  = 1 X = (I - A)-11

  27. free parameter : control balance between eigenvector term and the constant term increase  until I – A vanishes at det(A – -1I) = 0 characteristics roots are eigen values  -1 = 1 or  = 1/1,

  28. How to compute • Ues X = AX + 1 • start with an initial estimate of X (X0=0) • X1 = AX0+ 1 • Stop when converges • Can be applied to undirected networks as well • give a centrality to a node by virue of its existance

  29. Katz Centrality of the Example 1 2 3 4 5 6 7 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.940 0.940 1.149 0.940 1.149 0.940 0.940 0.934 0.934 1.136 0.992 1.136 0.934 0.934 0.921 0.921 1.166 0.983 1.166 0.921 0.921 ... ... 0.902 0.902 1.189 1.015 1.189 0.902 0.902 0.902 0.902 1.189 1.015 1.189 0.902 0.902 Katz Centrality 0.902 0.902 1.189 1.015 1.189 0.902 0.902

  30. Extension • Make  not the same for each vertices X = AX +  • Solution X = (I - A)-1 

  31. PageRank • Problem with Kazt • if a vertex with high Katz centrality points to another vertex • those others get high centrality • Yahoo high centrality • if points me should my page has the high centralyity as well

  32. PageRank • Centrality derived is scaled by the out-degree of a vertex xi = jAij(xj/koutj)+ , problem when koutj=0 in matrix form X = AD-1X + 1 Where 1 is a vector of 1s D: diagonal matrix Dii= max(koutj,1) X = (I - AD-1)-11 make  = 1 X = (I - aAD-1)-11 = D(D - aA)-11

  33. Free parameter  can be set to small values •  < inverse of largest eigen value of AD-1, • The largest eigenvalue is 1 by Peron-Frobenious theorem • for a matrix with columns sum to 1 • there is an eigenvalue 1 • for symetric matrices all other eigenvalues are less than 1 • Google sets  to 0.85

  34. Extensions • Make  not the same for each vertices xi = jAij(xj/koutj)+ i, X = AD-1X +  • Solution X = D(D - A)-1  • Or make  zero xi = jAij(xj/koutj), similar to eigen vector centrality • for undirected networks • xi = ki,

  35. Iterations and PageRenk Centrality 1 2 3 4 5 6 1.000 1.000 1.000 1.000 1.000 1.000 0.585 0.751 0.834 1.498 1.083 1.249 0.465 0.597 0.719 1.582 1.477 1.160 0.404 0.519 0.625 1.830 1.573 1.050 0.366 0.469 0.565 1.878 1.773 0.950 ... PageRank Centrality 0.239 0.306 0.369 2.287 2.179 0.620

  36. 0.558 1.135 1.439 0.707 1.265 0.896 0.558 1.135 1.439 0.707 1.265 0.896 0.558 1.135 1.439 0.707 1.265 0.896 • vold • 0.558 1.135 1.439 0.707 1.265 0.896 • new • 0.707 1.439 1.823 0.896 1.603 1.135 • 1.26 • Largest alpha:0.789

  37. 0,0,0,1,0,0 0,0,1,0,0,0 1,0,0,0,1,0 0,0,0,0,0,1 0,0,0,1,0,1 0,1,0,0,0,0

  38. wth beta 1, alfa suggested of 0.45 • Katz Centrality 1 2 3 4 5 6 0.623 1.104 1.417 0.731 1.242 0.884 • PageRank Centrality 1 2 3 4 5 6 0.440 1.299 1.351 0.683 0.973 1.254

  39. Hubs and Authorities • a vertex high centrality if pointed by high centrality vertexes • authorities: contain useful information on a topic • hubs: tells where the best authorities can be found • authority – hub • e.g., review articles • centrality for directed networks • authority and hub centrality • hyperlink-index-topic-search HITS by Kleinberg

  40. HITS • authority centrality to hub centrality and visa versa xi = jAijyj, xi: authority centrality of vertex i yi = jAjixj, yi: hub centrality of vertex i • in matrix notation x = Ay, y = ATx, • or combining both AATx = x, ATAy = y, • where = ()-1

  41. Solution • The autority and hub centralities are given by the eigenvectors of AAT and ATA respectively • With the same eigenvalue • Leading eigenvalue  • Both AAT and ATA have the same leading eigenvalue  • AT(AAT)x = ATx, • (ATA)ATx = ATx, (ATA)y = y • ATx = y • ATA : cocitation matrix • AAT: bibliographic coupling matrix

  42. Solves the problem with Eigenvalue Centrality • Vertices that are not cited has authority centrality zero • But thay can have non-zero hub cenrality • And the vertices they site can have non-zero authority centrality

  43. trnspose 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 coCitation Mat 1.00 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 0.00 0.00 1.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 Bib Mat 1.00 0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 0.00 0.00 1.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00

  44. autority Centrality 1.500 0.000 0.000 1.500 3.000 0.000 Hub Centrality 0.000 0.000 0.000 3.000 0.000 3.000 Hub Centrality2 0.000 0.000 0.000 4.500 0.000 4.500 autority Eigen Value: 3.0 Hub Eigen Value 2.0

  45. Closeness Centrality • measures mean distance from a vertex to all other vertices li = (1/n)jdij, • where • dij: length of a geodesic path from i to j • li: mean geodesic distance to i, average over all vertices j li = (1/n)jdij, li = (1/n-1)jdij, • dii is 0 by definition

  46. closeness centrality Ci: Ci = 1/li = n/jdij,

  47. Problems • 1 - small range • dijs tend to be small – log n • smallest 1, largest in log n • average in between • e.g..actor net n= ,lmax=2.41,lmin=,8.66 • 2 – dij is . if i and j are in different components • so Ci becomes 0 • average over components i is in • vertices in small components have high C values

  48. harmonic mean distance between vertices C’i = (ij1/dij)/(n-1) • desirable • when dij  the corresponding term drops out • give more weigth vertices close to i • mean geodesic distance l = (1/n2)ijdij = (1/n)ili, • problems • average over components • use harmonic mean distance • 1/lh=(1/n(n-1)) ij1/dij=(1/n) iC’i or lh=n/iC’i,

  49. Betweenness Centrality • the extend to which a vertex lies on paths between other vertices • the number of geodesic paths the vertex lies on • betweenness centrality or betweenness • high betweenness high influence • control of information passing to others • removel most disrupt communication

More Related