1 / 34

CUT: Community Update and Tracking in Dynamic Social Networks

CUT: Community Update and Tracking in Dynamic Social Networks. Hao -Shang Ma and Jen-Wei Huang K nowledge and I nformation D iscovery Lab, Dept . of Electrical Engineering, National Cheng Kung University

nibaw
Télécharger la présentation

CUT: Community Update and Tracking in Dynamic Social Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CUT: Community Update and Tracking in Dynamic Social Networks Hao-Shang Ma and Jen-Wei Huang Knowledge and Information Discovery Lab, Dept. of Electrical Engineering, National Cheng Kung University The 7th Workshop on Social Network Mining and Analysis (SNA-KDD'13) joint with the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'13)

  2. About Me • Jen-Wei Huang (黃仁暐) • Knowledge and Information Discovery Lab • Dept. of Electrical Engineering, National Cheng Kung University • Email: jwhuang @ mail.ncku.edu.tw • http://kid.ee.ncku.edu.tw KID Lab, National Cheng Kung University

  3. Research • Data Mining and Database • Time Series Mining • Social Network Analysis • Multimedia Information Retrieval • Ubiquitous Computing • Mobile Computing • Cloud Computing • Bioinformatics KID Lab, National Cheng Kung University

  4. Outline • Introduction • CUT Algorithm • Experiments • Conclusions • References KID Lab, National Cheng Kung University

  5. Introduction • Social networking websites allow users to establish their own personal communities or social networks based on relationships of friends. http://www.facebook.com/ http://twitter.com/ KID Lab, National Cheng Kung University

  6. Introduction • Based on the relationships between users, social networks exhibit a community structure. KID Lab, National Cheng Kung University

  7. Introduction • The detection of communities in a network usually puts network nodes into groups in such a way that nodes in the same group are densely connected to one another. • An objective function is chosen to determine the quality of a community. • Modularity [1] is a measure of the quality of a partition in terms of the number of intra-community and inter-community edges. KID Lab, National Cheng Kung University

  8. Introduction • Social networks are always changing with the time. • We want to quickly and efficiently identify the community structures of a network at every timestamp. • Updating the network structureby tracking previously known information instead of recalculating the relationships of all nodes and edges in the networks. KID Lab, National Cheng Kung University

  9. Introduction • In this work, we define the seed of community, which is a collection of 3-cliques where any two of 3-cliques share more than one edge. • By tracking seed of communities, we are able to efficiently update and track the dynamics of communities in a social network. KID Lab, National Cheng Kung University

  10. Example Network and 3-clique KID Lab, National Cheng Kung University

  11. CUT Algorithm • We propose CUT algorithm, standing for Community Update and Tracking algorithm, to update and track seed of communities. • There are two phases in CUT algorithm. • Initial phase, executed only once. • Find seed of communities • Extend seed of communities to communities • Update and Tracking phase • Maintain and update CAB graph KID Lab, National Cheng Kung University

  12. Find Seed of Communities • 1. Find all 3-cliques in a network • 2. Build CBA (Clique Bipartite Adjacent) graph • 3. Determine the seed of communities in a network KID Lab, National Cheng Kung University

  13. Find All 3-cliques • Backtracking algorithm KID Lab, National Cheng Kung University

  14. All 3-cliques in the Network KID Lab, National Cheng Kung University

  15. Clique Adjacent Bipartite Graph KID Lab, National Cheng Kung University

  16. All 3-cliques in CAB KID Lab, National Cheng Kung University

  17. Determine Seed of Community • DFS-like algorithm to find connected component KID Lab, National Cheng Kung University

  18. CAB Graph • The complexity of tracking CAB is lower than that of tracking the original graph • Complexity of building CAB is O(3|C|)=O(|C|) • Complexity of determining the connected component is O(3|C|)=O(|C|) • Easy to combine or split the seeds of community KID Lab, National Cheng Kung University

  19. Extend to Communities • Ignore the sparse nodes whose degree is smaller than 2. • Assign the remain nodes to the closest seed of community • Closest: the seed of community which has the most links to the node KID Lab, National Cheng Kung University

  20. Update and Tracking Phase • Maintain and Update CAB Graph • If there are some changes in the network, do the following cases • Case 1: New nodes & new edges are added • Case 2: Old nodes & edges are removed • Extend to Communities KID Lab, National Cheng Kung University

  21. Case 1: Merge and Join New Node : 20,21 New Edge : (2,8) (5,20), (9,20), (11,21) New 3-cliques: (2,6,8) and (5,9,20) KID Lab, National Cheng Kung University

  22. Case 1: Merge and Join • =(), =() • If any two edges link to different seeds of communities, SiandSj, we merge(Si, Sj) • Else if any edge of Ck links to any Si then we Join(Si, Ck) • Complexity is O(3*| new C |) = O(| new C |) KID Lab, National Cheng Kung University

  23. Case 2: Split and Removal • If there are nodes removed , we find all edges which connect to the removed nodes N10 is removed. Therefore, (4,10),(6,10) (8,10),(10,12) (10,11) are removed. KID Lab, National Cheng Kung University

  24. Node Removed Case - Split • Remove corresponding edges and cliques • Run FindSeedofCommunity algorithm again to update to new seeds of communities • Complexity is O(3|C|+| removed C |) KID Lab, National Cheng Kung University

  25. Joint Case There are new nodes added and edges removed at the same time KID Lab, National Cheng Kung University

  26. Joint Case • We simply deal with the Case 1 first, and then deal with the Case 2 so that we can decrease the unnecessary splits. • Finally, extend seed of communities to communities. KID Lab, National Cheng Kung University

  27. Related Works - Update the Community Structure • Nam P. Nguyen et al. propose a QCA algorithm. [9] • The QCA algorithm uses the already known community structure, and deal with the changing cases, new nodes, new edges, nodes removed, and edges removed based on modularity. • In QCA algorithm, they keep the whole community structure at each timestamp. • Using original CPM in removed case every time, which cost lots of time. • They have to identifying the nodes or edges belong to which type of cases. It costs much time as well. KID Lab, National Cheng Kung University

  28. Experiments • Coauthor network (2002~2010) • 1. About 20000 authors in one network • 2. Densely connected graph • 3. Five years as a time period, t1 is 2002-2006 (first update) • 4. Variations of network at each time stamp are small KID Lab, National Cheng Kung University

  29. Experiments KID Lab, National Cheng Kung University

  30. Experiments • p2p-Gnutella network • 1. t1-t4 is a snapshot from August 4 to 7 2002, about 6000 nodes • 2. Sparse connected graph • 3. Variations of network at each time stamp are large. KID Lab, National Cheng Kung University

  31. Experiments KID Lab, National Cheng Kung University

  32. Conclusions • We design CUT algorithm for updating community structures in dynamic social networks instead of recalculating relationships of all nodes and edges in the social network. • Keeping seeds of communities in the memory at each timestamp is more efficient than keeping all communities. • Using Clique Adjacent Bipartite graph to update and track seeds of community leads to lower complexity. KID Lab, National Cheng Kung University

  33. References • M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phy. Rev. E 69, 2004. • Bowen Yan and Steve Gregory,” Detecting Communities in Networks by Merging Cliques,” ICIS, 2009. • CLAUSET, G., NEWMAN, M. E. and MOORE, C., “Finding community structure in very large networks,” Phys. Rev. E 70, 066111, 2004. • Zhengzhang Chen, Kevin A. Wilson, Ye Jin, William Hendrix and Nagiza F. Samatova, “Detecting and Tracking Community Dynamics in Evolutionary Networks,” ICDMW, 2010. KID Lab, National Cheng Kung University

  34. References • Yi Wang, Bin Wu, and Xin Pei, “CommTracker: A Core-Based Algorithm of Tracking Community Evolution,” ADMA, 2008. • Nam P. Nguyen, Thang N. Dinh, Ying Xuan, and My T. Thai. “Adaptive Algorithms for Detecting Community Structure in Dynamic Social Networks,” INFOCOM, 2011. • Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte and Etienne Lefebvre,”Fastunfolding of communities in large networks,” JSTAT, 2008. • Nan Du, Bin Wu, Xin Pei, BaiWang and LiutongXu,” Community Detection in Large-Scale Social Networks,” SNA-KDD, 2007. KID Lab, National Cheng Kung University

More Related