1 / 25

Scalable Learning of Collective Behavior Based on Sparse Social Dimensions

Scalable Learning of Collective Behavior Based on Sparse Social Dimensions. Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01. Outline. Introduction Collective Behavior Learning Social Dimensions Algorithm Edge-Centric View K-means Variant Experiment Setup

royce
Télécharger la présentation

Scalable Learning of Collective Behavior Based on Sparse Social Dimensions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM’09 Speaker: Hsin-Lan, Wang Date: 2010/02/01

  2. Outline • Introduction • Collective Behavior Learning • Social Dimensions • Algorithm • Edge-Centric View • K-means Variant • Experiment Setup • Experiment Results • Conclusions and Future Work

  3. Introduction • Social media facilitate people of all walks of life to connect to each other. • We study how networks in social media can help predict some sorts of human behavior and individual preference.

  4. Introduction • In social media, the connections of the same network are not homogeneous. However, this relation type information is not readily available in reality. • A framework based on social dimensions is proposed to address this heterogeneity.

  5. Introduction • In the initial study, modularity maximization is exploited to extract social dimensions. • With huge number of actors, the dimensions cannot even be held in memory. • In this work, we propose an effective edge-centric approach to extract sparse social dimensions.

  6. Collective Behavior Learning • When people are exposed in a social network environment, their behaviors can be influenced by the behaviors of their friends. • People are more likely to connect to others sharing certain similarity with them.

  7. Collective Behavior Learning • K class labels • network V is the vertex set, E is the edge set and are the class labels of a vertex • Given known values of for some subsets of vertices . • How to infer the values of for the remaining vertices

  8. Social Dimensions

  9. Social Dimensions • To address the heterogeneity presented in connections, we have proposed a framework (SocDim) for collective behavior learning. • Framework SocDim is composed of two steps: 1. social dimension extraction 2. discriminative learning

  10. Social Dimensions • These social dimensions can be treated as features of actors. • Since network is converted into features, typical classifier such as support vector machine can be employed.

  11. Social Dimensions • Concerns about the scalability of SocDim with modularity maximization: • The social dimensions extracted according to modularity maximization are dense. • Requires the computation of the top eigenvectors of a modularity matrix which is of size n*n. • The dynamic nature of networks entails efficient update of the model for collective behavior prediction.

  12. Algorithm -Edge-Centric View • Treat each edge as one instance, and the nodes that define edges as features.

  13. Algorithm -Edge-Centric View • Based on the features of each edge, we can cluster the edges into two sets. • One actor is considered associated with one affiliation as long as any of his connections is assigned to that affiliation.

  14. Algorithm -Edge-Centric View • In summary, to extract social dimensions, we cluster edges rather than nodes in a network into disjoint sets. • Because the affiliations of one actor are no more than the connections he has, the social dimensions based on edge-centric clustering are guaranteed to be sparse.

  15. Algorithm -K-means Variant

  16. Algorithm

  17. Experiment Setup -Social Media Data

  18. Experiment Results -Prediction Performance

  19. Experiment Results -Prediction Performance

  20. Experiment Results -Prediction Performance • Prediction performance on all the studied social media data is around 20-30% for F1 measure. This is partly due to : • large number of labels in the data • only employ the network information

  21. Experiment Results -Scalability Study

  22. Experiment Results -Scalability Study

  23. Experiment Results -Sensitivity Study

  24. Conclusions and Future Work • To address the scalability issue, we propose an edge-centric clustering scheme to extract social dimensions and a scalable k-means variant to handle edge clustering. • The model based on the sparse social dimensions shows comparable prediction performance as earlier proposed approaches to extract social dimensions.

  25. Conclusions and Future Work • In reality, each edge can be associated with multiple affiliations while our current model assumes only one dominant affiliation. • The proposed EdgeCluster model is sensitive to the number of social dimensions.

More Related