230 likes | 334 Vues
This workshop delves into social influence and link prediction in online social networks, examining theories like self-interest, contagion, and homophily to predict future links. Data-driven analysis on Anobii.com offers insights into link creation dynamics and network behaviors. The speaker, Luca Maria Aiello from Università degli Studi di Torino, shares findings on dataset analysis, topical overlap, and homophily and influence dynamics. The research aims to understand profile similarity, geographic overlap, and causality between similarity and link creation. Strategies like triadic closure and supervised learning approaches are explored to predict link formation accurately.
E N D
A glimpse on social influence and link prediction in OSNs Workshop on Data Driven Dynamical Networks Speaker: • Luca Maria Aiello, PhD student • UniversitàdegliStudidi Torino • Computer Science Department • aiello@di.unito.it Keywords : link creation, link prediction, homophily, social influence, aNobii
Acknowledgments Giancarlo Ruffo RossanoSchifanella UniversitàdegliStudidi Torino ISI Foundation Alain Barrat CiroCattuto People: School of Informatics and Computing, Indiana University FilippoMenczer
Dynamics leading to link creation Food networks Collaboration networks Social media 2nd part: exploit the observations on these phenomena to predict future links • Several theories from sociology • Self-interest • Mutual-interest • Exchange • Contagion (influence) • Balance • Homophily • Proximity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Social network for bookworms • Profile features • Library and wishlist • Groups • Tags • Social network • Directed • Friendship + neighborhood • 6 snapshots, 15 days apart • Full giant connected component Data-driven analysis on anobii.com Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Basic statistics ng(kout) nb(kout) 103 nw(kout) 102 101 100 103 100 101 102 kout • Broad distributions • Positive correlations between connectivity and activity • Assortativity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Triadic closure • Classification of new links at time t+1 between nodes already present at time t (t ∈ {1,…,5}) Double closure Closure Bidirectional Direct Reciprocated 75% 20% 30% 25% 10% Reciprocation is strong (exchange) Users tend to choose “friends of their friends” as new friends (balance) Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Profile similarity vs. social distance Does similarity between user profiles depend on the social distance? • Topical overlap • Statistical correlation because of assortative biases? • Null model to discern real overlap from purely statistical effects • No topical overlap other than that caused by statistical mixing patters Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Geographical overlap Null model test with random link rewire Country-level overlap due to language barriers City level overlap SocialCom 2010 - Luca Maria Aiello, Università degli Studi di Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Causality between similarity and link creation • What is the cause of topical overlap? • Topical overlap is observed for all profile features • Three possible explanations: • Homophily (people connect with similar people) • Social influence(social connection conveys similarity) • Mixture of the two • Explore the causality relationship between profile similarity and social linking Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Similarity link creation (homophily) Average similarity of pairs forming new links between t and t+1 (t=4), compared with average similarity of all the pairs at distance 2 at time t Pairs that are going to get connected show a substantially higher similarity Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link creation similarity (influence) Groups Books Evolution of the similarity between pairs linking together at different times Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Summary Can we exploit the observations on these phenomena to predict future links? • Theories to explain link creation • Self-interest • Mutual-interest • Exchange Reciprocity in linking • Contagion Social influence • Balance Triangle closure • Homophily For all profile features • Proximity Geographical and on social graph Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link prediction Learning set example Snapshots at time t and t+1 Predict links created between t and t+1 given the whole information at time t Supervised learning approach to combine profile and structural features Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Features • Structural • Common neighbors • Distance on graph • Preferential attachment • Resource allocation • Local path • Profile • Library (cosine) • Groups (cosine) • Groups (size) • Gender {0,1} • Town {0,1} • Age (|age1 – age2|) • Country {0,1} • Vocabulary (cosine) • Wishlists (cosine) • Tagging behavior Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Link prediction: preliminary results • Rotation forest, 10-fold cross-validation, balanced sets • Rotation forest, 10-fold cross-validation, unbalanced sets Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Outline • Dataset • Topical overlap • Homophily and influence • Link prediction • Conclusions Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Conclusions and future work • Theories on social network growth are verified • Causality between similarity and social connection • Effective link detection/prediction • Topical information seems to be predictive as well as structural information • RFC: • Link prediction sampling/evaluation procedure • New challenges in prediction Les Houches 2010 - Luca Maria Aiello, UniversitàdegliStudidi Torino
Workshop on Data Driven Dynamical Networks Thank you for your attention! Speaker: Luca Maria Aiello aiello@di.unito.it www.di.unito.it/~aiello Reference: L. M. Aiello, A. Barrat, C. Cattuto, G. Ruffo, R. Schifanella"Link creation and profilealignment in the aNobii social network"In SocialCom'10: Proceedingsof the 2nd IEEE International Conference on Social Computing, Minneapolis, MN, USA, August 2010