350 likes | 447 Vues
This study explores a graph-based link prediction approach in social networks through a Pareto-optimal genetic algorithm. Key components include betweenness centrality, community detection, and a 10-dimensional genetic algorithm. The system aims to find optimal friend suggestions based on shared attributes and interactions. The evolutionary process involves fitness selection, crossover, and mutation to improve predictions. Future work includes enhancing the fitness function, weighted optimization, and validation with sociological input.
E N D
A Graph-Based Approach to Link Prediction in Social Networks Using a Pareto-Optimal Genetic Algorithm Jeff Naruchitparames University of Nevada, Reno - CSE CS 790: Complex Networks, Fall 2010
biological social 2
Social networks = • Dynamic, judgmental environment • Affect friendships over time heterogeneous very dynamic 5
1-2 hop distance only • Friend-of-friend 7
Multiple hops; >1 • Structural; purely graph-based • No explicit correlation between potential friends... 8
Silva, et. al., • A Graph-based Recommendation System Using Genetic Algorithms, 2010 9
Friends-of-Friends 2 hops Filter Order 12
Filtering “It’s more probable that you know a friend of your friend than any other random person” Mitchell M., Complex Systems: Network Thinking, 2006. 13
Indexes 16
What’s missing? • Heterogeneity • Human behavior and preferences • Multiple hops 17
My approach • Pretty much a filtering problem... 18
My approach • Components (for filtering) • Betweenness centrality • Community detection • Clique Percolation Method (CPM) • Friends of friends • 10-dimensional Pareto-optimal genetic algorithm 19
Remove duplicates • Remove our test cases • (More on this later...) 22
The Features • # of shared friends • location • age range • general interest • music • attended same events • groups • movies • education • religion/politics 25
Pareto Optimality • Localized to implementation of selection • Feature subset selection • We want to find the best combination of these subsets that can give us the best solutions for how we determine friendships 26
Pareto Optimality • Compare with the test cases we removed earlier... • For all chromosomes in population, do: • If ALL test cases ≥ optimal Pareto front • Calculate fitness • Good to go • Else • Calculate fitness • Continue onto next chromosome 29
Fitness Function • ∑ ∑ piln( fj)pi-1 n 10 i=1 j=1 30
Continuing on with the Evolutionary Process • Apply fitness proportional selection • Randomly select 2 parents to mate • Apply 1-point crossover (82% chance) • Bit mutation (0.05% chance) • Do this until ALL test cases better than Pareto front OR fitness does not improve for 5 consecutive generations 31
Conclusion • Complex network theory + Genetic algorithm + social theory • Betweenness centrality • Community detection • Clique Percolation Method • Binary 10-dimensional Pareto-optimal genetic algorithm • Dominant, fitness proportional selection • Several levels of filtering and selection (aka filtering ☺) 33
Future Work • Better fitness function (need to ask Sociologists) • Weighted chromosome for Pareto optimization (as opposed to binary) • Prove all this stuff actually works (sociology standpoint??) • Parallelize or GPU-ize the code (it’s in Python) 34