300 likes | 519 Vues
Search Engine Technology (10). Prof. Dragomir R. Radev radev@cs.columbia.edu. SET Fall 2013. … 16. (Social) networks Random graph models Properties of random graphs. …. SET Fall 2013. … 17. Small worlds Scale-free networks Power law distributions
E N D
Search Engine Technology(10) Prof. Dragomir R. Radev radev@cs.columbia.edu
SET Fall 2013 … 16. (Social) networks Random graph models Properties of random graphs. …
SET Fall 2013 … 17. Small worlds Scale-free networks Power law distributions Centrality …
Interleukin-2 receptor pathway protein interaction network (from HPRD). Peri et al., Nucleic Acids Res. 2004 January 1; 32(Database issue): D497–D501. doi: 10.1093/nar/gkh070.
American Journal of Sociology, Vol. 100, No. 1. "Chains of affection: The structure of adolescent romantic and sexual networks,“ Bearman PS, Moody J, Stovel K.
The New York Times May 21, 2005
Networks • The Web • Citation networks • Social networks • Protein interaction networks • Technological networks • Other networks • Movie actor networks • Cooccurrence of characters in Les Miserables • Board membership
Types of networks • Directed/undirected • Can have weights • Single-mode vs. bipartite (e.g., movie-actor graphs)
Dependency network bought Meredith yesterday apples green
Lexical networks • A special case of networks where nodes are words or documents and edges link semantically related nodes • Other examples: • Words used in dictionary definitions • Names of people mentioned in the same story • Words that translate to the same word
Analyzing networks • Clustering coefficient • Watts/Strogatz cc = #triangles/#triples • Example: • Diameter (longest shortest path) • Average shortest path (asp) • Strongly connected component (SCC) • Weakly connected component (WCC)
Degree distribution • Uniform • Poisson • Power-law (with coefficient α).
Types of networks • Regular networks • Uniform degree distribution • Random networks • Memoryless • Poisson degree distribution • Characteristic value • Low clustering coefficient • Large asp • Small world networks • High transitivity • Presence of hubs (memory) • High clustering coefficient (e.g., 1000 times higher than random) • Small asp • Some are scale free • Immune to random attacks • (Very) vulnerable to targeted attacks • Power law degree distribution (typical value of a between 2 and 3)
From: Mark Newman 2003. The structure and function of complex networks
universe letter character nature world actor Properties of lexical networks • Entries in a thesaurus[Motter et al. 2002] • c/c0 = 260 (n=30,000) • Co-occurrence networks [Dorogovtsev and Mendes 2001, Sole and Ferrer i Cancho 2001] • c/c0 = 1,000 (n=400,000) • Mental lexicon [Vitevitch 2005] • c/c0 = 278 (n=19,340)
1 6 8 2 7 5 3 4 Graph-based representations Square connectivity(incidence) matrix Graph G (V,E)
Bipartite graphs and one-mode projections A B C D E 1 2 3 4
Power laws • Web site size (Huberman and Adamic 1999) • Power-law connectivity (Barabasi and Albert 1999): exponents 2.45 for out-degree and 2.1 for the in-degree • Others: call graphs among telephone carriers, citation networks (Redner 1998), e.g., Erdos, collaboration graph of actors, metabolic pathways (Jeong et al. 2000), protein networks (Maslov and Sneppen 2002). All values of gamma are around 2-3.
Small-world networks • Diameter = average length of the shortest path between all pairs of nodes. Example… • Milgram experiment (1967) • Kansas/Omaha --> Boston (42/160 letters) • diameter = 6 • Albert et al. 1999 – average distance between two verstices is d = 0.35 + 2.06 log10n. For n = 109, d=18.89. • Six degrees of separation
Clustering coefficient • Cliquishness (c): between the kv (kv – 1)/2 pairs of neighbors. • Examples: