1 / 30

Search Engine Technology (10)

Search Engine Technology (10). Prof. Dragomir R. Radev radev@cs.columbia.edu. SET Fall 2013. … 16. (Social) networks Random graph models Properties of random graphs. …. SET Fall 2013. … 17. Small worlds Scale-free networks Power law distributions

jana
Télécharger la présentation

Search Engine Technology (10)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Search Engine Technology(10) Prof. Dragomir R. Radev radev@cs.columbia.edu

  2. SET Fall 2013 … 16. (Social) networks Random graph models Properties of random graphs. …

  3. SET Fall 2013 … 17. Small worlds Scale-free networks Power law distributions Centrality …

  4. Krebs 2004

  5. Interleukin-2 receptor pathway protein interaction network (from HPRD). Peri et al., Nucleic Acids Res. 2004 January 1; 32(Database issue): D497–D501. doi: 10.1093/nar/gkh070.

  6. American Journal of Sociology, Vol. 100, No. 1. "Chains of affection: The structure of adolescent romantic and sexual networks,“ Bearman PS, Moody J, Stovel K.

  7. The New York Times May 21, 2005

  8. Email network

  9. Networks • The Web • Citation networks • Social networks • Protein interaction networks • Technological networks • Other networks • Movie actor networks • Cooccurrence of characters in Les Miserables • Board membership

  10. Types of networks • Directed/undirected • Can have weights • Single-mode vs. bipartite (e.g., movie-actor graphs)

  11. Semantic network

  12. Dependency network bought Meredith yesterday apples green

  13. Dependency network

  14. Random network

  15. Lexical networks • A special case of networks where nodes are words or documents and edges link semantically related nodes • Other examples: • Words used in dictionary definitions • Names of people mentioned in the same story • Words that translate to the same word

  16. Analyzing networks • Clustering coefficient • Watts/Strogatz cc = #triangles/#triples • Example: • Diameter (longest shortest path) • Average shortest path (asp) • Strongly connected component (SCC) • Weakly connected component (WCC)

  17. Degree distribution • Uniform • Poisson • Power-law (with coefficient α).

  18. Types of networks • Regular networks • Uniform degree distribution • Random networks • Memoryless • Poisson degree distribution • Characteristic value • Low clustering coefficient • Large asp • Small world networks • High transitivity • Presence of hubs (memory) • High clustering coefficient (e.g., 1000 times higher than random) • Small asp • Some are scale free • Immune to random attacks • (Very) vulnerable to targeted attacks • Power law degree distribution (typical value of a between 2 and 3)

  19. From: Mark Newman 2003. The structure and function of complex networks

  20. Comparing the dependency graph to a random (Poisson) graph

  21. universe letter character nature world actor Properties of lexical networks • Entries in a thesaurus[Motter et al. 2002] • c/c0 = 260 (n=30,000) • Co-occurrence networks [Dorogovtsev and Mendes 2001, Sole and Ferrer i Cancho 2001] • c/c0 = 1,000 (n=400,000) • Mental lexicon [Vitevitch 2005] • c/c0 = 278 (n=19,340)

  22. 1 6 8 2 7 5 3 4 Graph-based representations Square connectivity(incidence) matrix Graph G (V,E)

  23. Bipartite graphs and one-mode projections A B C D E 1 2 3 4

  24. Power laws • Web site size (Huberman and Adamic 1999) • Power-law connectivity (Barabasi and Albert 1999): exponents 2.45 for out-degree and 2.1 for the in-degree • Others: call graphs among telephone carriers, citation networks (Redner 1998), e.g., Erdos, collaboration graph of actors, metabolic pathways (Jeong et al. 2000), protein networks (Maslov and Sneppen 2002). All values of gamma are around 2-3.

  25. Small-world networks • Diameter = average length of the shortest path between all pairs of nodes. Example… • Milgram experiment (1967) • Kansas/Omaha --> Boston (42/160 letters) • diameter = 6 • Albert et al. 1999 – average distance between two verstices is d = 0.35 + 2.06 log10n. For n = 109, d=18.89. • Six degrees of separation

  26. Clustering coefficient • Cliquishness (c): between the kv (kv – 1)/2 pairs of neighbors. • Examples:

More Related