1 / 43

CSE 522 – Algorithmic and Economic Aspects of the Internet

CSE 522 – Algorithmic and Economic Aspects of the Internet. Instructors: Nicole Immorlica Mohammad Mahdian. Topics covered in the course. Structure and modeling of social networks Power law graphs; Small world phenomenon; High clustering coefficient; Probabilistic and game theoretic models

denise
Télécharger la présentation

CSE 522 – Algorithmic and Economic Aspects of the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian

  2. Topics covered in the course • Structure and modeling of social networks Power law graphs; Small world phenomenon; High clustering coefficient; Probabilistic and game theoretic models • Algorithms for link analysis Crawling the web; HITS; Page Rank; Webspam; Rank aggregation; Spectral clustering • Economic aspects of the Internet Peering relations; Alternative mechanisms for routing; P2P networks • Topics motivated by e-commerce Reputation mechanisms; Recommendation systems; Ad auctions

  3. Logistics • Course web page: http://www.cs.washington.edu/education/courses/cse522/05au/ • Course work: • reading papers (1/week on avg) • possibly a few problem sets • How to contact us: {nickle,mahdian}@microsoft.com

  4. Social Networks • A social network is a graph that represents relationships between independent entities. • Graph of friendships (or in the virtual world, networks like orkut) • Web of sexual contact • Graph of scientific collaborations • Cross-posts in newsgroups • Web graph (links between webpages) • Internet: Inter/Intra-domain graph

  5. Scientific Collaboration Network • 400,000 nodes, authors in Mathematical Reviews database • An edge between two authors if they have a joint paper • Just 676,000 edges Picture from orgnet.com

  6. Scientific Collaboration Network • Average degree 3.36 • A few high-degrees: • Paul Erdös, 509 • Frank Harary, 268 • Yuri Alekseevich Mitropolskii, 244 • Many low-degrees: (100,000 of degree 1) Picture from orgnet.com

  7. Scientific Collaboration Network • Short paths • Max Erdös # is 13 • Any two authors connected by path of length at most 23 • Average distance between two authors is 7.64 • e.g.: John Nash → Shapley → Fulkerson → Hoffman → Paul Erdös • Many triangles … Picture from orgnet.com

  8. 9/11 Terrorist Network Picture from orgnet.com

  9. Newsgroup Cross-Post Graph • Nodes are newsgroups, essentially archived email lists • Edges are cross-posts, i.e. there is an edge between two newsgroups to which an identical email is posted alt.microsoft.sucks alt.linux.sucks

  10. Internet Graphs • Inter-domain graphs • Nodes are autonomous systemsor domains • Edges are inter-domain connections SPRINT AOL

  11. Inter-domain graph Picture from caida.org

  12. Internet Graphs • Intra-domain graphs • Nodes are routers • Edges are links between routers 199.45.130.13 199.45.143.14

  13. Intra-domain graph

  14. Colored by AS number Picture from lumeta.com

  15. World Wide Web • Nodes are webpages • Arcs (i.e., directed edges) are hyperlinks http://research.microsoft.com/~mahdian http://theory.csail.mit.edu

  16. Web graph, Chicago Tribune Page Picture generated by Nicheworks

  17. Social Networks

  18. Why Study These Networks • Understand the creation of these networks • Understand viral epidemics • Help design crawling strategies for the web • Analyze behavior of algorithms (web/internet) • Predict evolution of the network and emergence of new phenomena

  19. In this lecture • Common properties of social networks • Power law degree distribution • Small world phenomenon • High clustering coefficient • Structure of the web graph

  20. Power Laws • Two quantities x and y are related by a power lawif y is proportional to x(-c) for a constant c y = .x(-c) • If x and y are related by a power law, then the graph of log(y) versus log(x) is a straight line log(y) = -c.log(x) + log() • The slopeof the log-log plot is the power exponentc

  21. Power Law Distributions • A random variable X has a power law distributionif Pr[X=k] is proportional to k(-c) for a constant c • The cumulative distribution, Pr[X>k], of a power law distribution is proportional to k(-c+1), and is called the Pareto law • Similar to a power law, the Zipf lawrelates the rank r of X to its size: the r’th largest instance of X is proportional to r(-c’)

  22. Example: City Populations • New York 7,322,564 • Los Angeles 3,485,398 • Chicago 2,783,726 • Houston 1,630,553 • Philadelphia 1,585,577 • San Diego 1,110,549 • Detroit 1,027,974 • Dallas 1,006,877 • Phoenix 983,403 • San Antonio 935,933

  23. Example: City Populations • New York 7,322,564 • Los Angeles 3,485,398 • Chicago 2,783,726 • Seattle 516,259 • Spokane, WA 177,196 • Tacoma, WA 176,664 • Little Rock, AR 175,795 • Bakersfield, CA 174,820 • Fremont, CA 173,339 • Fort Wayne, IN 173,072 • Arlington, VA 170,936

  24. Example: City Populations • Power law exponent: c = 0.74

  25. Power Laws in Networks • Degree distribution often satisfies a power law: fraction of nodes fdof degree d is proportional to d-c

  26. Example: Collaboration Graph • Power law exp: c = 2.97 • With exponential decay factor, c = 2.46

  27. Example: Cross-Post Graph • Power law exponent: c = 1.3

  28. Example: Inter-Domain Internet • Power law exponent: 2.15 < c < 2.2

  29. Example: Intra-Domain Internet • Power law exponent: c = 2.48

  30. Example: Web Graph In-Degree • Power law exponent: c = 2.09

  31. Example: Web Graph Out-Degree • Power law exponent: c = 2.72

  32. Small World Phenomenon Six degrees of separation: “Everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names.”

  33. Small World Phenomenon • Milgram’s famous experiment (1960s): • Choose a random person in Nebraska, Bob • Ask Bob to deliver a letter to a random person in Massachusetts, Lashawn • Tell Bob target’s name, address, and occupation • Instruct Bob to only send letter to people he knows on a first-name basis

  34. Small World Phenomenon Bernard, David’s cousin who went to college with David, mayor of Bob’s town Bob, a farmer in Nebraska Maya, who grew up in Boston Six Degrees of Separation With Lashawn

  35. Small World Phenomenon in Graphs • The diameterof a graph is the maximum distance (number of edges) between any pair of nodes • The average distanceof a graph is the average distance between any pair of nodes • The average connected distanceof a graph is the average distance between any pair of connected nodes

  36. Small World Phenomenon in Graphs • A graph exhibits a small world phenomenonif it has low diameter or average (connected) distance • Typically, the average distance of a small world graph is on the order of log n (where n is the number of nodes)

  37. Examples • Collaboration graph • 401,000 nodes, 676,000 edges (average degree 3.37) • Diameter: 23, Average distance: 7.64 • Cross-post graph, giant component • 30,000 nodes, 800,000 edges (average degree 53.3) • Diameter: 13, Average distance: 3.8 • Web graph • 200 million nodes, 1.5 billion edges (average degree 15) • Average connected distance: 16 • Inter-domain Internet • 3500 nodes, 6500 edges (average degree 3.71) • 95% of pairs of nodes within distance 5

  38. High Clustering Coefficient • The clustering coefficientof a graph is the fraction of triangles among connected triples of nodes • Intuitively, the clustering coefficient reflects the probability that your friends are themselves friends • We expect social networks to have a high clustering coefficient

  39. Examples • Collaboration graph • Clustering coefficient is 0.14 • Density of edges is 0.000008 • Cross-post graph • Clustering coefficient is 0.4492 • Density of edges is 0.0016

  40. Assignment READ: A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph structure in the web, WWW, 2000.

  41. Graph Structure of the Web • Breadth-first search from randomly chosen start nodes • Follow both forward and backward links • Reveal directed and undirected graph structure • Over 90% of nodes reachable if links are treated as undirected • Directed graph reveals complex bow-tie structure

  42. Bow-Tie Structure of Web Graph Picture from the Nature journal

  43. Next Time Probabilistic models for social networks

More Related