1 / 24

Traffic-driven model of the World-Wide-Web Graph

Traffic-driven model of the World-Wide-Web Graph. A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France. Outline. The WebGraph Some empirical characteristics Various models Weights and strengths Our model: Definition Analysis: analytics+numerics

jodie
Télécharger la présentation

Traffic-driven model of the World-Wide-Web Graph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Traffic-driven model of the World-Wide-Web Graph A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France

  2. Outline • The WebGraph • Some empirical characteristics • Various models • Weights and strengths • Our model: • Definition • Analysis: analytics+numerics • Conclusions

  3. The Web as a directed graph nodes i: web-pages directed links: hyperlinks l j i in- and out- degrees:

  4. Poisson distribution Empirical facts • Small world : captured by Erdös-Renyi graphs With probability p an edge is established among couple of vertices <k> = p N

  5. n 3 Higher probability to be connected 2 1 Empirical facts • Small world • Large clustering: different neighbours of a node • will likely know each other =>graph models with large clustering, e.g. Watts-Strogatz 1998

  6. Empirical facts • Small world • Large clustering • Dynamical network • Broad connectivity distributions • also observed in many other contexts • (from biological to social networks) • huge activity of modeling (Barabasi-Albert 1999; Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003)

  7. Various growing networks models • Barabási-Albert (1999): preferential attachment • Many variations on the BA model: rewiring (Tadic 2001, Krapivsky et al. 2001), addition of edges, directed model (Dorogovtsev-Mendes 2000, Cooper-Frieze 2001), fitness (Bianconi-Barabási 2001), ... • Kumar et al. (2000): copying mechanism • Pandurangan et al. (2002): PageRank+pref. attachment • Laura et al. (2002): Multi-layer model • Menczer (2002): textual content of web-pages

  8. The Web as a directed graph nodes i: web-pages directed links: hyperlinks l j i Broad P(kin) ; cut-off for P(kout) (Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003)

  9. Additional level of complexity: Weights and Strengths l j Links carry weights/traffic: wij i In- and out- strengths Adamic-Huberman 2001: broad distribution of sin

  10. n Model: directed network (i) Growth j (ii) Strength driven preferential attachment (n: kout=m outlinks) i “Busy gets busier” AND...

  11. n Weights reinforcement mechanism j i The new traffic n-i increases the traffic i-j “Busy gets busier”

  12. Evolution equations (Continuous approximation) Coupling term

  13. Resolution Ansatz supported by numerics:

  14. Results

  15. Approximation Total in-weight i sini : approximately proportional to the total number of in-links i kini , times average weight hwi = 1+ Then: A=1+ gsin2 [2;2+1/m]

  16. Numerical simulations Measure of A prediction of  Approx of g

  17. Numerical simulations NB: broad P(sout) even if kout=m

  18. Clustering spectrum i.e.: fraction of connected couples of neighbours of node i

  19. Clustering spectrum • d increases => clustering increases • New pages: point to various well-known pages, often connected • together => large clustering for small nodes • Old, popular pages with large k: many in-links from many less popular pages which are not connected together • => smaller clustering for large nodes

  20. Clustering and weighted clustering takes into account the relevance of triangles in the global traffic

  21. Clustering and weighted clustering Weighted Clustering larger than topological clustering: triangles carry a large part of the traffic

  22. Assortativity Average connectivity of nearest neighbours of i

  23. Assortativity • knn: disassortative behaviour, as usual in growing networks • models, and typical in technological networks • lack of correlations in popularity as measured by the in-degree

  24. Summary • Web: heterogeneous topology and traffic • Mechanism taking into account interplay between topology and traffic • Simplemechanism=>complex behaviour, scale-free distributions for connectivity and traffic • Analytical study possible • Study of correlations: non-trivial hierarchical behaviour • Possibility to add features (fitnesses, rewiring, addition of edges, etc...), to modify the redistribution rule... • Empirical studies of traffic and correlations?

More Related