1 / 36

A Statistical Physics approach for Modeling P2P Systems

A Statistical Physics approach for Modeling P2P Systems. Giovanna Carofiglio 1 , R.Gaeta 2 , M.Garetto 1 , P.Giaccone 1 , E.Leonardi 1 , M.Sereno 2. 1 Politecnico di Torino, 2 Università di Torino Italy. MAMA Workshop joint with ACM SIGMETRICS 2005 Banff, June 6-10, 2005. Outline.

Télécharger la présentation

A Statistical Physics approach for Modeling P2P Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Statistical Physics approach for Modeling P2P Systems Giovanna Carofiglio1, R.Gaeta2, M.Garetto1, P.Giaccone1, E.Leonardi1, M.Sereno2 1 Politecnico di Torino, 2 Università di Torino Italy MAMA Workshopjoint withACM SIGMETRICS 2005 Banff, June 6-10, 2005

  2. Outline • Motivation • Basic Model • Extended Model • Content Search • Download effects

  3. P2P System Architecture server clients peers • A possible definition Decentralized, self-organizing distributed systems, in which all or most communication is symmetric.

  4. Peer-to-Peer traffic • P2P is the single largest generator of traffic • P2P traffic significantly outweights web traffic • P2P traffic is continuing to grow

  5. P2P Applications • File Sharing • BitTorrent, KaZaA, Gnutella, eDonkey, Napster, etc. • DHTs • Chord, CAN, Pastry, Tapestry • Wireless Ad hoc Networking • Communication • Voice Over IP: Skype • Instant Messaging • Distributed Computation • Seti@home, UnitedDevices, Distributed Science

  6. Motivation • Most of the Internet traffic is generated by p2p applications. • Performance studies of p2p systems may be useful to drive the design of future applications. • Analytical models help analyzing large and complex p2p networks.

  7. Modeling techniques • Traditional Markov Models • A detailed microscopic description is provided but with a huge space-state. • It is computationally expensive to analyze large systems like p2p systems (with million of users and contents shared). • Fluid models • Network dynamics are described with an increased level of abstraction, neglecting stochastic information. • Scalability: the model is based on a set of differential equations invariant w.r.t. the size of the network (n.users, link cap)

  8. Model description • We model a generic p2p system without focusing on a particular implementation. • Based on a fluid approach like in [1] and [2], our model evolves in a second-order diffusion approximation where stochasticity in networks’ dynamics plays a relevant role. • The model provide a description of users/contents dynamics both in transient and in steady state. [1]F. Clevenot, P. Nain, “A Simple Model for the Analysis of SQUIRREL”, Infocom 2004, Hong Kong, Mar 2004. [2]D. Qiu, R. Srikant, “Modeling and Performance Analysis of BitTorrent like Peer-to-Peer Networks”, Sigcomm 2004, U.S.A.

  9. Model structure Users dynamics Search phase Download phase Contents dynamics

  10. 2 Outline • Motivation • Basic Model • Extended Model • Content Search • Download effects

  11. Users dynamics (1) • The number of users joining the p2p network dynamically changes according to: • Enter-leave dynamics λ u= new users’ arrival rate1/μu= average subscription time • Active-Sleeping mode 1/μas= average active time1/μsa= average sleeping time • Users in sleeping mode do not interact at all with the other users of the community.

  12. Users dynamics (2) The evolution of the number of users in active or sleeping mode, Ua and Us respectively, can be described by two fluid differential equations: sleeping users who become active new users active users who become sleeping active users who become sleeping active users who leave the system

  13. Content Dynamics • The evolution of the number of available copies of a content is driven by 2 phenomena: • the generationof new copies (downloads or off-on transitions) • the cancellationof existing copies θ = average request rate 1/μh, 1/μ’h = average content holding time for active/sleeping users Note:ps=ps(μ’h ) is the probability that sleeping users have the considered content when they become active.

  14. Brownian Motion • Content dynamics are modelled through a Second-Order Diffusion Approximation Each content is a particle with instantaneous position x(t) moving accordingly to a Brownian motion. Langevin equation The evolution of the pdf f(x,t) over follows: Fokker Planck equation

  15. Content diffusion equation • The pdf F(x,t) of the number of copies follows the F.P. equation with boundary conditions for : Introduction of new contents in the system • A content can disappear when are no more copies available. The rate at which a content disappear is:

  16. Diffusion Parameters • m(x,t) expresses the average speed at which the content-particle moves along the x axis. hh= variation coefficient of holding time hr= variation coefficient of inter request time • The varianceσ2(x,t) expresses the burstiness of the processes.

  17. Case : Content disappearance (1) • In a single-content scenario we study the probability that the content disappears as a function of the users’ dynamics. Network parameters Initial condition • λ u= users’ arrival rate = 0.1 ut/s • 1/μu= avg subscription time = 4000 s • 1/μas = avg active period = 400 s • 1/μsa = avg sleeping period = 400 s • θ = average request rate • 1/μh ,1/μ’h = avg content holding time for a/s users= 100 s • Active Users = 10 • Sleeping Users = 10 • Copies Availables = 1

  18. Case: Content disappearance (2) Che grafico facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

  19. 2 Outline • Motivation • Basic Model • Extended Model • Content Search • Download effects

  20. Dual distribution • Relations between users’ and contents’ dynamics • The number of active and sleeping users at time t • The number of copies available at time t

  21. Dual equations • Ga(x,t) and Gs(x,t) are the pdf of the number of active and sleeping users having x contents: active users who become sleeping or leave the system sleeping users who become active new users

  22. Diffusion parameters • As for the contents diffusion equation m(x,t) expresses the average speed at which the copy-particle moves along the x axis, while σ2(x,t) expresses the variance of the associated process. ra= rate of generation of new copies da/s= rate of cancellation of existing copies

  23. Multi-contents case (1) • In a multi-content scenario, still assuming ideal search and download we study the steady state distribution of the contents among users. Network parameters Initial condition • λ u= users’ arrival rate = 0 ut/s • 1/μu= avg subscription time = inf • 1/μas = avg active period = 6 h • 1/μsa = avg sleeping period = 18 h • θ = average request rate = 2 c/h • λ c= contents’ introduction= 1/600 c/s • 1/μh ,1/μ’h = avg content holding time for a/s users= 10 h, 8 h • Active Users = 2500 • Sleeping Users=7500 • Copies Availables = 1

  24. Multi-contents case (2) Che grafici facciamo vedere? Modello e simulatore michele a confronto? Solo Modello?

  25. 2 Outline • Motivation • Basic Model • Extended model • Content Search • Download effects

  26. The contents’ trasfer rate • In a non-ideal p2p system the transfer rate of the contents dynamically changes according to: • theprobability of a successful searchphit(x,t)(related to content diffusion, search algorithm) • theprobability of a successful downloadpdown(x,t)(related to network congestion, user impatience, on-off dynamics) The effective retrieval rate becomes: • Both search and download require to know F(x,t) and provide it as a function of time.

  27. Search Phase • Search algorithm: flooding in an unstructured p2p network For each content request a query message is forwarded to all the neighbors up to the distance max_ttl • Graph Model The P2P network topology is modeled as a random finite graph. Active peer Application-level connection We consider Generalized Random Graph (GRG) to allow an arbitrary vertex degree distribution.

  28. GRG Model • Given the probability distribution {pk} that a vertex has k edges departing from it, we can define the generating function: • It can be shown that the generating function of the number of the first neighbors with a copy of the content is: α= x/Ua X=#copies Ua=#active users • The composition of these generating functions gives the generating function of the number of neighbors at distanceh

  29. GRG Topology • Now we can define the generating function for the number of neighbors at distance up tomax_ttl that have a copy of the content: Hence it derives the hit probability: • To compute thepdf of the GRG nodes degreewe adopt a M/M/∞ queue Assuming that an external observer joins the network # customers # connections established in queue by the observer

  30. 2 Outline • Motivation • Basic Model • Extended Model • Content Search • Download effects

  31. Download Phase • Assumptions: • The transport network is ideal • Infinite bandwidth on the client side • The peer from which downloading the desired content is rqndomly chosen between those storing that content. The dynamics of dowload at each peer are modelled by a M/G/1-PS queue. Problem The download request rate incoming at peers is not known a priori! It depends on: • The contents’ distribution at peers • The policy used by the system to distribute the load among peers

  32. Probability of successful download (1) Single Content Case • Letθ is the popularity of a content, present in x copies in the network where there are Ua active peers Download request rate • Assuming that the requests form a Poisson process, the queue becomes a M/G/1-PS with average delay: • Given a download rate y= θsphittheprobability of successful downloadis:

  33. Probability of successful download (2) The overall probability of successful download is ( F(x) is the pdf of the number of copies available for the content ) Multiple Content Case From F(x) we derive the probability that a peer has k contents, present in x copies: The overall download request rate seen by a peer is

  34. Probability of successful download (3) • Since all Z(x) are independent we can approximate the distribution of Y around its average with a normal distribution • The probability of successful download becomes Notes • my and σy are the first two moments of Y • The integral is restricted to the interval for numerical reasons.

  35. Conclusions • We defined a stochastic fluid model of a p2p system able to describe users and contents dynamics both in transient and stationary regime. • A support model permits to consider the effects of the search and the download on the system performance. • Analytical solution of the equations in steady state • Model Extension to classes of different users • Model Extension to classes of different contents • Comparison beetween model and simulations in realistic scenarios. Work in progress…

  36. Thank you!

More Related