Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Study on Network Size Estimation Schemes for Peer-to-Peer Networks

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Study on Network Size Estimation Schemes for Peer-to-Peer**Networks 2008/02/19 Hosik Cho hscho@mmlab.snu.ac.kr**Some Questions**• How many people in this room? • Why do you think that? • How many people in this campus? • Can you count them all? • How many nodes in a P2P network over the world?**Contents**• Peer to Peer networks • Network size estimation • Estimation methods • Unstructured P2P • Structured P2P • Conclusion**P2P networks**• A peer to peer overlay network connects peers in a logical manner on top of IP. • Unstructured P2P: Gnutella, Freenet • Structured P2P: Chord, CAN, Pastry, … • P2P applications • File sharing systems (Kazza, Gnutella) • Video over IP (CoolStreaming) • Voice over IP (Skype)**P2P networks**• Characteristics • Scalable • Self-organizing capability • Resilience to failure • Fully decentralized • The system monitoring and obtaining global statistics become much more complex.**Network size estimation**• Network size (N) • Load balancing • Restricted broadcasting • Determining network parameters • For unstructured P2P network, most approaches are based on broadcasting. • For structured P2P network, the size can be directly inferred from the density of identifiers.**Related Works**• Unstructured P2P • Sample & Collide • Hops Sampling • Gossip-based aggregation • Structured P2P • Token passing • Neighbor sampling • Finger sampling**Sample&Collide (1)**• “Birthday Paradox” – The probability of having two people in a room that have the same birthday is at least 50%, for a group of 23 peoples. • The initiator samples nodes uniformly at random until a sample returns a node that already has been selected. • The expected number (X) of samples is √2n • The system size is estimated to X2/2**T**Sample&Collide (2) • Initiator node set T>0 • Send to neighbors • Nodes picks a random number U, and decrements T by log(U)/di • T>0, forwards the message • T<0, return its ID to the initiator (sample)**HopsSampling (1)**• Probabilistic polling approach • An initiator spreads messages in the network and estimates the system size based on the replies it gets back. • If hopCount < minHopsReporting, a response is set with prob. 1 • Else, the response is sent with prob. 1/2(hopCount-minHopsReporting) • If minHopsReporting=2, only 25% of nodes with distance 4 will report back.**HopsSampling (2)**• Initiator node set hopCount=0 • Send to neighbors • If hopCount < minHopsReport, send response • Else, send response with probability depending on hopCount.**Gossip-based (1)**• Epidemic-based approach • If exactly one node of the system holds a value 1, and all the other values are 0, the average is 1/N. • An initiator take the value 1, and start gossiping. • The reached nodes participate to the process by setting their value to 0. • At each cycle, each node in the network chooses one of its neighbor and swaps its estimation parameter.**Gossip-based (2)**• Estimation (Estimation+neighbor’s_Estimation)/2 • To provide correct estimations, this algorithm needs to wait a certain number of rounds to elapse before computing the size estimation. • This period is the required time for the gossip to propagate in the whole network and for the values to converge.**N Estimation in S-P2P**• Assumptions • IDs are uniformly distributed. • Each node knows the total number of nodes (N) in the system. • Nodes do not leave and join frequently.**Basic approaches**7 4 Token 5 (a) Token passing (b) Neighbor sampling**N Estimation in S-P2P**• In actual deployed system, • Nodes join and leave frequently. • Node must estimate the time how long a query delivered to the destination. O(logN) • Proximity-based identifiers are adopted for efficient routing. • AS number • geographic location**Uniformity of Identifiers**Myth Real**Estimation result (1)**Uniformly distributed IDs Proximity ID’s**Extended approach**• Structured P2P maintains fingers, routing tables, contacts, etc. • Estimate N more precisely using structural information.**Estimation result (2)**Uniformly distributed IDs Proximity ID’s**Conclusion**• For unstructured P2P • Tradeoff between the quality of the estimate and the associated overhead. • A proper algorithm should be applied according to its objectives and applications. • For structured P2P • Distribution of identifiers may be skewed. • Use of structural information will make the estimation results more accurate.**References**• D. Psaltoulis, D. Kostoulas, I. Gupta, K. Birman, and A. Demers, “Practical algorithms for size estimation in large and dynamic groups,” PODC 2004. • D. Kostoulas, D. Psaltoulis, I. Gupta, K. Birman, and A. Demers, “Decentralized schemes for size estimation in large and dynamic group,” IEEE NCA’05, 2005. • L. Massoulie, A.-M. Kermarrec, E. Le Merrer, and A.J. Ganesh, “Peer couting and sampling in overlay networks: random walk methods,” Technical report MSR-TR-2005-156, 2005. • G.S. Manku, M. Bawa, and P. Raghavan, “Symphony: Distributed Hashing in a Small World,” USITS 2003.