260 likes | 374 Vues
This paper explores the scalability challenges in Gnutella-like Peer-to-Peer (P2P) systems and presents Gia, a novel system designed to improve scalability and query efficiency. By leveraging Distributed Hash Tables (DHTs) and innovative techniques such as topology adaptation and flow control, Gia balances load among peers of varying capacities. The evaluation compares Gia's performance against traditional models like Gnutella and KaZaA, highlighting its ability to reduce flooding, enhance lookup operations, and maintain robustness against high churn rates. This framework paves the way for more efficient decentralized networks.
E N D
Gia: Making Gnutella-like P2P Systems Scalable HY558 GiorgosSaloustros (gesalous@csd.uoc.gr)
Roadmap • Introduction • P2P systems • Distributed Hash Tables (DHT) • Gia Design And Implementation • Evaluation • Questions
Roadmap • Introduction • P2P systems • Distributed Hash Tables (DHT) • Gia Design And Implementation • Evaluation • Questions
P2P Systems • Distributed Systems that have no central servers • Peers forms an overlay network • All peers have equivalent functionalities • Advantages: Scalability, Resource utilization, Reducing administrative costs • Look Up operation is the main problem all P2P networks need to address
Early P2P Systems • Napster • Centralized file index • P2P file transfer • Gnutella (GNU + Nutella) • Unstructured overlay network, topology and placement of files unconstained • Uses simple flooding for file search among peers • KaZaA • Supernodes and Ordinary nodes • Flooding between Supernodes
Distributed Hash Tables (DHT) • Hash Tables Distributed over multiple nodes • LookUp operation requires O(logn) steps • Examples: Chord • Why not use DHT? • Sensitive in High Churn Rate • Need for Keyword Searches • Look for popular objects
Roadmap • Introduction • P2P systems • Distributed Hash Tables (DHT) • Gia Design And Implementation • Evaluation • Questions
Gia • Goals • Scalability • Higher Aggregate Query Rates(!) • Gia design principles • Topology Adaptation: Nodes are Closer to High Capacity Nodes • Flow Control • One-hop replication of content info • Biased Random Walks
Topology Adaptation • Bootstrapping via host cache or equivalent schemes • Construction of an overlay network with low capability nodes close to high capability nodes • Use of satisfaction metric • Number of Peers [min_nbrs, max_nbrs]
Satisfaction Calculation Algorithm • Each Node recalculates S in a time gap • T maximum interval between iterations • K aggressiveness of the adaptation
Flow Control • Active flow control assigns tokens to neighbours • X forwards a query to Y only if X has received a token from Y • Elimination of query packets dropping since gia uses random walks • Token allocation rate varies on query-processing capability and buffer queue of every peer • Start-time Fair Queueing is used with weight as the neighbours’ advertized capacity
One-hop replication • Content information is exchanged during connection and updated incrementally • High capacity peers act as proxy for low capacity peers
Search Protocol • Random walk to highest capability peer with flow token received • Uses GUID to send queries to different paths • TTL and max_responses bounds propagation • Advantage: Reduce flooding / congestion • Disadvantage: Sensitive to peer failures • Solution: keep-alive messages, with app-level retries
Roadmap • Introduction • P2P systems • Distributed Hash Tables (DHT) • Gia Design And Implementation • Evaluation • Questions
Evaluation • Gia is compared with • FLOOD – Gnutella model • RWRT – random walks over random topologies • SUPER – KaZaA model, flooding only between super nodes • Query rate is equal for all peers, bounded only from their capacity
Evaluation • Gia: Random graph with topology adaptation. TTL=1024 • min_nbrs = 3, max_nbrs = 128 • Min_alloc=4 • max_nbrs = min(max_nbrs, Capacity/min_alloc) • RWRT: Random graph with uniform degree distributions. TTL=1024. Average degree = 8. • FLOOD: Random graph with uniform degree distributions. TTL=10. Average degree = 8. • SUPER: Random graph for supernodes. Ordinary nodes connect randomly to one super node (TTL=10).
CP and CP-HC • Collapse Point (CP) the per node query rate at the knee where the query rate drops below 90%. CP is the metric for the total system capacity • Hop-count before collapse (CP-HC) average hop count before collapse
Performance Comparison • Max Responses = 1
Robustness CP CP-HC