420 likes | 541 Vues
This paper explores peer-assisted content distribution networks and their techniques and challenges compared to traditional Content Delivery Networks (CDNs). It highlights the advantages of peer-to-peer (P2P) systems, such as reduced infrastructure costs and scalability, while addressing quality of service (QoS) concerns. The study also investigates strategies for integrating P2P with CDN for optimized content delivery. Key findings focus on traffic shaping, neighbor selection algorithms like BitTorrent's, and methods to mitigate issues related to ISP clustering.
E N D
Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University
Traditional Intra-Provider Content Distribution Networks National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .
Peer-to-Peer Content Distribution National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .
P2P vs CDN • P2P: • No infrastructure cost • Supply grows linearly with demand • Simple distributed, randomized algorithms • No QoS • CDN: • Initial infrastructure cost • Centralized scheduling algorithms • Network efficiency • Capable of supporting QoS
Combine P2P with CDN? • Use P2P to complement CDN • P2P reduces load on the CDN, covers areas where CDN is not installed • Must be able to control, or “shape”, P2P traffic • Use CDN to complement P2P • CDN steps in when peer-based distribution is falling short, enabling QoS • Must be able to detect when peers won’t meet the delivery time guarantee
Outline • Review of BitTorrent • Traffic-shaping BitTorrent: biased neighbor selection • QoS in BitTorrent: delivery time prediction
BitTorrent File Sharing Network Goal: replicate K chunks of data among N nodes • Form neighbor connection graph • Neighbors exchange data
BitTorrent: Neighbor Selection Tracker file.torrent Seed 1 Whole file 4 3 2 5 A
BitTorrent: Piece Replication Tracker file.torrent Seed 1 Whole file 3 5 A
BitTorrent: Piece Replication Algorithms • “Tit-for-tat” (choking/unchoking): • Each peer only uploads to 7 other peers at a time • 6 of these are chosen based on amount of data received from the neighbor in the last 20 seconds • The last one is chosen randomly, with a 75% bias toward newcomers • (Local) Rarest-first replication: • When peer 3 unchokes peer A, A selects which piece to download
Analysis of BitTorrent • Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks • Demonstrated by simulation studies • Confirmed by theoretical modeling studies • Intuition: in a random graph, Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%
Random Neighbor Graph • Existing studies all assume random neighbor selection • BitTorrent no longer optimal if nodes in the same ISP only connect to each other • Random neighbor selection high cross-ISP traffic
Difficulty in Traffic-Shaping P2P Applications • ISPs: • Different links have different monetary costs • Prefer “clustering” of traffic • P2P Applications: • No knowledge of underlying ISP topology • Use randomized algorithms that don’t do well under clustering • Current solution: throttling users suffer
A Network-Friendly BitTorrent? • ISPs inform BitTorrent of its link preferences • Algorithm of BitTorrent is adjusted such that both users and ISPs benefit • Example: Biased Neighbor Selection • Works when cost function is transitive
Biased Neighbor Selection • Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP ISP
Implementing Biased Neighbor Selection • By Tracker • Need ISP affiliations of peers • Peer to AS maps • Public IP address ranges from ISPs • Special “X-” HTTP header • By traffic shaping devices • Intercept “peer tracker” messages and manipulate responses • No need to change tracker or client
Evaluation Methodology • Event-driven simulator • Use actual client and tracker codes as much as possible • Calculate bandwidth contention, assume perfect fair-share from TCP • Network settings • 14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s download • Seed node, 400Kb/s upload • Optional “university” nodes (1Mb/s upload) • Optional ISP bottleneck to other ISPs
Throttling: Cross-ISP Traffic Redundancy: Average # of times a data chunk enters the ISP
Importance of Rarest-First Replication • Random piece replication performs badly • Increases download time by 84% - 150% • Increase traffic redundancy from 3 to 14 • Biased neighbors + Rarest-First More uniform progress of peers
Presence of External High-Bandwidth Peers • Biased neighbor selection alone: • Average download time same as regular BitTorrent • Cross-ISP traffic increases as # of “university” peers increase • Result of tit-for-tat • Biased neighbor selection + Throttling: • Download time only increases by 12% • Most neighbors do not cross the bottleneck • Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peers
Comparison with Simple Clustering • Gateway peer: only one peer connects to the peers outside the ISP, all other peers only connect to peers inside the ISP • Gateway peer must have high bandwidth • It is the “seed” for this ISP • Ends up benefiting peers in other ISPs
Combining Biased Neighbor Selection with Caches • Under random neighbor selection • bandwidth requirement of cache is high • Under biased neighbor selection • bandwidth needed from the cache is reduced by an order of magnitude
Conclusions • By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost • Biased neighbor selection: choose initial set of neighbors well • Can be combined with throttling and caching BitTorrent’s algorithm can be shaped!
Motivation • Provide delivery time guarantee under P2P+CDN • What contributes to delivery time of a download via BitTorrent? • From simulations: seed bandwidth and even replication of blocks • Missing: node join/leave dynamics, TCP effects, etc.
Side-by-Side Live Experiments • Two clients, running on the same machine, starting at the same time, downloading the same • 13 experiments from Apr-May 2006 • File sizes: 700MB ~ 1.4GB • Network size: 1100 ~ 2100 peers • Duration: 10 hrs ~ 2 days
Results from Experiments • Effective download rate: 10 ~ 30KB/s • Speed difference between the two peers: 3% ~ 82% • What made the slower peer slow?
Suspicion #1: Slower Neighbors? • Calculate unweighted average of observed throughput at application level • R1: average from all neighbors • R2: average from neighbors uploading >250KB of data • R3: average from neighbors uploading >2.5MB of data • Low correlation between download-time ratio and neighbor-speed ratio • 0.57 for R1, 0.43 for R2, 0.47 for R3 • Faster neighbors corresponds to slower downloads in 3 experiments
Suspicion #2: Fewer Neighbors Uploading to the Peer? • Slot analysis: calculate download concurrency • Maximum number of neighbors: 35 • Neighbors come and go align neighbors into 35 slots • Calculate time-average of number of concurrent slots with neighbors uploading • Upload concurrency varies from 7 to 11 • Explains one of the download-time/neighbor-speed reversal case • But doesn’t explain the two others
“Close” Neighbors • 90% of data downloaded from 1-4% of neighbors • Let F(p) and G(p) be the number of neighbors that provides p of data to peers F and G, then F(p) > G(p) peer F is slower than G • This holds for p = 90%, 75%, and 50%
What makes a neighbor close? • Not related to speed, or order of connection to peer, or order of unchoking by peer
Cost of Departure of a Close Neighbor • Departure cost: if one close neighbor leaves, calculate the time until the earliest next close neighbor • The average departure cost: 30 min The convergence time of the tit-for-tat algorithm is slow
Why Do Close Neighbors Leave • Five possible reasons • A: Random disconnect • B: Finished downloading • C: Peer broke off the relationship • D: Neighbor broke off the relationship • Results: B is most common, followed by C/D, then A
Conclusions • Content delivery time in BitTorrent is determined by: • Neighbor upload speed • Stability of neighbor relationship • Disruption of the pairing leads to long delivery time • Neighbors may leave due to random disconnection, completion of download, or finding faster neighbors
Using CDN to Complement P2P • Use nodes CDN as high-speed specially managed seeds • Seeds are called to help whenever a node loses a close neighbor
Summary • A way to shape BitTorrent traffic • Predicting BitTorrent performance by monitoring close peer relationship
Related Work • Many modeling studies of BitTorrent • Simulation studies • Measurements of real torrents
Ongoing Work • Live experiments with biased neighbor selections • A k-regular graph algorithm with faster convergence • Prototype implementation of “P2P+CDN”