1 / 22

The BitTorrent content distribution system

CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006. The BitTorrent content distribution system. Motivation. flash crowd (aka slashdot) effect many clients, few servers Problem: servers cannot handle load Solution: swarming

andie
Télécharger la présentation

The BitTorrent content distribution system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006 The BitTorrentcontent distribution system

  2. Motivation • flash crowd (aka slashdot) effect • many clients, few servers • Problem: servers cannot handle load • Solution: swarming • clients download pieces of the file from each other • has been proven to have good scaling and performance properties

  3. Presentation outline • Joining the system • Encoding / metadata file • Tracker protocol • Peer wire protocol • Piece selection • Peer selection • Client implementations • Resources

  4. 2 4 3 1 tracker website seed/leecher new leecher Joining a torrent Peers divided into: • seeds:have the entire file • leechers: still downloading metadata file join peer list datarequest 1. obtain the metadata file (out of band) 2. contact the tracker 3. obtain a peerlist (contains seeds & leechers) 4. contact peers from that list for data

  5. leecher A seed leecher B leecher C Exchanging data I have ! ●verify pieces using hashes ●download sub-pieces (blocks) in parallel ● advertise received pieces to the entire peer list ● interested: need pieces that a given peer has

  6. Bencoding • encoding format of all exchanged messages • four types • byte strings • integers • lists • dictionaries (mapping keys to values) • examples • 4:spam represents the string “spam” • i10e represents the integer 10

  7. Metadata file structure • contains information necessary to contact the tracker and describes the files in the torrent • announce URL of tracker • file name • file length • piece length (typically 256KB) • SHA-1 hashes of pieces for verification • also creation date, comment, creator, …

  8. Tracker protocol • communicates with clients via HTTP/HTTPS • client GET request • info_hash: uniquely identifies the file • peer_id: chosen by and uniquely identifies the client • client IP and port • numwant: how many peers to return (defaults to 50) • stats: bytes uploaded, downloaded, left • tracker GET response • interval: how often to contact the tracker • list of peers, containing peer id, IP and port • stats: complete, incomplete • tracker-less mode; based on the Kademlia DHT

  9. Presentation outline • Joining the system • Encoding / metadata file • Tracker protocol • Peer wire protocol • Piece selection • Peer selection • Client implementations • Resources

  10. Peer wire protocol • implemented directly on top of TCP • messages • handshake (maybe with bitfield) • keep-alive • choke / unchoke • interested / not interested • have (advertisement of a newly acquired piece) • request / piece • cancel (only used in “endgame mode”) • port (used in tracker-less mode)

  11. Piece selection • when downloading starts: choose at random • get complete pieces as quickly as possible • obtain something to offer to others • after we have 4 pieces: pick (local) rarest first • achieves the fastest replication of rare pieces • obtain something of value • only get unique pieces from the seed • endgame mode • defense against the “last-block problem” • send requests for missing sub-pieces to all peers in our peer list • send cancel messages upon receipt of a sub-piece

  12. Last-block problem • at the end of the download, a peer may have trouble finding the few missing pieces • based on anecdotal evidence • other proposals • network coding [Gkantsidis et al., Infocom’05] • prefer to upload to peers with similar file completeness; unfair for the peers having most of the pieces [Tian et al., Infocom’06]

  13. Last-block problem – a myth? • is it a problem after all? • figure from [Legout et al., INRIA-TR-2006], with permission

  14. leecher A seed leecher B leecher C Peer selection - unchoking • periodically (typically every 10 seconds) calculate data-receiving rates • upload to (unchoke) the fastest • constant number of unchoking slots • based on the “tit-for-tat” strategy

  15. Optimistic unchoking • periodically select a peer at random and upload to it • typically every 3 unchoking rounds (30 seconds) • multi-purpose mechanism • allow bootstrapping of new clients • continuously look for the fastest partners • robustness: every peer has a non-zero chance of interacting with any other peer

  16. Seed unchoking • old algorithm • unchoke the fastest leechers • problem: fastest peers may monopolize seeds • new algorithm • periodically sort all leechers according to their last unchoke time • prefer the most recently unchoked leechers; on a tie, prefer the fastest • (presumably) achieves equal spread of seed bandwidth

  17. leecher A seed leecher B leecher C tracker Downloading only from seeds new listrequest peer list ● repeatedly query the tracker for peer lists ● distinguish the seeds, and receive data from them ● violates fairness model; may be harmful to honest peers

  18. Rate- vs. volume-based selection • Proponents of rate-based decisions: [Cohen, P2PECON’03], and[INRIA TR’2006] • Proponents of volume-based decisions:[Bharambe et al., MSR-TR-2005],[Gkantsidis et al., Infocom’05], [Jun et al., P2PECON’05], andeDonkey file-sharing system • No clear winner yet!

  19. Client implementations • mainline: written in Python; right now, the only one employing the new seed unchoking algorithm • Azureus: the most popular, written in Java; implements a special protocol between clients(e.g. peers can exchange peer lists) • other popular clients: ABC, BitComet, BitLord, BitTornado, μTorrent, Opera browser • various non-standard extensions • retaliation mode: detect compromised/malicious peers • anti-snubbing: ignore a peer who ignores us • super seeding: seed masquerading as a leecher

  20. Resources #1 • Basic BitTorrent mechanisms [Cohen, P2PECON’03] • BitTorrent specification Wikihttp://wiki.theory.org/BitTorrentSpecification • Measurement studies [Izal et al., PAM’04], [Pouwelse et al., Delft TR 2004 and IPTPS’05], [Guo et al., IMC’05], and[Legout et al., INRIA-TR-2006]

  21. Resources #2 • Theoretical analysis and modeling [Qiu et al., SIGCOMM’04], and[Tian et al., Infocom’06] • Simulations [Bharambe et al., MSR-TR-2005] • Sharing incentives and exploiting them [Shneidman et al., PINS’04],[Jun et al., P2PECON’05], and[Liogkas et al., IPTPS’06]

  22. Conclusion and food for thought • BitTorrent is fast and robust • Yet, many parameters are arbitrarily set • number of unchoking slots • unchoking round duration • size of pieces / sub-pieces • What can we learn from BitTorrent for the design of future P2P content distribution protocols?

More Related