1 / 38

Distributed hash tables Protocols and applications

Distributed hash tables Protocols and applications. Jinyang Li. Peer to peer systems. Symmetric in node functionalities Anybody can join and leave Everybody gives and takes Great potentials Huge aggregate network capacity, storage etc. Resilient to single point of failure and attack

chava
Télécharger la présentation

Distributed hash tables Protocols and applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed hash tablesProtocols and applications Jinyang Li

  2. Peer to peer systems • Symmetric in node functionalities • Anybody can join and leave • Everybody gives and takes • Great potentials • Huge aggregate network capacity, storage etc. • Resilient to single point of failure and attack • E.g. Gnapster, Gnutella, Bittorrent, Skype, Joost

  3. Motivations: the lookup problem • How to locate something given its name? • Examples: • File sharing • VoIP • Content distribution networks (CDN)

  4. Strawman #1: a central search service • Central service poses a single point of failure or attack • Not scalable(?) Central service needs $$$

  5. Improved #1: Hierarchical lookup • Performs hierarchical lookup (like DNS) • More scalable than a central site • Root of the hierarchy is still a single point of vulnerability

  6. Strawman #2: Flooding Where is item K? • Symmetric design: No special servers needed • Too much overhead traffic • Limited in scale • No guarantee for results

  7. Improved #2: super peers • E.g. Kazaa, Skype • Lookup only floods super peers • Still no guarantees for lookup results

  8. DHT approach: routing • Structured: • put(key, value); get(key, value) • maintain routing state for fast lookup • Symmetric: • all nodes have identical roles • Many protocols: • Chord, Kademlia, Pastry, Tapestry, CAN, Viceroy

  9. DHT ideas • Scalable routing needs a notion of “closeness” • Ring structure [Chord]

  10. 01100100 01100101 01100110 01100111 011000** 011001** 011010** 011011** Different notions of “closeness” • Tree-like structure, [Pastry, Kademlia, Tapestry] 01100101 Distance measured as the length of longest matching prefix with the lookup key

  11. 0,0.5, 0.5,1 0.5,0.5, 1,1 0.5,0.25,0.75,0.5 0.75,0, 1,0.5 0,0, 0.5,0.5 0.5,0,0.75,0.25 Different notions of “closeness” • Cartesian space [CAN] (1,1) Distance measured as geometric distance (0,0)

  12. DHT: complete protocol design • Name nodes and data items • Define data item to node assignment • Define per-node routing table contents • Handle churn • How do new nodes join? • How do nodes handle others’ failures?

  13. Chord: Naming • Each node has a uniqueflat ID • E.g. SHA-1(IP address) • Each data item has a unique flat ID (key) • E.g. SHA-1(data content)

  14. Predecessor is “closest” to Successor is responsible for Data to node assignment • Consistent hashing

  15. Key-based lookup (routing) • Correctness • Each node must know its correct successor

  16. Key-based lookup • Fast lookup • A node has O(log n) fingers exponentially “far” away Node x’s i-th finger is at x+2i away Why is lookup fast?

  17. Handling churn: join • New node w joins an existing network • Issue a lookup to find its successor x • Set its successor: w.succ = x • Set its predecessor: w.pred = x.pred • Obtain subset of responsible items from x • Notify predecessor: x.pred = w

  18. Handling churn: stabilization • Each node periodically fixes its state • Node w asks its successor x for x.pred • If x.pred is “closer” to itself, set w.succ = x.pred Starting with any connected graph, stabilization eventually makes all nodes find correct successors

  19. Handling churn: failures • Nodes leave without notice • Each node keeps s (e.g. 16) successors • Replicate keys on s successors

  20. Handling churn: fixing fingers • Much easier to maintain fingers • Any node at [2i, 2i+1] distance away will do • Geographically closer fingers --> lower latency • Periodically flush dead fingers

  21. Using DHTs in real systems • Amazon’s Dynamo key-value storage [SOSP’07] • Serve as persistent state for applications • shopping carts, user preferences • How does it apply DHT ideas? • Assign key-value pairs to responsible nodes • Automatic recovery from churn • New challenges? • Manage read-write data consistency

  22. Using DHTs in real systems • Keyword-based searches for file sharing • eMule, Azureus • How to apply a DHT? • User has file 1f3d… with name jingle bell Britney • Insert mappings: SHA-1(jingle)1f3d, SHA-1(bell) 1f3d, SHA-1(britney) 1f3d • How to answer query “jingle bell”, “Britney”? • Challenges? • Some keywords are much more popular than others • RIAA inserts a billion “Britney spear xyz”crap?

  23. Using DHTs in real systems CoralCDN

  24. Browser Browser Browser Browser http proxy http proxy http proxy http proxy http proxy Browser Browser Browser Browser Motivation: alleviating flash crowd • Proxies handle most client requests • Proxies cooperate to fetch content from each other Origin Server

  25. Getting content with CoralCDN 2.Lookup(URL) What nodes are caching the URL? Origin Server 1.Server selection What CDN node should I use? 3.Content transmission From which caching nodes should I download file? • Clients use CoralCDN via modified domain name nytimes.com/file → nytimes.com.nyud.net/file

  26. Coral design goals • Don’t control data placement • Nodes cache based on access patterns • Reduce load at origin server • Serve files from cached copies whenever possible • Low end-to-end latency • Lookup and cache download optimize for locality • Scalable and robust lookups

  27. ,TTL) Lookup for cache locations • Given a URL: • Where is the data cached? • Map name to location: URL  {IP1, IP2, IP3, IP4} • lookup(URL)  Get IPs of nearby caching nodes • insert(URL,myIP) Add me as caching URL for TTL seconds Isn’t this what a distributed hash table is for?

  28. SHA-1(URL1) A straightforward use of DHT • Problems • No load balance for a single URL • All insert and lookup go to the same node (cannot be close to everyone) URL1={IP1,IP2,IP3,IP4} insert(URL1,myIP)

  29. #1 Solve load balance problem: relax hash table semantics • DHTs designed for hash-table semantics • Insert and replace: URL  IPlast • Insert and append: URL  {IP1, IP2, IP3, IP4} • Each Coral lookup needs only few values • lookup(URL)  {IP2, IP4} • Preferably ones close in network

  30. 1 2 3 # hops: Prevent hotspots in index • Route convergence • O(b) nodes are 1 hop from root • O(b2) nodes are 2 hops away from root … Leaf nodes (distant IDs) Root node (closest ID)

  31. 1 2 3 # hops: Prevent hotspots in index • Request load increases exponentially towards root Root node (closest ID) Leaf nodes (distant IDs) URL={IP1,IP2,IP3,IP4}

  32. 1 2 3 # hops: Rate-limiting requests • Refuse to cache if already have max # “fresh” IPs / URL, • Locations of popular items pushed down tree Root node (closest ID) Leaf nodes (distant IDs) URL={IP5} URL={IP1,IP2,IP3,IP4} URL={IP3,IP4}

  33. 1 2 3 # hops: Rate-limiting requests • Refuse to cache if already have max # “fresh” IPs / URL, • Locations of popular items pushed down tree • Except, nodes leak through at most β inserts / min / URL • Bound rate of inserts towards root, yet root stays fresh Root node (closest ID) Leaf nodes (distant IDs) URL={IP5} URL={IP1,IP2,IP6,IP4} URL={IP3,IP4}

  34. 7 β 3 β 2 β 1 β Load balance results 494 nodes on PlanetLab • Aggregate request rate: ~12 million / min • Rate-limit per node (β): 12 / min • Root has fan-in from 7 others

  35. #2 Solve lookup locality problem • Cluster nodes hierarchically based on RTT • Lookup traverses up hierarchy • Route to “closest” node at each level

  36. Preserve locality through hierarchy 111… 000… Distance to key • Minimizes lookup latency • Prefer values stored by nodes within faster clusters Thresholds None < 60 ms < 20 ms

  37. Reduces median lat by factor of 4 Clustering reduces lookup latency

  38. Putting all together: Coral reduces server load Most hits in 20-ms Coral cluster Local disk caches begin to handle most requests 400+ nodes provide 32 Mbps 100x capacity of origin Few hits to origin

More Related