1 / 43

Peer-to-Peer Structured Overlay Networks

Peer-to-Peer Structured Overlay Networks. Antonino Virgillito. Background. Peer-to-peer systems distribution symmetry (communication, node roles) decentralized control self-organization dynamicity. Data Lookup in P2P Systems. Data items spread over a large number of nodes

Télécharger la présentation

Peer-to-Peer Structured Overlay Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-to-Peer Structured Overlay Networks Antonino Virgillito

  2. Background Peer-to-peer systems • distribution • symmetry (communication, node roles) • decentralized control • self-organization • dynamicity

  3. Data Lookup in P2P Systems • Data items spread over a large number of nodes • Which node stores which data item? • A lookup mechanism needed • Centralized directory -> bottleneck/single point of failure • Query Flooding -> scalability concerns • Need more structure!

  4. More Issues • Organize, maintain overlay network • node arrivals • node failures • Resource allocation/load balancing • Resource location • Network proximity routing

  5. What is a Distributed HashTable? • Exactly that  • A service, distributed over multiple machines, with hash table semantics • put(key, value), Value = get(key) • Designed to work in a peer-to-peer (P2P) environment • No central control • Nodes under different administrative control • But of course can operate in an “infrastructure” sense

  6. What is a DHT? • Hash table semantics: put(key, value), Value = get(key) • Key is a single flat string • Limited semantics compared to keyword search • Put() causes value to be stored at one (or more) peer(s) • Get() retrieves value from a peer • Put() and Get() accomplished with unicast routed messages • In other words, it scales • Other API calls to support application, like notification when neighbors come and go

  7. Distributed Hash Tables (DHT) nodes k1,v1 k2,v2 k3,v3 P2P overlay network Operations: put(k,v) get(k) k4,v4 k5,v5 k6,v6 • p2p overlay maps keys to nodes • completely decentralized and self-organizing • robust, scalable

  8. Popular DHTs • Tapestry (Berkeley) • Based on Plaxton trees---similar to hypercube routing • The first* DHT • Complex and hard to maintain (hard to understand too!) • CAN (ACIRI), Chord (MIT), and Pastry (Rice/MSR Cambridge) • Second wave of DHTs (contemporary with and independent of each other)

  9. DHTs Basics • Node IDs can be mapped to the hash key space • Given a hash key as a “destination address”, you can route through the network to a given node • Always route to the same node no matter where you start from • Requires no centralized control (completely distributed) • Small per-node state is independent of the number of nodes in the system (scalable) • Nodes can route around failures (fault-tolerant)

  10. Things to look at • What is the structure? • How does routing work in the structure? • How does it deal with node joins and departures (structure maintenance)? • How does it scale? • How does it deal with locality? • What are the security issues?

  11. The Chord Approach • Consistent Hashing • Logical Ring • Finger Pointers

  12. The Chord Protocol • Provides: • A mapping successor: key -> node • To lookup key K, go to node successor(K) • successor defined using consistent hashing: • Key hash • Node hash • Both Keys and Nodes hash to same (circular) identifier space • successor(K)=first node with hash ID equal to or greater than hash(K)

  13. Example: The Logical Ring Nodes 0, 1, 3 Keys 1, 2, 6

  14. Consistent Hashing [Karger et al. ‘97] • Some Nice Properties: • Smoothness: minimal key movement on node join/leave • Load Balancing: keys equitably distributed over nodes

  15. Mapping Details • Range of Hash Function • Circular ID space module 2m • Compute 160 bit SHA-1 hash, and truncate to m-bits • Chance of collision rare if m is large enough • Deterministic, but hard for an adversary to subvert

  16. Successor/Predecessor in the Ring Finger Pointers n.finger[i] = successor (n+2 i-1) Each node knows more about portion of circle close to it! Chord State

  17. Example: Finger Tables

  18. Chord: routing protocol Notation n.foo( ) stands for a remote call to node n. • A set of nodes towards id are contacted remotely • Each node is queried for the known node which is closest to id • Process stops when a node is found having successor > id

  19. Example: Chord Routing Finger Pointers for Node 1

  20. Lookup Complexity • With high probability: O(log(N)) • Proof Intuition: • Being p the successor of the targeted key, distance to p reduces by at least half in each step • In m steps, would reach p • Stronger claim: In O(log(N)) steps, distance ≤ 2m/N Thereafter even linear advance will suffice to give O(log(N)) lookup complexity

  21. Chord invariants • Every key in the network can be located as long as the following invariants are preserved after joins and leaves: • Each node’s successor is correctly maintained • For every key k, node successor(k) is responsible for k

  22. Chord: Node Joins • New node B learns of at least one existing node A via external means • B asks A to lookup its finger-table information • Given that B’s hash-id is b, A does lookup for B.finger[i] = successor ( b + 2i-1) if interval not already included in finger[i-1] • B stores all finger information and sets up pred/succ pointers

  23. Node Joins (contd.) • Update of finger table of existing nodes p such that: • p precedes b by at least 2i-1 • the i-th finger of node p succeeds b • Starts from p = predecessor( b - 2i-1 ) and proceeds in counter-clock-wise direction while 2. is true • Transferring keys: • Only from successor(b) to b • Must send notification to the application

  24. Example: finger table update Node 6 joins

  25. Example: transferring keys Node 1 leaves

  26. Concurrent Joins/Leaves • Need a stabilization protocol to guard against inconsistency • Note: • Incorrect finger pointers may only increase latency, but incorrect successor pointers may cause lookup failure! • Nodes periodically run stabilization protocol • Finds successor’s predecessor • Repair if this isn’t self • This algorithm is also run at join

  27. Example: node 25 joins

  28. Example: node 28 joins before 20 stabilizes (1)

  29. Example: node 28 joins before 20 stabilizes (2)

  30. CAN • Virtual d-dimensionalCartesian coordinatesystem on a d-torus • Example: 2-d [0,1]x[1,0] • Dynamically partitionedamong all nodes • Pair (K,V) is stored bymapping key K to a point P in the space using a uniform hash function and storing (K,V) at the node in the zone containing P • Retrieve entry (K,V) by applying the same hash function to map K to P and retrieve entry from node in zone containing P • If P is not contained in the zone of the requesting node or its neighboring zones, route request to neighbor node in zone nearest P

  31. Routing in a CAN • Follow straight line path through the Cartesian space from source to destination coordinates • Each node maintains a table of the IP address and virtual coordinate zone of each local neighbor • Use greedy routing to neighbor closest to destination • For d-dimensional space partitioned into n equal zones, nodes maintain 2d neighbors • Average routing path length:

  32. CAN Construction • Joining node locates a bootstrapnode using the CAN DNS entry • Bootstrap node provides IP addressesof random member nodes • Joining node sends JOIN request torandom point P in the Cartesian space • Node in zone containing P splits thezone and allocates “half” to joining node • (K,V) pairs in the allocated “half” aretransferred to the joining node • Joining node learns its neighbor setfrom previous zone occupant • Previous zone occupant updates its neighbor set

  33. Departure, Recovery and Maintenance • Graceful departure: node hands over its zone and the (K,V) pairs to a neighbor • Network failure: unreachable node(s) trigger an immediate takeover algorithm that allocate failed node’s zone to a neighbor • Detect via lack of periodic refresh messages • Neighbor nodes start a takeover timer initialized in proportion to its zone volume • Send a TAKEOVER message containing zone volume to all of failed node’s neighbors • If received TAKEOVER volume is smaller kill timer, if not reply with a TAKEOVER message • Nodes agree on neighbor with smallest volume that is alive

  34. Pastry Generic p2p location and routing substrate • Self-organizing overlay network • Lookup/insert object in < log16N routing steps (expected) • O(log N) per-node state • Network proximity routing

  35. Pastry: Object distribution 2128-1 O • Consistent hashing • 128 bit circular id space • nodeIds(uniform random) • objIds (uniform random) • Invariant: node with numerically closest nodeId maintains object objId nodeIds

  36. Pastry: Object insertion/lookup 2128-1 O Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible X Route(X)

  37. Pastry: Routing table (# 65a1fc) Row 0 Row 1 Row 2 Row 3 log16 N rows

  38. Pastry: Leaf sets • Each node maintains IP addresses of the nodes with the L/2 numerically closest larger and smaller nodeIds, respectively. • routing efficiency/robustness • fault detection (keep-alive) • application-specific local coordination

  39. Pastry: Routing procedure if (destination is within range of our leaf set) forward to numerically closest member else let l = length of shared prefix let d = value of l-th digit in D’s address if (Rld exists) forward to Rld else forward to a known node that (a) shares at least as long a prefix (b) is numerically closer than this node

  40. Pastry: Routing Properties • log16 N steps • O(log N) state d471f1 d467c4 d462ba d46a1c d4213f Route(d46a1c) d13da3 65a1fc

  41. Pastry: Performance Integrity of overlay message delivery: • guaranteed unless L/2 simultaneous failures of nodes with adjacent nodeIds Number of routing hops: • No failures: < log16N expected, 128/b + 1 max • During failure recovery: • O(N) worst case, average case much better

  42. Pastry Join • X = new node, A = bootstrap, Z = nearest node • A finds Z for X • In process, A, Z, and all nodes in path send state tables to X • X settles on own table • Possibly after contacting other nodes • X tells everyone who needs to know about itself

  43. Pastry Leave • Noticed by leaf set neighbors when leaving node doesn’t respond • Neighbors ask highest and lowest nodes in leaf set for new leaf set • Noticed by routing neighbors when message forward fails • Immediately can route to another neighbor • Fix entry by asking another neighbor in the same “row” for its neighbor • If this fails, ask somebody a level up

More Related