240 likes | 254 Vues
DHT designs: Overview and comparison. Lintao Liu 3/23/2004. Content. Distributed Hash Table Design considerations of DHTs A few DHT designs: Some interesting new designs Comparison Discussion. Peer-to-Peer network. P2P: sharing resources and services
E N D
DHT designs: Overview and comparison Lintao Liu 3/23/2004
Content • Distributed Hash Table • Design considerations of DHTs • A few DHT designs: • Some interesting new designs • Comparison • Discussion
Peer-to-Peer network • P2P: sharing resources and services • Peers: large number of heterogeneous PCs • Connection: LAN, Internet (large latency) • How to make it happen? • A directory service • Insertion, deletion, lookup, search • consider the special characteristics of P2P network • Early P2P networks: • Gnutella: local directory for each peer • Napster: centralized directory server
DHT - Distributed Hash Table • Objective: • Distributed directory: • No central server, nor completely localized index • Efficient directory services: • Insertion, deletion, lookup, search • DHT: storing <Key, Value> pairs • Each node is responsible for part of the key space • Distributed Hash Table: Partition the key space • Given a key, determine where the value is stored in O(1) time • In another word, it determines where to retrieve the value. • Another half of the story • How to reach the peer where the <Key, Value> is stored? • This introduces the differences between DHT designs.
Routing in DHT-based P2P: • Overlay structure & Routing Table: • Nodes are organized in some logic structure • Links maintained by each peer constitutes routing table • Facilitate routing a message to a node • Metrics: • Lookup • Insertion, deletion • Fault tolerant
DHT: Considerations • Efficiency of the DHT • Lookup, insertion, deletion • Size of Routing Table • How much state information maintained on each peer • O(n), O(logN) or O(1) • More state info, more maintenance cost • Flexibility of Routing Table • Rigid routing table • requires more maintenance cost • Complicate recovery • Preclude proximity-based routing
SCAN: Berkeley • Overlay: • A virtual d-dimensional Coordinate space • Each peer is responsible for a zone • Routing: • Each peer maintains info about neighbors • Greedy algorithm for routing • Performance Analysis: • Expected: (d/4)(n1/d) steps for lookup • d: dimension
Chord: MIT • Overlay: • Peers are organized in a ring • Successor peer • <k, value> is stored on the successor of k • Routing: • Finger Table: • More info for close part of the ring • Large jumps, then shorter jumps • Resembling a binary search (?)
Chord: Characteristics • Efficient directory operations • Insertion, deletion, lookup • Good analysis properties • O(logN) routing table size • O(logN) logic steps to reach the successor of a key k • O(log2N) for peer joining and leaving • High maintenance cost • Node join/leave induces state change on other nodes • Rigidity of Routing Table: • For a given network, there is only one optimal/ideal state • Unique, and deterministic
Pastry: Rice • Circular namespace • Routing Table: • Peer p, ID: IDp • For each prefix of IDp, keep a set of peers who shares the prefix and the next digit is different from each other. • Routing: • Plaxton algorithm • Choose a peer whose ID shares the longest prefix with target ID • Choose a peer whose ID is numerically closest to target ID • Exploit the locality • Similar analysis properties with Chord
Symphony: Stanford • Distributed Hashing in a Small World • Like Chord: • Overlay structure: ring • Key ID space partitioning • Unlike Chord: • Routing Table • Two short links for immediate neighbors • k long distance links for jumping • Long distance links are built in a probabilistic way • Peers are selected using a Probability Distribution Function (pdf) • Exploit the characteristics of a small-world network • Dynamically estimate the current system size *
Symphony: Performance • Each node has k = O(1) long distance links • Lookup: • Expected path length: O(1/k * log2N) hops • Join & leave • Expected: O(log2N) messages • Comparing with Chord: • Discard the strong requirements on the routing table (finger table) • rely on the small world to reach the destination.
Kademlia: NYU • Overlay: • Tree • Node Position: • shortest unique prefix • Service: • Locate closest nodes to a desired ID • Routing: • “based on XOR metric” • keep k nodes for each sub-tree which shares the root as the sub-trees where p resides. • Share the prefix with p • Magnitude of distance (XOR) • k: replication parameter (e.g. 20) • Maintenance • Re-publishing
Kademlia: • Comparing with Chord: • Like Chord: achieving similar performance • deterministic • O(logN) contacts (routing table size) • O(logN) steps for lookup service (?) • Lower node join/leave cost • Unlike Chord: • Routing table: view of the network • Flexible Routing Table • Given a topology, there are more than one routing table • “So relaxed, that maintenance is minimal” • Symmetric routing • Comparing with Pastry: • Both have flexible routing table • Better analysis properties • Pastry: “complicating attempts at formal analysis of worst-case behavior”
Skip Graphs: Yale • Based on “skip list”: 1990 • A randomized balanced tree structure organized as a tower of increasingly sparse linked lists • All nodes join the link list of level 0 • For other levels, each node joins with a fixed probability p • Each node has 2/(1-p) pointers • Average search time: O(log(n/((1-p)*log1/p)))
Skip Graph: • Skip List is not suitable for P2P environment • No redundancy, Hotspot problem • Vulnerable to failure and contention • Skip Graph: Extension of Skip List • Level 0 link list builds a Chord ring • Multiple (max 2i) lists for level i (i = 1, … logn) • Each node participate in all levels, but different lists • Membership vector m(x): decide which list to join • Every node sees its own skip list
Skip Graph: • Performance: • Since the membership vector is random, the performance analysis is also probabilistic. • Expected Lookup cost: • O(logn) time and O(logn) messages (conclusion from skip list) • Insertion: • same as search, but more complicated due to the concurrent join • More fault tolerant • θ(logn) neighbors, Expansion ratio Ω(1/logn) (?) • Overall about skip graph: • Probabilistic (like Symphony) • Routing table is flexible • Given the same participating node set, no fixed network structure • O(logn) links, O(logn) hops (Symphony: O(1) links) • Simple structure, low maintenance cost, failure resistant
Koorde: MIT • Based on de Bruijn graphs • Each node has two pointers • Node ID: m, b = logn, ID length • 2m % 2b & (2m + 1) % 2b • Lookup: • Shift one bit by one bit along the way to the target ID • E.g. 010-> 011: 010->100->001->011 (3 steps) • O(logn) steps • Any similarity with Pastry? * • But P2P network is an “incomplete” Bruijn graph • Many positions are empty • Solution: imaginary node, modified lookup protocol
Koorde: • Organize all nodes in a Chord Ring • Successor link: the next node on the ring • Still two links: • The next node on the ring (successor) • The predecessor of 2m % 2b • The first node which precedes 2m % 2b • Very likely it’s also the predecessor of (2m+1) % 2b • Routing: • simulate the routing in a complete graph • Imaginary de Bruijn node • Example…
Koode: • Performance: • With 2 links, O(logn) hops (expected) • at most 3*logn steps (with high prob) • With O(logn) neighbors, O(logn/loglogn) hops • Fault tolerant • Compared with other DHT designs: • Constant routing table size: • 2 well defined links, no flexibility • Expected hops: O(logn) • Probabilistic • Viceroy: • Another DHT which achieve the similar results • Emulation of butterfly network
Summary & Comparison • Size of Routing Table • Node Degree • O(logn): chord, Pastry, Kademlia, Skip Graphs • O(1): Symphony, Koorde, Viceroy • O(n): • Flexibility of Routing Table • Rigid routing table: chord, Koorde, Viceroy • Flexible routing table: • Pastry, Kademlia, Skip Graphs, Symphony • Performance:Mostly O(logN) • Deterministic: chord, pastry, Kademlia, • Probabilistic: skip graphs, symphony, Koorde, viceroy
Summary & Comparison • Optimal results: • For any constant degree k, θ(logN) hops is optimal • To provide a high degree of fault tolerance: • For a network to stay connected with constant probability when all nodes fail with probability ½ • O(logn) neighbors are needed • O(logn/loglogn) hops may be achieved • But, what is really important? • How far is these formal analysis to the real cost and performance? • Something not considered in these DHT designs?
Discussion: • What can be improved on the current DHT designs? • Exploit the locality? • It seems that only Pastry tries on this. • Low maintenance cost? • It has been very low: constant size of routing table • Other cost? Maintenance cost doesn’t depend solely on the routing table size. • Adapt to the content distribution or user interests? • Is DHT efficient for locating popular items? • Are those should be considered in DHT design or other level of the design? • Why is there no widely used DHT-based P2P network? • Is that because the designs not good enough? • Even the formal analysis give very good properties. • More?