Structured P2P Networks

Structured P2P Networks Guo Shuqiao Yao Zhen Rakesh Kumar Gupta CS6203 Advanced Topics in Database Systems

Introduction-P2P Network • A peer-to-peer (P2P) network is a distributed system in which peersemploy distributed resources to perform a critical function in a decentralized fashion[LW2004] • Classification of P2P networks • Unstructured and Structured • Centralized and Decentralized • Hierarchical and Non-Hierarchical

Structured P2P network • Distributed hash table (DHT) • DHT is a structured overlay that offers extreme scalabilityand hash-table-like lookup interface • CAN, Chord, Pastry • Other techniques • Skip list • Skipgraph, SkipNet

Outline • Hashed based techniques in P2P • Hashed based structured P2P system • Pastry • P-Grid • Two important issues • Load balancing • Neighbor table consistency preserving • Comparison of DHT techniques • Skip-list based system • SkipNet • Conclusion

Pastry[RD2001] • Pastry is a P2P object location and routing scheme • Hash-based • Properties • Completely decentralized • Scalable • Self-organized • Fault-resilient • Efficient search

Design of Pastry • nodeID: each node has a unique numeric identifier (128 bit) • Assigned randomly • Nodes with adjacent nodeIDs are diverse in geography, ownership, etc • Assumption: nodeID is uniform in the ID space • Presented as a sequence of digits with base 2b • b is a configuration parameter (4)

Design of Pastry (cont’) • Message/query has a numeric key of same length with nodeIDs • Key is presented as a sequence of digits with base 2b • Route: a message is routed to the node with a nodeID that is numerically closest to the key

Destination of Routing Message Key =10 20 31 23 03 12 12 Destination node

Pastry Schema • Given a message of key k, a node A forwards the message to a node whose ID is numerically closest to k among all nodes known to A • Each node maintains some routing state

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233020 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Pastry Node State • A leaf set L • A routing table • A neighborhood set M

Meanings of ‘Close’ Closest according to proximity metric (real distance ) Nearest Neighbor 20 31 31 23 23 03 12 Closest according to numerical meaning Node with closet nodeID

Pastry Node State • A leaf set • |L| nodes with closest nodeIDs • |L|/2 larger ones and |L|/2 smaller ones • Useful in message routing • A neighborhood set • |M| nearest neighbors • Useful in maintaining locality properties

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Leaf Set and Neighborhood Set A • In this example b=2, l=8 • |L| = 2 × 2b = 8 • |M| = 2 × 2b = 8 SMALLER LARGER

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Routing Table A NodeID 10233102 • l rows and 2b columns • ith row: i-prefix • jth column: next digit after the prefix is j • b=2 l=8－> 8 rows and 4 columns j=0 j=3 j=1 2nd 10-0-31203 10-0-31203 10-1-32102 10-1-32102 10-3-23302 10-3-23302

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Routing A • Step1: If k falls within the range of nodeIDs covered by A’s leaf set, forwarded it to a node in the leaf set whose nodeID is closest to k • Eg. k = 10233022 falls in the range (10233000,10233232) Forword it to node10233021 • If k is not covered by the leaf set, go to step2

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Routing A • Step2: The routing table is used and the message is forwarded to a node whose ID shares a longer prefix with the k than A’s nodeID does • Eg. k = 10223220 forward it to node 10222302 102-2-2302 • If the appropriate entry in the routing table is empty, go to step3

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233230 10233232 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Routing • Step3: The message is forwarded to a node in the leaf set, whose ID has the same shared prefix as A but is numerically closer to k than A • Eg. k = 10233320 A • If such a node does not exist, A is the destination node forward it to node10233232

Routing • The routing procedure always converges, since each step chooses a node that • Shares a longer prefix • Shares the same long prefix, but is numerically closer • Routing performance • The expected number of routing steps is log2bN • Assumption: accurate routing tables and no recent node failures

Performance Average number of routing hops versus number of Pastry nodes b = 4, |L| = 16, |M| =32 and 200,000 lookups.

Discussion of Pastry • Pastry: the parameters make it flexible • b is the most important parameter that determines the power of the system • Trade-off between the routing efficient (log2bN) and routing table size (log2bN×2b) • Each node can choose its own |L| and |M| based on the node situation

NodeID 10233102 Leaf set SMALLER LARGER 10233033 10233021 10233120 10233122 10233001 10233000 10233132 10233133 Routing table -0-2212102 1 -2-2301203 -3-1203203 0 1-1-301233 1-2-230203 1-3-021022 10-0-31203 10-1-32102 2 10-3-23302 102-0-0230 102-1-1302 102-2-2302 3 1023-0-322 1023-1-000 1023-2-120 3 10233-0-01 1 10233-2-32 0 102331-2-0 2 Neighborhood set 13021022 10200230 11301233 31301233 02212102 22301203 31203203 33213321 Discussion of Pastry – routing schema • Local optimal?? • Eg. k = 10233200 A X’ nodeID = 10233232 Y’ nodeID = 10233133 Dis(k, X’ID) = (10233200, 10233232) = 32 Dis(k, Y’ID) = (10233200, 10233133) = 1 Local optimal node is Y Pastry forward to node X

0 1 Virtual binary tree 00 01 10 11 1 6 2 3 4 5 1 :3 01:2 1 :5 01:2 1 :4 00:6 0 :2 11:5 0 :6 11:5 0 :6 10:4 P-Grid [Aberer2001] • P-Grid is a scalable access structure for P2P • Hash-based & virtual binary search tree • Randomized algorithms are used for constructing the access structure Query k=100 4

P-Grid (cont’) • Properties • Complete decentralized • Scalable with the total number of nodes and data items • Fault-resilient, search is robust against failures of nodes • Efficient search

Discussion of Pastry and P-Grid • The two system both make uniform assumption • Pastry: ID space • P-Grid: data distribution and behavior on peer If data/message/query distribution is skewed, Pastry and P-Grid are not able to balance the load

Load Balancing • Consider a DHT P2P system with N nodes • Θ(logN) imbalance factor if items IDs are uniformly distributed [SMKKB2001] • Even worse if applications associate semantics with the item IDs • IDs would no longer be uniformly distributed • How to • Minimize the load imbalance? • Minimize the amount of load moved?

Load Balancing • Challenges • Data items are continuously inserted/deleted • Nodes join and depart continuously • The distribution of data item IDs and item sizes can be skewed • Solution—[GLSKS2004]

Load Balancing • Virtual server • Represents a peer in the DHT rather than physical node • A physical node hosts one or more virtual server • Total load of virtual servers = load of node • E.g., in Chord Virtual Server Node: Physical Node 0 FT1 1 7 2 6 5 3 FT3 4

Load Balancing • Basic idea • Directories • To store load information of the peer nodes • Periodically schedule reassignments of virtual servers Distributed load balancing problem reduced to Centralized problem at each directory

Load Balancing Node • Load balancing algorithm Directory ID (known to all nodes) Randomly chooses a directory directory in new cycle OR utilization>Ke Receives information from nodes Computes a schedule of virtual server transfers among nodes contacting it in order to reduce their maximal utilization yes Send to directory:(1)Loads of all virtual servers that it is responsible for (2)Capacity Emergency load balancing Delay T time

Load Balancing • Load balancing algorithm (cont.) • Computing optimal reassignment is NP-complete • Greedy algorithm O(mlogm) • For each heavily loaded node, move the least loaded virtual server to pool • For each virtual server in pool, from heaviest to lightest, assign to a node n which minimizes the resulting load

Load Balancing • Performance • Tradeoff: Load movement vs. Load balancing • Load balancing: max node utilization • When T decreases • Max node utilization decreases • Load movement increases • Effective in achieving load balancing for • System utilization as high as 90% • Only transfer 8% of the load that arrives in the system • Emergency load balancing is necessary

Consistency Preserving • Neighbor table • A table of neighbor pointers • For efficient routing in a P2P system • Challenge • How to maintain consistent neighbor tables in a dynamic network where nodes may join, leave and fail concurrently and frequently?

Consistency Preserving • Consistent network • For every entry in neighbor tables, if there exists at least one qualified node in the network, then the entry stores at least one qualified node • Qualified node for an entry of a node’s neighbor table: the node whose ID has suffix same as the required suffix of that entry • Otherwise, the entry is empty

Consistency Preserving • K-consistent network • For every entry in neighbor tables, if there exist H qualified nodes in the network, then the entry stores at least min(K,H) qualified nodes • Otherwise, the entry is empty • For K>0, K-consistency => consistency • 1-consistency = consistency

Consistency Preserving • General strategy • Identify a consistent subnet as large as possible • Only replace a neighbor with a closer one if both of them belong to the subnet • Expand the consistent subnet after new nodes join • Maintain consistency of the subnet when nodes fail

Consistency Preserving • Approach of [LL2004b] • To design a join protocol such that • An initially K-consistent network remains K-consistent after a set of nodes join process terminate • The termination of join implies the node joined belong to this consistent subnet • To design a failure recovery protocol that • Recovers K-consistency of the subnet by repairing holes left by failed neighbors with qualified nodes in the subnet • Protocol is presented in the paper [LL2004a], but integrated with join in experiment of this paper

Consistency Preserving • Join protocol • Each node has a status • copying, waiting, notifying, cset_waiting, in_system • S-node: node in status in_system • T-node: otherwise • All S-nodes form a consistent subnet

Consistency Preserving Copy neighbor infor from S-nodes to fill in most entries of its table level by level. copying When cannot find a qualified S-node for a level i>=1 Try to find an S-node which shares at least the rightmost i-1 with x and stores x as a neighbor waiting When find such a node, say y Seek and notify nodes that share the rightmost j digits with it, where j is the lowest level that x is stored in y’s table notifying When finish notifying Wait for the nodes joining currently and are likely to be in the same consistent subnet cnet_wating When confirm all nodes have exited notifying status in_system

Consistency Preserving • Performance • p-ratio • In x’s table, the primary-neighbor of the entry is y, the true primary-neighbor should be z • p-ratio = delay from x to y / delay from x to z • K-consistency is always maintained in all experiments • When K increases, p-ratio decreases • More neighbor infor is stored => more messages • Even with massive joins and failures, tables are still optimized greatly

Comparing DHTs [DGPR2003] • Each DHT Algorithm has many details making it difficult to compare. We will use a component-base analysis approach • Break DHT design into independent components • Analyze impact of each component choice separately • Two types of components • Routing-level : neighbor & route selection • System-level : caching, replication, querying policy, latency

Metrics Used • Metrics used in comparison • Flexibility – Options in choosing neighbors and routes • Resilience – Does it route when nodes goes down ? • Load balancing – Is the content distributed ? • Proximity & Latency – Is the content stored nearby ? • Aspects of DHT • Geometry - a structure that inspires a DHT design, • Distance function –distance between two nodes • Algorithm: rules for selecting neighbors and routes using the distance function

Algorithm & Geometry • What is routing algorithm & geometry ? • Routing Algorithm – refers to exact rules for selecting neighbors, routes. (eg. Chord, CAN, PRR, Tapestry, Pastry) • Geometries – refers to the algorithms’ underlying structure derived from the way in which neighbors and routes are chosen. (Eg. Chord routes on a ring). • Why is geometry important ? Geometry capture flexibility in selection of neighbors and routes. • Neighbor selection – Does the geometry choose neighbors based on proximity ? Leads to shorter paths. • Route selection – Number of options for selecting next hops. Leads to shorter, reliable paths.

011 111 010 110 0 7 1 001 101 6 2 000 100 5 3 4 root root 0 0 1 1 00 00 01 01 10 10 11 11 DHT Algorithms Analysis • The table summarizes the geometries & algorithms. • We will examine the metric flexibility in these two aspects • Flexibility in neighbor selection • Flexibility in route selection

root 0 1 00 01 10 11 Tree Geometry • PRR uses tree geometry. • Distance between two nodes is the depth of the binary tree (Well-balanced tree : log N) • Node selection flexibility - has 2(i-1) options of choosing neighbor at distance i. • No routing flexibility Height = 2 Leafset Height = 1

011 111 010 110 001 101 000 100 Hypercube Geometry • CAN uses a d-torus hypercube. • Each node has log n neighbor. • Routing greedily by correcting bits inany order. • Neighbors differ by exactly one bit.No flexibility in choosing neighbors. • Routing from source to destination at log n distance. First node has log n next hop choices, second hop has log (n – 1) choices. Hence (log n)! choices

Butterfly Geometry • Viceroy uses butterfly geometry. • Nodes organized in a series of log n “stages” where all the nodes at stage i are capable of correcting the ithbit. • Routing consists of 3 phases. Done in O(log N) hops • No flexibility in route selection and neighbor selection.

0 7 1 6 2 5 3 4 Ring Geometry • Chord uses the Ring • Maintain log n neighbors and routes to arbitrary destination in log n hops. Routing in O(log n) hops • Flexibility in neighbor selection, has 2(i-1) possible options to pick its ith neighborAn approx of nlog n / 2 possible routing tables for each node • Yields (log n)! possible routes to route from a source to destination of distance log n.

Structured P2P Networks

Structured P2P Networks

Presentation Transcript

Overlay/P2P Networks

Structured P2P Network

Unstructured P2P Networks

P2P Networks

P2P Network Structured Networks: Distributed Hash Tables

Content-Based Publish-Subscribe Over Structured P2P Networks

P2P Networks (Continue)

P2P Networks Introduction

Applications over P2P Structured Overlays

P2P Network Structured Networks (IV) Distributed Hash Tables

Structured P2P overlay networks

Unstructured P2P Networks

Other Structured P2P Systems

P2P Networks

Structured P2P based Data Management for Corporate Networks

P2P = “Structured Overlay Networks for Peer-to-Peer systems”

P2P Network Structured Networks: Distributed Hash Tables

P2P Networks

P2P Networks Introduction

P2P Network Structured Networks (III) Distributed Hash Tables

Content-based Publish-Subscribe Over Structured P2P Networks