250 likes | 382 Vues
Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks. Gurmeet Manku. Moni Naor Udi Wieder The Weizmann Institute of Science. Stanford. The Small World Phenomena a very brief history. Folklore – People are connected via short chains –
E N D
Searching via Your Neighbor’s Neighbor:The Power of Lookahead in P2P Networks Gurmeet Manku Moni Naor Udi Wieder The Weizmann Institute of Science Stanford
The Small World Phenomenaa very brief history • Folklore – People are connected via short chains – • The graph of social networks has small diameter. • Barabasi: belief may have originated from a story by Frigyes Karinthy, 1929 • Quantitative approach initiated by Milgram in the 1960’s - “The six degrees of separation”. • Mathematical modeling: Model a social network by some distribution on graphs. • A precursor of P2P– need to locate a resource in a ‘natural’ network based on partial information. P2P = Peer-to-Peer = a highly dynamic network
Routing in a Small World Common question: do short paths exist? Kleinberg’s algorithmic question: assuming short paths exist, how do people find them?
Modeling Small Worlds • Kleinberg’s model [2000]: • People points on a two dimensional grid. • Grid edges (short range). • One long range contact chosen with the Harmonic distribution. • probability of (u,v) proportional to 1/d(u,v)2. • Naturally generalizes to k long range links (Symphony [MBR03],[ADS02].). • Naturally generalizes to any dimension. • Captures the intuitive notion that people know people who are close to them.
( ) l £ o g n Modeling Small Worlds • Small World Percolation: • People points on a two dimensional grid. • Grid edges (short range). • Each edge appears independently with probability = inverse of its distance squared. • Degree of each node . • Originates from long range percolation model. • Shares structural properties with some popular randomizedP2P networks: R-Chord, R-Hypercube, Skip Lists…
k l · o g n 2 ( ( ( ( ) ) ) ) l l l l l £ £ O O ( o g n ) £ o o o o g g g g n n n n k Routing in Small Worlds • Greedy algorithm: move to the node that minimizes the L1 distance to the target.
2 k l · l ( ( ( ( ) ) ) ) o g n l l l l £ O £ O ( ) o g n £ o o o o g g g g n n n n k Properties of Greedy • Simple– to understand and to implement. • Local– If source and target are close, the path remains within a small area. • In some cases – (Hypercube, Chord) – the best we can do. • Not optimal with respect to the degree. • Can Greedy Routing be shortened? • Without compromising the good properties
Neighbor of Neighbor (NoN) Routing • Each node has a list of its neighbor’s neighbors. • The message is routed greedily to the closest neighbor of neighbor (2 hops). • Let w1, w2, … wk be the neighbors of current node u • For each wi find zi, the closet neighbor to target t • Let j be such thatzj is the closest to target t • Route the message from u via wj to zj • Effectively it is Greedy routing on the squared graph. • The first hop may not be a greedy choice. • Previous incarnations of the approach: • Coppersmith, Gamarnik and Sviridenko [2002]: proved an upper bound on the diameter of a small world graph. • No routing algorithm • Manku, Bawa and Ragahavan [2003]: a heuristic routing algorithm in ‘Symphony’ - a Small-World like P2P network.
2 2 l l ( ( ( ( ) ) ) ) l l l l l l £ £ £ £ ( ( ) ) o o g g n n k £ £ ( ( o o g g n n ) ) £ £ o o o o g g g g n n n n l l l l k l k k o o g g o o g g n n o g What can we show about Non Greedy • PSW, R-Chord, R-Hypercube are degree optimal w.h.p. • Skip Lists – degree optimal on expectation. • Kleinberg’s model and P2P variations – improved. • Lower bounds for algorithms based on neighbor lists only (Greedy is a special case).
Degree Optimal P2P Routing • Different routing schemes • Viceroy [MNR02]: emulates the butterfly network • Constant degree • O(log n) hops for routing • Constructions emulating De-Bruijn graphs • Can achieve any degree/number of hops tradeoff • In particular degree O(log n) and O(log n/ log log n) hops • Routing is not greedy • Recent construction [AM] fixes that. • Even if target and source are close in label space message might be routed away • No (natural) prefix search • Random keys are necessary.
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Skip – Graphs [AS02],[HDJ+03] • Each node (resource) has a name. • Nodes are arranged on a line sorted by name. b a c f d e • Each node chooses a random string of bits. • An edge is established if two nodes share a prefix which is not shared by the nodes between them. • Allows prefix search.
0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Theorem: Using the NoN algorithm, the expected path length of any lookup is . Routing in Skip – Graphs • Greedy Routing – use longest edge possible. • Path length is (log n) w.h.p. • The NoN algorithm optimizes over two hops.
= d l d o g = ( = ) d l l d l l O d 1 [ ] [ ( ) j ] o o g g n 0 o g o g n X D 0 ¸ > l d ; 2 o g ( = ) l l l O o g n o g o g n Skip Graphs – degree optimality • Call a NoN 2-hop successfulif it reduces the distance from d to . • Need succesful 2-hops to get to distance 1. • From Lemma, this would take in expectation. d 0 X - # of two hop paths between d and D - the event a message reached the node d. Lemma: Prob Sufficiency of lemma:
1 [ ( ) j ] X D 0 ¸ > = d l d 2 o g j j k k i j 1 ¡ ¡ ¡ ¡ ( ) 2 1 2 ¡ ¢ d [ ] 0 l d ; o g [ ] c c P A · · 1 2 n r d i j c j j j j X i j i j ¡ ¡ 1 ; [ ] E X 5 ¸ ¸ Lemma: ¢ ( ) l d i i ¡ o g n i 1 = Proof: For prefix of length k the probability of an edge is: Let k be log(|i-j|). Skip Graphs – degree optimality d 0 X - # of two hop paths between d and Want to show Prob . Ignore dependence on D. Ai,j - There exists an edge between i, j. Choice of constants
1 2 [ ] [ ] [ ] X E X E X · + v a r 1 [ ] [ ] [ ] 2 A A P A P A · ¢ c o v r r d d d d i j i j d x y x y [ ; ] 2 0 ; ; ; ; ; ; ; ; l d ; 2 1 o g [ ] [ ] E X E X + [ ] P X 0 0 7 · · 2 r = [ ] 2 E X : Careful calculation: deal with dependencies Which implies: Skip Graphs – degree optimality X - # of two hop paths between d and Ai,j - There exists an edge between i, j. 0 d i j x y
The Cost/Performance of NoN • Cost of Neighbor of Neighbor lists: • Memory: O(log2n) - marginal. • Communication: Is it tantamount to squaring the degree? • Neighbor lists should be maintained (open connection, pinging, etc.) • NoN lists should only be kept up-to-date. • Reduce communication by piggybacking updates on top of the maintenance protocol. • Lazy updates: Updates occur only when communication load is low – supported by simulations. Networks of size 217 show 30-40% improvement
Simulation Results Small World - one dimension Skip Graphs
Simulation Results 2-dimensional small world 1-dimensional Small World each edge fails with probability 1/2
( ( ( ) ) ) l l l o o o g g g n n n A Case for Randomized Topology • Average diameter of hypercube is . • Average diameter of ‘perfect’ skip graph is . • Average diameter of Chord is . • Conclusion– The randomization of edges reduces the average path lengths. • Common design rule – reduce randomization in topology. • The long edges are just in the right density, so that NoN finds them without increasing the degree. • Other advantages: • Security, fault tolerance….
Do People Use the NoN Algorithm? • Experiment based on email [DRW03] • About 25% sent the mail because: • The recipient traveled to target’s geographical region. • The recipient’s family originates from target’s geographical region.
( ) l o g n Theorem: Every 1-local algorithm requires probes w.h.p, both in small worlds and in skip graphs. Lower Bounds – A Probing Model • Goal: Find a path between two nodes in an unknown graph. • The algorithm may probe a node. If the probing reveals a neighborhood of radius k, then the algorithm is k–local. • A lower bound on the number of probes implies a lower bound on the sequential running time of routing. • The Greedy algorithm is 1-local. NoN is 2-local. Conclusion: Some extra information is necessary.
[ ] f k P · r d [ ] [ ] k f k P P · ¸ · r g r d d ( ) f g d d Lemma: For all k;d>0 ; Greedy algorithm dominates1-local algorithms. • Let A be a 1-local algorithm. Denote by the r.v. counting the number of probes it takes A (Greedy) to find a path between 0 and d. i d 0 revealed • If a probe finds node i, reveal all edges (prefixes) in [d;i]. Only increases . • The ‘best chance’ of getting close to 0 is by probing the node closest to 0.
l P [ ] o g n l E X ¸ c o g n i X [ j ] P X X X X 1 1 1 1 ¸ 1 X ¡ P [ ] r l c P X ² = · = · = = i i 0 1 1 i ¡ r c o g n n ; ; : : : ; i 2 i 0 = Lemma: Both for skip graphs and small worlds, there exists a constant c such that: Lower Bounds on Greedy • Partition the nodes to balls B0,B1,…,Blog d • Define Xi– the indicator of the event :“Greedy probed a node in Bi” • The probe complexity is at least . 0 d B2 B0 B1 B3 Azuma’s inequality:
0 d B2 B1 B0 B3 1 ¡ [ j ] P X X X X c 1 1 1 1 ¸ r c = = = = i i 0 1 1 ¡ ; ; : : : ; Lemma: Both for skip graphs and small worlds, there exists a constant c such that: Lower Bounds on Greedy • Xi depends only on the last ball visited. • When a ball is visited – skip to the last node. • Assume X0=1,X1=0. • The probability the dangling edge would skip over B2 is at most .
Conclusions • NoN Greedy seems like an almost free tweak that is a good idea in many settings. • Do not be perfect (all the time) – randomization helps. • What is more important • Prefix search. • Easy and ‘natural’ degree optimality. • Better understanding of the ‘small world’ phenomena.