Download
failure recovery of overlay tree based structures n.
Skip this Video
Loading SlideShow in 5 Seconds..
Failure Recovery of Overlay Tree-based Structures PowerPoint Presentation
Download Presentation
Failure Recovery of Overlay Tree-based Structures

Failure Recovery of Overlay Tree-based Structures

68 Vues Download Presentation
Télécharger la présentation

Failure Recovery of Overlay Tree-based Structures

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Doctoral Thesis Failure Recoveryof Overlay Tree-basedStructures Ing. Vladimír Dynda Doc. RNDr. Ing. Petr Zemánek, CSc. (supervisor) Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science and Engineering

  2. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  3. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  4. Introduction • Problem statement TR= (TM\FC, CE’ ) T4 T = (TM, CE) TM T5 CE T6 T3 FC T0 T2 S= (N, L) T1 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 1

  5. Introduction • Problem statement • Failure recovery • Reconnection ofT0, T1, ..., TN-1intoa restored network TR= (TM \FC, CE’) • Correctness – TR is acyclic • Completeness –TRcontains all the fragments Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 2

  6. Introduction • Problem statement • Environment • Asynchronous distributed system • No central authority / no global knowledge • Unlimited sizes of S and T • Arbitrary traffic directionin T • Failures • Node failures only • Fail stop failure model • Failures must not split S Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 3

  7. Introduction • Goals of the thesis • Proposal of a generic recovery platform • Illustration of the tree restoration methods • Simulation & verification of the theoretical properties • Survey of possible applications Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 4

  8. Introduction • State of the art • On-demand / preplanned recovery • Preplanned methods • Employ pre-computed backup structures • Existing preplanned methods • Complete graph (Narada) • Ancestor list (Yang-Fei, EFTMRP, HMTP) • Administrative hierarchy (Nice, Nemo) • Secondary trees (Dual-tree, Coop-net) • Link to random nodes (HMTP, Yoid) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 5

  9. Introduction • State of the art • Weaknesses of the existing methods • Poor scalability • Restricted set of applicable trees • Single points of failure • Fixed level of fault tolerance • Unrecoverable multiple failures • Non-local restoration Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 6

  10. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  11. BR Platform • Bypass ring platform • Ensures correctness and completeness • Forms a basis for a tree reconnection • Fabric of redundant links in T: • Bypass rings of optional diameter • Alternative paths in the event of failure • Location & routing among the fragments Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 7

  12. BR Platform • Failure recovery Bypass routing Tree reconnection Leader link election Bypass rings BC(FC) n1 Leader BRT(n1,4) BRT(n2,2) BRT(n1,3) BRT(n1,2) FC n1 n2 TR= (TM\FC, CE’ ) n2 T = (TM, CE) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 8

  13. BR Platform • Elemental steps of the recovery • Initialization of the platform • Failure detection • Designated nodes discovery • Leader link election • Tree reconnection • Bypass rings reconfiguration Bypass routing Correctness & Completeness Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 9

  14. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  15. Bypass Routing • Partially ordered tree (POT) Ordered rays Ordered neighbor sequence R-(A0,3C) R+(A0,3C) 17 CE E8 9F BT(A0,3C) B9 72 67 79 09 0F 3C A0 93 B2 1D SeqT(A0) 24 SeqT(3C) 42 T = (TM, CE) 5E 4A F7 11 R+(A0,3C) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 10

  16. Bypass Routing • Bypass ring BRT(n, d) R+(n,n1) R-(n,n0) dmax = 4 BT(n,n1) BRT(n,4) BRT(n,dmax) BRT(n,3) BT(n,n0) n1 BRT(n,2) n0 R-(n,n1) R+(n,n2) R+(n,n0) n2 n n3 R-(n,n3) SeqT(n) BT(n,n2) R+(n,n3) BT(n,n3) R-(n,n2) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 11

  17. BRT(nm,dmax) BRT(n2,5) BRT(n2,4) BRT(n1,3) BRT(n1,2) Bypass Routing • Bypass rings R+(n,n1) ndmax n5 n4 n3 FC n2 n1 n BT(n,n1) T = (TM, CE) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 12

  18. Bypass Routing • Routing algorithm • <FC>T = BT(ni, nj), njAT(ni)  FC ni1 nj1 BC(FC) BT(ni2,nj2) BT(ni3,nj3) FC T = (TM, CE) nj3 R+(ni1,nj1) ni3 nj2 ni2 BT(ni1,nj1) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 13

  19. BRT(A0,4) BRT(3C,3) BRT(3C,2) Bypass routing • Example BC(FC) R+(72,3C) CE 17 E8 9F 72 B9 0F 67 FC 79 09 3C A0 93 B2 1D 24 T = (TM, CE) 42 5E 4A F7 11 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 14

  20. Bypass Routing • Properties • Memory overhead at node nT:O(degT(n) * dmax) • Routing is successful iflenX(ni, ni+1)  dmax, X = R+(ni, nj)for all neighborsni andni+1 BC(FC) • Lower bound of maximum size ofFC:dmax/2 nodes for arbitrary clusters Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 15

  21. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  22. Leader Link Election • Leader link election(LLE) • Guarantees correctness • Communication structure – BC(FC) • Node states • Passive – initial state of the election • Active – leader candidates • Relay – election is lost Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 16

  23. ID(nN-1) < ID(n0) Leader Link Election • LLE on ordered rings ID(n0) < ID(n1) < ... < ID(nN-1) Leader ELECTION(n0) n0 nN-1 ID(n0) < ID(n1) n1 ELECTION(n1) FC n6 n2 ID(n1) < ID(n2) n BC(FC) = BRT(n,2) SeqT(n) n5 n3 n4 <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 17

  24. A1.BA < A1.16 Leader Link Election • LLE in partially ordered trees Sweep process Hierarchical identifier HIDT(nr,ni) ELECTION(4F.*) Leader BC(FC) R+ HIDT(4F,D8) D8 4F.A1.BA.D8 SWEEP(4F.A1) BA HIDT(4F,97) 97 4F.A1.BA.97 ELECTION(A1.BA.97) A1 4F HIDT(4F,16) 4F.A1.16 16 nr SeqT(nr) SeqT(A1) FC <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 18

  25. 17 9F 67 79 93 24 3C.A0 < 3C.A0 A0.B9 < A0.1D 42 5E 4A F7 11 Leader Link Election • Example CE Leader ELECTION(3C.A0.1D) E8 72 FC B9 SWEEP(3C.A0) 0F nr nr 09 3C A0 ELECTION(A0.B9.CE) B2 1D T = (TM, CE) <FCAT(FC)> Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 19

  26. Leader Link Election • Properties • Average message complexity:O(N logbN); b is the average branching factor of FC nodes in T • Time complexity: O(N) Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 20

  27. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  28. Tree Reconnection • Reconnection methods • Reconnect the fragments located by the routing algorithm • Abide by the results of LLE • Designed to meet the specific application requirements • Influence properties of the restored tree Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 21

  29. Tree Reconnection • LR method BC(FC) n1 n2 n3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 22

  30. Tree Reconnection • HR-x method HR-1 (q0, qi) if i  1 (mod x) (qi-1, qi) otherwise BC(FC) n1 = q0 q3 q1 q2 q2 q1 n2 = q0 = q3 n3 q5 = q0 = q1 q4 q2 q3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 23

  31. Tree Reconnection • HR-x method HR-2 (q0, qi) if i  1 (mod x) (qi-1, qi) otherwise BC(FC) n1 n2 n3 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 24

  32. 17 9F 67 79 93 24 42 5E 4A F7 11 Tree Reconnection • Example CE ELECTION(3C.A0.1D) E8 72 FC B9 SWEEP(3C.A0) 0F 09 3C A0 ELECTION(A0.B9.CE) B2 TR= (TM\FC, CE’ ) 1D <FCAT(FC)> HR-2 Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 25

  33. Tree Reconnection • Properties Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 26

  34. Tree Reconnection • Properties Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 27

  35. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  36. Summary of Results • Properties of the BR platform • Node memory overhead: • O(degT(n) * dmax) • Average message complexity: • O(N logbN) for arbitrary failures • Nfor single failures • Lower bound of max. recoverable failure: • dmax/2 nodes for arbitrary failed clusters • dmax-1 nodes for internal failed clusters Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 28

  37. Summary of Results • Simulation results • Successfully recovered cluster • Average diameter: dmax-2 • Average size: 1.5 dmax • Linear recovery time • dmax parameter • Controls fault-tolerance vs. costs • dmax=4 provides ample tolerance for GFS Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 29

  38. Summary of Results • Properties of the platform • Locality • Multiple failure recovery • Scalability • Application requirements consideration • Optional level of fault tolerance • Protection selectivity • Designated nodes discovery • Tree reconnection method • Independence of the protected tree type Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 30

  39. Summary of Results • Applications • Overlay multicast • Applicable in all types • Network-layer multicast • Extension with BR(n,1) needed • Sample application – GFS multicast • Designed for large-scale P2P systems • Based on a layered administrative hierarchy • Employs BR platform to achieve fault-tolerance Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 31

  40. Agenda • Introduction • Solution • BR Platform • Bypass Routing • Leader Link Election • Tree Reconnection • Summary of Results • Conclusion Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures

  41. Conclusion • Thesis summary • Analysis of overlay trees environment and identification of recovery properties • Proposal of BR platform • Design of the specialized leader election • Illustration of the tree reconnection • Simulation of the platform • Outline of the overlay multicast scheme Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 32

  42. Conclusion • Ideas for further research • Autonomous management of fault-tolerance level and protection selectivity • More sophisticated tree reconnection methods • Extension of the platform fornetwork-layer multicast Vladimír Dynda: Failure Recovery of Overlay Tree-based Structures 33

  43. Thank You