360 likes | 486 Vues
The paper addresses the lengthy convergence time of BGP (Border Gateway Protocol), which can take up to 15 minutes. We present the Ghost Flushing technique that reduces the worst-case convergence time from minutes to mere seconds. This technique tackles the issue of "ghost information," where outdated routing information persists and proliferates within a network. By implementing the Ghost Buster and Ghost Flushing rules, we demonstrate how to effectively streamline BGP convergence, supported by simulation results that illustrate significant performance improvements.
E N D
Improved BGP convergence via Ghost Flushing Yehuda Afek Anat Bremler-Barr ShemerSchwarzd המרכז הבינתחומי הרצליה
Problem: BGP Convergence • [Labovitz,Ahuja,Bose,Jahanian] BGP may take up to 15 minutes to converge. • Here: Reduce the worst case from minutes to seconds, in a practical way
Problem: BGP Convergence • [Labovitz,Ahuja,Bose,Jahanian] Events Time (sec’s, minRouteAdver=30) • E-Down 30•n n 10,000, up to 15 minutes • E-Up 30•d d 30, d=diameter • E-Longer 2•30•l l == path length • E-Shorter 30•d • Here: E-down = l time units (unit = link delay) • E-Longer = 30•d
Agenda • BGP overview • The BGP convergence problem • Ghost buster rule • Ghost flushing rule • Simulation results
BGP protocol • Distance (Path) vector protocol • Receive AS-path from the neighbors • Chooses the best one (shortest) • Eliminates Routing loops using the AS-path • Two kinds of messages: Announcements and Withdrawal
Problem: Ghost information One Ghost (old information) makes many, and in the network it continues recursively dst: 0 dst: 0 2 4 1 3 dst: 0 dst: 0 withdraw 0 dst: 0 t=0 dst
Problem: Ghost information One Ghost (old information) makes many, and in the network it continues recursively dst: 1 0 dst: 1 0 2 4 annc:1 0 annc:1 0 dst: 2 0 1 3 dst: 1 0 annc:2 0 annc:1 0 0 dst: {} t=1 dst
Problem: Ghost information One Ghost (old information) makes many, and in the network it continues recursively dst: 3 1 0 dst: 1 2 0 2 4 dst: {} 1 3 dst: 1 2 0 withdraw 0 dst: {} t=2 dst
Problem: Ghost information minRouteAdver: Wait 30 sec’s before sending the next announcement (BGP) annc: 3 1 0 annc: 2 1 0 annc: 2 1 0 One Ghost (old information) makes many, and in the network it continues recursively dst: 3 1 0 dst: 2 1 0 2 4 dst: {} 1 3 dst: 2 1 0 0 dst: {} t=17 t=3 t=4 t=5 t=6 t=28 t=24 t=20 t=31 t=16 t=14 t=11 t=10 t=8 t=7 t=27 t=22 dst
E_Down convergence In the clique (size 4) example the scenario ends after 62 sec (= 30(n-2) )
Without MinRouteAdver • Avalanche of Messages O(n!) • Explore all possible paths of length 1, 2 … dst: 0 dst: 0 2 4 1 : 1 0 2 : 2 0 3 : 3 0 1 : 1 0 3 : 3 0 4 : 4 0 dst: 0 1 3 dst: 0 1 : 1 0 2 : 2 0 4 : 4 0 2 : 2 0 3 : 3 0 4 : 4 0 withdrawal 0 dst: {} t=0 dst
Without MinRouteAdver • Avalanche of Messages O(n!) • Explore all possible paths of length 1, 2 … dst: 1 0 dst: 1 0 2 4 1 : 1 0 3 : 3 0 4 : 4 0 1 : 1 0 3 : 3 0 4 : 4 0 annc: 1 0 annc:1 0 dst: 2 0 1 3 dst: 1 0 1 : 1 0 3 : 3 0 4 : 4 0 2 : 2 0 3 : 3 0 4 : 4 0 annc: 2 0 annc: 1 0 0 dst: {} t=0.1 dst
Without MinRouteAdver • Avalanche of Messages O(n!) • Explore all possible paths of length 1, 2 … dst: 3 0 dst: 2 0 2 4 1 : 1 2 0 2 : 2 0 3 : 3 0 1 : 1 2 0 3 : 3 0 4 : 4 0 annc:2 0 annc:3 0 dst: 20 1 3 dst: 2 0 annc: 2 0 annc:2 0 1 : 1 2 0 2 : 2 0 4 : 4 0 2 : 2 0 3 : 3 0 4 : 4 0 0 dst: {} t=0.2 dst
Without MinRouteAdver • Avalanche of Messages O(n!) • Explore all possible paths of length 2, 3 … dst: 3 0 dst: 3 0 1 : 1 2 0 3 : 3 0 4 : 4 0 2 4 1 : 1 2 0 2 : 2 1 0 3 : 3 0 annc:3 0 annc:3 0 dst: 3 0 1 3 dst: 4 0 annc:3 0 annc:4 0 1 : 1 2 0 2 : 2 1 0 4 : 4 0 2 : 2 3 0 3 : 3 0 4 : 4 0 0 dst: {} t=0.3 dst
Without MinRouteAdver • Avalanche of Messages O(n!) • Explore all possible paths of length 2, 3 … dst: 4 0 dst: 1 2 0 1 : 1 2 0 3 : 3 1 0 4 : 4 0 2 4 1 : 1 2 0 2 : 2 1 0 3 : 3 1 0 annc: 1 2 0 annc:4 0 dst: 4 0 1 3 dst: 4 0 annc:4 0 annc:4 0 1 : 1 2 0 2 : 2 1 0 4 : 4 0 2 : 2 3 0 3 : 3 1 0 4 : 4 0 0 dst: {} t=0.4 dst
Related Work • Introducing the problem [Labovitz,Ahuja,Bose,Jahanian], [Labovitz,Wattenhofer,Venkatachary,Ahuja] • real life evidence • theoretical analysis • Experimental analysis [Griffin,Premore] • Solution • Works in Counting to Infinity: • Adding states [Garcia-Luna-Aceves] – EIGRP like… • Route Poisoning with Hold-down [Cisco:Rutgers]– IGRP like... • Routes consistency [Pei,Zhao,Wang,Massey,Mankin,Wu,Zhang]
Ghost flushing rule • If ASpath to dst is longer and cannot send annoucement (due to minRouteAdver rule ) then send withdrawal • Motivation: Flush the ghost information ASAP
Ghost Flushing example dst: 0 dst: 0 2 4 1 3 dst: 0 dst: 0 withdraw 0 dst: 0 t=0 dst
Ghost Flushing example dst: 1 0 dst: 1 0 2 4 annc:1 0 annc:1 0 dst: 2 0 1 3 dst: 1 0 annc:2 0 annc:1 0 0 dst: {} t=1 dst
Ghost Flushing example withdraw withdraw withdraw Longer ASpath & minRouteAdver timer Send “flushing” withdrawal dst: 3 1 0 dst: 1 2 0 2 4 dst: {} 1 3 dst: 1 2 0 withdraw 0 dst: {} t=2 dst
Ghost Flushing example dst: {} dst: {} 2 4 withdraw withdraw dst: {} 1 3 dst: {} withdraw 0 dst: {} t=3 dst
Analysis: Time convergence of ghost flushing rule, E_down • In each time unit (=h, maximum link delay), ghost information is erased to a distance greater by one • After k time units, ghost information ASpath with length < k has disappeared. • Longest Ghost ASpath = n (in theory). • Hence (worst case) time convergence: nh
Ghost Buster Rule • The convergence time is better than expected !!!! • Explanation: The minRouteAdver blocks the propagation of ghost information, while the flushing withdrawal “eats” the ghost information. • Bad (wrong) news propagate slowly
Analysis: Ghost buster rule • Add to the ghost flushing rule: • Router sends announcement, only after delta time • MinRouteAdver similar to delta: • Common implementation: MinRouteAdver per peer • And, timer almost always on (lots of BGP announcements !)
Analysis: Time convergence of ghost buster rule • The ghost information disappears at time t: d+t/(delta+h) = t/h • Every delta+h time the length of the maximum ghost ASpath is increased by one. • Every h time, the length of the minimum ghost ASpath is increased by one. • After the failure the length of the maximum ghost ASpath is d (diameter). • Hence: t = kdh/(k-1) d, where k = (delta+h)/h is the rate of the algorithm
The effect on E_longer 7 2 4 1 3 6 0 5 dst • BGP: Convergence time dominated: • Time until ghost information vanishes • Time until backup path propagates in • Ghost flushing: helps the first factor
The effect on E_longer • Original BGP may err: • MinRouteAdver peer stores wrong ASPath • BGP may err and send the packet in the wrong direction • Ghost flushing: send withdrawal to a peer. Perhaps by a chance there may be an alternative path there.
Simulation: BGP code • Shortest path metric • Delay on link between 0.2 to 2 sec • MinRouteAdver randomly in 0 to 30 sec
Simulation: ISP topology 9 4 8 5 1 7 3 dst
Example: Core Internet (ASes) Out-degree In-degree BGP Ghost Flushing 1 45 10 963 22 2 52 17 898 51 3 3 4 1031 36 4 112 27 1017 50 5 61 11 1034 36 6 20 24 920 33 7 1 6 2 2.5 8 18 2 1111 54 9 1 1111 981 62 10 1 98 4 5.1
E_longer: Convergence Time 7 2 4 1 3 6 0 5 dst
Conclusion • Reduced convergence time from minutes to sec’s. • Does not hurt in other cases • Ghost flushing - no change at BGP messages • Ghost buster solution – a new counting to infinity solution • BGP very sensitive to minor modifications.