End-to-End Fault Tolerance Using Transport Layer Multihoming

End-to-End Fault ToleranceUsing Transport Layer Multihoming Armando L. Caro, Jr. Dissertation Proposal April 8, 2003

A1 B1 ISP ISP Internet A2 B2 ISP ISP Host A Host B Propose to investigate transport layer multihoming for • end-to-end fault tolerance (primary goal) • improved application performance (secondary goal)

Need transport layer support to increase connection resilience during path outages Why Investigate Transport Layer Multihoming? • Many applications (e.g., mission-critical) require uninterrupted service • Internet path outages are common • link failures • overloaded links • Multiple network interfaces provide network layer redundancy • interfaces today are relatively cheap

Can’t Routing Handle Path Outages? • Routing does not recover fast enough from link failures • [Labovitz 00] measure failure detection and recovery minimum: 3 minutes often: 10’s of minutes 40% required >30 minutes • [Chandra 01] (using probes) 5% required 2.75 – 27.75 hours! • [Paxson 97] (using probes) 1.5 – 3.3% of routes had “serious pathologies” • [Labovitz 98] (examining routing table logs) 10% of routes available < 95% of time 65% of routes available < 99.99% of time • Routing does not recover at all from overloaded links • Flash crowds • DoS attacks - statistics in [Moore 01]

ISP ISP Internet ISP ISP Host A Host B A1 B1 A2 B2 SCTP Multihoming • hosts choose 1 of 4 possible TCP connections: • (A1,B1) or (A1,B2) or (A2,B1) or (A2,B2) • 1 SCTP association • ({A1,A2}, {B1,B2}) • concept of “primary” destination • Host A → B1 • Host B → A1 • network state (RTT, cwnd, ssthresh, …) maintained per destination

A B Sender: Host A Primary: B1 Alternate: B2 i = 1 j = 2 D Path.Max. Retrans Phase I Phase II Phase III i times out exceeded D primary D primary D primary i i i D errors D failed D active i i i new => D new => D new => D i j i D rtx => D rtx => D rtx => D i j j j responds D responds i A. Caro, J. Iyengar, P. Amer, G. Heinz, R. Stewart. Using SCTP Multihoming for Fault Tolerance & Load Balancing. SIGCOMM 2002 Poster, August 2002. Current SCTP Failover Mechanism • Reachability probes • Explicitly with heartbeats • Implicitly with data

SCTP Failover: Issue 1 - Failover is “temporary” Issue 2 - Retransmission Policy Issue 3 - Failure Detection Time Issue 4 - No Source Interface Selection

SCTP Failover: Issue 1- Failover is “temporary”

We found returning to the primary may be inefficient* *A. Caro, J. Iyengar, P. Amer, G. Heinz, R. Stewart. A Two-level Threshold Recovery Mechanism for SCTP. SCI 2002, July 2002. SCTP Failover: Issue 1- Failover is “temporary” • Current failover policy • Traffic is redirected back to primary when primary responds to a single heartbeat • i.e., primary destination is never changed • Why keep the primary destination? • Assumes application has a preferred destination • at time of return • primary’s cwnd = 1MTU & ssthresh = 2MTU • alternate’s cwnd > 1MTU & ssthresh > 2MTU • We propose to investigate “permanent” failover when no destination is preferred • One successful heartbeat may not accurately indicate recovered path outages • overloaded links may need more probing • We propose to investigate other probing techniques

A B D responds i i ó j i = 1 j = 2 α β D Phase I Phase II Phase III Phase IV i times out D primary D primary D primary D primary i i j i D errors D failed D failed D active i i i i new => D new => D new => D new => D i j j i D rtx => D rtx => D rtx => D rtx => D i i j j i responds D responds i A. Caro, J. Iyengar, P. Amer, G. Heinz, R. Stewart. A Two-level Threshold Recovery Mechanism for SCTP. SCI 2002, July 2002. A. Caro, J. Iyengar, P. Amer, G. Heinz, R. Stewart. Using SCTP Multihoming for Fault Tolerance & Load Balancing. SIGCOMM 2002 Poster, August 2002. Two-level Threshold Failover α = temporary failover β = auto change primary

SCTP Failover: Issue 2 – Retransmission Policy

We found that this policy degrades performance in many circumstances* • * A. Caro, P. Amer, R. Stewart. Transport Layer Multihoming for Fault Tolerance in FCS Networks. CTA 2003, April 2003. (Submitted to MILCOM 2003) SCTP Failover: Issue 2 – Retransmission Policy • Current retransmission policy • If peer is multihomed, retransmit to an alternate destination • Why the alternate destination? • Attempts to improve chances of success • No prior research to demonstrate benefits • Not enough traffic on the alternate path to accurately measure RTT …so timeouts are LONG! * • We propose to investigate alternative policies

Potential Solutions • Solution 1: Retransmissions to Same Destination • Pro: uses destination with accurate RTT; cwnd benefits for primary • Con: fewer successful transmits if primary failed • Solutions 2: Heartbeat After RTO (Randall Stewart’s idea) • Pro: immediate opportunity to measure RTT after RTO backoff • Con: still few samples to estimate alternate RTT • Solution 3: Timestamps • Pro: Karn’s Algorithm not needed; more RTT samples on alternate • Con: 12-byte overhead in each packet • Solution 4: Our Multiple Fast Retransmit Algorithm • Pro: minimizes number of timeouts • Con: no extra RTT samples on alternate • Solution 5: Rtx to Same Destination & Multiple Fast Rtx A. Caro, P. Amer, J. Iyengar, R. Stewart. Retransmission Policies with Transport Layer Multihoming. UD CIS TR2003-05, March 2003. (submitted to ICON 2003)

Simulation Topology

Methodology • A→B traffic • 4MB file transfer • Packet sizes: 100% @ 1500B • Cross-traffic • Self-similar (aggregation of Pareto sources) • Packet sizes: 50% @ 40B, 25% @ 576B, 25% @ 1500B • Load: 5Mbps – 11Mbps (producing varying loss rates) • Simulation parameters (60 runs per combo) • Cross-traffic on primary destination path • Cross-traffic on alternate destination path • Retransmission policy (current policy, or 1 of 5 solutions)

SCTP Failover: Issue 3 - Failure Detection Time

Best case failure detection is 1+2+4+8+16+32 = 63 seconds! * *A. Caro, J. Iyengar, P. Amer, G. Heinz, R. Stewart. A Two-level Threshold Recovery Mechanism for SCTP. SCI 2002, July 2002. SCTP Failover: Issue 3 - Failure Detection Time • Current SCTP recommends static parameter settings: • RTO (min, max): (1, 60) seconds • Path.Max.Retrans: 5 attempts per destination • Heartbeat Interval: 30 seconds • [Jungmaier 02] improves performance by lowering parameter settings, but • their experimental network had • fixed delays (ie, no delay spikes) • no cross-traffic (ie, no congestion) • RTO.Min < 1 second against recommendation in [Allman 99] • We propose to • further investigate static parameter settings in a more realistic environment • investigate dynamically changing parameters based on • path metrics (RTT, loss rate) • application requirements (high throughput, low delay, low loss)

Congestion Control Improvement • Introduce Fast Recovery mechanism • Avoids multiple cwnd reductions in a single RTT • Similar to New-Reno TCP’s Fast Recovery • Introduce new policy which restricts cwnd increasing during Fast Recovery • Maintains conservative behavior • Modify SCTP’s Fast Retransmit • Avoids unnecessary delays of retransmissions A. Caro, K. Shah, J. Iyengar, P. Amer, R. Stewart. SCTP and TCP Variants: Congestion Control Under Multiple Losses. UD CIS TR2003-04, February 2003. (submitted to ACM CCR) R. Stewart, L. Ong, I. Arias-Rodriguez, K. Poon, P. Conrad, A. Caro, M. Tuexen. SCTP Implementer’s Guide. draft-ietf-tsvwg-sctpimpguide-08.txt, March 2003.

Drop Scenarios One drop Two drops Three drops Four drops Scenarios from: Kevin Fall and Sally Floyd. Simulation-based Comparisons of Taho, Reno, and SACK TCP. In ACM Computer Communications Review, 26(3):5-21, July 1996.

SCTP Failover: Issue 4 - No Source Interface Selection

A B Sender: Host A Primary: B1 Alternate: B2 SCTP Failover: Issue 4 - No Source Interface Selection • Current SCTP • transport sender only specifies destination IP address • but network layer determines outgoing source IP address/interface • Why is this a problem? • Suppose A’s network layer routes packets to B1 & B2 via A1

SCTP Failover: Issue 4 (cont’d) • For full multihoming flexibility • endpoint’s IP stack should support multiple default routes • SCTP should specify the source-destination pair for sending traffic • Stewart and Lei’s KAME implementation • supports experimental options for source interface selection • maintains network state per destination • varies source address to same destination until destination failure detected • [Kubo 03] propose a failover scheme that • maintains network state per source-destination pair • detects failures per source-destination pair • We propose to further investigate source interface selection

Plan of Study

Plan of Study (in progress) • Retransmission policies with multihoming (issue 2) • other file sizes • other cross traffic types (Exponential aggregate, etc) • SCTP vs TCP variants under multiple losses (issue 3) • more extensive loss scenarios • Analytic SCTP model (issues 1 & 3) • build on TCP models in [Padhye 98] and [Cardwell 00] • use to investigate static failover parameter settings

Plan of Study (future) • Adaptive failover algorithm (issue 3) • dynamically adjust thresholds based on • path metrics (RTT, loss rate) • app requirements (high throughput, low delay, low loss) • Probing mechanism (issue 1) • investigate use of packet pairs or small packet trains • Source interface selection (issue 4) • evaluate proposed solutions by [Stewart KAME] and [Kubo 03] • investigate other possible solutions • Final failover mechanism evaluation • simulation • empirical study

Any Questions?

End-to-End Fault Tolerance Using Transport Layer Multihoming