1 / 91

FAST TCP

FAST TCP. Bartek Wydrowski Steven Low. netlab. CALTECH .edu. Acks & Collaborators. Internet2 Almes, Shalunov Abilene GigaPoP’s GATech, NCSU, PSC, Seattle, Washington Cisco Aiken, Doraiswami, McGugan, Smith, Yip Level(3) Fernes LANL Wu. Caltech

bien
Télécharger la présentation

FAST TCP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAST TCP Bartek Wydrowski Steven Low netlab.CALTECH.edu

  2. Acks & Collaborators • Internet2 • Almes, Shalunov • Abilene GigaPoP’s • GATech, NCSU, PSC, Seattle, Washington • Cisco • Aiken, Doraiswami, McGugan, Smith, Yip • Level(3) • Fernes • LANL • Wu • Caltech • Bunn, Choe, Doyle, Hegde, Jin, Li, Low Newman, Papadoupoulous, Ravot, Singh, Tang, J. Wang, Wei, Wydrowski, Xia • UCLA • Paganini, Z. Wang • StarLight • deFanti, Winkler • CERN • Martin • SLAC • Cottrell • PSC • Mathis

  3. Outline • Background, motivation • FAST TCP • Architecture and algorithms • Experimental evaluations • Loss recovery • MaxNet, SUPA FAST

  4. ns-2 simulation DataTAG Network: CERN (Geneva) – StarLight (Chicago) – SLAC/Level3 (Sunnyvale) average utilization 95% 1G 27% 19% txq=100 txq=100 txq=10000 Linux TCP Linux TCP FAST capacity = 1Gbps; 180 ms round trip latency; 1 flow C. Jin, D. Wei, S. Ravot, etc (Caltech, Nov 02) Performance at large windows 10Gbps capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps; 100 ms round trip latency; 100 flows J. Wang (Caltech, June 02)

  5. Average Queue vs Buffer Size Dummynet • capacity = 800Mbps • Delay =200ms • 1 flows • Buffer size: 50, …, 8000 pkts (S. Hedge, B. Wydrowski, etc, Caltech)

  6. Is large queue necessary for high throughput?

  7. Congestion control Example congestion measure pl(t) • Loss (Reno) • Queueing delay (Vegas) pl(t) xi(t)

  8. pl(t) • AQM: • DropTail • RED • REM/PI • AVQ xi(t) TCP: • Reno • Vegas TCP/AQM • Congestion control is a distributed asynchronous algorithm to share bandwidth • It has two components • TCP: adapts sending rate (window) to congestion • AQM: adjusts & feeds back congestion information • They form a distributed feedback control system • Equilibrium & stability depends on both TCP and AQM • And on delay, capacity, routing, #connections

  9. ACK: W  W + 1/W Loss: W  W – 0.5W • Packet level • Flow level • Equilibrium • Dynamics pkts Packet & flow level Reno TCP (Mathis formula)

  10. Reno TCP • Packet level • Designed and implemented first • Flow level • Understood afterwards • Flow level dynamics determines • Equilibrium: performance, fairness • Stability • Design flow level equilibrium & stability • Implement flow level goals at packet level

  11. Reno TCP • Packet level • Designed and implemented first • Flow level • Understood afterwards • Flow level dynamics determines • Equilibrium: performance, fairness • Stability Packet level design of FAST, HSTCP, STCP guided by flow level properties

  12. ACK: W  W + 1/W Loss: W  W – 0.5W • Reno AIMD(1, 0.5) ACK: W  W + a(w)/W Loss: W  W – b(w)W • HSTCP AIMD(a(w), b(w)) ACK: W  W + 0.01 Loss: W  W – 0.125W • STCP MIMD(a, b) • FAST Packet level

  13. Flow level: Reno, HSTCP, STCP, FAST • Similarflow level equilibrium pkts/sec a = 1.225 (Reno), 0.120 (HSTCP), 0.075 (STCP)

  14. Flow level: Reno, HSTCP, STCP, FAST • Commonflow level dynamics! window adjustment control gain flow level goal = • Different gain k and utility Ui • They determine equilibrium and stability • Different congestion measure pi • Loss probability (Reno, HSTCP, STCP) • Queueing delay (Vegas, FAST)

  15. Implementation strategy • Commonflow level dynamics window adjustment control gain flow level goal = • Small adjustment when close, large far away • Need to estimate how far current state is wrt target • Scalable • Window adjustment independent of pi • Depends only on current window • Difficult to scale

  16. Difficulties at large window • Equilibrium problem • Packet level: AI too slow, MD too drastic • Flow level: required loss probability too small • Dynamic problem • Packet level: must oscillate on binary signal • Flow level: unstable at large window 5

  17. Problem: no target • Reno:AIMD (1, 0.5) ACK: W  W + 1/W Loss: W  W – 0.5W • HSTCP:AIMD (a(w), b(w)) ACK: W  W + a(w)/W Loss: W  W – b(w)W • STCP:MIMD (1/100, 1/8) ACK: W  W + 0.01 Loss: W  W – 0.125W

  18. FAST Conv Slow Start Equil Loss Rec Solution: estimate target • FAST Scalable to any w*

  19. Difficulties at large window • Equilibrium problem • Packet level: AI too slow, MD too drastic • Flow level: required loss probability too small • Dynamic problem • Packet level: must oscillate on binary signal • Flow level: unstable at large window

  20. TCP Problem: binary signal oscillation

  21. Solution: multibit signal FAST stabilized

  22. Difficulties at large window • Equilibrium problem • Packet level: AI too slow, MD too drastic • Flow level: required loss probability too small • Dynamic problem • Packet level: must oscillate on binary signal • Flow level: unstable at large window Use multi-bit signal ! Stablize flow dynamics !

  23. Outline • Background, motivation • FAST TCP • Architecture and algorithms • Experimental evaluations • Loss recovery • MaxNet, SUPA FAST

  24. <RTT timescale RTT timescale Loss recovery Architecture

  25. Architecture Each component • designed independently • upgraded asynchronously

  26. Architecture Each component • designed independently • upgraded asynchronously Window Control

  27. Window control algorithm • Full utilization • regardless of bandwidth-delay product • Globally stable • exponential convergence • Fairness • weighted proportional fairness • parameter a

  28. Window control algorithm

  29. Window control algorithm target backlog measured backlog

  30. Outline • Background, motivation • FAST TCP • Architecture and algorithms • Experimental evaluations • Loss recovery • MaxNet, SUPA FAST

  31. Dynamic sharing: 3 flows FAST Linux Dynamic sharing on Dummynet • capacity = 800Mbps • delay=120ms • 3 flows • iperf throughput • Linux 2.4.x (HSTCP: UCL)

  32. Dynamic sharing: 3 flows FAST Linux Steady throughput HSTCP BIC

  33. 30min queue FAST Linux loss throughput Dynamic sharing on Dummynet • capacity = 800Mbps • delay=120ms • 14 flows • iperf throughput • Linux 2.4.x (HSTCP: UCL) HSTCP STCP

  34. 30min queue Room for mice ! FAST Linux loss throughput HSTCP HSTCP BIC

  35. small window 800pkts large window 8000 Aggregate throughput Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

  36. Fairness Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

  37. stable in diverse scenarios Stability Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

  38. Responsiveness Dummynet: cap = 800Mbps; delay = 50-200ms; #flows = 1-14; 29 expts

  39. I2LSR, SC2004 Bandwidth Challenge Harvey Newman’s group, Caltech http://dnae.home.cern.ch/dnae/lsr4-nov04 OC48 OC192 November 8, 2004 Caltech and CERN transferred • 2,881 GBytes in one hour (6.86Gbps) • between Geneva - US - Geneva (25,280 km) • through LHCnet/DataTag, Abilene and CENIC backbones • using 18 FAST TCP streams • on Linux 2.6.9 kernel with 9000KB MTU • at 174 Pbm/s

  40. Internet2 Abilene Weather Map OC48 OC192 7.1G: GENV-PITS-LOSA-SNVA-STTL-DNVR-KSCY-HSTON-ATLA-WASH-NYCM-CHIN-GENV Newman’s group, Caltech

  41. “Ultrascale” protocol development: FASTTCP FAST TCP • Based on TCP Vegas • Uses end-to-end delay and loss to dynamically adjust the congestion window • Defines an explicit equilibrium Capacity = OC-192 9.5Gbps; 264 ms round trip latency; 1 flow BW use 50% BW use 79% BW use 30% BW use 40% Linux TCP Westwood+BIC TCP FAST (Yang Xia, Caltech)

  42. FAST backs off to make room for Reno Periodic losses every 10mins (Yang Xia, Harvey Newman, Caltech)

  43. Linux Experiment by Yusung Kim KAIST, Korea, Oct 2004 • Dummynet • Capacity = 622Mbps • Delay=200ms • Router buffer size = 1BDP (11,000 pkts) • 1 flow • Application: iperf • BIC, FAST, HSTCP, STCP, Reno (Linux), CUBIC http://netsrv.csc.ncsu.edu/yskim/single_traffic/curves/

  44. RTT RTT = 400ms double baseRTT FAST Throughput Yusung Kim, KAIST, Korea 10/2004 • All can achieve high throughput except Reno • FAST adds negligible queueing delay • Loss-based control (almost) fills buffer … • adding delay and reducing ability to absorb bursts HSTCP BIC

  45. queue FAST FAST cwnd Yusung Kim, KAIST, Korea 10/2004 • FAST needs smaller buffer at both routers and hosts • Loss-based control limited at host in these expts HSTCP BIC

  46. Outline • Background, motivation • FAST TCP • Architecture and algorithms • Experimental evaluations • Loss recovery • MaxNet, SUPA FAST

  47. Loss Recovery Section Overview • Linux & TCP loss recovery has problems; esp. in non-congestion loss environments. • New Loss Architecture: • Determining packet loss & PIF • Decoupled window control • Testing in high loss environment • Receiver window issues • Forward Retransmission • SACK processing optimization • Reorder Detection • Testing in small buffer environment

  48. New Loss Recovery Architecture • New Architecture for loss recovery motivated by new environments: • High loss wireless, 802.11, Satellite • Low loss, but large BDP • Measure of Path ‘difficulty’ should be extended • BDLP: Bandwidith x Delay x (1/(1-Loss))

  49. Periodic losses every 10mins (Yang Xia, Harvey Newman, Caltech)

  50. Haystack - 1 Flow (Atlanta-> Japan) • Iperf used to generate traffic. • Sender is a Xeon 2.6 Ghz • Window was constant: • Burstiness in rate due to • Host processing and ack spacing.

More Related