1 / 26

FAST TCP for Multi-Gbps WAN: Experiments and Applications

FAST TCP for Multi-Gbps WAN: Experiments and Applications. Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003 http://www.slac.stanford.edu/grp/scs/net/talk/fast-i2-apr03.html.

nowles
Télécharger la présentation

FAST TCP for Multi-Gbps WAN: Experiments and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003 http://www.slac.stanford.edu/grp/scs/net/talk/fast-i2-apr03.html Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), by the SciDAC base program.

  2. Outline • High throughput challenges • New TCP stacks • Tests on Unloaded (testbed) links • Performance of multi-streams • Performance of various stacks • Tests on Production networks • Stack comparisons with single streams • Stack comparisons with multiple streams • Fairness • Where do I find out more?

  3. High Speed Challenges • After a loss it can take over an hour for stock TCP (Reno) to recover to maximum throughput at 1Gbits/s • i.e. loss rate of 1 in ~ 2 Gpkts (3Tbits), or BER of 1 in 3.6*1012 • PCI bus limitations (66MHz * 64 bit = 4.2Gbits/s at best) • At 2.5Gbits/s and 180msec RTT requires 120MByte window • Some tools (e.g. bbcp) will not allow a large enough window – (bbcp limited to 2MBytes) • Slow start problem at 1Gbits/s takes about 5-6 secs for 180msec link, • i.e. if want 90% of measurement in stable (non slow start), need to measure for 60 secs • need to ship >700MBytes at 1Gbits/s Sunnyvale-Geneva, 1500Byte MTU, stock TCP

  4. New TCP Stacks • Reno (AIMD) based, loss indicates congestion • Back off less when see congestion • Recover more quickly after backing off • Scalable TCP: exponential recovery • Tom Kelly, Scalable TCP: Improving Performance in Highspeed Wide Area Networks Submitted for publication, December 2002. • High Speed TCP: same as Reno for low performance, then increase window more & more aggressively as window increases using a table • Vegas based, RTT indicates congestion • Caltech FAST TCP, quicker response to congestion, but … Standard Scalable High Speed cwnd=38pkts~0.5Mbits

  5. Typical testbed 12*2cpu servers 6*2cpu servers 7609 T640 GSR 4 disk servers 4 disk servers OC192/POS (10Gbits/s) Chicago Sunnyvale 2.5Gbits/s (EU+US) 7609 Sunnyvale section deployed for SC2002 (Nov 02) 6*2cpu servers SNV Geneva CHI AMS > 10,000 km GVA

  6. Testbed Collaborators and sponsors • Caltech: Harvey Newman, Steven Low, Sylvain Ravot, Cheng Jin, Xiaoling Wei, Suresh Singh, Julian Bunn • SLAC: Les Cottrell, Gary Buhrmaster, Fabrizio Coccetti • LANL: Wu-chun Feng, Eric Weigle, Gus Hurwitz, Adam Englehart • NIKHEF/UvA: Cees DeLaat, Antony Antony • CERN: Olivier Martin, Paolo Moroni • ANL: Linda Winkler • DataTAG, StarLight, TeraGrid, SURFnet, NetherLight, Deutsche Telecom, Information Society Technologies • Cisco, Level(3), Intel • DoE, European Commission, NSF

  7. Windows and Streams • Well accepted that multiple streams (n) and/or big windows are important to achieve optimal throughput • Effectively reduces impact of a loss by 1/n, and improves recovery time by 1/n • Optimum windows & streams changes with changes (e.g. utilization) in path, hard to optimize n • Can be unfriendly to others

  8. Even with big windows (1MB) still need multiple streams with Standard TCP • Above knee performance still improves slowly, maybe due to squeezing out others and taking more than fair share due to large number of streams • ANL, Caltech & RAL reach a knee (between 2 and 24 streams) above this gain in throughput slow

  9. Stock vs FAST TCPMTU=1500B • Need to measure all parameters to understand effects of parameters, configurations: • Windows, streams, txqueuelen, TCP stack, MTU, NIC card • Lot of variables • Examples of 2 TCP stacks • FAST TCP no longer needs multiple streams, this is a major simplification (reduces # variables to tune by 1) Stock TCP, 1500B MTU 65ms RTT FAST TCP, 1500B MTU 65ms RTT FAST TCP, 1500B MTU 65ms RTT

  10. TCP stacks with 1500B MTU @1Gbps txqueuelen

  11. Jumbo frames, new TCP stacks at 1 Gbits/s SNV-GVA But: Jumbos not part of GE or 10GE standard Not widely deployed in end networks

  12. Production network tests SLAC CERN APAN Stanford All 6 hosts have 1GE interfaces (2 SLAC hosts send simultaneously) Competing flows, no jumbos Host running “New” TCP CERN RTT = 202 ms GVA Remote host OC 48 Host running Reno TCP ESnet NIKHEF RTT = 158 ms CHICAGO OC 192 OC 12 SURFnet CHI AMS OC 48 Caltech RTT = 25 ms APAN RTT = 147 ms Abilene OC 12 SEATTLE CalREN SNV OC 12

  13. High Speed TCP vs Reno – 1 Stream 2 separate hosts @ SLAC sending simultaneously to 1 receiver (2 iperf processes), 8MB window, pre-flush TCP config, 1500B MTU RTT bursty = congestion? Checked Reno vs Reno 2 hosts and very similar as expected

  14. Nb large RTT=congestion?

  15. Large RTTs => poor FAST

  16. Scalable vs multi-streams SLAC to CERN, duration 60s, RTT 207ms, 8MB window

  17. FAST & Scalable vs. Multi-stream Reno (SLAC>CERN ~230ms) • Bottleneck capacity 622Mbits/s • For short duration, very noisy, hard to distinguish Congestion events often sync Reno 1 streams 87 Mbits/s average FAST 1 stream 244 Mbits/s average Reno 8 streams 150 Mbits/s average FAST 1 stream 200 Mbits/s average

  18. Scalable & FAST TCP with 1 stream vs Reno with n streams

  19. Fairness FAST vs Reno Reno alone 221Mbps 1 Stream, 16MB window, SLAC to CERN Fast alone 240Mbps Reno (45Mbps) & FAST (285Mbps) competing

  20. Summary (very preliminary) • With single flow & empty network: • Can saturate 2.5 Gbps with standard TCP & jumbos • Can saturate 1Gbps with new stacks & 1500B frame or with standard & jumbos • With production network, • FAST can take a while to get going • Once going, FAST TCP with one stream looks good compared to multi-stream RENO • FAST can back down early compared to RENO • More work needed on fairness • Scalable • Does not look as good vs. multi-stream Reno

  21. What’s next? • Go beyond 2.5Gbits/s • Disk-to-disk throughput & useful applications • Need faster cpus (extra 60% MHz/Mbits/s over TCP for disk to disk), understand how to use multi-processors • Further evaluate new stacks with real-world links, and other equipment • Other NICs • Response to congestion, pathologies • Fairness • Deploy for some major (e.g. HENP/Grid) customer applications • Understand how to make 10GE NICs work well with 1500B MTUs • Move from “hero” demonstrations to commonplace

  22. More Information • 10GE tests • www-iepm.slac.stanford.edu/monitoring/bulk/10ge/ • sravot.home.cern.ch/sravot/Networking/10GbE/10GbE_test.html • TCP stacks • netlab.caltech.edu/FAST/ • datatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdf • www.icir.org/floyd/hstcp.html • Stack comparisons • www-iepm.slac.stanford.edu/monitoring/bulk/fast/ • www.csm.ornl.gov/~dunigan/net100/floyd.html • www-iepm.slac.stanford.edu/monitoring/bulk/tcpstacks/

  23. Extras

  24. FAST TCP vs. Reno – 1 stream N.b. RTT curve for Caltech shows why FAST performs poorly against Reno (too polite?)

  25. Scalable vs. Reno - 1 stream 8MB windows, 2 hosts, competing

  26. Other high speed gotchas • Large windows and large number of streams can cause last stream to take a long time to close. • Linux memory leak • Linux TCP configuration caching • What is the window size actually used/reported • 32 bit counters in iperf and routers wrap, need latest releases with 64bit counters • Effects of txqueuelen (number of packets queued for NIC) • Routers do not pass jumbos • Performance differs between drivers and NICs from different manufacturers • May require tuning a lot of parameters

More Related