170 likes | 293 Vues
TCP performance. Sven Ubik ubik @cesnet.cz. Géant, Internet2 vs. 10 Mb/s Ethernet ?.
E N D
TCP performance Sven Ubik ubik@cesnet.cz
Géant, Internet2 vs. 10 Mb/s Ethernet ? FTP throughput capacity loadftp.uninett.no 12.3 Mb/s 1.2 Gb/s 80 Mb/s (6.6%) ftp.stanford.edu 1.3 Mb/s 600 Mb/s 180 Mb/s (30%)Protocols: TCP 95%, UDP 3%, other 2%International traffic 30% (april 2002)
BW * delay product From CESNET: ping max.throughput for 64kB owin ftp.uninett.no 38 ms 13.8 Mb/s ftp.cs.columbia.edu 90 ms 5.8 Mb/s ftp.tamu.edu 133 ms 3.9 Mb/s ftp.stanford.edu 166 ms 2.6 Mb/s
Window Scale TCP Option (RFC 1323) Advertised rwnd shifted internally 1-14 bits 1. OS must support, for example, Linux 2.4: sysctl -w net/ipv4/tcp_adv_win_scale=1 2. Application must use a) default for all TCP connections sysctl -w net/ipv4/tcp_rmem=„4096 1048576 1048576“ sysctl -w net/ipv4/tcp_wmem=„4096 1048576 1048576“ b) application sets its own setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, (char *)&size, sizeof(int)); setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, (char *)&size, sizeof(int)); before connect() or listen() (e.g., netperf, modified ncftp+wuftpd) c) OS tunes automatically (dynamic right-sizing) 3. Timestamps + PAWS (Protect Against Wrapped Sequence Numbers)
TCP congestion control Initialization: cwnd<=2*MSS, ssthresh high (rwnd) Slow start (cwnd<ssthresh): received new ack => cwnd=cwnd+max.segment Congestion avoidance (cwnd>ssthresh): RTT => cwnd=cwnd+max.segment usually approximated Timeout: ssthresh=max(owin/2, 2*max.segment) cwnd=max.segment (implies slow start)
TCP congestion control throughput limitation • Cesnet Uninett: MSS=1460 bytes, RTT=44ms, packet loss rate~5*10 -6,Timeout=250 ms • Padhye [1] equation: BW ~ 102.8 Mb/s • Mathis [2] equation (BW ~ MSS/RTT * C/sqrt(p)): 110.4 Mb/s Higher MSS to speed-up congestion avoidance? [1] J. Padhye, V. Firoiu, D. Towsley, J. Kurose. „Modeling TCP Throughput: A Simple Model and its Emprirical Validation“ [2] M. Mathis, J. Semke, J. Mahdavi. „The Macroscopic Behaviour of the TCP Congestion Avoidance Algorithm“.
CESNET UNINETT UDP throughput Géant Teleglobe created by qosplot http://www.cesnet.cz/english/project/qosip
Path capacity estimation tools pathrate: „Phase I was aborted“ Final capacity estimate: 757 Mbps to 792 Mbps pathchar: pathchar to tcp4-ge.uninett.no (158.38.0.194) 0 localhost | 61 Mb/s, 48 us (293 us) 1 195.113.147.1 (195.113.147.1) | 348 Mb/s, 13 us (354 us) 2 r21-pos0-0-stm16.cesnet.cz (195.113.156.114) | 108 Mb/s, 64 us (594 us) 3 cesnet.cz1.cz.geant.net (62.40.103.29) | 614 Mb/s, 4.15 ms (8.91 ms) 4 cz.de1.de.geant.net (62.40.96.38) | 557 Mb/s, 7 us (8.95 ms) 5 de1-1.de2.de.geant.net (62.40.96.130) | 1229 Mb/s, 10.9 ms (30.7 ms) 6 de.se1.se.geant.net (62.40.96.66) | ?? b/s, -13 us (30.6 ms) 7 nordunet-gw.se1.se.geant.net (62.40.103.118) | 805 Mb/s, 3.67 ms (38.0 ms) 8 no-gw.nordu.net (193.10.68.30) | 884 Mb/s, 13 us (38.0 ms) 9 oslo-gw1.uninett.no (193.10.68.50) | 656 Mb/s, 3.39 ms (44.8 ms) 10 trd-gw.uninett.no (128.39.0.250) | 99 Mb/s, 29 us (45.0 ms), 13% dropped 11 tcp4-ge.uninett.no (158.38.0.194)
CESNET UNINETT TCP throughput max. rwnd CESNET UNINETT CESNET [bytes]UNINETT [Mb/s] CESNET [Mb/s] UNINETT [Mb/s] (Teleglobe) 65535 7.5 7.5 8.7 131072 15.2 14.4 17.0 262144 28.7 27.6 33.4 524288 58.1 54.4 62.9 1048576 105.1 100.4 103.4 2097152 205.7 177.4 103.3 4194304 451.6 65.9 98.9 8388608 394.9 67.0 98.9 FTP 150 MB: standard FTP: 130 s rwnd increased to 4MB: 5s
Measurement tools • Simulation (ns/2, sim.cesnet.cz) • Emulation (Nist Net) • Path capacity estimation tools (pathchar, pathrate, …) • Capture + follow-up analysis (tcpdump + tcptrace and others) • On-the-fly monitoring of TCP state variables (web100)
tcpdump -i eth1 -p -s 96 -w trace.log tcp and host tcp4-ge.uninett.no tcptrace -l -f ’s_port!=12865’ -T -A300 -G trace.log xplot tcpplot Capture + follow-up analysis
www.web100.org - “Tools for end hosts to automatically achieve high bandwidth” - kernel data structures (approx. 120 variables), library and userland tools On-the-fly monitoring of TCP state variables
On-the-fly monitoring of TCP state variables readvars 0.01 CurrentCwnd CurrentRwndRcvd
Parallel TCP • GridFTP, LFTP • ~100 Mb Uninett -> Cesnet: • pget -n 5 14.7s 6.5 Mb/s pget -n 10 7.8s 12.1 Mb/s pget -n 15 5.5s 17.4 Mb/s to 10s 9.5 Mb/s pget -n 20 9.9s 9.6 Mb/s to 12s 7.9 Mb/s
E2E performance: • No changes to network. • Is it best-effort? Is it fair? • Ultimate goal of E2E performance: • Fully automatic adjustment to network and receiver conditions inside the operating system to maximize utilization of available • resources. • Use SACKs to avoid slow start after timeout • Are you far enough from us? You are welcome! • Sven Ubik, ubik@cesnet.cz Further research