1 / 43

Reliable Transport II: TCP and Congestion Control

Reliable Transport II: TCP and Congestion Control. Brad Karp UCL Computer Science. CS 6007/GC15/GA07 27 th - 28 th February, 2008. Outline. Packet header format Connection establishment Data transmission Retransmit timeouts RTT estimator AIMD Congestion control

lev-pugh
Télécharger la présentation

Reliable Transport II: TCP and Congestion Control

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reliable Transport II:TCP and Congestion Control Brad Karp UCL Computer Science CS 6007/GC15/GA07 27th - 28th February, 2008

  2. Outline • Packet header format • Connection establishment • Data transmission • Retransmit timeouts • RTT estimator • AIMD Congestion control • Throughput, loss, and RTT equation • Connection teardown • Protocol state machine

  3. TCP Packet Header • TCP packet: IP header + TCP header + data • TCP header: 20 bytes long • Checksum covers header + “pseudo header” • IP header source and destination addresses, protocol • Length of TCP segment (TCP header + data)

  4. TCP Header Details • Connections inherently bidirectional; all TCP headers carry both data and ACK sequence numbers • 32-bit sequence numbers are in units of bytes • Source and destination ports • multiplexing of TCP by applications • UNIX: local ports below 1024 reserved (only root may use them) • Window: advertisement of number of bytes advertiser willing to accept

  5. TCP Connection Establishment:Motivation • Goals: • Start TCP connection between two hosts • Avoid mixing data from old connection in new connection • Avoid confusing previous connection attempts with current one • Prevent (most) third parties from impersonating (spoofing) one endpoint • SYN packets (SYN flag in TCP header set) used to establish connections • Use retransmission timer to recover from lost SYNs • What protocol meets above goals?

  6. Connections shouldn’t start with constant sequence number; risks mixing data between old and new connections TCP Connection Establishment:Non-Solution (I) A B • Use two-way handshake • A sends SYN to B • B accepts by returning SYN to A • A retransmits SYN if not received • A and B can ignore duplicate SYNs after connection established • What about delayed data packets from old connection? time SYN SYN data, seqno = 1 data, seqno = 512 closed SYN data, seqno = 1024 SYN data, seqno = 1 data, seqno = 512 data, seqno = 1024

  7. Connection attempts should explicitly acknowledge which SYN they are accepting! TCP Connection Establishment:Non-Solution (II) A B • Two-way handshake, as before • But enclose random initial sequence numbers on SYNs • What about delayed SYNs from old connection? • A wrongly believes connection successfully established • B will drop all of A’s data! time SYN, seqno = i closed SYN, seqno = k SYN, seqno = j data, seqno = k+1 data ignored!

  8. TCP Connection Establishment:3-Way Handshake A B • Set SYN on connection request • Each side chooses random initial sequence number • Each side explicitly ACKs the sequence number of the SYN it’s responding to SYN, seqno = i time SYN, seqno = j, ACK = i+1 seqno = i+1, ACK = j+1

  9. Robustness of 3-Way Handshake:Delayed SYN • Suppose A’s SYN i delayed, arrives at B after connection closed • B responds with SYN/ACK for i+1 • A doesn’t recognize i+1; responds with reset, RST flag set in TCP header • A rejects connection A B SYN, seqno = i closed SYN, seqno = j, ACK = i+1 time RST, ACK = j

  10. Robustness of 3-Way Handshake:Delayed SYN/ACK A B • A attempts connection to B • Suppose B’s SYN k/ACK p delayed, arrives at A during new connection attempt • A rejects SYN k;sends RST to B • Connection from A to B succeeds unimpeded closed SYN, seqno = i time SYN, seqno = k, ACK = p RST, ACK = k SYN, seqno = j, ACK = i+1 seqno = i+1, ACK = j+1

  11. Unless he is on path between A and B, adversary cannot spoof A to B or vice-versa! Why: random ISNs on SYNs Robustness of 3-Way Handshake:Source Spoofing • Suppose host B trusts host A, based on A’s IP address • e.g., allows any account creation request from host A • Adversary M may not control host A, but may seek to impersonate, or spoof, host A • Adversary may not need to receive data from B; only send data (e.g., “create an account l33thax0r”) • Can M establish a connection to B as A? SYN, seqno = j, ACK = i+1 A B IP = A, SYN, seqno = i M IP = A, seqno = i+1, ACK = ??

  12. TCP: Data Transmission (I) • Each byte numbered sequentially, mod 232 • Sender buffers data in case retransmission required • Receiver buffers data for in-order reassembly • Sequence number (seqno) field in TCP header indicates first user payload byte in packet • Receiver indicates receive window size explicitly to sender in window field in TCP header • corresponds to available buffer space at receiver

  13. TCP: Data Transmission (II) • Sender’s transmit window size: amount of buffer space at sender • Sender uses window that is minimum of send and receive window sizes • Receiver sends cumulative ACKs • ACK number in TCP header names highest contiguous byte number received thus far, +1 • one ACK per received packet, OR • Delayed ACK also possible: receiver batches ACKs, sends one for every pair of data packets (200 ms max delay) • Current window at sender: • low byte advances as packets sent • high byte advances as receive window updates arrive

  14. Outline • Packet header format • Connection establishment • Data transmission • Retransmit timeouts • RTT estimator • AIMD Congestion control • Throughput, loss, and RTT equation • Connection teardown • Protocol state machine

  15. TCP: Retransmit Timeouts • Sender sets timer for each sent packet • when ACK returns, timer canceled • if timer expires before ACK returns, packet resent • Expected time for ACK to return: RTT • TCP estimates round-trip time using EWMA • measurements mi from timed packet/ACK pairs • RTTi = ((1-α) x RTTi-1 + α x mi) • Retransmit timeout: RTOi = β × RTTi • original TCP: β = 2 • Is this accurate enough? • Recall dangers of too-short and too-long RTT estimates from previous lecture

  16. Mean and Variance RTT estimator used by all modern TCPs Mean and Variance:Jacobson’s RTT Estimator • Above link load of 30% at router, β × RTTi will retransmit too early! • Response to increasing load: waste bandwidth on duplicate packets • Result: congestion collapse! • [Jacobson 88]: estimate vi, mean deviation (EWMA of |mi – RTTi|), stand-in for variance vi = vi-1 × (1-γ) + γ × |mi-RTTi| • Use RTOi = RTTi + 4vi

  17. Retransmit Behavior • Original TCP, before [Jacobson 88]: • at start of connection, send full window of packets • retransmit each packet immediately after its timer expires • Result: window-sized bursts of packets sent into network

  18. Pre-Jacobson TCP (Obsolete!) • Time-sequence plot taken at sender • Bursts of packets: vertical lines • Spurious retransmits: repeats at same y value • Dashed line: available 20 Kbps capacity

  19. Self-Clocking: Conservation of Packets • Goal: self-clocking transmission • each ACK returns, one data packet sent • spacing of returning ACKs: matches spacing of packets in time at slowest link on path

  20. Reaching Equilibrium: Slow Start • At connection start, sender sets congestion window size, cwnd, to pktSize (one packet’s worth of bytes), not whole window • Sender sends up to minimum of receiver’s advertised window and cwnd • Upon return of each ACK until receiver’s advertised window size reached, increase cwnd by pktSize bytes • “Slow” means exponential window increase! • Takes log2W RTTs to reach receiver’s advertised window size W

  21. Post-Jacobson TCP: Slow Start and Mean+Variance RTT Estimator • Time-sequence plot at sender • “Slower” start • No spurious retransmits

  22. Outline • Packet header format • Connection establishment • Data transmission • Retransmit timeouts • RTT estimator • AIMD Congestion control • Throughput, loss, and RTT equation • Connection teardown • Protocol state machine

  23. Goals in Congestion Control • Achieve high utilization on links; don’t waste capacity! • Divide bottleneck link capacity fairly among users • Be stable: converge to a steady allocation among users • Avoid congestion collapse

  24. Congestion Collapse • Cliff behavior observed in [Jacobson 88] Knee Congestion collapse! Throughput (bps) Offered load (bps)

  25. Congestion Requires Slowing Senders • Recall: bigger buffers cannot prevent congestion • Senders must slow to alleviate congestion • Absence of ACKs implicitly indicates congestion • TCP sender’s window size determines sending rate • Recall: correct window size is bottleneck bandwidth-delay product • How can sender learn this value? • Search for it, by adapting window size • Feedback from network: ACKs return (window OK) or do not return (window too big)

  26. Avoiding Congestion:Multiplicative Decrease • Recall that sender uses sending window of size min(cwnd, rwnd), where rwnd is receiver’s advertised window • Upon timeout for sent packet, sender presumes packet lost to congestion, and: • sets ssthresh = cwnd / 2 • sets cwnd = pktSize • uses slow start to grow cwnd up to ssthresh • End result: cwnd = cwnd / 2, via slow start • Sender sends one window per RTT; halving cwnd halves transmit rate

  27. Combined algorithm: Additive Increase, Multiplicative Decrease (AIMD) Avoiding Congestion:Additive Increase • Drops indicate TCP sending more than its fair share of bottleneck • No feedback to indicate TCP using less than its fair share of bottleneck • Solution: speculatively increase window size as ACKs return • Additive increase: for each returning ACK, cwnd = cwnd + (pktSize × pktSize)/cwnd • Increases cwnd by ~pktSize bytes per RTT

  28. Refinement: Fast Retransmit (I) • Sender must wait well over RTT for timer to expire before loss detected • TCP’s minimum retransmit timeout: 1 second • Another loss indication: duplicate ACKs • Suppose sender sends 1, 2, 3, 4, but 2 lost • Receiver receives 1, 3, 4 • Receiver sends cumulative ACKs 2, 2, 2 • Loss causes duplicate ACKs!

  29. Fast Retransmit (II) • Upon arrival of 3 duplicate ACKs, sender: • sets cwnd = cwnd/2 • retransmits “missing” packet • no slow start • Not only loss causes dup ACKs • Reordering, too A B data, seqno = 1 data, seqno = 513 time data, seqno = 1025 data, seqno = 1537 ACK = 513 ACK = 513 ACK = 513 data, seqno = 513

  30. AIMD in Action • Sender searches for correct window size

  31. Why AIMD? • Other control rules possible • E.g., MIMD, AIAD, … • Recall goals: • Links fully utilized (efficient) • Users share resources fairly • TCP adapts all flows’ window sizes independently • Must choose a control that will always converge to an efficient and fair allocation of windows

  32. Chiu-Jain Phase Plots • Consider two users sharing a bottleneck link • Plot bandwidths allocated to each • Efficiency: sum of two users’ rates fixed • Fairness: two users’ rates equal • Equi-Fairness: ratio of two users’ rates fixed Equi-Fairness Line (MI) Fairness Line (AI) Overload User 2 (bps) Optimum Efficiency Line Underload User 1 (bps)

  33. Chiu Jain: AIMD • AIMD converges to optimum efficiency and fairness Fairness Line Efficiency Line

  34. Chiu Jain: AIAD • AIAD doesn’t converge to optimum point! • Similar oscillations for MIMD Fairness Line Efficiency Line

  35. Outline • Packet header format • Connection establishment • Data transmission • Retransmit timeouts • RTT estimator • AIMD Congestion control • Throughput, loss, and RTT equation • Connection teardown • Protocol state machine

  36. Modeling Throughput, Loss, and RTT • How do packet loss rate and RTT affect throughput TCP achieves? • Assume: • only fast retransmits • no timeouts (so no slow starts in steady-state)

  37. Evolution of Window Over Time W • Average window size: 3W/4 • One window sent per RTT • Bandwidth: • 3W/4 packets per RTT • (3W/4 x packet size) / RTT bytes per second • W depends on loss rate… W/2 time

  38. Loss and Window Size • Assume no delayed ACKs, fixed RTT • cwnd grows by one packet per RTT • So it takes W/2 RTTs to go from window size W/2 to window size W; this period is one cycle • How many packets sent in total? • ((3W/4) / RTT) x (W/2 x RTT) = 3W2/8 • One loss per cycle (as window reaches W) • loss rate: p = 8/3W2 • W = sqrt(8/3p)

  39. Throughput, Loss, and RTT Model • W = sqrt(8/3p) = (4/3) x sqrt(3/2p) • Recall: • Bandwidth: B = (3W/4 x packet size) / RTT • B = packet size / (RTT x sqrt(2p/3)) • Consequences: • Increased loss quickly reduces throughput • At same bottleneck, flow with longer RTT achieves less throughput than flow with shorter RTT!

  40. Outline • Packet header format • Connection establishment • Data transmission • Retransmit timeouts • RTT estimator • AIMD Congestion control • Throughput, loss, and RTT equation • Connection teardown • Protocol state machine

  41. TCP: Connection Teardown A B • Data may flow bidirectionally • Each side independently decides when to close connection • In each direction, FIN answered by ACK • Must reliably terminate connection for both sides • During TIME_WAIT state at first side to send FIN, ACK valid FINs that arrive • Must avoid mixing data from old connection with new one • During TIME_WAIT state, disallow all new connections for 2 x max segment lifetime FIN, seqno = i time ACK = i+1 FIN, seqno = j ACK = j+1 enter TIME_WAIT state

  42. TCP: Protocol State Machine

  43. Summary: TCP and Congestion Control • Connection establishment and teardown • Robustness against delayed packets crucial • Round-trip time estimation • EWMAs estimate both RTT mean and deviation • Congestion detection at sender • Timeout: retransmit timer expires, half window, slow start from one packet • Fast Retransmit: three duplicate ACKs, half window, no slow start • Search for optimal sending window size • Additive increase, multiplicative decrease (AIMD) • AIMD converges to high utilization, fair sharing

More Related