Understanding Transport Layer Congestion Control Mechanisms

Chapter 3Transport Layer – part B • Adapted from Computer Networking: A Top Down Approach, 6th edition, Jim Kurose, Keith RossAddison-Wesley, March 2012 Transport Layer

Outline • Principles of Congestion Control • TCP Cong. Control Review • Congestion Control – Beyond TCP Transport Layer

congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control! manifestations: lost packets (buffer overflow at routers) long delays (queueing in router buffers) a top-10 problem! Principles of congestion control TransportLayer

two senders, two receivers one router, infinite buffers output link capacity: R no retransmission maximum per-connection throughput: R/2 R/2 lout delay lin R/2 lin R/2 Causes/costs of congestion: scenario 1 original data: lin throughput:lout Host A unlimited shared output link buffers Host B • large delays as arrival rate, lin, approaches capacity TransportLayer

one router, finite buffers sender retransmission of timed-out packet application-layer input = application-layer output: lin = lout transport-layer input includes retransmissions : lin lin Causes/costs of congestion: scenario 2 ‘ lin: original data lout l'in:original data, plus retransmitted data Host A finite shared output link buffers Host B TransportLayer

idealization: perfect knowledge sender sends only when router buffers available R/2 lout lin R/2 Causes/costs of congestion: scenario 2 lin: original data lout copy l'in:original data, plus retransmitted data A free buffer space! finite shared output link buffers Host B TransportLayer

Idealization: known loss packets can be lost, dropped at router due to full buffers sender only resends if packet known to be lost Causes/costs of congestion: scenario 2 lin: original data lout copy l'in:original data, plus retransmitted data A no buffer space! Host B TransportLayer

Idealization: known loss packets can be lost, dropped at router due to full buffers sender only resends if packet known to be lost R/2 when sending at R/2, some packets are retransmissions but asymptotic goodput is still R/2 (why?) lout lin R/2 Causes/costs of congestion: scenario 2 lin: original data lout l'in:original data, plus retransmitted data A free buffer space! Host B TransportLayer

when sending at R/2, some packets are retransmissions including duplicated that are delivered! lin timeout Causes/costs of congestion: scenario 2 Realistic: duplicates • packets can be lost, dropped at router due to full buffers • sender times out prematurely, sending twocopies, both of which are delivered R/2 lout R/2 lin lout copy l'in A free buffer space! Host B TransportLayer

when sending at R/2, some packets are retransmissions including duplicated that are delivered! lin Causes/costs of congestion: scenario 2 Realistic: duplicates • packets can be lost, dropped at router due to full buffers • sender times out prematurely, sending twocopies, both of which are delivered R/2 lout R/2 “costs” of congestion: • more work (retrans) for given “goodput” • unneeded retransmissions: link carries multiple copies of pkt • decreasing goodput TransportLayer

four senders multihop paths timeout/retransmit Causes/costs of congestion: scenario 3 Q:what happens as linand lin’ increase ? A:as red lin’ increases, all arriving blue pkts at upper queue are dropped, blue throughput g 0 lout Host A lin: original data Host B l'in:original data, plus retransmitted data finite shared output link buffers Host D Host C TransportLayer

Causes/costs of congestion: scenario 3 C/2 lout lin’ C/2 another “cost” of congestion: • when packet dropped, any “upstream transmission capacity used for that packet was wasted! TransportLayer

end-end congestion control: no explicit feedback from network congestion inferred from end-system observed loss, delay approach taken by TCP network-assisted congestion control: routers provide feedback to end systems single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) explicit rate for sender to send at Approaches towards congestion control two broad approaches towards congestion control: TransportLayer

TCP congestion control: • goal: TCP sender should transmit as fast as possible, but without congesting network • Q: how to find rate just below congestion level • decentralized: each TCP sender sets its own rate, based on implicit feedback: • ACK: segment received (a good thing!), network not congested, so increase sending rate • lost segment: assume loss due to congested network, so decrease sending rate Transport Layer

TCP Congestion Control • Challenge • determining the available capacity in the first place (Without additional protocols or APIs) • adjusting to changes in the available capacity (Adjustments must be made quickly since a large window may already be out on the network) • Implementation • increase CongestionWindow when congestion goes down (slowly) • decrease CongestionWindow when congestion goes up (quickly) • Question: how does the source determine whether or not the network is congested? Transport Layer

loss, so decrease rate X TCP congestion control: bandwidth probing • “probing for bandwidth”: increase transmission rate on receipt of ACK, until eventually loss occurs, then decrease transmission rate • continue to increase on ACK, decrease on loss (since available bandwidth is changing, depending on other connections in network) ACKs being received, so increase rate X X X TCP’s “sawtooth” behavior X sending rate time • Q: how fast to increase/decrease? • details to follow Transport Layer

sender limits transmission: cwnd is dynamic, function of perceived network congestion TCP sending rate: roughly: send cwnd bytes, wait RTT for ACKS, then send more bytes < cwnd ~ ~ RTT TCP Congestion Control: details sender sequence number space cwnd last byte ACKed last byte sent sent, not-yet ACKed (“in-flight”) rate bytes/sec LastByteSent- LastByteAcked cwnd TransportLayer

when connection begins, increase rate exponentially until first loss event: initially cwnd = 1 MSS double cwnd every RTT done by incrementing cwnd for every ACK received summary:initial rate is slow but ramps up exponentially fast time TCP Slow Start Host B Host A one segment RTT two segments four segments TransportLayer

loss indicated by timeout: cwnd set to 1 MSS; window then grows exponentially (as in slow start) to threshold, then grows linearly loss indicated by 3 duplicate ACKs: TCP RENO dup ACKs indicate network capable of delivering some segments cwnd is cut in half window then grows linearly TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks) TCP: detecting, reacting to loss TransportLayer

Q: when should the exponential increase switch to linear? A: when cwnd gets to 1/2 of its value before timeout. Implementation: variable ssthresh on loss event, ssthresh is set to 1/2 of cwndjust before loss event TCP: switching from slow start to CA TransportLayer

TCP: congestion avoidance AIMD • when cwnd > ssthresh grow cwnd linearly • increase cwnd by 1 MSS per RTT • approach possible congestion slower than in slowstart • implementation: cwnd = cwnd + MSS/cwnd for each ACK received • ACKs: increase cwnd by 1 MSS per RTT: additive increase • loss: cut cwnd in half (non-timeout-detected loss ): multiplicative decrease AIMD: Additive Increase Multiplicative Decrease Transport Layer

TCP AIMD Transport Layer

TCP congestion control: • goal: TCP sender should transmit as fast as possible, but without congesting network • Q: how to find rate just below congestion level • decentralized: each TCP sender sets its own rate, based on implicit feedback: • ACK: segment received (a good thing!), network not congested, so increase sending rate • lost segment: assume loss due to congested network, so decrease sending rate Transport Layer

TCP Congestion Control • Challenge • determining the available capacity in the first place (Without additional protocols or APIs) • adjusting to changes in the available capacity (Adjustments must be made quickly since a large window may already be out on the network) • Implementation • increase CongestionWindow when congestion goes down (slowly) • decrease CongestionWindow when congestion goes up (quickly) • Question: how does the source determine whether or not the network is congested? Transport Layer

loss, so decrease rate X TCP congestion control: bandwidth probing • “probing for bandwidth”: increase transmission rate on receipt of ACK, until eventually loss occurs, then decrease transmission rate • continue to increase on ACK, decrease on loss (since available bandwidth is changing, depending on other connections in network) ACKs being received, so increase rate X X X TCP’s “sawtooth” behavior X sending rate time • Q: how fast to increase/decrease? • details to follow Transport Layer

sender limits transmission: LastByteSent-LastByteAcked  CongWin Roughly, CongWin is dynamic, function of perceived network congestion How does sender perceive congestion? loss event = timeout or 3 duplicate acks TCP sender reduces rate (CongWin) after loss event three mechanisms: slow start AIMD conservative after timeout events CongWin rate = Bytes/sec RTT TCP Congestion Control Transport Layer

TCP: congestion avoidance AIMD • when cwnd > ssthresh grow cwnd linearly • increase cwnd by 1 MSS per RTT • approach possible congestion slower than in slowstart • implementation: cwnd = cwnd + MSS/cwnd for each ACK received • ACKs: increase cwnd by 1 MSS per RTT: additive increase • loss: cut cwnd in half (non-timeout-detected loss ): multiplicative decrease AIMD: Additive Increase Multiplicative Decrease Transport Layer

TCP AIMD Transport Layer

TCP Tahoe • Slow-start • Congestion control upon time-out or DUP-ACKs • When the sender receives 3 duplicate ACKs for the same sequence number, sender infers a loss • Congestion window reduced to 1 and slow-start performed again • Simple • Congestion control too aggressive Transport Layer

TCP Reno • Tahoe + Fast re-transmit • Packet loss detected both through timeouts, and through DUP-ACKs • Sender reduces window by half, the ssthresh is set to half of current window, and congestion avoidance is performed (window increases only by 1 every round-trip time) • Fast recovery ensures that pipe does not become empty • Window cut-down to 1 (and subsequent slow-start) performed only on time-out Transport Layer

TCP New-Reno • TCP-Reno with more intelligence during fast recovery • In TCP-Reno, the first partial ACK will bring the sender out of the fast recovery phase • Results in timeouts when there are multiple losses • In TCP New-Reno, partial ACK is taken as an indication of another lost packet (which is immediately retransmitted). • Sender comes out of fast recovery only after all outstanding packets (at the time of first loss) are ACKed Transport Layer

TCP SACK • TCP (Tahoe, Reno, and New-Reno) uses cumulative acknowledgements • When there are multiple losses, TCP Reno and New-Reno can retransmit only one lost packet per round-trip time • SACK enables receiver to give more information to sender about received packets allowing sender to recover from multiple-packet losses faster Transport Layer

TCP SACK (Example) • Assume packets 5-25 are transmitted • Let packets 5, 12, and 18 be lost • Receiver sends back a CACK=5, and SACK=(6-11,13-17,19-25) • Sender knows that packets 5, 12, and 18 are lost and retransmits them immediately Transport Layer

TCP Vegas • Idea: source watches for some sign that some router's queue is building up and congestion will happen soon; e.g., • RTT is growing • sending rate flattens Transport Layer

Summary: TCP Congestion Control • when cwnd < ssthresh, sender in slow-start phase, window grows exponentially. • when cwnd >= ssthresh, sender is in congestion-avoidance phase, window grows linearly. • when triple duplicate ACK occurs, ssthresh set to cwnd/2, cwnd set to ~ ssthresh • when timeout occurs, ssthresh set to cwnd/2, cwnd set to 1 MSS. Transport Layer

3 W avg TCP thruput = bytes/sec RTT 4 W W/2 TCP throughput • avg. TCP thruput as function of window size, RTT? • ignore slow start, assume always data to send • W: window size (measured in bytes) where loss occurs • avg. window size (# in-flight bytes) is ¾ W • avg. thruput is 3/4W per RTT TransportLayer

. 1.22 MSS TCP throughput = RTT L TCP Futures: TCP over “long, fat pipes” • example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput • requires W = 83,333 in-flight segments • throughput in terms of segment loss probability, L [Mathis 1997]: ➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10 – a very small loss rate! • new versions of TCP for high-speed TransportLayer

fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP Fairness TCP connection 1 bottleneck router capacity R TCP connection 2 TransportLayer

two competing sessions: additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally Why is TCP fair? equal bandwidth share R loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 2 throughput loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R TransportLayer

Fairness and UDP multimedia apps often do not use TCP do not want rate throttled by congestion control instead use UDP: send audio/video at constant rate, tolerate packet loss Fairness, parallel TCP connections application can open multiple parallel connections between two hosts web browsers do this e.g., link of rate R with 9 existing connections: new app asks for 1 TCP, gets rate R/10 new app asks for 11 TCPs, gets R/2 Fairness (more) TransportLayer

Understanding Transport Layer Congestion Control Mechanisms

Understanding Transport Layer Congestion Control Mechanisms

Presentation Transcript

Review of Previous Lecture

Ahmed Saad 1,2 , Torsten Möller 1 , and Ghassan Hamarneh 2

OSI Transport Layer

Chapter 3: Transport Layer Part B

Chapter 7: Transport Layer

Transport Layer Socket programming

Chapter 7: Transport Layer

Chapter 3 Transport Layer

OSI Transport Layer

Chapter 3: Transport Layer

Chapter 3 Transport Layer Part C

Chapter 3: Transport Layer Part B

Chapter 5 Network Layer Part 1

Part 5: Transport Layer

Chapter 2 Networking Overview

Chapter 3: Transport Layer

Chapter 3: Transport Layer

Chapter 3 Transport Layer

CSE 524: Lecture 12