Congestion control

Congestion control Lecture 6 CS 653

Why congestion control?

two senders, two receivers one router, infinite buffers no retransmission large delays when congested throughput staurates lout lin : original data unlimited shared output link buffers Host A Host B Causes/costs of congestion: scenario 1

one router, finite buffers sender retransmission of lost packet Causes/costs of congestion: scenario 2 Host A lout lin : original data l'in : original data, plus retransmitted data Host B finite shared output link buffers

always: (goodput) “perfect” retransmission when only loss retransmission of delayed (not lost) packet makes larger (than perfect case) for same l l l > = l l l R/2 in in in R/2 R/2 out out out R/3 lout lout lout R/4 R/2 R/2 R/2 lin lin lin a. b. c. Causes/costs of congestion: scenario 2 “costs” of congestion: • more work (retransmission) for given “goodput” • unneeded retransmissions: link carries multiple copies of pkt

four senders multihop paths timeout/retransmit l l in in Host A Host B Causes/costs of congestion: scenario 3 Q:what happens as and increase ? lout lin : original data l'in : original data, plus retransmitted data finite shared output link buffers

Host A Host B Causes/costs of congestion: scenario 3 lout Another “cost” of congestion: • when packet dropped, any “upstream transmission capacity used for that packet was wasted!

End-end congestion control: no explicit feedback from network congestion inferred from end-system observed loss, delay approach taken by TCP Network-assisted congestion control: routers provide feedback to endhosts single bit indicating congestion (SNA, DECbit, ATM, TCP/IP ECN) explicit rate sender should send recent proposals [XCP] [RCP] revisit ATM ideas Two broad approaches towards congestion control

TCP congestion control

Components of TCP congestion control • Slow start • Multiplicatively increase (double) window • Congestion avoidance • Additively increase (by 1 MSS) window • Loss • Multiplicatively decrease (halve) window • Timeout • Set cwnd to 1 MSS • Multiplicatively increase (double) retransmission timeout upon each further consecutive loss

Retransmission timeout estimation • Calculate EstimatedRTT using moving average • Calculate deviation wrt moving average • Timeout = EstimatedRTT + 4*DevRTT EstimatedRTTi = (1- )*EstimatedRTTi-1 + *SampleRTTi DevRTTi = (1-)*DevRTTi-1 + *|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput: A very very simple model • What’s the average throughout of TCP as a function of window size and RTT T ? • Ignore slow start • Let W be the window size when loss occurs. • When window is W, throughput is W/T • Just after loss, window drops to W/2, throughput to W/2T • Average throughput: 3W/4T

TCP throughput: A very simple model • But what is W when loss occurs? • When window is w and queue has q packets, TCP is sending at rate w/(T+q/C) • For maintaining utilization and steady state • Just before loss, rate = W/(T+Q/C) = C • Just after loss, rate = W/2T = C • For Q = CT (a common thumbrule to set router buffer sizes), a loss occurs every ¼ (3/4W)Q = 3W2/8packets C = link capacity in packets/sec Q = queue capacity in number of packets

Deriving TCP throughput/loss relationship # packets sent per “period” = W TCP window size W/2 period time (rtt)

1 packet lost per “period” implies: ploss or: Deriving TCP throughput/loss relationship # packets sent per “period” W TCP window size W/2 period time (rtt)

Alternate fluid model • Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p • In steady state,

TCP throughput: A better loss rate based “simple” model [PFTK] • With many flows, loss rate and delay are not affected much by a single TCP flow • TCP behavior completely specified by loss and delay pattern along path (bounded by bottleneck capacity) • Given loss rate p and delay T what is TCP’s throughput B packets/sec taking timeouts into account?

What is PFTK modeling? • Independent loss probability p across rounds • Loss ´ triple duplicate acks • Bursty loss in a round: if some packet lost, all following packets in that round also lost • Timeout if < three duplicate acks received

PFTK empirical validation: Low loss

PFTK empirical validation: High loss

Loss-based TCP • Evolution of loss-based TCP • Tahoe (without fast retransmit) • Reno (triple duplicate acks + fast retransmit) • NewReno (Reno + handling multiple losses better) • SACK (selective acknowledgment) common today • Q: what if loss not due to congestion?

Delay-based TCP Vegas • Uses delay as a signal of congestion • Idea: try to keep a small constant number of packets at bottleneck queue • Expected = W/BaseRTT • Actual = W/CurRTT • Diff = Expected - Actual • Try to keep Diff between fixed 1 and 3 • More recent FAST TCP based on Vegas. • Delay-based TCP not widely used today

TCP-Friendliness • Can we try MyFavNew TCP? • Well, is it TCP-friendly? • Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet, or be isolated from TCP. • To co-exist with TCP, it must impose the same long-term load on the network: • No greater long-term throughput as a function of packet loss and delay so TCP doesn't suffer • Not significantly less long-term throughput or it's not too useful

TCP friendly rate control (TFRC) Use a model of TCP's throughout as a function of the loss rate and RTT directly in a congestion control algorithm. • If transmission rate is higher than that given by the model, reduce the transmission rate to the model's rate. • Otherwise increase the transmission rate. • Eg, DCCP (Datagram Congestion Control Protocol), for unreliable congestion control • Q: how to measure/use loss rate and RTT?

High speed TCP

TCP in high speed networks • Example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput • Requires window size W = 83,333 in-flight segments • Throughput in terms of loss rate: • ➜ p = 2·10-10 or equivalently at most one drop every couple hours! • New versions of TCP for high-speed networks needed!

TCP’s long recovery delay • More than an hour to recover from a loss or timeout ~41,000 packets ~60,000 RTTs ~100 minutes

High-speed TCP • Proposals • Scalable TCP, HSTCP, FAST, CUBIC • General idea is to use superlinear window increase • Particularly useful in high bandwidth-delay product regimes

Alternate choices of response functions Scalable TCP - S = 0.15/p Q: Whatever happened to TCP-friendly?

High speed TCP [Floyd] • additive increase, multiplicative decrease • increments, decrements depend on window size

Scalable TCP (STCP) [T. Kelly] • multiplicative increase, multiplicative decrease W  W + a per ACK W  W – b W per window with loss

STCP dynamics From 1st PFLDnet Workshop, Tom Kelly

Active Queue Management

Router Queue Management • normally, packets dropped only when queue overflows • “drop-tail” queueing FCFS Scheduler P3 P1 P6 P5 P4 P2 ISP ISP Internet router router

The case against drop-tail queue management • Large queues in routers are “a bad thing” • Delay: end-to-end latency dominated by length of queues at switches in network • Allowing queues to overflow is “a bad thing” • Fairness: connections transmitting at high rates can starve connections transmitting at low rates • Utilization: connections can synchronize their response to congestion FCFS Scheduler P3 P1 P6 P5 P4 P2

Idea: early random packet drop When queue length exceeds threshold, drop packets with queue length dependent probability • probabilistic packet drop: flows see same loss rate • problem: bursty traffic (burst arrives when queue is near threshold) can be over penalized FCFS Scheduler P3 P1 P6 P5 P4 P2

Random early detection (RED) packet drop Average queue length Drop probability • Use exponential average of queue length to determine when to drop • avoid overly penalizing short-term bursts • react to longer term trends • Tie drop prob. to weighted avg. queue length • avoids over-reaction to mild overload conditions Maxqueue length Forced drop Maxthreshold Probabilisticearly drop Minthreshold No drop Time

Random early detection (RED) packet drop Average queue length Drop probability Maxqueue length Forced drop Maxthreshold Probabilisticearly drop Minthreshold No drop Time Drop probability 100% maxp min max Weighted Average Queue Length

RED summary: why random drop? • Provide gentle transition from no-drop to all-drop • Provide “gentle” early warning • Avoid synchronized loss bursts among sources • Provide same loss rate to all sessions: • With tail-drop, low-sending-rate sessions can be completely starved

Random early detection (RED) today • Many (5) parameters: nontrivial to tune (at least for HTTP traffic) • Gains over drop-tail FCFS not that significant • Still not widely deployed …

Why randomization important? • Synchronization of periodic routing updates • Periodic losses observed in end-end Internet traffic source: Floyd, Jacobson 1994

time spent in state depends on msgs received from others (weak coupling between routers processing) Router update operation: receive update from neighbor process (time: TC2) prepare own routing update (time: TC) <ready> send update (time: Tdto arrive at dest) start_timer (uniform: Tp +/- Tr) timeout, or link fail update wait receive update from neighbor process

Router synchronization • 20 (simulated) routers broadcasting updates to each other • x-axis: time until routing update sent relative to start of round • By t=100,000 all router rounds are of length 120! • synchronization or lack thereof depends on system parameters

Add enough randomization to avoid synchronization Avoiding synchronization receive update from neighbor process (time: TC2) • Choose random timer component, Tr large (e.g., several multiples of TC) prepare own routing update (time: TC) <ready> send update (time: Tdto arrive) start_timer (uniform: Tp +/- Tr) wait receive update from neighbor process

Randomization • Takeaway message: • randomization makes a system simple and robust

Background transport: TCP Nice

What are background transfers? • Data that humans are not waiting for • Non-deadline-critical • Unlimited demand • Examples • Prefetched traffic on the Web • File system backup • Large-scale data distribution services • Background software updates • Media file sharing

Desired Properties • Utilization of spare network capacity • No interference with regular transfers • Self-interference • applications hurt their own performance • Cross-interference • applications hurt other applications’ performance

TCP Nice • Goal: abstraction of free infinite bandwidth • Applications say what they want • OS manages resources and scheduling • Self tuning transport layer • Reduces risk of interference with foreground traffic • Significant utilization of spare capacity by background traffic • Simplifies application design

Congestion control

Congestion control

Presentation Transcript

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

Congestion Control

CONGESTION CONTROL

Congestion Control

Congestion Control

Congestion Control