Sliding Window in Computer Networks
E N D
Presentation Transcript
TCP Tutorial- Part II - Internet Computing Laboratory @ KUT (http://icl.kut.ac.kr) Youn-Hee Han It is licensed under a Creative Commons Attribution 2.5 License
Sliding Window Computer Network
Using Sliding Window:Transport Versus Data Link Layer • Potentially connects many different hosts • need explicit connection establishment and termination • Potentially different RTT • need adaptive timeout mechanism • Potentially long delay in network • need to be prepared for arrival of very old segments • Potentially different capacity at destination • need to accommodate different node capacity (flow-control) • Potentially different network capacity • need to be prepared for network congestion Computer Network
Sliding Window • Sliding Window Used By TCP • Measured in byte positions • Illustration • Bytes through 2 are acknowledged • Bytes 3 through 6 not yet acknowledged • Bytes 7 though 9 waiting to be sent • Bytes above 9 and lie outside the window cannot be sent • TCP sliding window mechanism operates at the octet (byte) level • TCP allows the window size to vary over time • Variable size window means thatTCP provides “flow control” Computer Network
Flow Control & TCP Window • Receiver controls flow by telling sender size of currently available buffer measured in bytes • Each acknowledgement contains a window advertisement that specifies how many additional bytes of data the receiver is prepared to accept. • The advertised window size represents the receiver’s current buffer size • Sender never sends more than the advertised window size • Receiver buffer will never overflow Computer Network
Flow Control & TCP Window Sender = client Receiver = server 0 Ack : 1001 , win : 4000 seq : 1001 , 4000 bytes 4000 Ack : 5001 , win : 0 2000 Ack : 5001 , win : 2000 seq : 5001 , 1000 bytes 3000 Computer Network
Window Size Issue See:http://icl.kut.ac.kr/2007_1/G_Course/tcp.shtml Default Window Size Computer Network
MSS (Maximum Segment Size) Computer Network
MSS (Maximum Segment Size) • Overview • Maximum Transmission Unit (MTU) is defined by the maximum payload size of the Layer 2 frame. • MTU determines the maximum size of a Layer 3 packet/fragment. • Layer 3 payload determines Layer 4 Maximum Segment Size(MSS) Computer Network
MSS (Maximum Segment Size) • Overview • MSS: Maximum Segment Size • Largest payload size that TCP can send for this connection. • Usually, MSS is calculated by Maximum Transmission Unit (MTU) - 40 bytes. Computer Network
MSS (Maximum Segment Size) • Overview • An example of MSS negotiation • In this example, both sides use 960 bytes as MSS. Computer Network
Link MTU • Link MTU • The max packet size that can be transmitted over a link • If a router receives a packet whose size is bigger than its outbound Link MUT, it must fragment the packet. • Most modern router and link implementations now support MTUs of 1500 • but there are some older, e.g., international routers out there that do not support 1500. Computer Network
Path MTU • Path MTU • The minimum link MTU of all links in a path between a source and a destination • Source host can fragment payloads of upper-layer protocols of which packet size is larger than the Path MTU • all IP hosts (and routers) are required to accept or reassemble fragments of which size is 576 octets Default (and Safe) value of Path MTU is 576! • Path MTU Discovery • Used to send packets bigger than 576 bytes • Increase Path MTP • To detect increases in a path’s PMTU, a node periodically increases it. • Increasing Path MTU must not be done less than 5 minutes after ICMP has been received (Recommend : 10 minutes) • Minimal implementation can omit Path MTU Discovery as long as all packets kept 576 bytes Computer Network
2. ICMP Packet Too Big message (MTU =1400)(Note : Packet Discard) 1400 1500 1600 1. Source Node initially assume that…PMTU = MTU of first hop=1500 4. ICMP Packet Too Big message (MTU=576)(Note : Packet Discard) 576 1400 576 1400 1500 1500 1600 3. Source Node assume that…PMTU = MTU notified by ICMP=1400 1600 5. Source Node assume that…PMTU = MTU notified by ICMP=576 Path MTU • Path MTU Discovery 576 Computer Network
Path MTU • How to get Path MTU by yourself? • PING <IP-Address/Domain Name> -f -l <estimated MTU - 28> • 28 represents IP Header (20 Bytes) and an ICMP-Header (8 Bytes) • -f: Don’t Fragment • -l: Payload Size • Ex] ping -f -l 1472 www.yahoo.com … Reply from 209.131.36.158: bytes=1472 time=141ms TTL=50 … ping -f -l 1473 www.yahoo.com … Packet needs to be fragmented but DF set … Computer Network
Path MTU & MSS • How to determine TCP MSS • SndMSS = MIN(Path MTU - sizeof(TCPHDR) - sizeof(IPHDR), Advertised MSS) • Case I: both the IP header and the TCP header are minimum size, that is, 20 octets • SndMSS = MIN((576 - 20 - 20, Advertised MSS) = MIN(536, Advertised MSS) • Case II: if the IP Security option (11 octets) were in use • SndMSS = MIN((576 - 20 - 20 - 11, Advertised MSS) = MIN(525, Advertised MSS) • In Modern Internet, path MTU is usually 1500 and MSS can be 1460 • Self-check: http://www.speedguide.net:8080 Computer Network
TCP in Action Computer Network
Reliability in TCP • Checksum used to detect bit level errors • Sequence numbers used to detect sequencing errors • Duplicates are ignored • Out of order packets are reordered (or dropped) • Lost packets are retransmitted • Timeouts used to detect lost packets • Requires RTT calculation • Requires sender to maintain data until it is ACKed Computer Network
TCP in Action • The sending node will: • split the data sequence into packets • for each packet, give: • destination address (for routers) • source address (for replies) • sequence number (for reconstruction) • sum check (for error detection) • send the packets • The receiving node will: • discard any corrupted packets (for which the sum check doesn't agree) • request retransmission of any missing packets • restore packets to original order • reconstruct the original byte stream Computer Network
Site 2 Site 1 User types ‘C’ Seq=42, ACK=79, data = ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ Seq=79, ACK=43, data = ‘C’ host ACKs receipt of echoed ‘C’ Seq=43, ACK=80 time simple telnet scenario TCP data exchange Seq. # is: - byte stream “number” of first byte in segment’s data Ack. # is: - seq # of next byte expected from other side - Cumulative ACK Computer Network
Sequence Number • Indicates the position of the data in the packets • Every byte is sequenced • Used for re-ordering packets and finding lost packets • Initial Sequence Number (ISN) is randomly assigned for every TCP connection • For security reasons, ISN should not be easy to be guessed • [Note] • SYN and FIN packets also consume 1 sequence number, although they do not include any data. Computer Network
Cumulative Acknowledgement • TCP Ack specifies the sequence number of the next octet that the receiver expects to receive • An acknowledgment of sequence number X indicates that all bytes up to but not including X have been received. • TCP Ack is called cumulative because it reports how much of the stream has accumulated • Pros. • Ack is easy to generate unambiguously • Cons. • The sender does not receive information about all successful transmission Computer Network
Cumulative Acknowledgement • Let’s think the following scenario (1/3) sender receiver Seq. #=101, 100 bytes data Seq. #=201, 100 bytes data Seq. #=301, 100 bytes data Seq. #=401, 100 bytes data Seq. #=501, 100 bytes data Timeout Seq. #=601, 100 bytes data Acq. #=201 101 Acq. #=401 201 Acq. #=501 301 Acq. #=601 401 Acq. #=701 501 601 Computer Network
Cumulative Acknowledgement • Let’s think the following scenario (2/3) sender receiver Seq. #=101, 100 bytes data Seq. #=201, 100 bytes data Seq. #=301, 100 bytes data Seq. #=401, 100 bytes data Seq. #=501, 100 bytes data Timeout Seq. #=601, 100 bytes data Acq. #=201 101 Acq. #=201 201 Seq. #=201, 100 bytes data Acq. #=201 301 Seq. #=301, 100 bytes data Acq. #=201 401 Seq. #=401, 100 bytes data Acq. #=201 501 Seq. #=501, 100 bytes data 601 Seq. #=601, 100 bytes data Computer Network
Cumulative Acknowledgement • Let’s think the following scenario (3/3) sender receiver Seq. #=101, 100 bytes data Seq. #=201, 100 bytes data Seq. #=301, 100 bytes data Seq. #=401, 100 bytes data Seq. #=501, 100 bytes data Timeout Seq. #=601, 100 bytes data Acq. #=201 Acq. #=201 Acq. #=201 Seq. #=201, 100 bytes data 101 Acq. #=201 201 Acq. #=201 301 Seq. #=301, 100 bytes data Acq. #=701 401 Duplicate ACKs & Fast Retransmit 501 Computer Network 601
Duplicate ACK and Fast Retransmit • How the TCP sender know the Segment loss • Timeout ! • Receives duplicate Ack. From the receiver • Duplicate Ack • The sender sends a sequence of segments to the receiver • The receiver fails to receive the expected segments • The receiver sends “duplicate ACKs” • Ex.] • Sender sends #1~#8segments to the receiver • The receiver does not receive #5 segment, but receives #6 segment • As the correspondence of #6 segment, the receiver still sends an Ack for the #5 segment • Although the receiver receives #7 and #8segments continuously, it still sends Acks for the #5 segment Computer Network
Duplicate ACK and Fast Retransmit • The cases of sending “Duplicate Ack” • CASE I – The segments are simply out of order • 1개에서 2개까지의 duplicate Ack가 수신되는 동안 순서가 바뀐 segment가 수신 측에 전달되어 본래 받기로 한 Ack를 받을 가능성이 높다 • CASE II – Segment 가 손실된 경우 • 송신 측은 duplicate Ack를 연속적으로 여러 번 수신하게 됨 • “Three” Duplicate Ack and Fast Retransmit • Only one or two “Duplicate Ack” does not distinguish CASE I from CASE II. • It is highly possible that the segment was lost if the sender receive three duplicate acknowledge (the duplicate ack. threshold =3) • That is, if the sender receives three duplicate Ack., it should send the corresponding segment instantly without waiting to retransmission timer expiration. Computer Network
Time-out period often relatively long: long delay before resending lost packet Detect lost segments via “duplicate ACKs”. Sender often sends many segments back-to-back If segment is lost, there will likely be many duplicate ACKs. Duplicate ACK and Fast Retransmit • If sender receives 3 ACKs for the same data, it supposes that segment was lost: • fast retransmit:resend segment before timer expires Computer Network
Delayed Ack • TCP has a rule like the following: • If you send me two packets, I will send you one acknowledgement (ACK). • If you send me one packet, I will wait 200 ms but not more than 200 ms before I respond with an ACK. (IETF RFC recommends 500ms) • Delayed ACK (Optional, But Recommended) • Every two receipts of segment, the receiver sends ACK. • At least within 500ms, the receiver sends ACK. • 500ms내에 수신측에서 송신측으로 보낼 데이터가 있으면 ACK 정보를 데이터 Segment에 Piggyback시킴 • This rule reduces the number of unnecessary ACKs. Computer Network
TCP Scenario (1/5) Segment Corruption Receiver sender Seq : 1001, 200bytes Seq : 1201, 200bytes Seq : 1401, 200bytes Segment 3 - corrupted ACK : 1401 OK OK Seq : 1401, 200bytes Timeout ACK : 1601 OK Computer Network
TCP Scenario (2/5) Lost segment Receiver sender Seq : 1001, 200bytes Seq : 1201, 200bytes Seq : 1401, 200bytes Segment 3 - lost ACK : 1401 OK OK Timeout Seq : 1401, 200bytes ACK : 1601 OK Computer Network
TCP Scenario (3/5) Computer Network
TCP Scenario (4/5) Computer Network
TCP Scenario (5/5) Cumulative Ack Scenario II Receiver sender Seq : 1001, 200bytes Seq : 1201, 200bytes Seq : 1401, 200bytes Acknowledgement Lost ACK : 1401 ACK : 1601 OK OK OK Computer Network
TCP ACK generation[RFC 1122, RFC 2581] TCP Receiver action Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Immediately send single cumulative ACK, ACKing both in-order segments Immediately send duplicate ACK, indicating seq. # of next expected byte Immediate send single cumulative ACK Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Arrival of in-order segment with expected seq #. One other segment has ACK pending Arrival of out-of-order segment higher-than-expect seq. # . Gap detected Arrival of segment that partially or completely fills gap Computer Network
TCP Timeout and Retransmission Computer Network
Internet Environment • Designed for Internet environment • Delays on one connection vary over time • Delays vary widely between connections real-time plotting of measured round trip time (RTT) delay from lancelet.caida.org (in Ann Arbor, MI) to www.ucsd.edu (in San Diego, CA) Computer Network
Timeout and Retransmission • Fixed value for timeout will fail • Waiting too long introduces unnecessary delay • Not waiting long enough (early timeout) wastes network bandwidth with unnecessary retransmission • Retransmission strategy must be adaptive Computer Network
RTT (Round-trip Time) • TCP keeps estimate of round-trip time (RTT) • RTT: • The time received an ACK minus the time a data was sent • “Timeout Interval” is calculated from RTT • It is derived from observed RTT • Appropriate time for retransmission is very different from each communication path. Computer Network
Adaptive Retrasmission • Difficulties with adaptive retransmission • Segments or ACKs can be lost or delayed, making roundtrip estimation difficult or inaccurate • Round-trip times vary over several orders of magnitude between different connections • Traffic is bursty, so round-trip times fluctuate wildly on a single connection • Retransmission can cause congestion on routers or hosts Computer Network
RTT Smoothing • Solution: Smoothing • Adaptive retransmission schemes keep a statistically smoothed round-trip estimate • Smoothing keeps running average from fluctuating wildly, and keeps TCP from overreacting to change • Difficulty: choice of smoothing scheme Computer Network
RTT Smoothing • Smoothing Scheme • Let “EstimatedRTT” be current (old) average round-trip time • Let “NewRTT” be a new sample • Compute • EstimatedRTT = a * EstimatedRTT + b * NewRTT • where a + b = 1 • Recommended values [RFC2988]: a = 0.875, b = 0.125 (=1/8) • Large a makes estimate less susceptible to a single long delay (more stable) • Large b makes estimate track changes in round-trip time quickly Computer Network
TCP Timeout and Retransmission Computer Network Smoothed RTT
A Measured SNR Values #1-2 Smoothing Effect dB • Smoothed SNR = a* SNR_p + b* SNR_c • SNR_p = Previous SNR • SNR_c = Current SNR Use Moving Average! ms b=0.6 b=0.9 b=0.1 b=0.3 Computer Network
Original Algorithm • Adaptive Retransmission • Compute • EstimatedRTT = a * EstimatedRTT + b * NewRTT • where a + b = 1 • Recommended values [RFC2988]: a = 0.875, b = 0.125 (=1/8) • Set timeout based on EstimatedRTT • TimeOut Interval = 2 * EstimatedRTT Computer Network
Jacobson/Karels Algorithm • Jacobson/Karels Algorithm • More Finer Determination of Timeout Interval • EstimtedRTT plus “safety margin” • large variation in EstimatedRTT larger safety margin • Then set the improved timeout interval: • DevRTT is a good approximation of the Standard Deviation • By using DevRTT, we can avoid computing square root. DevRTT = (1-)* DevRTT + *|NewRTT-EstimatedRTT| (typically, = 0.25) Timeout Interval = EstimatedRTT + 4*DevRTT Computer Network
TCP Timeout Intervel based on Jacobson/Karels Algorithm • Measurement Of Internet Delays For 100 Successive Packets At 1 Second Intervals • TCP Round-Trip Estimation For Sampled Internet Delays Computer Network
Retransmission Ambiguity A B A B Original transmission Original transmission RTO RTO ACK Sample RTT Sample RTT retransmission retransmission ACK Which one is correct?So, what should we do? Computer Network
Karn’s Algorithm • Karn’s Algorithm • Improve accuracy of the RTT measurement. • RTT measurement with packet loss • Duration A: use the most recent retransmission for RTT measurement. • Duration B: use the original transmission for RTT measurement. • Which duration is suitable for RTT measurement? • example 1: • We should use "duration A" as RTT in this case. • But, this assumption is not always correct. Data Retransmission B Retransmission A Ack Computer Network
Karn’s Algorithm • Karn’s Algorithm • example 2: • We cannot use "duration A" as RTT in this case! • So, what should we do? Data Retransmission B Retransmission A Ack Computer Network