430 likes | 439 Vues
Datagram Congestion Control Protocol (DCCP). CISC 856 - TCP/IP and Upper Layer Protocols Presented by Ke Li ( kli@cis.udel.edu ) 2007/12/4 Thanks to Prof Amer, Kireeti Valicherla, and Xiaofeng Han. Overview. Motivation Connections Unreliable datagram transfer Modular congestion control
E N D
Datagram Congestion Control Protocol (DCCP) CISC 856 - TCP/IP and Upper Layer Protocols Presented by Ke Li(kli@cis.udel.edu) 2007/12/4 Thanks to Prof Amer, Kireeti Valicherla, and Xiaofeng Han
Overview • Motivation • Connections • Unreliable datagram transfer • Modular congestion control • Miscellaneous issues
DCCP: Which Layer? DCCP SCTP Adapted from Figure 2-11TCP/IP Protocol Suite, Behrouz A. Forouzan
Streaming Media • What streaming media needs? • Timeliness of data • What streaming media doesn’t need? • Retransmissions of lost/expired packets • Annoying “rebuffering…” – HOL blocking Source: http://streaming.wisconsin.edu/Accessible_Tutorials/Tutorial1/p1-3.htm
Streaming Media Over TCP Server Client D12 A12 D13 D14 - D16 A12 Data is not useful now Retransmit D13 D: data TCP-PDU A: ack TCP-PDU
Streaming Media Over UDP Server Client • No congestion control in UDP flows • Harmful to Internet health
IP B2 IP B1 IP A1 IP B3 IP A2 IP network Streaming Media with SCTP • Multi-streams over a single association • Uses TCP-like congestion control • Retransmission • Partial Reliability: require at least 1 RTT
Other target applications • Internet Telephony • Constant-packet-rate sources • Change data rate by adjusting packet size • Extremely sensitive to delay • Demands a slower congestion response • Interactive games • Can quickly make use of available bandwidth • Prefers TCP-like sawtooth congestion response
Solution: DCCP • provides unreliable flow of datagrams • provides congestion control using • Acknowledgment • Sequence number • Connection oriented • does not provide • Full reliability: no-loss & no-error & in-order & no-duplicate • flow control • streaming • DCCP = UDP + congestion control or = TCP – bytestream semantics – full reliability
DCCP connections DCCP A DCCP B • Full-duplex bi-directional connection • Two logical half connections • A-to-B half connection: • Application data sent from A to B • Corresponding acks from B to A • In practice overlapped: DataAck • Each half connection can have independent features negotiated during connection initiation, e.g., different congestion control mechanism Data Ack
DCCP Connection Initiation Client Server CLOSED CLOSED LISTEN DCCP Request REQUEST RESPOND DCCP Response PARTOPEN DCCP Ack OPEN
DCCP Data Transfer Phase Client Server DCCP A DCCP B OPEN PARTOPEN DCCP Data OPEN DCCP Ack DCCP Data DCCP DataAck …
DCCP Connection Termination Client Server Client Server OPEN OPEN OPEN OPEN DCCP Close DCCP CloseReq CLOSING CLOSEREQ DCCP Reset CLOSED CLOSING DCCP Close TIMEWAIT DCCP Reset CLOSED CLOSED Wait 2 MSL TIMEWAIT Wait 2 MSL CLOSED
DCCP Data Transfer Example 1. without loss DCCP A DCCP B • Seq # on DCCP-PDU, not byte • Each PDU carries a Seq # • Seq # increases per PDU • detect loss – congestion control • network duplicate – ignored • Pure acks also consume Seq # • possible to detect ack loss • No cum ack – ack # is the Greatest Seq # Received (GSR) normally Data(seq #1) Ack(seq # 10, ack #1) Data(seq # 2) Ack(seq # 11, ack # 2)
DCCP Data Transfer Example 2. non-large burst of loss DCCP A DCCP B • Maybe loss: no data retransmissions • Separate options indicate PDU loss or ECN info: Ack Vector (SACK-like) • Ack of Ack: clear receiver’s state • A PDU is ackable – its header has been successfully processed (e.g., valid header checksum and seq #) • Acked PDUs may be dropped -- no guarantee of data delivery • Due to receiver buffer overflow or corruption -- endpoint loss • Data dropped option Data(seq #1) Ack(seq # 10, ack #1) Data(seq # 2) Data(seq # 3) Ack(seq # 11, ack # 3) Data(seq # 4) Ack(seq # 12, ack # 4)
Sequence Validity Check • Both endpoints keep expected seq#/ack# -- in sync • To detect seq# attacks, significant reordering, or one endpoint crash • Out of sync after large burst of loss • No cum ack, use separate DCCP-Sync/SyncAck PDU to recover • Sequence number variables • Maintained at each endpoint for each connection • GSR – Greatest Seq# Received • GSS – Greatest Seq# Sent • Sequence validity windows • Window width W: Seq Win feature • Expected seq# [SWL, SWH], SWH=GSR+3W/4 • Expected ack# [AWL, AWH], AWH=GSS • Seq#/ack# out of range – seq invalid PDU, ignore and send Sync PDU exp seq# window exp ack# window W/4 3W/4 W GSR GSS
DCCP Data Transfer Example 3. large burst of loss DCCP A DCCP B Exp ack# Exp seq# Data(seq #20) [13, 20] GSS=20 [19, 26] GSR=20 Ack(seq #60, ack #20) Data(seq #21) [14, 21] GSS=21 … Sync.ack# = out-of-sync seq# recvd Data(seq #30) [23, 30] GSS=30 Data(seq #31) [24, 31] GSS=31 seq# 31>26, out of range, send Sync.ack#=31 Sync(seq # 61, ack # 31) SyncAck(seq #32, ack #61) [25, 32] GSS=32 GSR=32 [31, 38] If Sync is ackable, SyncAck.ack# = Sync.seq# If valid ack#: update GSR, back to sync • GSR – Greatest Sequence number Received • GSS – Greatest Sequence number Sent • Window Size = 8
DCCP Data Transfer Example 4. slight reordering DCCP A DCCP B Exp ack# Exp seq# Data (seq #20) [13, 20] GSS=20 [19, 26] GSR=20 Ack (seq #60, ack #20) Data (seq #21) [14, 21] GSS=21 Data (seq #22) [15, 22] GSS=22 [21, 28] GSR=22 Ack (seq #61, ack #22) seq# 21 within range, Ack.ack# = GSR Data may be delivered out-of-order Ack (seq #62, ack #22) Data (seq #23) [16, 23] GSS=23 Ack (seq #63, ack #23) GSR=23 [22, 29] • GSR – Greatest Sequence number Received • GSS – Greatest Sequence number Sent • Window Size = 8
DCCP Data Transfer Example 5. medium reordering DCCP A DCCP B Exp ack# Exp seq# Data (seq #18) … Data (seq #22) [15, 22] GSS=22 [21, 28] GSR=22 Ack (seq #61, ack #22) seq#18<21, out of range, send Sync.ack#=18 Sync (seq # 62, ack #18) SyncAck (seq #23, ack #62) [16, 23] GSS=23 GSR=23 [22, 29] If Sync is ackable, SyncAck.ack# = Sync.seq# If valid ack#: update GSR, back to sync • GSR – Greatest Sequence number Received • GSS – Greatest Sequence number Sent • Window Size = 8
DCCP Data Transfer Example 6. significant reordering (or blind attack) DCCP A DCCP B Exp ack# Exp seq# Data (seq #10) … Data (seq #22) [15, 22] GSS=22 [21, 28] GSR=22 Ack (seq #61, ack #22) seq#10<21, out of range, send Sync.ack#=10 Sync (seq # 62, ack #10) ack#10<15, out of range, nonackable Sync, ignored Data (seq #23) [16, 23] GSS=23 Ack (seq #63, ack #23) GSR=23 [22, 29] • GSR – Greatest Sequence number Received • GSS – Greatest Sequence number Sent • Window Size = 8
DCCP PDU Types Connection Initialization Data Transfer Connection Termination Resynchronization
DCCP PDU Formats • Generic header: 16 bytes (using 48bits seq#) or 12 bytes (using short 24 bits seq#) • Additional fields: fixed length field, e.g. ack# (48b, 24b) • Options: variable length field up to1008 bytes, e.g. Init Cookie, Ack Vector, Change, Confirm, Slow Receiver, Data Dropped
DCCP Generic Header 0 8 16 24 X =1
DCCP Generic Header: short 0 8 16 24 X =0
Acknowledgement Sub-Header 0 8 16 24 X =1 X =0
DCCP Checksum 0 4 31 • Header Checksum Coverage (CsCov): 4 bits • CsCov = 0: covers the DCCP header (generic, additional), DCCP options, network-layer pseudoheader, and all application data in the packet (possibly some padding) • CsCov = 1-15: covers the DCCP header, DCCP options, network-layer pseudoheader, and the initial (CsCov-1)*4 bytes of the packet's application data. • Applications that can tolerate corruption can request header checksum only covers part or no app data at all • Corrupted data could be delivered, impact on congestion control • Improve delivery rate and perceived performance • Data checksum option provides strong CRC checksum for all application data
Modular Congestion Control • Each congestion control mechanism supported by DCCP is assigned a 1-byte congestion control identifier, or CCID: a number from 0 to 255. • CCID 0 and CCID 1 are reserved • TCP-like congestion control – CCID 2 • TCP friendly rate control (TFRC) – CCID 3 • CCID 4-255 are reserved • CCID is a feature to be negotiated and agreed on both endpoints
CCID 2: TCP-like Congestion Control • Congestion control like TCP [RFC4341] • Ack contains Seq# of all received packets within some window • Congestion event: packet loss, or ECN -> Halve congestion window • Abrupt rate changes • Reverse-path (Ack) congestion control: • Ack Ratio R, integer, [2, cwnd/2] • For each congestion window of data where at least one of the corresponding acks was lost or ECN-marked, R is doubled; • For each cwnd/(R2-R) consecutive congestion window of data whose acks were not lost or ECN-marked, R is decreased by 1; • The above formula comes from wanting to increase the number of acks per congestion window, namely cwnd/R, by 1 for every congestion-free window that passes.
CCID 2: TCP-like Congestion Control [RFC4341] • Applications using this: • Respond quickly to changes in available bandwidth • Must tolerate abrupt rate changes -- sawtooth (initial) ssthresh cwnd Online interactive games prefer this kind of congestion control Loss
CCID 3: TCP Friendly Rate Control [RFC 4342] • Receiver-based feedback mechanism • Equation-based congestion control • Minimizes abrupt changes in sending rate • Maintains longer-term fairness with TCP Streaming Media prefer steadier and less bursty traffic as provided by TFRC
CCID 3: TCP Friendly Rate Control • The receiver measures the loss event rate and feeds this information back to the sender • The sender uses these feedback messages to measure the round-trip time (RTT) • The loss event rate and RTT are then fed into TFRC's throughput equation, giving the acceptable transmit rate • The sender then adjusts its transmit rate to match the calculated rate
Congestion related options • Slow receiver option • Receiver sends this option to its sender to indicate it is having trouble keeping up with the sender’s data • Sender shouldn't increase sending rate for about 1 RTT time • Data dropped option • indicates that a packet was dropped due to corruption, receiver buffer overflow, application requirement, or other non congestion reasons.
Feature Negotiation Options • DCCP features on what value two endpoints agree • Examples • Congestion control identifier (CCID) • ECN capable / incapable • Ack Ratio • Allow Short Seq# • DCCP features are identified by a feature number and an endpoint • Notation “F/X” is used • Use Change and Confirm options to negotiate
F/X Notation Feature location for all F/B Feature location for all F/A B A Feature Remote for all F/B Feature Remote for all F/A
Feature Negotiation • General-purpose reliable negotiating • Almost always during the connection initiation handshake, but it can begin at any time • Each happens in a single option exchange • Multiple values, preference order
Feature Negotiation Example 1 Client Server CCID/Server agreed as 2 Change R (CCID, 2) Confirm L (CCID, 2) CCID/Server agreed as 4 Change L (CCID, 3 4) Confirm R (CCID, 4, 4 2)
Feature Negotiation Example 2 Client Server Change L(CCID, 3 2) Change L(CCID, 3 2) Change L(CCID, 3 2) Change L(CCID, 3 2) Confirm R(CCID, 3, 3 2) CCID/Client agreed as 3
DCCP: Miscellaneous issues • Maximum Packet Size (MPS) • Maintained for each DCCP session • Minimum of congestion control MPS (CCMPS) and path MTU • Generally, DCCP should NOT fragment data – reduce robustness • Applications can usually get better error tolerance by producing packets smaller than the PMTU • Security Concerns • Prevents SYN-flooding-like DDoS attacks – init cookie • Prevents Sequence Number Attack • Large Sequence Number • Sequence and Acknowledgement Number Windows
DCCP: Summary • Transport layer protocol • Unreliable datagrams • Modular congestion control • Negotiable features
Implementation • Linux Kernel Version 2.6.14 • Preliminary FreeBSD implementation • tcpdump 3.9.4 and later includes DCCP support
References • Designing DCCP: Congestion Control Without Reliability, by Eddie Kohler, Mark Handley, and Sally Floyd. Proc. ACM SIGCOMM 2006, September 2006. • RFC 4340 - Datagram Congestion Control Protocol (DCCP), Eddie Kohler, Mark Handley, and Sally Floyd, March 2006 • DCCP Overview, Eddie Kohler and Sally Floyd, July 2003 • DCCP link from one of the authors http://www.read.cs.ucla.edu/dccp/
Questions and Comments ? Thank you! Have a nice holiday!