1 / 46

End-to-End Protocols

End-to-End Protocols. Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance. End-to-End Protocols. Common end-to-end services guarantee message delivery deliver messages in the same order they are sent deliver at most one copy of each message

Télécharger la présentation

End-to-End Protocols

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. End-to-End Protocols Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance

  2. End-to-End Protocols • Common end-to-end services • guarantee message delivery • deliver messages in the same order they are sent • deliver at most one copy of each message • support arbitrarily large messages • support synchronization • allow the receiver to flow control the sender • support multiple application processes on each host • Underlying best-effort network • drop messages • reorders messages • delivers duplicate copies of a given message • limits messages to some finite size • delivers messages after an arbitrarily long delay

  3. Simple Demultiplexor (UDP) • User Datagram Protocol (UDP) - Unreliable and unordered datagram service • Adds multiplexing to allow multiple application processes on each host to share the network • A port is the abstraction of the communication endpoints. • Use a <port/mailbox, host> pair to identify a process • Endpoints identified by ports • servers have well-known ports – DNS:53, talk:517 • see /etc/services on Unix

  4. Simple Demultiplexor (UDP) • A port is implemented by a message queue. • UDP has no flow control. • UDP header format • Optional checksum: psuedo header + UDP header + data • psuedo header: Protocol number, Source IP address, Destination IP address, and UDP length field • Verify that this message has been delivered between the correct two endpoints. 0 16 31 SrcPort DstPort Checksum Length Data

  5. Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout

  6. TCP Overview • Transmission Control Protocol (TCP) is a reliable, connection-oriented, and byte-stream service. • A byte-stream service • application writes bytes • TCP sends segments • application reads bytes • TCP is a full-duplex protocol. • TCP supports a demultiplexing mechanism.

  7. Application process Application process W rite Read bytes bytes … … TCP TCP Send buffer Receive buffer … Segment Segment Segment T ransmit segments TCP Overview • Flow control: keep sender from overrunning receiver • Congestion control: keep sender from overrunning network • TCP uses the sliding window algorithm.

  8. Data Link Versus Transport • Potentially have many connections between different hosts • need explicit connection establishment and termination • Potentially different RTT • need adaptive timeout mechanism • Potentially long delay in network • need to be prepared for arrival of very old packets • Potentially different capacity at destination • need to accommodate different node capacity • Potentially different network capacity • need to be prepared for network congestion

  9. TCP Segment Format • The packets exchanged between TCP peers are called segments. • How does TCP decide that it has enough bytes to send a segment? • TCP maintains a variable, called the maximum segment size (MSS), and it sends a segment as soon as it has collected MSS bytes from the sending process. • TCP supports a push operation, and the sending process invokes this operation to effectively flush the buffer of unsent byte. • The final trigger is a timer that periodically fires.

  10. Segment Format

  11. TCP Header Format • SrcPort: Source port, DstPort: Destination port • Acknowledgement, SequenceNum, and AdvertisedWindow fields are all involved in TCP’s sliding window algorithm. • The 6-bit Flags field is used to replay control information between TCP peers: • SYN, FIN: establish and terminate a TCP connection. • RESET, PUSH: push operation • URG: urgent data up to UrgPtr bytes • ACK: Acknowledgement

  12. Data (SequenceNum) Sender Receiver Acknowledgment + AdvertisedWindow Segment Format (cont) • Each connection identified with 4-tuple: • (SrcPort, SrcIPAddr, DsrPort, DstIPAddr) • Sliding window + flow control • acknowledgment, SequenceNum, AdvertisedWinow • Flags • SYN, FIN, RESET, PUSH, URG, ACK • Checksum • pseudo header + TCP header + data

  13. Three-Way Handshake • The algorithm used by TCP to establish and terminate a connection is a called a three-way handshake. • A timer is scheduled for each of the first two segments. • The client and server select an initial starting sequence number at random and have to exchange starting sequence numbers with each other at connection setup time. • This is to protect against the chance that a segment from an early connection might interfere with a latter one. • TCP can be specified in a state-transition diagram.

  14. Connection Establishment and Termination Active participant Passive participant (client) (server) SYN, SequenceNum = x , y 1 + SYN + ACK, SequenceNum = x Acknowledgment = ACK, Acknowledgment = y + 1

  15. CLOSED Active open /SYN Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes FIN/ACK TIME_WAIT CLOSED State Transition Diagram

  16. Sliding Window • TCP’s sliding window algorithm serves several purposes: • It guarantees the reliable delivery of data. • It ensures that data is delivered in order. • It enforces flow control between the sender and the receiver. • In order to keep the sender from overrunning the receiver’s buffer, the receiver advertises a window size to the sender by specifying the AdvertisedWindow field in the TCP header.

  17. Sending application Receiving application TCP TCP LastByteWritten LastByteRead LastByteAcked LastByteSent NextByteExpected LastByteRcvd Sliding Window Revisited • Sending side • LastByteAcked < = LastByteSent • LastByteSent < = LastByteWritten • buffer bytes between LastByteAcked and LastByteWritten • Receiving side • LastByteRead < NextByteExpected • NextByteExpected < = LastByteRcvd +1 • buffer bytes between NextByteRead and LastByteRcvd

  18. Flow Control • Send buffer size: MaxSendBuffer • Receive buffer size: MaxRcvBuffer • Receiving side • LastByteRcvd - LastByteRead < = MaxRcvBuffer • AdvertisedWindow = MaxRcvBuffer - (LastByteRcvd - NextByteRead) • Sending side • LastByteSent - LastByteAcked < = AdvertisedWindow • EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked) • LastByteWritten - LastByteAcked < = MaxSendBuffer • block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer • Always send ACK in response to arriving data segment • Persist when AdvertisedWindow= 0

  19. Protection Against Wrap Around • 32-bit SequenceNum Bandwidth Time Until Wrap Around T1 (1.5 Mbps) 6.4 hours Ethernet (10 Mbps) 57 minutes T3 (45 Mbps) 13 minutes FDDI (100 Mbps) 6 minutes STS-3 (155 Mbps) 4 minutes STS-12 (622 Mbps) 55 seconds STS-24 (1.2 Gbps) 28 seconds

  20. Keeping the Pipe Full • 16-bit AdvertisedWindow Bandwidth Delay x Bandwidth Product T1 (1.5 Mbps) 18KB Ethernet (10 Mbps) 122KB T3 (45 Mbps) 549KB FDDI (100 Mbps) 1.2MB STS-3 (155 Mbps) 1.8MB STS-12 (622 Mbps) 7.4MB STS-24 (1.2 Gbps) 14.8MB

  21. Adaptive Retransmission(Original Algorithm) • Measure SampleRTT for each segment/ ACK pair • Compute weighted average of RTT • EstRTT = axEstimatedRTT + bxSampleRTT • where a+b = 1 • a between 0.8 and 0.9 • b between 0.1 and 0.2 • Set timeout based on EstRTT • TimeOut=2xEstRTT

  22. Karn/Partridge Algorithm Sender Receiver Sender Receiver • Do not sample RTT when retransmitting • Double timeout after each retransmission Original transmission Original transmission TT TT ACK Retransmission SampleR SampleR Retransmission ACK

  23. Jacobson/ Karels Algorithm • New Calculations for average RTT • Diff = sampleRTT - EstRTT • EstRTT = EstRTT + (8x Diff) • Dev = Dev + 8 ( |Diff| - Dev) • where 8 is a factor between 0 and 1 • Consider variance when setting timeout value • TimeOut = mxEstRTT + fxDev • where m = 1 and f = 4 • Notes • algorithm only as good as granularity of clock (500ms on Unix) • accurate timeout mechanism important to congestion control (later)

  24. TCP Extensions • Implemented as header options • Store timestamp in outgoing segments • Extend sequence space with 32-bit timestamp (PAWS) • Shift (scale) advertised window

  25. Remote Procedure Call Outline Basics Protocol Stack Presentation Formatting

  26. Remote Procedure Call Basics • Problems with sockets • The read/write (input/output) mechanism is used in socket programming. • Socket programming is different from procedure calls which we usually use. • To make computing transparent from locations, input/output is not the best way.

  27. Remote Procedure Call Basics • A procedure call is a standard abstraction in local computation. • Procedure calls are extended to distributed computation in Remote Procedure Call (RPC) as shown in Figure 5.11. • A caller invokes execution of procedure in the callee via the local stub procedure. • The implicit network programming hides all network I/O code from the programmer. • Objectives are simplicity and ease of use.

  28. Remote Procedure Call Basics • The concept is to provide a transparent mechanism that enables the user to utilize remote services through standard procedure calls. • Client sends request, then blocks until a remote server sends a response (reply). • Advantages: user may be unaware of remote implementation (handled in a stub in library); uses standard mechanism. • Disadvantages: prone to failure of components and network; different address spaces; separate process lifetimes.

  29. Caller Callee (client) (server) Return Return Arguments Arguments value value Server Client stub stub Request Reply Request Reply RPC RPC protocol protocol RPC Components • Protocol Stack • BLAST: fragments and reassembles large messages • CHAN: synchronizes request and reply messages • SELECT: dispatches request to the correct process • Stubs

  30. RPC Timeline Client Server Blocked Request Blocked Computing Reply Blocked

  31. SunRPC • IP implements BLAST-equivalent • except no selective retransmit • SunRPC implements CHAN-equivalent • except not at-most-once • UDP + SunRPC implement SELECT-equivalent • UDP dispatches to program (ports bound to programs) • SunRPC dispatches to procedure within program

  32. Sun RPC • It is designed for client-server communication over Sun NFS network file system. • UDP or TCP can be used. If UDP is used, the message length is restricted to 64 KB, but 8 - 9 KB in practice. • The Sun XDR is originally intended for external data representation. • Valid data types supported by XDR include int, unsigned int, long, structure, fixed array, string (null terminated char *), binary encoded data (for other data types such as lists).

  33. Sun XDR • A program number and a version number are supplied. • The procedure number is used as a procedure definition. • Single input parameter and output result are being passed.

  34. Files interface in Sun XDR const MAX = 1000; typedef int FileIdentifier; typedef int FilePointer; typedef int Length; struct Data { int length; char buffer[MAX]; }; struct writeargs { FileIdentifier f; FilePointer position; Data data; }; struct readargs { FileIdentifier f; FilePointer position; Length length; }; program FILEREADWRITE { version VERSION { void WRITE(writeargs)=1; 1 Data READ(readargs)=2; 2 }=2; } = 9999;

  35. Sun RPC • The interface compiler rpcgen is used to generate the following from interface definition. • client stub procedures • server main procedure, dispatcher and server stub procedures • XDR marshalling and unmarshalling procedures used by dispatcher and client, server stub procedures. • Binding: • portmapper records program number, version number, and port number. • If there are multiple instance running on different machines, clients make multicast remote procedure calls by broadcasting them to all the port mappers.

  36. RPC Interface Compiler

  37. Example (Sun RPC) • long sum(long) example • client localhost 10 • result: 55 • Need RPC specification file (sum.x) • defines procedure name, arguments & results • Run (interface compiler) rpcgen sum.x • generates sum.h, sum_clnt.c, sum_xdr.c, sum_svc.c • sum_clnt.c & sum_svc.c: Stub routines for client & server • sum_xdr.c: XDR (External Data Representation) code takes care of data type conversions

  38. RPC XDR File (sum.x) struct sum_in { long arg1; }; struct sum_out { long res1; }; program SUM_PROG { version SUM_VERS { sum_out SUMPROC(sum_in) = 1; /* procedure number = 1*/ } = 1; /* version number = 1 */ } = 0x32123000; /* program number */

  39. Example (Sun RPC) • Program-number is usually assigned as follows: • 0x00000000 - 0x1fffffff defined by SUN • 0x20000000 - 0x3fffffff defined by user • 0x40000000 - 0x5fffffff transient • 0x60000000 - 0xffffffff reserved

  40. RPC Client Code (rsum.c) #include ''sum.h'' main(int argc, char* argv[]) { CLIENT* cl; sum_in in; sum_out *outp; // create RPC client handle; need to know server's address cl = clnt_create(argv[1], SUM_PROG, SUM_VERS, ''tcp''); in.arg1 = atol(argv[2]); // number to be squared // Call RPC; note convention of RPC function naming if ( (outp = sumproc_1(&in, cl)) == NULL) err_quit(''%s'', clnt_sperror(cl, argv[1]); printf(''result: %ld\n'', outp->res1); }

  41. RPC Server Code (sum_serv.c) #include "sum.h" sum_out* sumproc_1_svc (sum_in *inp, struct svc_req *rqstp) { // server function has different name than client call static sum_out out; // why is this static? int i; out.res1 = inp->arg1; for (i = inp->arg1 - 1; i > 0; i--) out.res1 += i; return(&out); } // server's main() is generated by rpcgen

  42. Compilation Linking rpcgen sum.x cc -c rsum.c -o rsum.o cc -c sum_clnt.c -o sum_clnt.o cc -c sum_xdr.c -o sum_xdr.o cc -o client rsum.o sum_clnt.o sum_xdr.o cc -c sum_serv.c -o sum_serv.o cc -c sum_svc.c -o sum_svc.o cc -o server sum_serv.o sum_svc.o sum_xdr.o

  43. Internal Details of Sun RPC • Initialization • Server runs: register RPC with port mapper on server host (rpcinfo –p) • Client runs: clnt_create contacts server's port mapper and establishes TCP connection with server (or UDP socket) • Client • Client calls local procedure (client stub: sumproc_1), that is generated by rpcgen. Client stub packages arguments, puts them in standard format (XDR), and prepares network messages (marshaling). • Network messages are sent to remote system by client stub. • Network transfer is accomplished with TCP or UDP.

  44. Internal Details of Sun RPC • Server • Server stub (generated by rpcgen) unmarshals arguments from network messages. Server stub executes local procedure (sumproc_1_svc) passing arguments received from network messages. • When server procedure is finished, it returns to server stub with return values. • Server stub converts return values (XDR), marshals them into network messages, and sends them back to client • Back to Client • Client stub reads network messages from kernel • Client stub returns results to client function

  45. Details of RPC

  46. 0 31 0 31 XID XID MsgType = CALL MsgType = REPLY RPCVersion = 2 Status = ACCEPTED Data Program Version Procedure Credentials (variable) Verifier (variable) Data SunRPC Header Format • XID (transaction id) is similar to CHAN’s MID • Server does not remember last XID it serviced • Problem if client retransmits request while reply is in transit

More Related