Application Level Protocol Design • atomic units used by protocol: "messages" • encoding • reusable, protocol independent, TCP server, • LinePrintingprotocol implementation
Protocol Definition • set of rules, governing the communication details between two parties (processes) • different forms and levels; • protocols for exchange bits across a wire • protocols governing administration of super computers. • application level protocols - define interaction between computer applications
Protocol Communication Rules • syntax: how do we phrase the information we exchange. • semantics: what actions/response for information received. • synchronization: whose turn it is to speak (given the above defined semantics).
Protocols Skeleton • all protocols follow a simple skeleton. • exchange information using messages, which define the syntax. • difference between protocols: syntax used for messages, and semantics of protocol.
Protocol Initialization (hand-shake) • communication begins when party sends initiationmessage to other party. • synchronization- each party sends one message in a round robin fashion.
TCP 3-Way Handshake • Establish/ tear down TCP socket connections • computers attempting to communicate can negotiate networkTCPsocket connection • both ends can initiate and negotiate separate TCP socket connections at the same time
A sends a SYNchronize packet to B • B receives A's SYN • B sends a SYNchronize-ACKnowledgement • A receives B's SYN-ACK • A sendsACKnowledge • B receives ACK. • TCP socket connection is ESTABLISHED.
HTTP (Hyper Text Transfer Protocol) • exchanging special text files over the network. • brief (not complete) protocol description: • synchronization: client initiates connection, sends single request, receive reply from server. • syntax: text based, see rfc2616. • semantics: server either sends to the client the page asked for, or returns an error.
What next? • syntax and semantics aspects of protocols. • assume: synchronization works in round robin, i.e., each party sends one message at a time.
Message Format • Protocol syntax: message is the atomic unit of data exchanged throughout the protocol. • message = letter • concentrate on the delivery mechanism.
Framing • streaming protocols - TCP • separate between different messages • all messages are sent on the same stream, one after the other, • receiver should distinguish between different messages. • Solution: messageframing - taking the content of the message, and encapsulating it in a frame (letter - envelop).
Framing – what is it good for? • sender and receiver agree on the framing method beforehand • framing is part of message format/protocol • enable receiver to discover in a stream of bytes where message starts/ends
Framing – how? • Simple framing protocol for strings: • special FRAMING character (e.g., a line break). • each message is framed by two FRAMING characters at beginning and end. • message will not contain a FRAMING character • framing protocol by adding a special tag at start and end. • message can be framed using <begin> / <end> strings. • avoid having <begin> / <end> in message body.
Framing – how? • framing protocol by employing a variable length message format • special tag to mark start of a frame • message contains information on message's length
Textual data • Many protocols exchange data in textual form • strings of characters, in character encoding, (UTF-8) • very easy to document/debug - print messages • Limitation: difficult to send non-textual data. • how do we send a picture? video? audio file?
Binary Data • non-textual data is called binary data. • all data is eventually encoded in "binary" format, as a sequence of bits • "binary data" = data that cannot be encoded as a readable string of characters?
Binary Data • Sending binary data in raw binary format in a stream protocol is dangerous. • may contain any byte sequence, may corrupt framing protocol. • Devising a variable length message format.
Base64 Encoding Binary Data encode binary data using encoding algorithm • Base64 encoding - encodes binary data into a string • Convert every 2 bytes sequence from the binary data into 3 ASCII characters. • used by many "standard" protocols (email to encode file attachments of any type of data).
Encoding using Poco • In C++, Pocolibrary includes module for encoding/decoding byte arrays into/from Base64 encoded ASCII data. • functionality is modeled as a stream "filter" • performs encode/decode on all data flowing through the stream • classesBase64Encoder/Base64Decoder.
Encoding in Java • iharderlibrary. • modeled as stream filters (wrappers around Input/Output Java streams).
Encoding binary data • advantage: any stream of bytes can be "framed" as ASCII data regardless of character encoding used by protocol. • disadvantage - size of the message, increased by 50%. • (we will use UTF-8 encoding scheme)
Protocol and Server Separation code reuse is one of our design goals! • generic implementation of server, which handles all communication details • generic protocol interface: • handles incoming messages • implements protocol's semantics • generates the reply messages.
Protocol-Server Separation: protocol object • protocol object is in charge of implementing expected behavior of our server: • What actionsshould be performed upon the arrival of a request. • requests may be correlated one to another, meaning protocol should save an appropriate state per client.
Example: authenticated session • protocols require user authentication (login), • only authorized users can perform certain actions. • protocol is statefull - serving requests of client can be in at least 2 distinct states: • authenticated (user has already logged in) • non-authenticated (user has not provided login). • by state of the protocol object, behavior of protocol object is different
Protocol and Server Separation separate different tasks server must perform. • Acceptnew connections from new clients. • Receive new bytes from connected clients. • Parse incoming bytes from clients into messages ("de-serialization" / "unframing"). • Dispatchmessage to right method on server side to execute requested operation. • Send back an answer to a connected client after an action has been executed.
The key participants in this architecture are: • Tokenizer - syntax, tokenizing a stream of data into messages. • MessagingProtocol– semantics, handling received messages and generating responses.
implementations of interfaces: • generic server • MessageTokenizer • LinePrinitingProtocol,
Interfaces • implement separation between protocol and server. Define: • message (can be encoded in various ways: Base64, XML, text). • Our messages encoded as plain UTF-8 text. • framing of messages - delimiters between messages sent in stream. • protocol interface which handles each individual message.
ConnectionHandler • server accepted new connection from client. • server creates ConnectionHandler- will handle all incoming messages from this client. • ConnectionHandler- maintains state of connection for specific client • Ex: user perform "login" -ConnectionHandlerobject remembers this in its state
ConnectionHandler - Socket • ConnectionHandlerhas access to Socket connecting server to client process. • TCP server - Socket connection is viewed as a pair of InputStream and OutputStream. • streams of bytes – client and the server exchange a bunch of bytes.
Tokenizer- in charge of parsing a stream of bytes into a stream of messages • Tokenizerinterface: filter between Socketinput stream and protocol • Protocol accesses the input stream only through the tokenizer. • instead of "seeing" a stream of bytes, it sees a stream of messages. • Many libraries model such "filters" on streams as wrappers around a lower-level stream. • OutputStreamWriter - wraps stream and performs encoding from one character encoding to another • BufferedReader - adds a layer of buffering around a non-buffered input stream.
Tokenizer • splits incoming bytes from the socket into messages. • For simplicity, we model the Tokenizer as an iterator… • protocol will see the input stream from the socket as an iterator over messages (instead of an iterator over bytes).
Messaging Protocol • protocol interface • wraps together: socket and Tokenizer • Pass incoming messages to MessagingProtocol- execute action requested by client. • look at the message and decide on action • decision may depend on the state • Once the action is performed - answer back from the MessagingProtocol.
We use a String to pass data from Tokenizerto Protocol, and back from Protocol. • Serialization/Deserialization (encode/decode parameters to/from Strings) performed by Protocol - and not by the Tokenizer. • Tokenizeris only in charge of deframing (split bytes into messages).
Connection Handler • active object: • handles one connection to one client for the whole period during which the client is connected • (from the moment the connection is accepted, until one of the sides decides to close it). • modeled as a Runnable class.
Connection Handler • holds references to: • TCP socket connected to the client, • Tokenizer • an instance of the MessagingProtocol.
connection handler is generic, works with any implementation of a messaging protocol. • assumes data exchanged between client and server is in form of encoded strings • encoder passed to constructor as an Encoder interface.
What’s left? • only need to implement: • specific framing handler (tokenizer) • specific protocol we wish to use. • continue our line printing example…
Message Tokenizer • we use a framing method based on a single character delimiter. • assume stream of messages, delimited by FRAMING = we will use the character '\0‘
important part is connection termination and exception handling at any moment • most of the code in low-level input/output and socket manipulation relates to error handling and connection termination.