1 / 20

The Application Layer WWW

The Application Layer WWW. Chapter 7. WWW: HTTP. HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server maintains no information about the client. pair of 1 request and 1 response originally per pair 1 TCP connection was established and closed

keefe
Télécharger la présentation

The Application Layer WWW

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Application LayerWWW Chapter 7

  2. WWW: HTTP HyperText Transfer Protocol, to transfer pages between a client and a server Stateless: server maintains no information about the client • pair of 1 request and 1 response • originally per pair 1 TCP connection was established and closed • now more pairs / connection (a persistent connection) • less overhead, better settings of self-learning parameters

  3. HTTP request message • Request are in ASCI text, e.g. • GET /somedir/page.html HTTP/1.1 • Host: www.cucg.gh, the server name (more than 1 per IP) • Connection: Close • User-agent: Mozilla/4.0 • Accept-language: fr

  4. HTTP Methods • Conditional Get: with If-Modified-Since header • PUT, POST and DELETE allows changing a site using HTTP

  5. HTTP Message Headers The accept headers tell the server what the client is willing to accept in case it has a limited repertoire of what it can handle. It also allows the server to send back a page in a certain language, if it has a choice.

  6. Browser • A web page may contain HTML code, images in GIF or JPEG format, sound in MP3 format, video in MPEG format, documents in PDF, MSWord or other formats, or information in many other formats. • Some are handled directly by a browser. • Some by a plug-in, a code module that the browser fetches from disk and installs as an extension to itself. • For others the browser starts up a helper application as a separate process.

  7. Client side actions • Clicking in a browser on “http://www.cs.ru.nl/~ths/index.html”. • The steps that occur then are: • The browser determines the URL (by seeing what was selected) • The browser asks DNS for the IP address of www.cs.ru.nl • DNS answers with the IP number • The browser makes a TCP connection to that number on port 80 • It then sends a GET /~ths/index.html command • The www.cs.ru.nl server sends the file index.html • The TCP connection is released • The browser displays all the text in index.html • The browser fetches all images indicated in index.html, by establishing a TCP connection for each of them, and displays them.

  8. URLs – Uniform Resource Locaters A URL consists of 3 parts: a protocol, the DNS name of the host, and the file name.

  9. Server side actions • This performs the following steps in its main loop: • Accept a TCP connection from a client. • Resolve the name of the page requested. • Authenticate the client if needed. • Perform access control on the client, can the requested page be sent given the client's identity and location. • Perform access control on the web page, some pages may only been sent to clients on particular domains, e.g. inside the company. • Check the cache if the page is there, otherwise get it from disk. • Determine the MIME type and include it in the header of the reply. • Other possible tasks, like building a user profile, gathering statistics or making an entry in a logfile. • Return a reply, either the requested file or error information • Release the TCP connection

  10. Statelessness and Cookies • For newer applications the server likes to know more about the user requesting pages e.g. to keep information between request • IP numbers are not suitable for that, because of dynamic IP addresses and NAT and there may be more than one user on a computer. • When a client requests a page, the server may send in the reply header a “cookie”: a small, at most 4 KB, text string. • Browers may accept it. When the browser later sends a request it checks whether it has cookies for the domain the request is for. It includes them in the request so the server can use them.

  11. HTML – HyperText Markup Language By embedding the markup commands within each HTML file, a browser may reformat any web page. A web page can be shown full screen on a 1024 x 768 display with 24-bit color but also in a small window on a 640 x 480 screen with 8-bit color. The designer of a web page can indicate how the page can be best displayed, but the client can overwrite these settings. In contrast to Adobe Acrobat and Flash. HTML is an application of SGML (Standard Generalized Markup Language), XHTML uses XML (Extensible Markup Language).

  12. Cascaded style sheets HTML is constantly changing. Version 1.0 was the de- facto standard used in the Mosaic browser. When new browsers came along version 2.0 became an Internet standard. Version 3.0 added many new features, including tables, toolbars and cascaded style sheets. This gives page designers more control over the desired appearance of pages on browsers. The semantics of a text are defined in the HTML file, while a style sheet defines the appearance: h1 { color: #FF0000;} h2 { color: #0000FF;} body { color: #000000; background: #ffffff; } .red {color: #FF0000;} e.g. <p class=“red”>…</p> HTML 4.01 is now the current version.

  13. XML XML (eXtensible Markup Language) describes Web content in a structured way. On the left a structure called book_list, a list of books, each having 3 fields, is defined. The structure could have repeated fields (e.g. multiple authors), optional fields (e.g. title of included CD-rom) and alternative fields.

  14. XSL How the XML page is to be formatted and displayed on a screen is determined by a XSL (eXtensible Style Language) file. It looks like HTML but has stricter syntax requirements, a browser should reject it if for instance a closing tag like </th> is missing. XSL commands are given with a xsl tag, like <xsl:xxxx>. The for-each command iterates over the given structure, the list of books. XHTML (X from eXtended) is essentially HTML 4 reformulated in XML. It needs a XSL file to provide display meaning to its tags. Strict performance to the syntax is required, like closing tags, tags and attributes in lower case, attributes in quotation marks and proper nesting of tags.

  15. Forms for interaction Input is returned in a string added to the URL: http://www.ru.nl/cgi-bin/ query?name=jan&city=a… A + indicates a space, a %2B indicates a typed in +, etc. On the server the CGI (Common Gateway Interface) starts the script (or program) 'query' with the string after the ? as its parameter. The script does its work, e.g. search a database, and returns its result as a HTML page.

  16. Server-side Dynamic Web Pages Another way to generate dynamic content is to embed little scripts inside HTML pages to be executed by the server to generate the page. A popular language for this is PHP (PHP: Hypertext Preprocessor). To use it the server has to understand PHP, usually page containing PHP have file extension 'php' rather than 'html' or 'htm'. JSP (Java Server Pages) is similar to PHP, except that the dynamic part is written in the JAVA programming language. ASP (Active Server Pages) is Microsoft's version, using Visual Basic Script for generating the dynamic content.

  17. with PHP The PHP commands are included in the HTML tag <?php ... ?>. On the top a form with 2 entry fields. Below is the 'action.php' file with the PHP commands. They have access to the information filled in the form using the name of the fields, e.g. $age. They produce a text string which is included in the output send to the client. PHP is a powerful programming language oriented towards interfacing between the WEB and a server database. It is open source and freely available, and specially designed to work well with Apache, which is also open source and is the world's most widely used Web server.

  18. Client-Side Dynamic Web Pages Here a program contained in a web page is executed by the browser and the result is displayed. No information is send to the server. JavaScript can be used for this, a scripting language very loosely inspired by some ideas from JAVA. It is a full-blown programming language, with variables, strings, arrays, objects, functions, and all the usual control structures. Another way to make web pages highly interactive is through the use of applets. These are small JAVA programs embedded with the 'applet' tag and executed by a Java Virtual Machine. As they are interpreted, the interpreter can prevent them from doing Bad Things. In theory at least, in practice many bugs were found. Microsoft's answer to SUN's applet was allowing web pages to hold ActiveX controls. They are faster than applets, but only run on Window machines.

  19. Client-Side Javascript It has the ability to manage windows and frames, set and get cookies, deal with forms and handle hyperlinks. As these things are rather internal to browsers, and often different for different browsers and versions, it is difficult to write JavaScript programs which work correctly for all browsers, versions and platforms. It can also track mouse movements and actions. When the mouse is over a link, a window with a certain image is displayed. It is embedded in a HTML page using the 'script' tag or inline at certain locations.

  20. Client-Server overview Cascaded Style Sheets are part of HTML. Plug-ins or helpers can display other contents, such as ps, pdf, video, sound and images, e.g. SVG (scalable vector graphics). SGI scripts can be in various languages, Perl, Python, C, etc.

More Related