Chapter 3
Chapter 3. Web Technology. Web Publishing. Static documents HTML, ASCII text, Postscript, PDF GIF, JPEG, MOV, Quicktime, AVI AU, WAV, MP3, RealAudio Dynamic documents executable content Java, Javascript, Active-X, Dynamic HTML Variable documents Dynamically-generated (on-the-fly)
Chapter 3
E N D
Presentation Transcript
Chapter 3 Web Technology
Web Publishing • Static documents • HTML, ASCII text, Postscript, PDF • GIF, JPEG, MOV, Quicktime, AVI • AU, WAV, MP3, RealAudio • Dynamic documents • executable content • Java, Javascript, Active-X, Dynamic HTML • Variable documents • Dynamically-generated (on-the-fly) • CGI, FastCGI, JSP, PHP
HyperText Markup Language • Document structure description • (sub-)sections • headings • tables • No (?) layout information • style sheets • font mapping • Defined in SGML / XML (XHTML) • Document Type Definition (DTD)
HTML Basic Elements • <HTML> • <HEAD> • <TITLE>Hello World</HTML> • </HEAD> • <BODY> • Oh, what a beautiful morning.... • </BODY> • </HTML>
HTML Elements • physical text styles • logical test styles • test segmentation • tables • inline pictures • anchors/links • forms • specials • image maps (client/server) • background pic/audio, marquees
HTML Development • Browser war (NS Navigator vs. MSIE) • World Wide Web Consortium (W3C) • rendering • internationalization • forms • active content • object models
HTML History • HTML 2.0 • 1st version conforming to SGML • Core HTML (lists, forms, headings, fonts, image maps, ...) • HTML 3.2 (no HTML 3.0 & HTML 3.1) • many elements get additional attributes • font sizes, font faces, colors • form-based file upload • client-side image maps • tables, APPLET tag • STYLE and SCRIPT tags (placeholders, no exact mechanisms)
HTML 3.2 • Script Tags • JavaScript, VBScript, Jscript • eventless • APPLET tag • Styles (CSS) • ALIGNs generalized • client-side image-maps • FONT • formulas • FORM based file upload
HTML 4.0 • accessability • labels, legend, clusters • iterators, access keys • character encodings, language codes (lang) • bidrectional text (dir) • incremental rendering • tables, images • objects (nested) • scripts and events • frames
HTML 4.0 (Frames) • TARGETs <A href="slide2.html" target="dynamic">slide 2.</A> • IFRAMEs <IFRAME src="foo.html" width="400" height="500" scrolling="auto" frameborder="1"> [Your user agent does not support frames] </IFRAME>
HTML 4.0 (Scripts) • client-side execution at • loading • user input (focus, pointer movement) • language selection • <META http-equiv="Content-Script-Type" content="type"> • <SCRIPT type="text/vbscript" src="http://someplace.com/progs/vbcalc"> • </SCRIPT> • evaluation order, object modification (DOM)
HTML 4.0 (Events) • Loading • onload, unload • Pointer • onmousemove, onmousmoveover, ... • Keyboard • onkeypress, onkeydown, ... • Form handling • onsubmit, onreset, onselect, onchange • <INPUT NAME="userName" onblur="validUserName(this.value)">
HTML 4.01 • Common attributes for many elements: Id, Title, Style, Class • Improved tables (Thead, Tbody, Tfoot, column handling, formatting), Incremental rendering of tables • Improved forms • STYLE element now encloses style sheet instructions (CSS) • Style sheets: separation of content and styles (CSS)
HTML 4.01 • Frames, Embedded documents (IFRAME) • APPLET element deprecated (OBJECT) • Intrinsic events (onclick, ondblclick, onmouseover, onmouseout, ...) • OBJECT element (IMAGE, APPLET, IFRAME) • SCRIPT element (client-side scripting - JavaScript, VBScript, ...)
<OBJECT data="http://some.server.com/me.png" type="image/png"> A picture of me. </OBJECT> <OBJECT codetype="application.java" classid="java:AudioItem.start codebase="http://some.server.com/java/AudioStuff/" width="15" height="15"> <PARAM name="sound" value="Welcome.au"> Play a welcoming sound. </OBJECT> <OBJECT data="embed_me.html"> Warning: embed_me.html could not be embedded. </OBJECT> OBJECT Element Examples
More OBJECT examples <OBJECT data="canyon.png" type="image/png"> This is a <EM>closeup</EM> of the Grand Canyon. </OBJECT> <OBJECT classid="http://www.miamachina.it/clock.py"> An animated clock. </OBJECT> <OBJECT data="embed_me.html"> Warning: embed_me.html could not be included. </OBJECT>
Client-side Scripting 1 • Languages: JavaScript, VBScript, Python, Tcl, ... • Security ?
<SCRIPT type= " text/vbscript" src="http://someplace.com/vbcalc"> </SCRIPT> <INPUT NAME="userName" onblur="validUserName(this.value)"> <INPUT NAME="edit1" size="50"> <SCRIPT type="text/vbscript"> Sub edit1_changed() If edit1.value = "abc" Then button1.enabled = True Else button1.enabled = False EndIf EndSub </SCRIPT> Client-side Scripting 2
<INPUT NAME="num" onchange="if (!checkNum(this.value, 1, 10)) { this.focus(); this.select(); } else {thanks()}" VALUE="0"> <BUTTON type="button" name="mybutton" value="10"> <SCRIPT type="text/javascript"> function my_onclick() { ... } document.form.mybutton.onclick = my_onclick </SCRIPT> </BUTTON> Client-side Scripting 3
Cascading Style Sheets • Separation of content (HTML, XML documents) and presentation style (CSS) • simplified Web authoring • easier Web site maintenance • CSS vs. XSL • CSS was defined earlier • XSL is still a draft while CSS is already supported by browsers • XSL is more powerful => too complex for many users/applications
<HTML> <HEAD> <STYLE TYPE="text/css"> H1 { color: yellow; font-style: italic; font-family: helvetica } .verde { color: #00FF00 } </STYLE> </HEAD> <BODY> <H1>Heading 1</H1> <H2 class="verde">Heading 2</H2> <P STYLE="color: red"> paragraph text </P> yet another paragraph </BODY> </HTML> Heading 1 Heading 2 paragraph text yet another paragraph Style and HTML
<HTML> <HEAD> <TITLE>Document with Cascading Style Sheets</TITLE> <LINK rel="alternate stylesheet" title="compact" href="small-base.css" type="text/css"> <LINK rel="alternate stylesheet" title="compact" href="small-extras.css" type="text/css"> <LINK rel="alternate stylesheet" title="big print" href="bigprint.css" type="text/css"> <LINK rel="stylesheet" href="common.css" type="text/css"> </HEAD> <BODY> ... </BODY> </HTML> @import "fineprint.css" print; @import url("bluish.css") projection, tv; External CSS and Cascasding
BODY { background: #def url("recbg.jpg"); color: black; margin: 0.5em; } A:link { color: #00ff00; } A:visited { color: #0000ff; } TABLE.navigation { border-style: none; } TD.navigation { background: black; text-align: center; font-weight: bold; font-family: helvetica, sans-serif; padding: 0.2em; vertical-align: top; } <HTML> <HEAD> <LINK rel="STYLESHEET" href="MyStyle.css" type="text/css"> </HEAD> <BODY> <TABLE class="navigation" COLS=4> <TR> <TD class="navigation"> <A HREF="...">Previous</A></TD> ... CSS by Example (1/3)
H1, H2, H3, H4, H5, H6 { font-family: helvetica, sans-serif; text-align: left; font-weight: normal; font-style: italic; } H1 { font-weight: bold; font-style: normal; } DIV.quote { text-align: right; font-style: italic; color: blue; } LI { list-style: square; } <H1>Heading 1</H1> <DIV class="quote"> He brewed a song of love and ... </DIV> <H2>Heading 2</H2> <H3>Heading 3</H3> <UL> <LI>item 1</LI> <LI>item 2</LI> </UL> CSS by Example (2/3)
DIV.listing { border: solid thin; margin-top: 0.5em; margin-right: 1.5em; margin-left: 1.5em; margin-bottom: 0.5em; background: white; } SPAN.ltitle { text-align: center; font-family: helvetica, sans-serif; font-style: normal; font-weight: bold; font-size: 100%; margin-top: 0.25em; } DIV.warning { border: solid thick; border-color: red; margin-top: 0.5em; margin-right: 1.5em; margin-left: 1.5em; margin-bottom: 0.5em; } CAPTION { text-align: center; color: #088; font-style: italic; font-weight: bold; font-size: 100%; } <DIV CLASS="listing"> <SPAN CLASS="ltitle">C shell</SPAN> <HR> <PRE> setenv CVSROOT /path/to/CVS cvs checkout Project </PRE> </DIV> <CAPTION>Figure 1: This ...</CAPTION> <DIV CLASS="warning"> This is a warning! </DIV> CSS by Example (3/3)
Extensible Markup Language • XML (1998) is an application of SGML • Standard Generalized Markup Language (1986): ISO8879 • influenced by HTML (SGML Document Type Definifion) • Structure description language • Meta-language: language to describe other languages • Tags enclose identifiable parts of a document • markup (type-setting systems)
XML Example <warning> <para>This substance is <emph>hazardous</emph> to health</para> <para>See procedure 12A.7 for information on protective clothing.</para> <image .../> </warning>
XML Documents Document Document Unit Sub-unit
<!ELEMENT warning (para*, image?)> <!ELEMENT para (#PCDATA | emph)*> <!ELEMENT image EMPTY> <!ATTLIST image url CDATA #REQUIRED> <!ELEMENT emph (#PCDATA)*> Document Type Definition • DTD defines the elements allowed • A parser compares the DTD rules against a given XML document => validation • XML DTDs can be applied for data-type definitions (XML-RPC), data exchange (EDI, push, RDBMS), etc.
<warning> <para>This substance is <emph>hazardous</emph> to health</para> <para>See procedure 12A.7 for information on protective clothing.</para> <image .../> </warning> XML Document Presentation • Style sheets specify output format • 1 XML document, n alternative style sheets depending on audience, media, etc. WARNING: This substance is hazardous to health See procedure 12A.7 for information on protective clothing.
<!-- DTD for user groups in JSEF --> <?xml encoding="US-ASCII"?> <!ELEMENT usergroups (user)*> <!ATTLIST usergroups lastChanged CDATA #REQUIRED changedBy CDATA #REQUIRED> <!ELEMENT user (addgroups | subgroups)*> <!ATTLIST user userName ID #REQUIRED> <!ELEMENT addgroups (group)+> <!ELEMENT subgroups (group)+> <!ELEMENT group EMPTY> <!ATTLIST group groupName CDATA #REQUIRED> XML by Example (1/2)
<?xml version="1.0"?> <!DOCTYPE usergroups SYSTEM "usergroups.dtd"> <usergroups lastChanged="10/21/1999" changedBy="sysadmin"> <user username = "Charly Brown"> <addgroups> <group groupname = "User" /> <group groupname = "InternalUser" /> </addgroups> <subgroups> <group groupname = "Developer" /> </subgroups> </user> <user username = "Hugo Boss"> <addgroups> <group groupname = "Admin" /> </addgroups> </user> </usergroups> XML by Example (2/2)
Extensible Stylesheet Language • XSL is a language for expressing stylesheets and consists of • a language for transforming XML documents (XSLT) • an XML vocabulary for specifying formatting semantics (Formatting Objects DTD - FO DTD) • CSS < XSL < DSSSL (SGML's Document Style Semantics and Specification Language) • Style sheets • target specific elements => closely related with DTDs
XSL and XSLT Processing XSL processor XSLT stylesheet XSLT processor source DTD document new document FO DTD
Document Reengineering • analysis • data sources, responsibilities and update dynamics • data model • EER (extended entity-relationship model) • mapping logical -> physical • reverse association: how to link back • forward association: how to link forward
HTML Code Creation • Editors • Tags, Syntax • Validators • Halsoft, htmllint, .... • Converters • which HTML? DTD as parameter • how to map the document structure into HTML? • special symbols? mathematical formulas? • what happens with hyperlinks in original document?
HTML Creation (cont'd) • Development Environments • versioning • staging • TODO database • link consistency • upload to server • integration of functionality • CGIs • backend applications, databases • client-side scripting
Webserver State maintenance • HTTP interactions are "isolated", i.e., HTTP does not include means to hand over state information between interactions -> difficult • Advanced web applications, e.g. shopping basket, require that state can be shared between interactions (between web client and web server) • External apps have their own state space
WWW Gateways HTTP Web browser Web server CGI (HTTP) WWW CGI gateway Gateway-specific protocol non WWW Application
Stateful Gateways • A permanently running gateway process keeps up a connection with the external application and serves successive HTTP requests, i.e. the gateway maintains the sessions state. • Problem: state bookkeeping • client caches • back button • interrupted requests (recover ?) • time-out for follow-up requests (bound resources ?) • Example: DBs (expensive login)
Stateless Gateways • Gateway or external application generate state-information which is stored at the client and sent with every request. • State can be stored in • URLs • hidden fields • Cookies • Solve state consistency problem ?
http://some.host.org/gateway.pl?user=hugo&items=4711+0815+...http://some.host.org/gateway.pl?user=hugo&items=4711+0815+... <FORM METHOD="POST" ACTION="/gateway.pl"> <INPUT TYPE="HIDDEN" NAME="user" VALUE="hugo"> <INPUT TYPE="HIDDEN" NAME="items" VALUE="4711+0815+..."> ... State in URLs / Hidden Fields • State information can become large • User can change state information (reservation ?) • Sessions may have to be replayed until the state for the next step is reached • Unreadable URLs are no solution • Passwords ?
Cookies • A cookie is a small data structure which holds name, value pairs which is sent back and forth between web client and web server for certain URLs • Several incompatible "standards" • original standard by Netscape (Set-Cookie) • RFC 2109 (Set-Cookie) • New Internet Draft (Set-Cookie2)
Set-Cookie: Basket="4711+0815"; Version="1"; Max-Age="3600"; Path="/cgi-bin/order"; Domain=".supershop.com" Cookie: $Version="1"; Basket="4711+0815"; $Path="/cgi-bin/order"; $Domain=".supershop.com" Cookie example User shops around and gathers 2 items in his shopping basket (server -> client): User decides to buy the 2 items and selects http://www.supershop.com/cgi-bin/order/buy.pl (client -> server)
.netscape.com TRUE / FALSE 946684799 NETSCAPE_ID 100103 More about Cookies • Cookies can enhance or break privacy • tracking vs. no user database • Cookies are kept in memory • Persistent cookies: cookies[.txt] file creator (domain) path expiration name value access for all hosts in domain ? require secure connection ?
httpd.conf: AddType text/html .shtml AddHandler server-parsed Options ... Include ... Server-side Includes • .shtml files are parsed for special commands • executed before the file is sent to the client
<!--#element attribute=value attribute=value ... --> SSI Syntax Main elements Variables (additionally to CGI environment variables) DATE_GMT, DOCUMENT_NAME, DOCUMENT_URI, LAST_MODIFIED Flow control <!--#if expr="test condition" --> <!--#elif expr="test condition" --> <!--#else --> <!--#endif -->
SSI problems • Performance • parsing • command execution • Security • exec command, etc. • IncludesNoExec • Unreadable -> maintenance ?