350 likes | 364 Vues
This session provides an overview of XML, its use in universal data representation, and examples of XML in action. It also explores the evolution of HTML and the birth of XML as a language for distributed information on the web.
E N D
An Introduction to XML: The eXtensible Markup Language Manitoba Library Associations Conference 2000 Ian Graham, Session T4, 11 May, 2000
An Introduction to XML: The eXtensible Markup Language Ian GRAHAM Centre for Academic Technology, Information Commons, University of Toronto Tel: (416) 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/
Overview • Web history and the birth of HTML • HTML is not enough -- why? • XML for universal data • Examples of XML in action • Profound conclusions ...
The Birth of the Web • The HyperText Markup Language (HTML) • A simple language for distributing text-based information • Combined with other Web technologies to yield…. • A distributed information Web
Four Main Components • URL: For addressing things • HTTP: For transporting data • CGI: For adding functionality • HTML: For encoding text information
HTML NNTP Shoutcast FTP Web Server Databases & other software CGI URLs HTTP
HTML • A simple, general-purpose language • Simple hypermedia ( <a href=“…” > … ) • Original concept -- • Collaborative authoring • Merging of roles of authoring/viewing
HTML Evolution • Started with very few tags … • simple requirements (only need know a little bit about the tags, and then just muddle through) • Language evolved -- more tags: • forms, images, tables, frames, fonts, … • driven by functional and marketing demands
HTML Problems (1) • Many wanted personalized tags • Want to put other data into HTML • mathematics, database entries, literary text, poems, purchase orders, graphic layouts …. • Different conceptions for the language • HTML just isn’t designed for that!
Software processing Server management of data (library Web site, any large site) But -- HTML is so ill-formed, this is hard! HTML Problems (2) Web server engine HTML chunk HTML chunk HTML chunk HTML chunk HTML
Software processing Client data processing (machine--machine communication) But -- HTML is so ill-formed, this is hard! HTML Problems (3) Web software HTML data (from somewhere on the Web ...) Into a database, or other tool
Idea: Back to Basics • HTML is defined using SGML • Standard Generalized Markup Language • A meta-language for defining languages • I.e. -- can define your own tags • Complex, sophisticated, powerful • Idea: Use SGML
Languages based on SGML HTML TEI DocBook . . . SGML
SGML Problems • Too complicated • Rules too strict • Can’t distribute ‘muddle-able’, loosely formatted text (like HTML) • Not good in a distributed environment • Can’t mix different data together • Can’t add arbitrary tags
Idea (2): “Webified” SGML • New eXtensible Markup Language: XML • Can use XML to define new languages • Distributes easily on the Web • Can mix different types of data together • can easily add new tags, and tell a browser what to do with them (more or less....)
Basic XML Rules • Tags written as in HTML, but ... • Technical details • Tag names are case-sensitive • Always need end tags • Special empty-element tags (that don’t have end tags) • Always quote attribute values
Like this example ….. <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> ….. <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html> XML stuff
Special XML Things • <?xml version=“1.0” encoding=“iso-8859-1” ?> • Says that this is an XML document • <html xmlns=“http://www.w3.org/TR/xhtml1”> • Says that the meaning of the tags inside (and including) the html element are defined as belonging in the same “space” of names.xmlnsXML namespace
Evolution of XML • Many XML-defined languages, optimised for different roles • MathML -- for mathematics • SMIL -- for synchronised multimedia • RDF -- for describing “things” • XUL -- for describing the Nav6 user interface • SpeechML -- for synthesised voices
Designed to express layout of maths Also can express semantics Cut & paste into Maple, Mathematica possible support in Navigator 6 x2 + 4x + 4 =0 <mrow> <mrow> <msup> <mi>x</mi> <mn>2</mn> </msup> <mo>+</mo> <mrow> <mn>4</mn> <mo>&invisibletimes;</mo> <mi>x</mi> </mrow> <mo>+</mo> <mn>4</mn> </mrow> <mo>=</mo> <mn>0</mn> </mrow> MathML
SMIL • Synchronised Multimedia Integration Language • Integration of multimedia with text, audio, video • Support in RealPlayer G2
SMIL Example <smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout> </head> <body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par> </body></smil>
XHTML: NextGen HTML <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And another paragraph, this one with an <img src="image.gif" alt="waste of time" /> image, and a <br /> line break. </p> </div> </body></html>
XHTML • Just like HTML, but based on XML rules • Will support integration of different data into a single document • (Doesn’t quite work that way now, unfortunately)
XHTML and other Data <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> <mathml xmlns=“http://www.w3.org/TR/mathml”> … MathML markup … </mathml> <p> more html stuff goes here </p> <smil xmlns=“http://www.w3.org/TR/smil1”> … SMIL markup … </smil> </div> </body></html>
Displaying XML • More complicated than HTML • XML represents data only, not how it looks • Need extra instructions (a “style sheet” document) to define how things should look
What Browsers Do Now? • Navigator 4, Internet Explorer 4 • Uggh…… (can’t handle XML at all) • Internet Explorer 5 -- shows a tree of elements • Mozilla/Netscape 6 -- ignores the tags ... or so it seems (see examples)
Other Use: Data Abstraction • XML as a universal format for data interchange • Machines exchange data as XML-format messages • Eliminates proprietary data formats • Lots of XML processing software available
Supplier Place order Factory Supplier Supplier Response XML Messaging: Business
XML Messaging: Database Other DB Request/send data Database Other DB Other DB Request/send data
Example Message <partorders xmlns=“http://myco.org/Spec/partorders.desc”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster</desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <delivery-date date=“27aug1999-12:00h”> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> …. Order something else ….. </order> </partorders>
Other Examples • XUL: XML User Interface Language • How Navigator 5 configures its interface • Defines structure and software integration (www.mozilla.org) • RDF: Resource Description Framework • For describing things • Used by Netscape Open Catalog project to define Web accessible resources (www.dmoz.org)
SMIL SpeechML XUL XHTML MathML RDF The XML Family Tree HTML TEI . . . . . . XML SGML
XML Summary • an integration tool for mixing different types of data • a universal format for exchanging data between machines • a framework for distributing information on the Web
An Introduction to XML: The eXtensible Markup Language Ian GRAHAM Centre for Academic Technology, Information Commons, University of Toronto Tel: (416) 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/