320 likes | 406 Vues
Learn about XML, its importance in web data, evolution from HTML, issues with HTML, transition to XML, basic XML rules, and evolution of XML into specialized languages like MathML and SMIL.
E N D
An XML Introduction Next Generation Web Data Ian GRAHAM Centre for Academic Technology Tel: 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/
Overview • An XML example • -- so what’s so special about XML? • The birth of the Web -- HTML • HTML is not enough -- why? • XML for universal data • Common uses and applications
XML Example: test.xml <?xml version="1.0" encoding="iso-8859-1” ?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html>
It Looks Like HTML …. • Sort of …. • Tags look just like HTML tags (although XML lets you ‘create’ your own) • The “red bits” are special XML stuff (we will discuss them later) • It’s got .xml at the end
The Birth of the Web • The HyperText Markup Language • A simple language for distributing text • All that other stuff • URLs, HTTP, CGI ...
HTML Evolution • Started with very few tags … • Language evolved, as more tags were added • forms • tables • fonts • frames
HTML Problems • Desire for personalized tags • Want to put data into HTML form • mathematics, database entries, literary text, poems, purchase orders …. • HTML just isn’t designed for that!
Software processing Server management of data But -- HTML is so ill-formed, this is hard! HTML Problems (2) HTML HTML HTML HTML HTML
Idea: Back to the Basics • HTML was defined using SGML • Standard Generalized Markup Language • A meta-language for defining languages. • Complex, sophisticated, powerful • Idea: Use SGML
Languages based on SGML HTML TEI DocBook . . . SGML
Problems with SGML • Too complicated a language • Rules are too strict • Can’t distribute ‘loosely’ formatted text (like HTML) • Not good in a distributed environment • Can’t mix different data together • Can’t add arbitrary tags
Idea (2): “Webified” SGML • New eXtensible Markup Language: XML • Can use XML to define new languages • Distributes easily on the Web • Can mix different types of data together • can easily add new tags, and tell a browser what to do with them.
Basic XML Rules • Tags like in HTML, but ... • Technical details • Tag names are case-sensitive • Always need end tags • Special empty-element tags • Always quote attribute values
Like this example ….. <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> ….. <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html>
XML Things • <?xml version=“1.0” encoding=“iso-8859-1” ?> • Says that this is an XML document • <html xmlns=“http://www.w3.org/TR/xhtml1”> • Says that the meaning of the tags inside (and including) the html “element” are defined here.
Evolution of XML • Many XML languages, optimised for different roles • MathML -- for mathematics • SMIL -- for synchronised multimedia • RDF -- for describing “things” • XUL -- for describing the Navigator 5 user interface
Designed to express semantics of maths Also can express layout Cut & paste into Maple, Mathematica x2 + 4x + 4 =0 <mrow> <mrow> <msup> <mi>x</mi> <mn>2</mn> </msup> <mo>+</mo> <mrow> <mn>4</mn> <mo>&invisibletimes;</mo> <mi>x</mi> </mrow> <mo>+</mo> <mn>4</mn> </mrow> <mo>=</mo> <mn>0</mn> </mrow> MathML
SMIL • Synchronised Multimedia Integration Language • Integration of multimedia with text, audio, video • Support in RealPlayer G2
SMIL Example <smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout> </head> <body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par> </body></smil>
XHTML: NextGen HTML <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And another paragraph, this one with an <img src="image.gif" alt="waste of time" /> image, and a <br /> line break. </p> </div> </body></html>
XHTML • Just like HTML, but based on XML rules • Will support integration of different data into a single document
XHTML and other Data <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> <mathml xmlns=“http://www.w3.org/TR/mathml”> … MathML markup … </mathml> <p> more html stuff goes here </p> <smil xmlns=“http://www.w3.org/TR/smil1”> … SMIL markup … </smil> </div> </body></html>
Displaying XML • More complicated than HTML • XML represents data only, not how it looks • Need extra instructions (a “style sheet” document) to define how things should look
What Browsers Do Now? • Netscape 5 -- ignores the tags ... or so it seems ... • Internet Explorer 5 -- shows a tree of elements • Navigator 4, Internet Explorer 4 • Uggh…… (can’t handle it)
Other Use: Data Abstraction • XML as a universal format for data interchange • Machines exchange data as XML-format messages • Eliminates proprietary data formats • Lots of XML processing software available
Supplier Place order Factory Supplier Supplier Response XML Messaging
XML Messaging Other DB Request/send data Database Other DB Other DB Request/send data
Example Message <partorders xmlns=“http://myco.org/Spec/partorders.desc”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster</desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <delivery-date date=“27aug1999-12:00h”> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> …. Order something else ….. </order> </partorders>
SMIL SpeechML XUL XHTML MathML RDF The XML Family Tree HTML TEI . . . . . . XML SGML
Other Examples • XUL: XML User Interface Language • How Navigator 5 configures its interface • RDF: Resource Description Framework • For describing things • Used by Netscape Open Catalog project to define Web accessible resources
Summary • a framework for distributing data on the Web • an integration tool for mixing different types of data • a universal format for exchanging data between machines
An XML Introduction Next Generation Web Data Ian GRAHAM Centre for Academic Technology Tel: 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/