370 likes | 524 Vues
< XML > and the Future of Internet-based Computing. 11 March 2002 Ian GRAHAM Emerging Business Strategy, Bank of Montreal E: <ian.graham@bmo.com> or <ian.graham@utoronto.ca> T: (416) 513.5656 / F: (416) 513.5590 Web: http://www.utoronto.ca/ian/talks/. ian.graham@bmo.com / 416.513.5656.
 
                
                E N D
<XML> and the Future of Internet-based Computing 11 March 2002 Ian GRAHAM Emerging Business Strategy, Bank of Montreal E: <ian.graham@bmo.com> or <ian.graham@utoronto.ca> T: (416) 513.5656 / F: (416) 513.5590 Web: http://www.utoronto.ca/ian/talks/ ian.graham@bmo.com / 416.513.5656 Emerging Business Strategy, IBS
Overview • A history lesson • The Web and the birth of XML • when, why, and who • What does XML give us? • Examples, illustrations, and applications • The future
Ftp News Email • HTML Web Server • HTTP • URL Db & other software URLs (location e.g --http://www.foo.org/boo.html) Internet communication protocols HTML (data/display) Hello There Here’s a zippy HTML page, with lots of Colors and Links ...!!! Fun, Eh? HTTP (transfer) In The Beginning ..... • …. was the birth of the Web (Tim Berners-Lee, 1992)
Three Core Concepts • HTTP -- HyperText Transfer Protocol • A protocol for transferring data between machines on the Internet • URL -- Uniform Resource Locator • A scheme for referencing, using a simple text string, the specific location of a resource (Web page, audio file, program) somewhere on the Internet (e.g. http://www.utoronto.ca/ian/talks/ ) • HTML -- HyperText Markup Language • a markup language for encoding information to be read / viewed by people HTTP and URLs have pretty-well stood the test of time. But by 1996, HTML was already showing signs of age ....
Simple HTML Example Browser Rendering HTML (not XML) Markup <HTML> <HEAD> <TITLE>The XML Specification Guide -- Website Home Page </TITLE> <LINK REL="stylesheet" HREF="style.css"> </HEAD> <BODY BGCOLOR="#FFFFFF" TEXT="black" LINK="#0066CB" ALINK="#00A000" VLINK="#808080" > <TABLE WIDTH="100%" CELLPADDING="0" CELLSPACING="0" BORDER="0"> <TR> <TD VALIGN="top" ALIGN="left"><FONTCLASS="toolbar" FACE="arial,helvetica" SIZE="-1">The XML Specification Guide </FONT></TD> …….. More tags and text ….
The Problems with HTML • HTML designed to serve one role - simple hypertext documents, with simple user interaction (forms, etc.). But people soon wanted to display other types of data: • mathematical expressions, literary text • graphics, multimedia, interactive content ... • commercial forms, purchase orders, generic data • ... and “connect” these parts together (so they can interact) • ... and dynamically mix/edit chunks of data together • ... and build dynamic networks that exchange information • ... and make sure this works reliably, anywhere.
HTML Scope was Too Limited • Single model for data (hypertext text) • Syntax too lenient ... It’s easy to create HTML that can be mis-processed by other systems • Result: • can’t create arbitrary custom data that can be universally understood HTML Web Evolution interchange data between machines modeling different types of data presentation of different types of data
The Birth of XML... • ..happened in 1996, when a group of experts assembled to try and find a way out of the problem. • First draft came out in late 1996 ... Final version of the XML 1.0 specification came out in February 1998 • Large Canadian contribution -- 3 out of 18 WG members, plus 1/3 editors [Tim Bray] • Followed in 1999 by a second ‘core’ XML specification (Also with Tim Bray as co-editor) Core Principles • Simple • But not as simple as HTML, in particular with stricter formal syntax • Extensible • So you can create your own tags, or elements • Distributed environment-friendly • like HTML, but better
An XML Example <?xml version=“1.0” ?> <partorders xmlns=“http://myco.org/Spec/partorders” > <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster </desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <deliveryDatedate=“27aug1999-12:00h” /> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> . . . Order something else . . . </order> </partorders>
What is XML? • Specification of a syntax for “encoding” text-based data (words, phrases, numbers, ...), with strict syntax rules about how to do so. • A text-based syntax -- written using printable characters (no explicit binary data) • Extensible -- you can define your own tags (essentially data types), within the constraints of the syntax rules • Universal -- the syntax rules ensure that all XML processing software MUST identically handle a given piece of XML. If you can read and process it, so can anybody else
attribute of this quantity element element tags Example Revisited <partorders xmlns=“http://myco.org/Spec/partorders” > <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc>Gold sprockel grommets, with matching hamster </desc> <part number=“23-23221-a12” /> <quantityunits=“gross”>12</quantity> <deliveryDate date=“27aug1999-12:00h” /> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> . . . Order something else . . . </order> </partorders> Hierarchical, structured information
ref= date= desc text order part quantity partorders xmlns= delivery-date order ref= date= Processing XML -- creating data structures <partorders xmlns="..."> <order date="..." ref="..."> <desc> ..text.. </desc> <part /> <quantity /> <delivery-date /> </order> <order ref=".." .../> </partorders> text XML syntax rules guarantees the same result, always
XML: Why it's this way • Simple (like HTML) • But not quite so simple • Stricter syntax rules, to eliminate processing errors • syntax defines structure (hierarchically), and names structural parts (element names) -- it is self-describing data • Extensible (unlike HTML, vocabulary is not fixed) • Can create your own language of tags/elements, with rules • Strict syntax ensures that custom tags can be reliably processed • Designed for a distributed environment (like HTML) • Can have data all over the place: can retrieve and use it reliably • Can mix different data types together (unlike HTML) • Can mix one set of tags with another set: resulting data can still be reliably processed
Mixing dialects together: name spaces Default ‘type’ is xhtml <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/1999/xhtml1" xmlns:mt=“http://www.w3.org/1998/mathml” > <head> <title> Title of XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> <mt:mathml> <mt:sup> ...… MathML markup … </mt:mathml> <p> more html stuff goes here </p> </div> </body> </html> mt: prefix indicates 'type' mathml (a different language)
W3C rec XML Specification(s) Chart XML 1.0 XML names
Classes of XML Dialects • XML gives us a tool for expressing data in a universally shareable way. • Many XML 'dialects,' optimised for different roles. • Can roughly break these down into five categories • presentation & data stuff people read, look at, or exchange • metadata for describing things; for use by other software • distributed apps data delivery; distributed applications, Web services • XML utilitiesXSLT, Schemas,… • software utilitiesvariety of things … • We’ll now look at some examples from the first three categories.
Classes of XML Dialects • 1)Presentational Language (for people/applications) • SMIL -- for multimedia (RealPlayer Multimedia players) • WML -- Wireless WAP-phones • XUL -- user interface (Netscape 6) • VoiceXML -- voice interfaces (telephone-based ...) • XHTML -- XMLized version of HTML • … • Some language with specific academic relevance: • TEI -- Text encoding http://www.tei-c.org/ • MathML -- for mathematics http://www.w3.org/Math • XHTML -- new HTML http://www.w3.org/MarkUp • SVG -- for graphics http://www.w3.org/Graphics/SVG • HEML -- historical events http://www.heml.org
TEI -- Text Encoding Initiative • ... represent all kinds of literary and linguistic texts for online research and teaching, using an encoding scheme that is maximally expressive and minimally obsolescent. † • Recently migrated to be compatible with XML (TEI-Lite) • Namespaces let you re-use XHTML ‘links’ • XML also has its own more expressive linking/pointing mechanisms • Some online examples via .... [ www.utoronto.ca/ian/talks/11mar02/examples.html ] • Gain: universally accessible literary/academic texts, with networked capabilities †From: TEI home page, http://www.tei-c.org, 16 Jan 2002
MathML, SVG: for Mathematics and Graphics • XML dialects that model essential “types” of data for presentations and display. • “Namespace” mechanism let you mix these different types of information together, and with other dialects (like XHTML) • Some online examples ....[ www.utoronto.ca/ian/talks/11mar02/examples.html ] • Advantages: Can communicate both structural and semantic information (how it looks and what it means) • Interactive mathematical example documents • Interfaces with tools like Mathematica, Maple • Non-proprietary languages, interfaces
HEML: Historical Event Markup and Linking • ... elements that are flexible enough to represent most known events in the past while working well with existing document encoding schemes, such as XHTML, TEI-Lite and Docbook. † • Online examples at ...[ www.utoronto.ca/ian/talks/11mar02/examples.html ] • A “web” of historical events, cross-linking documents with resources, timelines, etc. †From: HEML home page, http://www.heml.org, 16 Jan 2002
CML - Chemical Markup Lang CellML - biological models BSML - bioinformatic sequences MAGE-ML - Microarray Gene Expression XSTAR - for archaeological research XMLMARC - MARC in XML AML - astronomy markup language ... many (dozens and dozens) more ... And others • There has been an explosion of activity towards developing “universal” XML formats for encoding, exchanging and linking information. • “Evolutionary” forces still at play (many languages are born, but only a few will survive) • Prediction -- this will lead to a big change in how academic information is created, shared, and stored.
Informational Data: Metadata and Packages • Can use XML to encode information about data • Indexes, catalog records, etc. • data about non-text resources (images, people, whatever) • Can also use XML to package up information (data + catalog) • Example: IMS Content packaging • A standard for “packaging” Web content relevant to Web based instructional applications • Will allow for interoperable content -- so it can be moved between different IMS-compliant learning systems. • A growing number of learning systems, including WebCT, support this standard • One of the core components for creating learning objects
Distributed Data • The networking of the data is becoming more important that the data itself • XML is becoming the tool for creating such networks, and for transporting data from place to place in that network. • The preceding example languages can sometimes do this sort of thing, but there are also specific XML languages aimed at this role. • These ideas -- and some of the existing tools -- can be used in Portal / Website development, creation of distributed databases, etc.
Distributed data application: Open Directory • RDF -- Resource Description Framework • A language for encoding metadata about resources • Used by the Open Directory Project to create an open, shareable directory of Web resources • Can search the directory site (like Yahoo), or download the entire directory and integrate it into your own. • Current directory has: • 46,000 human editors • 45,000 categories • millions or ‘resources’ catalogued • re-used by ~290 sites around the world • Online examples from ...[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
RDF data feeds: <XML> infospace Ask Jeeves Google infospace Labour party UK Open Directory Model dmoz.org Downloading XML data from well-known location
Distributed data application: RSS • RSS -- Rich/Resource/RDF Site Summaries • A language for encoding summary data about Web pages/sites, and related metadata (update interval, etc.) • Designed for syndicated distribution of information about pages • Rather like headlines for newspapers • There are currently 850+ syndicators of such data, and several thousand RSS ‘feeds’ • News agencies • Web sites with updated content • individuals with ‘blogs’ • Online examples from ...[ www.utoronto.ca/ian/talks/11mar02/examples.html ]
sites ... RSS consumers RSS aggregator Black lines: <XML> RSS Syndication Model Web site Desktop app (e.g., Headline Viewer) ‘one-way’ XML --Simple querying of ‘aggregator’ via URLs: http://ag.org/?news JavaScript component Other ... (aggregator, ...)
Distributed data application: Jabber • open, XML-based protocol for instant messaging and presence. Jabber-based software is deployed on thousands of servers across the internet and is used by over a million people worldwide. • A complete XML-based distributed application toolset. †From: TEI home page, http://www.tei-c.org, 16 Jan 2002
Jabber: • Presence • User directory • Proxys to Yahoo, ICQ • Other services Jabber clients Jabber server Jabber server
Jabber Example Jabberclient Jabber client Jabber server Jabber server Requests and responses all sent in XML Generic XML protocol for exchanging messages, plus some services. Can be extended to non-text messaging applications • Connect register presence • Lookup user contact database • Send text message contact database
XML for networked applications • XML for encoding data • XML for transporting information between applications • XML for encoding instructions to send to another application • XML interfaces to other applications • Creation of Web Services • Software made available to others via a generic XML interface, with supporting facilities (directory service for ‘finding’ them, etc.) • XML is becoming the core tool for building distributed, dynamically configured applications
How can this be used? XML interface (SOAP, XML-RPC, other...) Integrated Application Web site News Feeds Jabber/chat Banking • Web content distribution • Calendar aggregation • Portlets for Web sites • Distributed catalogs / db’s
The result of all this activity • Enormous drive to create all the XML technologies needed behind the scenes • Many “core” XML languages, plus many supporting standards • Evolution has been very quick, as the new Web model is not that n
industry std XML Core XML 1.0 Xfragment XML names RDF Canonical Xpath MathML APIs XSLT SMIL 1 & 2 Xpointer XML base W3C rec JDOM VoiceXML JAXP Xlink Infoset XSL …... SVG DOM 1 XHTML events XML signature XHTML 1.0 DOM 2 XML query …. DOM 3 XHTML basic Xforms XML schema SAX 1 SAX 2 UDDI Modularized XHTML RSS SOAP IFX TEI Biztalk IMS XML-RPC CSS 1 HEML Docbook ... 100's more .... ebXML XMI CSS 2 WDDX CellML XUL ... WSDL CSS 3 Jabber ... ... Style Protocols Web Services Application areas Data/presentaion XML (and related) Specifications W3C draft ‘Open’ std
In Conclusion • XML is changing the way we think about ‘raw’ information • Open, • Universal • Shareable • Distributable • Collective, complex, and emergent • .. and with the Internet model is changing the way we think about applications • Networked (via XML) collections of individually simple apps. • Value in aggregation, not the individual parts
Conclusion II • “A large part of how we think about music is influenced by the methods by with which it has conventionally been distributed. We think of pop songs as being three or four minutes long because 40 years ago that was all that could fit on one side of a vinyl single.” Moby • We think of Internet-based computing is the same way -- in terms of what we know or knew -- not what it can be, or will become • Our great opportunity is to help define this future
<XML> and the Future of Internet-based Computing 11 March 2002 Ian GRAHAM Emerging Business Strategy, Bank of Montreal E: <ian.graham@bmo.com> or <ian.graham@utoronto.ca> T: (416) 513.5656 / F: (416) 513.5590 Web: http://www.utoronto.ca/ian/talks/ ian.graham@bmo.com / 416.513.5656 Emerging Business Strategy, IBS