1 / 45

Overview of XML-related standards

Overview of XML-related standards. Steven J. DeRose, Ph.D. Brown University Scholarly Technology Group Steven_DeRose@brown.edu http://www.stg.brown.edu/~sjd. XML and related specs. XML: The basic syntax Plus Namespaces, Schemas, InfoSet DOM: API to the Information Set XML Linking

jgower
Télécharger la présentation

Overview of XML-related standards

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of XML-related standards Steven J. DeRose, Ph.D. Brown University Scholarly Technology GroupSteven_DeRose@brown.eduhttp://www.stg.brown.edu/~sjd

  2. XML and related specs • XML: The basic syntax • Plus Namespaces, Schemas, InfoSet • DOM: API to the Information Set • XML Linking • XPath: Expressions to find XML nodes • XPointer: XPath++ for addressing • XLink: hypermedia connections • Stylesheet Attachment • XSL: stylesheets and transforms

  3. XML specification • A “Recommendation” since 2/1998 • The highest level for a W3C specification • Defines the syntax/grammar • Not any particular processing/semantics • Schemas or DTDs define applications (poem, manual, eCommerce,...) • All these can be parsed by generic XML, just as new words can be readily fitted into existing sentence structures • Schemas are political as well as technical

  4. XML Namespaces • Disambiguate element type names <head><html:title>Oncataloging</html:title>…<biblio><entry id='DeRo98'> <loc:title>Navigation, Access, Control… • Declaring prefixes <sec xmlns:loc="http://foo.com/mynamesp”xmlns:html='http://www.w3.org/1999/xhtml' xmlns="http://…"> <loc:title>… • Declaration without prefix sets default • Attributes can have namespaces • No renaming (x:foo to y:bar)

  5. XML Schemas • Let you define a document type • What elements/attributes are defined? • Where can they occur? • What content is allowed? • What datatypes are represented? • Required for validation • Similar to DTDs, but • More powerful (esp. for datatyping) • Use XML syntax

  6. XML Information Set • What data in XML document “counts”? • Elements, attributes, content • Order and hierarchy of nodes • Required for interoperability • Applications must count nodes consistently • Not whitespace inside tags • Not which kind of quotes around attributes • Candidate recommendation 2001-05-14 • http://www.w3.org/TR/xml-infoset

  7. These arealwaysleaves in the Infoset 7 types of Infoset Nodes • Root: Above the document • <?foo ?> <doc>…</doc> <!-- hi --> • Element: Main structure • <div n='1'>…</div> • Text: Spans of unbroken text • Attribute: Properties of elements • Namespace: Prefixes/URIs • Processing Instr: <?…?> • Comment: <!-- … -->

  8. Root Example ROOT doc title abstract chapter chapter chapter ID= 'intro' ID='summary' ID='concepts' Introduction title section section Attribute p title p p list Text node (others omitted) ... ID='p37' a xref Element name='baz' href='#id(intro)'

  9. More Infoset details Namespace node http://www.w3.org/1999/xhtml item Mixed content PI p Comment TEX:pgbrk Added 7/00 Everything is intertwingled em deeply XML Comment Processing instruction

  10. DOM • "Document Object Model" • An API for accessing the Infoset • Many tools use this • Level 1 complete • http://www.w3.org/TR/REC-DOM-Level-1 • Level 2 core complete • http://www.w3.org/TR/DOM-Level-2-Core

  11. XML Base • Similar to the HTML <base> element • Useful for keeping URIs simpler and uniform. • Applies to relative URLs <html><head><base href="http://www.example.com/">…</head> <body>… <a href="fig/mosquito.png"> • The hrefs combine to make whole URI: http://www.example.com/fig/mosquito.png

  12. XML Base • XML Base provides similar feature • By a reserved attribute <?xml version="1.0"?> <doc xml:base="http://eg.org/today/"> See <link xlink:type="simple" xlink:href="new.xml">the news</link> • Applies to attributes & descendants • Can be overridden on descendants • Final REC as of 2001-06-27 • http://www.w3.org/TR/xmlbase/

  13. Stylesheet attachment • Lets documents point to stylesheets • Based on HTML <link type='stylesheet'> • Multiple, anywhere in XML prolog • May point to CSS, XSL, etc. • Example: • <?xml-stylesheet alternate="yes" href= "mystyle.css" title="Medium" type="text/css"?> • Equivalent of HTML:<LINK href="mystyle.css" title="Medium" rel="alternate stylesheet" type="text/css"> • REC: http://www.w3.org/TR/xml-stylesheet

  14. XSL specification • Stylesheet language • Based on ISO DSSSL and W3C CSS • 2 major pieces: • XSLT: document transformation • Builds on XPath (more later) • Match elements, then construct output • XSL-FO: Formatting objects • To actually render blocks, fonts, tables, etc. • Hypermedia support unfinished (=CSS) • http://www.w3.org/TR/xsl/

  15. W3C Architecture UI Canonical Plenary Infoset Core Fragments Schema Query XSL Linking DOM Namespaces Assoc. Style XML Base XPointer XPath XLink XInclude Current XML organization XML Plenary coordinates several WGs Some related WGs have liaisons

  16. XML-Linking specifications • XPath: expressions on infoset nodes • REC: http://www.w3.org/TR/xpath • XPointer: XPath + ranges, in URIs • CR: http://www.w3.org/TR/WD-xptr • XLink: gather locations to make links • REC: http://www.w3.org/TR/xlink/ • (XML Base)

  17. XML-Linking goals: end user • Links from un-writable documents • Which is most of the Web, for any person • Perhaps the most important single feature • ->Bidirectional and multi-ended links • ->Annotations and annotation sharing • Dynamic updates, patches, highlighting • Precise link attachment in any media • Large sets/databases of managed links • An entirely new market for links per se • Anyone can publish/sell their commentary

  18. Pointing vs. linking • In HTML, many things are combined: <a href="eg.org/foo">wow</a> • Technically: • "eg.org/foo" is a pointer (namely a URI) • The abstract connection itself is the link • The <a> element is a link representation • "wow" is the localanchor • Anchors are also called link-ends • Data at eg.org is the remoteanchor • HTML specifies the link behavior

  19. ROOT doc title abstract chapter chapter chapter intro concepts summary Introduction title section section p title p p list ... p37 a xref name='baz' href='#id(intro)' XPointer: locators <xml>… <xref target="http://z.com/foo.xml#id('p37')">See Section 1.</xref> A way of locating data in XML structure — used to attach link end(s) to data A pointer identifies or locates some part of a document -- this is only the yellow part above

  20. XLink: connections Someplace Someplace • Describes a relationshipof referenced location(s), • To each other • To descriptions • XLink providessome key ones A link connects data and meta-data portions, including their relationship -- really just the lines role role role A link may be expressed at a unique source end, or out in a link database Someplace Someplace Someplace

  21. XPointer… • Locates parts of XML resources • Even things without IDs • Even things that aren't whole nodes • XPointer adds (beyond XPath): • Way to refer to point and range selections • Way to use inside URI fragment identifiers • TEI “extended pointer” notation plus XPath logical expressions • Typically, a browser might load a document and scroll to/highlight the part

  22. Anatomy of a URI reference URI reference URI http://example.com/foo.htm#bing scheme domain path fragment identifier XPointer defines this part

  23. Fragment identifiers • Part of URIs after "#" • Says where in document is actual target • Separate form for each media type • Identifiers for graphics  for text • IETF MIME definition specifies form • HTML • To scroll to <a name="coyote"> http://example.com/hello.html#coyote

  24. The 3 XPointer/XPath forms • Bare names • An XML "name"* finds element with that ID • For (X)HTML compatibility • HTML uses "NAME", not ID • Child sequences • Stepwise down through elements: /1/4/27/2 • May start with an ID: intro/4/3/2 • Full XPointers • scheme1(args) scheme2(args)… • For now, the only "scheme" is "xpointer" *Name: Letters, digits, hyphen, underscore, period.

  25. XPointer's 2 parts • Provide 'scheme' mechanism • Identify media-specific pointer types • Allow multiple ones to co-exist • Pointing methods for XML • Point to ranges, sets, id's, coords… • Point descriptively

  26. XPointer schemes • Each media type needs pointer type • pngRect(0,10 100,200) • vrml(camera=1,2,3 light=4,50,500) • map(W010’/ N5130’) • Xml(…) • Schemes label fragment identifier types • #scheme1(args) scheme2(args)… • Escape any extra ( ) -- tlg('^(apax') • XPointer() is the first scheme

  27. Multiple schemes in a URL? • When a server responds to a URI, it • Checks what media the client can handle • Picks one of those to send • “content negotiation” • If a visually-impaired user clicks • <a href="http://www.example.com/foo.gif# gif(0,0 1,1) xpointer(id(chap1))"> • The server may fall back to an XML file • The client tries fragment identifiers left-to-right, and uses the first one that works

  28. Anatomy of a location step predicate child::para[@type="weak"][3] axisname node test literalstring position test attributereference Case matters Finds the third child of the current node that (a) is an element of type 'para' and (b) has a 'type' attribute whose value is 'weak'

  29. Summary: axes and functions • root( ), id( ) • parent, self, child • ancestor, ancestor-or-self • descendant, descendant-or-self • preceding-, following-sibling • preceding, following • attribute, namespace • here( ), origin( ) • String-range(), range-to() Absolute Relative Absolute

  30. Counting locations

  31. Points and Ranges Hello, world. • Point • What you get by click-selection • Gap before/after node or char • Range • What you get by drag-selection • From a start point to an end point • Not generally a WF XML subtree • May partially contain some elements: • <p>Hello, world.</p><p>Hi, back</p> • Crucial for creating hypertext links • How often do you click/drag exactly one entire element? Hello, world.

  32. XLink is a language that... • Lets you invent your own linking elements and their meanings • In keeping with XML approach overall • Lets you create link databases • Links become first-class objects in the model • Provides some basic traversal behavior • E.g., “Open the target in a new window” • The rest is left to a style mechanism such as XSL

  33. XLink terminology • Linking element • Identifies, connects, and describes anchors • Locator • Locateses some link end (anchor)’s data • Link end or anchor • A data portion reachable as part of a link • Arc • Explicit connection between two link ends • Resource • Anything you can point at on the Web • Using an arc is called Traversal

  34. What links do with link-ends • A link identifies where its ends are • Using some kind of locators • URI#XPointer will be the locator for XML • URI#scheme()scheme() in general • A link attaches metadata to each end • Its formal role in relation to the other ends • A title by which to refer to it (say, in menus) • Some traversal behaviors • Arcs to say which traversals happen • Link itself can also have type, other info

  35. Inline links • Linking element itself (better, the origin() end) is one of the link’s ends

  36. Out-of-line links • Linking element itself isn't automatically made into one of its own resources Requires that there be a way to find link databases in the first place

  37. Link need not be“at” a link-end Anatomy of an XML link <html>Knuth’s right.</html> <link type="annotated-reference"> <loc role="ref" href="xptr.xml#child(2,div)"> <loc role="src" href="knut73.tex#s4.2.2"> <loc role="com" href="http://x.com/note.html"> <link> Each link-end can be described Link may have any number of ends <!DOCTYPE spec...<spec><div>…</div><div> <head>...</head>... Link-ends need not be XML \{… 4.2.2: A tree is a set of nodes where each node has one parent, except for a root node, which has none….} Link-ends need not be marked up

  38. Arcs • Arcs specify traversal rules • Multi-ended links may restrict travel among their endpoints • Restrictions generic or app-specific • Arcs enable the description of both • An arc is a pair of roles, plus metadata • Enables traversal between ends with the given roles • May be multiple locators per role (useful for document assembly, multiple-choice travel)

  39. Arc example: fuel-type annotations Warning: explosive Warning:toxic gasoline fuel-type warning warning ARCS: vehicle  fuel-type fuel-type  warning Link body 1 vehicle vehicle vehicle

  40. How to detect links • Could have any name and content at all • <footnote>, <criticism>, … • xlink:type attribute marks linking elements for applications to find: <!ELEMENT footnote EMPTY><!ATTLIST footnotexlink:type CDATA #FIXED "simple" xlink:href CDATA #REQUIRED> • For example: ...has studied the issue.<footnote href="http://www.doctools.com" /> Defaultvalue forattribute

  41. Arcs and Traversals • Traversal is split into: • Behavior • Author's intention for behavior of a link. • Input to style mechanism • Not a presentation command • Actuation • Defines the event that triggers a link • Events are very generic, intentionally

  42. Two kinds of behavior policies • show attribute • new to traverse and provide new “context” • replace to display in existing “context” • embed to display in the body of the initiating resource • Some semantic details are left unspecified: combining multiple ends, style inheritance, etc. • actuate attribute • onRequest to require external request • onLoad to traverse when link processed

  43. Link databases let you… • Attach descriptive information from afar • Annotate other people's stuff • Maintain links more easily • When a destination changes, you don’t have to touch documents with links to it • Engage in online commerce in links • Express, package, and sell point-of-view • Collect out of line links as databases

  44. External Linksets • Users will have persistent linkdbs • Subscriptions, interest groups, private,... • Document can specify relevant link dbs • Linked by special type of extended link • Included within regular documents too • LinkDBs enable link management • Needed to author using external links • Example: Public annotations on….

  45. An external Linkset Instance <xls><linkbase xlink:href="linkset1.xml" /><linkbase xlink:href="linkset2.xml" /><linkbase xlink:href="linkset3.xml" /> </xls>

More Related