1 / 19

The Semantic Blessings of XSLT

Diederik Gerth van Wijk dg@doxatrix.nl XML Holland 2008 Planetarium Gaasperplas, Amsterdam, 20 november. The Semantic Blessings of XSLT. DOXATRIX. Intended audience. Understands English Knows what XML is about Cares about meaning, processing and validation

Télécharger la présentation

The Semantic Blessings of XSLT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diederik Gerth van Wijk dg@doxatrix.nl XML Holland 2008 Planetarium Gaasperplas, Amsterdam, 20 november The Semantic Blessings of XSLT DOXATRIX

  2. Semantic Blessings of XSLT Intended audience • Understands English • Knows what XML is about • Cares about meaning, processing and validation • Does not need to know about XSLT • Does not need to be a programmer • But might be aware that computers need to be programmed

  3. Semantic Blessings of XSLT Semantic? Blessings? XSLT? • XML is about the structure of a document • Semantics are about “meaning” • A schema can say that a document should have a title (structure)‏ • The documentation might add that a title is used for identification (unique within a set of documents), and give a clue about what the document is about (semantics)‏ • The words used in the title are really semantics • Blessings are good, helpful, you want them • What is XSLT? • How can XSLT help you in adding, verifying and using semantic markup?

  4. Semantic Blessings of XSLT Why bother marking up explicitly?

  5. Semantic Blessings of XSLT NLP is good, Explicit Markup is better • “Plein 26 Den Haag”=<street>Plein</street><nr>26</nr><city>Den Haag</city> • “Plein 1813 Den Haag”=<street>Plein 1813</street><city>Den Haag</city> • XML is about tagging structure • A schema adds semantics • <name>Quattro Staggioni</name>: Pizza by Mario or piece by Vivaldi? • I don’t care (in this presentation)‏

  6. Semantic Blessings of XSLT eXtensible Stylesheet Language - Transformations • XSL: the eXtensible Stylesheet Language • Family of three W3C recommendations for transformation and presentation • XML Path Language (XPath)‏ • XSL Transformations (XSLT)‏ • XSL Formatting Objects (XSL-FO) XSLT stylesheet 1 XSL-FO document PDF XSL-FO processor XML source document(s)‏ XSLT processor HTML pages XSLT stylesheet 2

  7. Semantic Blessings of XSLT XSLT characteristics • An XSLT style sheet is an XML document • Input is one or more XML documents • Output is one or more XML (XSLT!), HTML, XSL-FO or plain text (CSS!) documents • Style sheet can look like template of the result document (data pull)‏ • Or be event driven (data push)‏ • Elements and attributes are “events” • Functional programming language • Rule based • Declarative • No side effects • Statements can be executed in any order • Embeds XPath • XSLT 2.0 and XPath 2.0 know XML Schema types • XSLT 2.0 can compute from implicit structure

  8. Semantic Blessings of XSLT XSLT engines • stand alone: • Saxon (open source, Michael Kay)‏ • Altova (free, XML Spy)‏ • MSXML • on server: • Saxon + .NET • Altova + .NET • MSXML + ASP • built in browser: • IE6 and higher • FF1 and higher • Opera9 and higher

  9. Semantic Blessings of XSLT What’s the competition? • CSS (Cascading Style Sheets)‏ • Easier, simpler • Don’t transform • Perl, Python, Java, JavaScript, C(++), (V)Basic • Generic programming or scripting languages • No built in knowledge of XML, but lots of libraries for DOM or SAX • JSP, ASP, PHP • Server side processing • Not really XML aware • Little or no transformation • IS-10179 DSSSL: Document Style Semantics and Specification Language • SGML based • Rarely used

  10. Semantic Blessings of XSLT XSLT and semantics... • XML elements describe what the content is (semantics)‏ • XSLT stylesheets what to do (processing) with them • How can a processing stylesheet be a semantic blessing?

  11. Semantic Blessings of XSLT Blessing 3: XSLT 2.0 may be schema aware • A schema defines the semantics of a document type • XSLT 2.0 is based on XPath 2.0 • XSLT 2.0 may use schemas • Then, XPath 2.0 can use the type of element types or attributes • So it can know whether to treat an attribute as string or as integer(”12” < ”3” if type is string, ”12” > ”3” if type is integer)‏ • But will it sort correctly:<song title=”50 ways to leave your lover” performer=”Paul Simon” /><song title=”1919 rag” performer=”Kid Ory” />or<king name=”Henry VIII” born=”1491-06-28” died=”1547-01-28” /><king name=”Henry IX” born=”1725-03-11” died=”1807-07-13” />(yes, if the roman numbers were coded as &#x2167; and &#x2168;)‏ • With the “instance of” operator you can use information that is not in the document, but is in the schema • Therefore, XSLT 2.0 disencourages stand alone processing • From a semantic point of view, that’s a blessing

  12. Semantic Blessings of XSLT Blessing 4: Schema independent processing (1)‏ • In a sequence group, the order contains no information:(title, abbreviated-title?) (1)is equivalent to(abbreviated-title?, title) (2)‏ • Suppose, you want to print the abbreviated title if one is coded, and otherwise the full title • In streamprocessing, the q&d solution might be as simple as:temp=getNextElement; if existsNextElement then write(getNextElement) else write(temp); (1)orwrite(getNextElement); (2)‏ • But what if you decide to change from order (1) to (2)? • Or add an optional element toc-title?(title, abbreviated-title?, toc-title?) (1)(toc-title?, abbreviated-title?, title) (2)‏ • The simple program breaks

  13. Semantic Blessings of XSLT Blessing 4: Schema independent processing (2)‏ • In XSLT, you have access to the elements by name, in arbitrary order • The style sheet fragment looks like<xsl:choose> <xsl:when test="./abbreviated-title"> <xsl:value-of select="abbreviated-title"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="title"/> </xsl:otherwise></xsl:choose> • If the schema (and documents) change order, the stylesheet remains the same • If an optional toc-title is added, the stylesheet remains the same • Verbosity turns out to be simpler, in the long run • By the way, if sequence matters in the document, it shouldn’t in the schema • Reasons to prescribe sequence: • to ease input • to enforce cardinality

  14. Semantic Blessings of XSLT Blessing 5: functional programming • No variables • Suppose you want to sort items alphabetically and do act on each new letter • First idea:<xsl:variable name="PrevLetter" select="' '" /><xsl:for-each select="book"> <xsl:sort select="title" data-type="text" order="ascending"/> <xsl:variable name="ThisLetter" select="substring(title/.[1],1,1)" /> <xsl:if test="$PrevLetter!=$ThisLetter"> <H2><xsl:value-of select="$ThisLetter"/></H2> </xsl:if> <xsl:variable name="PrevLetter" select="$ThisLetter" /> <H3><xsl:value-of select="title"/></H3> </xsl:for-each> • No good: the value of the variable PrevLetter is reset in every iteration of the for-each loop

  15. Semantic Blessings of XSLT Would this work? <xsl:for-each select="book"> <xsl:sort select="title" data-type="text" order="ascending"/> <xsl:variable name="PrevLetter" select="substring(preceding-sibling::book[1]/title/.[1],1,1)" /> <xsl:variable name="ThisLetter" select="substring(title/.[1],1,1)" /> <xsl:if test="$PrevLetter!=$ThisLetter"> <H2><xsl:value-of select="$ThisLetter"/></H2> </xsl:if> <H3><xsl:value-of select="title"/></H3> </xsl:for-each> • Better, but the function preceding-sibling operates on the original order, not on the sorted... • Is that a bug or a feature? • It’s a blessing!

  16. Semantic Blessings of XSLT The solution <xsl:for-each-group select="book" group-by="substring(title/.[1],1,1)"> <H2><xsl:value-of select="current-grouping-key()"/></H2> <xsl:for-each select="current-group()"> <xsl:sort select="title" data-type="text" order="ascending"/> <H3><xsl:value-of select="title"/></H3> </xsl:for-each> </xsl:for-each-group> • Think XML • Think in creating hierarchies: groups of titles starting with the same letter

  17. Semantic Blessings of XSLT The ultimate semantic normalisation • “PCDATA considered harmful” (Han Nonnekes, Shell Oil)‏ • Text is the outer structure in a specific language of a deeper meaning • You should encode a text as that deeper tree • With references to abstract words (concepts)‏ • For each language (“English, upper class, around 1850”) give dictionary and transformation rules • Then generate the text

  18. Semantic Blessings of XSLT Questions? • Ask me now • Ask me during lunch or tea break • Ask me during buffet • Mail dg@doxatrix.nl • Presentation can be downloaded from • www.xmlholland2008.nl • www.doxatrix.nl/dg

  19. Semantic Blessings of XSLT

More Related