Languages for the Semantic Web: Enhancing Knowledge Representation and Machine Interpretation
This presentation explores the critical role of markup languages, such as XML and RDF, in enabling formalized knowledge representation on the Semantic Web. By examining the layered architecture proposed by Tim Berners-Lee and the functionalities of various languages (DAML, XQuery, etc.), we discuss how structured metadata improves machine comprehension and automation across applications. The talk emphasizes the importance of knowledge capture, linking, and the various markup languages suited for different types of information, promoting a universal format for web-distributed knowledge.
Languages for the Semantic Web: Enhancing Knowledge Representation and Machine Interpretation
E N D
Presentation Transcript
Languages for Semantic Web 葉慶隆 大同大學 資訊工程系所 Email: chingyeh@cse.ttu.edu.tw URL: www.cse.ttu.edu.tw/chingyeh
Sources • Knowledge Markup and Resource Semantics, By Harold Boley, Stefan Decker, and Michael Sintek, IJCAI-01 Tutorial, http://www.ijcai-01.org/ • XML Fundamentals, http://www.ibiblio.org/xml/slides/sd2001east/fundamentals/XML_Fundamentals.html • Anupriya Ankolenkar, et al., “DAML-S: Semantic Markup For Web Services,”, Proceedings of SWWS’ 01, the First Semantic Web Working Symposium, California, USA, July 30 - August 1, 2001. Languages forSemantic Web
Increasing demand for formalized knowledge on the Web: AI’s chance! XML- & RDF-based markup languages provide a 'universal' storage/interchange format for such Web-distributed knowledge representation In this talk, we focus on Semantic Web languages: XML, RDF(S), DAML. Overview Namespaces CSS DTDs XSLT DAML Stylesheets Agents Transformations Ontobroker XQL XML HornML Rules Queries XQuery RuleML XML-QL SHOE RDF[S] Frames Acquisition TopicMaps Protégé Languages forSemantic Web
Web Languages forKnowledge Capturing • Human knowledge is (partially) captured on the Web as informal texts, semiformal documents, and structured metadata • Each kind of knowledge has its (preferred) markup language Languages forSemantic Web
Web Languages forMachine Interpretation • XML (Extensible Markup Language): Semiformal documents range between non-formatted texts and fully formatted databases • RDF (Resource Description Framework): Structured metadata describe arbitrary heterogeneous Web pages/objects in a homogeneous manner. Machines (e.g. search engines) can analyze XML or RDF markups better than full HTML Languages forSemantic Web
The Semantic Web Activityof the W3C • “The Semantic Web is a vision: the idea of having • data on the Web defined and linked in a way that • it can be used by machines not just for display purposes, • but for • automation, • integration and • reuse of data across various applications.” • (http://www.w3.org/2001/sw/Activity) Languages forSemantic Web
The Semantic Web Layered Architecture Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001 (http://www.w3.org/2001/Talks/0228-tbl/slide5-0.html) Languages forSemantic Web
XML Fundamentals Source: http://www.ibiblio.org/xml/slides/sd2001east/fundamentals/XML_Fundamentals.html
What is XML? • Extensible Markup Language • A syntax for documents • A Meta-Markup Language • A Structural and Semantic language, not a formatting language • Not just for Web pages Languages forSemantic Web
Extensible Markup Language • Language • It has a grammar • It has a vocabulary (sort of) • It can be parsed by machines • Markup Language • It says what things are; not what they do • It is not a programming language • It is not compiled • Extensible • You can add words to the language Languages forSemantic Web
XML is a Meta Markup Language • Not like HTML, troff, LaTeX • Make up the tags you need as you need them • The tags you create can be documented in a Document Type Definition (DTD) • A meta syntax for domain-specific markup languages like MusicML, MathML, and XHTML Languages forSemantic Web
XML Applications • A specific markup language that uses the XML meta-syntax is called an XML application • Different XML applications have their own more constricted syntaxes and vocabularies within the broader XML syntax • Further syntax can be layered on top of this; e.g. data typing through schemas Languages forSemantic Web
XML describes structure and semantics, not formatting • XML documents form a tree • Document Object Model (DOM) • Element and attribute names reflect the kind of the element • DTD, Schema • Formatting can be added with a style sheet • Cascading Style Sheets (CSS) • Extensible Stylesheet language (XSL) Languages forSemantic Web
XML Hypertext • A Uniform Resource Identifier (URI) names or locates a resource • An XLink defines connections between two or more documents identified by URIs • XPath identifies particular nodes within a document • An XPointer adds an XPath to a URI • XBase defines the URI against which relative URIs are resolved • XInclude embeds a document identified by a URI inside an XML document. Languages forSemantic Web
A Song Description in HTML <dt>Hot Cop <dd> by Jacques Morali, Henri Belolo, and Victor Willis <ul> <li>Producer: Jacques Morali <li>Publisher: PolyGram Records <li>Length: 6:20 <li>Written: 1978 <li>Artist: Village People </ul> Languages forSemantic Web
A Song Description in XML <SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> Languages forSemantic Web
Style Sheets Provide Formatting(CSS) SONG {display: block; font-family: New York, Times New Roman, serif} TITLE {display: block; font-size: 24pt; font-weight: bold; font-family: Helvetica, sans} COMPOSER {display: block} PRODUCER {display: block} YEAR {display: block} PUBLISHER {display: block} LENGTH {display: block} ARTIST {display: block; font-style: italic} Languages forSemantic Web
Attaching Style Sheets to Documents <?xml-stylesheet type="text/css" href="song.css"?> <SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> Languages forSemantic Web
An XSLT Stylesheet (Part 1) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><title>Song</title></head> <body> <xsl:apply-templates select="SONG"/> </body> </html> </xsl:template> Languages forSemantic Web
An XSLT Stylesheet (Part 2) <xsl:template match="SONG"> <h1> <xsl:value-of select="TITLE"/> by the <xsl:value-of select="ARTIST"/> </h1> <ul> <li>Length: <xsl:value-of select="LENGTH"/></li> <li>Producer: <xsl:value-of select="PRODUCER"/></li> <li>Publisher: <xsl:value-of select="PUBLISHER"/></li> <li>Year: <xsl:value-of select="YEAR"/></li> <xsl:apply-templates select="COMPOSER"/> </ul> </xsl:template> <xsl:template match="COMPOSER"> <li>Composer: <xsl:value-of select="."/></li> </xsl:template> </xsl:stylesheet> Languages forSemantic Web
Transforming the Document <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Song</title> </head> <body> <h1>Hot Cop by the Village People </h1> <ul> <li>Length: 6:20</li> <li>Producer: Jacques Morali</li> <li>Publisher: PolyGram Records</li> <li>Year: 1978</li> <li>Composer: Jacques Morali</li> <li>Composer: Henri Belolo</li> <li>Composer: Victor Willis</li> </ul> </body> </html> XSL document (template rules) XML document XSLT Processor (IE 5) Output Languages forSemantic Web
A DTD for Songs <!ELEMENT SONG (TITLE, COMPOSER+, PRODUCER*, PUBLISHER*, LENGTH?, YEAR?, ARTIST+)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT COMPOSER (#PCDATA)> <!ELEMENT PRODUCER (#PCDATA)> <!ELEMENT PUBLISHER (#PCDATA)> <!ELEMENT LENGTH (#PCDATA)> <!-- This should be a four digit year like "1999", not a two-digit year like "99" --> <!ELEMENT YEAR (#PCDATA)> <!ELEMENT ARTIST (#PCDATA)> Languages forSemantic Web
Well-formedness • Rules: • Open and close all tags • Empty tags end with /> • There is a unique root element • Elements may not overlap • Attribute values are quoted • < and & are only used to start tags and entities • Only the five predefined entity references are used • Plus more... Languages forSemantic Web
Validity • To be valid an XML document must be • Well-formed • Must have a Document Type Definition (DTD) • Must comply with the constraints specified in the DTD Languages forSemantic Web
What Is XML Used for? • Domain-Specific Markup Languages • XML in industrial applications: http://www.xml.org/xml/industry_industrysectors.jsp • Self-Describing Data • Much data is lost due to format problems. • Interchange of Data Among Applications • Electronic business: RosettaNet, ebXML Languages forSemantic Web
XML Namespaces • XML namespaces are akin to namespaces, packages, and modules in programming languages • Disambiguation of tag–and attribute–names from different XML applications (“spaces”) through different prefixes • A prefix is separated from the local name by a “:”, obtaining prefix:name tags • Namespaces constitute a layer on top of XML 1.0, since prefix:name is again a valid tag name and namespace bindings are ignored by some tools Languages forSemantic Web
Namespace Bindings • Prefixes are bound to namespace URIs by attaching an xmlns:prefix attribute to the prefixed element or one of its ancestors, prefix:name1 ,...,prefix:namen • The value of the xmlns:prefix attribute is a URI, which may or (unlike for DTDs!) may not point to a description of the namespace’s syntax • An element can use bindings for multiple name-spaces via attributes xmlns:prefix1 ,...,xmlns:prefixm Languages forSemantic Web
Two-Namespace Example: Snail-Mail and Telecoms Address Parts <mail:address xmlns:mail="http://www.deutschepost.de/" xmlns:tele="http://www.telekom.de/"> <mail:name>Xaver M. Linde</mail:name> <mail:street>Wikingerufer 7</mail:street> <mail:town>10555 Berlin</mail:town> <mail:bill>12.50</mail:bill> <tele:phone>030/1234567</tele:phone> <tele:phone>030/1234568</tele:phone> <tele:fax>030/1234569</tele:fax> <tele:bill>76.20</tele:bill> </ mail:address> bill disambiguation through mail and tele prefixes Languages forSemantic Web
Resource Description FrameworkRDF Source:Knowledge Markup and Resource Semantics, By Harold Boley, Stefan Decker, and Michael Sintek, IJCAI-01 Tutorial, http://www.ijcai-01.org/
Outline • Motivation: Why XML is not enough • Introduction to RDF • Requirements for KR on the Web • The RDF Data Model • RDF Schema • Extensions of RDF(S) • Tools for RDF and RDF Schema • Parser, Query, and Inference Engines Languages forSemantic Web
Why The Shift Towards More Semantics? • Information Overload • Information on the Web currently aiming at Human Consumption • Information Consumption is too time consuming • Search Engines fail more and more • combined coverage is less than 42% of the HTML-Web • Data Interchange growing (e.g. B2B) • needs a common semantics Languages forSemantic Web
Extensible Markup Language (XML) Revisited • Key idea: separate structure from presentation • XML DTDs or Schemas define document structure • Replace HTML with two things • A domain specific markup language (defined in XML) • A map from that markup language to HTML (defined using XSLT) • DTD enables document recipients to tell whether they’ve received a grammar-conforming document • Gives a minimal level of validation Languages forSemantic Web
Why XML is Not Enough • Main advantage of using XML is reusing the parserand document validation • Many different possibilities to encode a domain of discourse • Leads to difficulties when understanding of foreign documents is required ==> Next step: separate content from structure! Languages forSemantic Web
Encoding of Knowledge: Example “The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila Ora Lassila http://www.w3.org/Home/Lassila Creator Endless encoding possibilities in XML: <Creator> <uri>http://www.w3.org/Home/Lassila</uri> <name>Ora Lassila</name> </Creator> <Document uri=“http://www.w3.org/Home/Lassila” <Creator>Ora Lassila</Creator> </Document> <Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/> Languages forSemantic Web
XML-based Communication using DTD A Parse Tree XML-Parser Sender using DTD A Point to Point Communicationfor Machine-Understandable Data Conceptual Domain Model(Objects and Relations) Person is_a Mammal Student is_a Person ---- Translation Step DTD or XML Schema <xsd:schema xmlns:xsd="http://..."> <xsd:annotation> A-Schema </xsd:...</xsd:schema> Deployment Recipient using DTD A Common Semantics Languages forSemantic Web
Many Previously Unknown Communication Partners Languages forSemantic Web
New Partners Don’t Understand Each Other ? Communication Partner using DTD B Communication Partner using DTD C ? ? XML-based Communication using DTD A Parse Tree XML-Parser Recipient using DTD A Sender using DTD A Languages forSemantic Web
Merging Steps Between Models DTD A Steps DTD B <xsd:schema xmlns:xsd="http://..."> <xsd:annotation>B-Schema </xsd:...</xsd:schema> <xsd:schema xmlns:xsd="http://..."> <xsd:annotation>A-Schema </xsd:...</xsd:schema> Reengineering of the conceptual model Matching Matching XML Document Translation Generation(e.g. in XSLT) <xsl:stylesheet version="1.0” xmlns:xsl="http://....Transform" <xsl:template match="/"> .... </xsl:template> </xsl:stylesheet> <xsl:stylesheet version="1.0” xmlns:xsl="http://....Transform" <xsl:template match="/"> .... </xsl:template> </xsl:stylesheet> XML Document Translation from DTD A to DTD B (and B to A) Languages forSemantic Web
Merging/Aligning Models • Reengineering step is costly and unnecessary, when a conceptual language is in use • Generation document translation procedures is again complicated and unnecessary ==> use a level on top of XML • What are requirements for such a level? Languages forSemantic Web
Postulates: Fundamental Requirements for KR on the Web 1. Knowledge on the Web is distributed (link Knowledge on the Web) 2. Knowledge on the Web is biased - there is no universal truth it must be possible to dispute statements 3. Many different user communities: Extensibility and Simplicity ==> Resource Description Framework (RDF) Languages forSemantic Web
Introduction to RDF • RDF (Resource Description Framework) • Beyond Machine readable to Machine understandable • RDF unites a wide variety of stakeholders: • Digital librarians, content-raters, privacy advocates, B2B industries, AI... • Significant (but less than XML) industrial momentum, lead by W3C • RDF consists of two parts • RDF Model (a set of triples) • RDF Syntax (different XML serialization syntaxes) • RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF) Languages forSemantic Web
RDF Data Model • Resources • A resource is a thing you talk about (can reference) • Resources have URI’s • RDF definitions are themselves Resources (linkage, see requirement 1) • Properties • slots, define relationships to other resources or atomic values • Statements • “Resource has Property with Value” • (Values can be resources or atomic XML data) • Similar to Frame Systems Languages forSemantic Web
Ora Lassila A Simple Example • Statement • “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila” • Structure • Resource (subject) http://www.w3.org/Home/Lassila • Property (predicate) http://www.schema.org/#Creator • Value (object) "Ora Lassila” • Directed graph s:Creator http://www.w3.org/Home/Lassila Languages forSemantic Web
Another Example • To add properties to Creator, point through an intermediate Resource. http://www.w3.org/Home/Lassila s:Creator Person://fi/654645635 Email Name Ora Lassila lassila@w3.org Languages forSemantic Web
Collection Containers • Multiple occurrences of the same PropertyType don’t establish a relation between the values • The Millers own a boat, a bike, and a TV set • The Millers need (a car or a truck) • (Sarah and Bob) bought a new car • RDF defines three special Resources: • Bag unordered valuesrdf:Bag • Sequence ordered values rdf:Seq • Alternative single valuerdf:Alt • Core RDF does not enforce ‘set’ semantics amongst values Languages forSemantic Web
/courses/6.001 Example: Bag • The students incourse 6.001 are Amy, Tim,John, Mary,and Sue Rdf:Bag rdf:type /Students/Amy students rdf:_1 rdf:_2 /Students/Tim bagid1 rdf:_3 /Students/John rdf:_4 /Students/Mary rdf:_5 /Students/Sue Languages forSemantic Web
http://x.org/package/X11 Example: Alternative • The source code for X11 may be found at ftp.x.org, ftp.cs.purdue.edu, or ftp.eu.net rdf:Alt rdf:type source rdf:_1 altid ftp.x.org rdf:_2 ftp.cs.purdue.edu rdf:_3 ftp.eu.net Languages forSemantic Web
Statements About Statements (Requirement 2: Dispute Statements) • Making statements about statements requires a process for transforming them into Resources • subject the original resource • predicate the original property • object the original value • type rdf:Statement Languages forSemantic Web
A Formal Model of RDF • RDF itself is mathematically straightforward: • Basic Definitions • Resources. • PropertiesResources • Literals • Statements = Properties Resources {Resources Literals} • Typing • rdf:type Properties • {RDF:type, sub, obj} Statements obj Resources • for triples like {p,r1,r2} the RDF spec should use some different bracketing, like (p,r1,r2) Languages forSemantic Web
Formal Model of RDF II • Reification • rdf:Statement Resource-Properties • {rdf:predicate, rdf:subject, rdf:object } Properties • Reification of a triple {pred, sub, obj} of Statements is an element r of Resources representing the reified triple and the elements s1, s2, s3, and s4 of Statements such that • s1: {RDF:predicate, r, pred} • s2: {RDF:subject, r, sub} • s3: {RDF:object, r, obj} • s4: {RDF:type, r, [RDF:Statement]} • Collections • { RDF:Seq, RDF:Bag, and RDF:Alt } Resources-Properties • There is a subset of Properties corresponding to the ordinals (1, 2, 3, ...) called Ord. We refer to • elements of Ord as RDF:_1, RDF:_2, RDF:_3, ... Languages forSemantic Web