1 / 23

The Semantic Web – introduction to the basic technology Week 2 - XML

The Semantic Web – introduction to the basic technology Week 2 - XML. Lee McCluskey. Recap.

madison
Télécharger la présentation

The Semantic Web – introduction to the basic technology Week 2 - XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semantic Web –introduction to the basic technologyWeek 2 - XML Lee McCluskey

  2. Recap • The Semantic Web is the Vision (not a current reality) of having an internet with resources that are machine understandable or accessible to automated processes - machines should do much more than present the information visually or do human-consumable IR. • Central idea – we agree on a way of SPECIFYING vocabularies rather than agreeing on a particular vocabularies/languages. Then in communication, processes only need to point to the language (vocabulary) they are using. This is much more flexible than a common language. • XML is like a “machine code” in the SW. • Processes on the SW will need to perform reasoning to fully exploit the SW to do Knowledge Acquisition etc.

  3. WWW • A tool for people to access information • Interface to certain (online) databases, and to businesses • Human interface to some services (info retrieval, weather, train timetables etc) The WWW is successful largely through the use of layers of internationally accepted standards (TCP/IP,html) and now the fact that it is • Ubiquitous • Organic + Distributed • Dynamic + Unbounded

  4. WWW - a standard - ‘first generation’ - hand written html pages - ‘second generation’ - dynamic web - pages created by programs to display the results of a process, or the output of a query of an accessed database. Web pages used as an interface to networked processes (services) as well as for general information display.

  5. WWW + Much R&D has been directed at writing programs/services that utilise HTML web info EG the University of California’s travel assistant - a web service that uses other web services (weather, timetables, hotel) to make travel plans in response to a high level directive “I need to be in X on days Y using budget Z” BUT: this is very hard because of the web’s unstructured data .. Eg ISI’s travel assistant has to use a learning program to induce web page ‘wrappers’ before it can reliably extract data.

  6. WWW html example <html> <head><title> Lee McCluskey </title></head> <body bgcolor="#ffffff"> <body> <h1> McCluskey, Thomas Leo </h1> <br> BSc (Maths), MSc (Maths), PhD (Computer Science), MBCS, C.Eng <br> Professor of Software Technology <br> <br> School of Computing and Engineering, <br> University of Huddersfield, <br> Huddersfield, <br> West Yorkshire, <br> HD1 3DH, <br> United Kingdom. <p> <b>email:</b> t.l.mccluskey followed by @hud.ac.uk</a> <br> <b>telephone (direct):</b> (+44) (0) 1484 472247 <br> <b>telephone (internal):</b> 2247 <br> <b>telephone (messages):</b> (+44) (0) 1484 472150 <br> <b>fax:</b> (+44) (0) 1484 421106 <br> <b>room number:</b> CW2/09 </p>

  7. Metadata and XML • We can start to giving ‘meaning’ to info on the web using META-DATA eg using tags around data to describe its content. • In XML - eXtensible Mark-up Language - tags are not fixed - one can invent new tags to structure the information in a web page. • XML is considered to be the basis for all semantic web languages - the “machine code” of the new generation web

  8. Rough Hierarchy of Languages in the Semantic Web OWL .. Ontology language DAML .. gives logic RDFS .. gives classes RDF .. gives tuples XML .. gives content

  9. XML Overview • XML is a subset of SGML (standard general mark-up language) which was written originally for electronic documents and publications • XML has the advantages of HTML – it is platform-independent and a standardised language see http://www.w3.org/TR/REC-xml/ But HTML has a FIXED set of tags, and holds no MEANING about the data in its document.

  10. Rough syntax of XML = list of <name attributes> element </name> • XML structures information using TAGS in a composite fashion eg <someTag> …… </someTag> <someTag Attribute = “Value”> …… </someTag> • Info between tags is called an “element”

  11. XML • XML allows the content to be structured so that it is easy for a machine to extract meaningful data from an XML page. It is a meta-language – a language used in the description of other languages. • It can be used to structure data in a database, or as a communication language • It can be formatted using a style sheet language called XSL (like CSS for HTML)

  12. Example <?xml version="1.0"?> <email date=“30/09/04”> <to>fred</to> <from>sue</from> <subject>xml example</subject> <message>This is the message</message> </email> • All tags have a start and end • Tags must be correctly nested as a tree syntax • Tags can have attributes

  13. Example - better <?xml version="1.0"?> <email> <to>fred</to> <from>sue</from> <date> <day>30</day> <month>9</month> <year>2004</year> </date> <subject>xml example</subject> <message>this is the message</message> </email>

  14. Elements .. Logically every element has four key pieces: • A name • The attributes of the element • The namespaces in scope on the element • The content of the element The content can be text, comments, more tagged info or Processing Information eg <?xml-stylesheet type="text/xml" href="limited.xsl"?> This is meta info about the document

  15. DTD’s • XML is self describing – it uses a DTD (Document Type Definition) to formally describe the structure of its contents • An XML doc is well-formed if its syntax is ok according to the XML standard. It is VALID if additionally it conforms to its DTD • DTD’s are formed so that we can share our document structures with other parties. Knowing our DTD, they can write programs to process our XML documents.

  16. Example with DTD <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT email (to,from,subject,message)> <!ELEMENT to (#PCDATA <!ELEMENT from (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT message (#PCDATA)> ]> <email date=“30/09/04”> <to>fred</to> <from>sue</from> <subject>xml example</subject> <message>this is the message</message> </email>

  17. DTD are like grammars.. <!ELEMENT address_book (listing+) > <!ELEMENT listing (name, address) > <!ELEMENT name (last_name, first_name) > <!ELEMENT last_name (#PCDATA) > <!ELEMENT first_name (#PCDATA) > <!ELEMENT address (street, city, (state|province), zip) > <!ELEMENT street (#PCDATA) > <!ELEMENT city (#PCDATA) > <!ELEMENT state (#PCDATA) > <!ELEMENT province (#PCDATA) > <!ELEMENT zip (#PCDATA) >

  18. DOMs “.. The promise of the Internet is very much tied to interoperability and the value proposition of e-business depends on the ability to truly collaborate with partners and customers in a meaningful and efficient way..” http://www.4infinitesolutions.com/course%20XML%20DTDs_Schema_DOM.htm

  19. DOMs • Document Object Models (DOMs) give an (abstract) program interface for constructing, querying accessing, and manipulating XML documents. • Concrete DOMs define methods and properties (instantiated for each programming language) which can be used to access/change XML documents from programs

  20. The Uniform Resource Identifier (URI) !!! A “URI” is fundamental to the SW – it ‘defines a unique resource’ – a string that uniquely defines something. Often (but not always) URI points to a webpage or an XML document. In XML, element type names (tags) and attribute names may be qualified with a URI – so that the name can be understood globally.

  21. The Uniform Resource Identifier (URI) Example: you need to refer to an ELEMENT annotated by <email> in the document.. http://scom.hud.uk/scomtlm/namespaces/example You would set up a “namespace” in your XML document say tlm = http://scom.hud.uk/scomtlm/namespaces/example Then in your document you would use tlm:email To denote that this <email> tag is the same as the one in http://scom.hud.uk/scomtlm/namespaces/example

  22. Namespaces - xmlns examples <tlm:email xmlns:email="http://scom.hud.ac.uk/scomtlm/namespaces/example"> … <email:message …. > </email> You can also define a default namespace: <email xmlns="http://scom.hud.ac.uk/scomtlm/namespaces/example"> </email>

  23. exercises Read through some XML tutorials from relevant sites on the web eg • http://www.ddj.com/documents/s=2803/nam1012432263/ • http://www.ddj.com/documents/s=2799/nam1012432259/ • http://www.xmlfiles.com/xml/ • http://www.dcs.napier.ac.uk/~andrew/xml/ (this has some nice tutorial questions and answers!) Try the following exercises: • 1. Write a small XML Bibliography, and then write a DTD for it. • 2. Write a small XML Address book, and then write a DTD for it. • 3. Cut and paste an XSL style-sheet from one of the example websites and try to use it to present your XML files. For the Week ahead: Continue to read through the tutorials, and write down some notes on the meaning and different roles of DTD, XSL, DOM and all the other jargon you come across!

More Related