1 / 34

Semantic Web Standards

Semantic Web Standards. Slides based on Ian Horrock’s class. Where we are Today: the Syntactic Web. [Hendler & Miller 02]. The Syntactic Web is…. A hypermedia, a digital library A library of documents called (web pages) interconnected by a hypermedia of links

Télécharger la présentation

Semantic Web Standards

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Web Standards Slides based on Ian Horrock’s class

  2. Where we are Today: the Syntactic Web [Hendler & Miller 02]

  3. The Syntactic Web is… • A hypermedia, a digital library • A library of documents called (web pages) interconnected by a hypermedia of links • A database, an application platform • A common portal to applications accessible through web pages, and presenting their results as web pages • A platform for multimedia • BBC Radio 4 anywhere in the world! Terminator 3 trailers! • A naming scheme • Unique identity for those documents A place where computers do the presentation (easy) and people do the linking and interpreting (hard). Why not get computers to do more of the hard work? [Goble 03]

  4. Hard Work using the Syntactic Web… Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

  5. , e.g., Barn Owl Impossible (?) using the Syntactic Web… • Complex queries involving background knowledge • Find information about “animals that use sonar but are not either bats or dolphins” • Locating information in data repositories • Travel enquiries • Prices of goods and services • Results of human genome experiments • Finding and using “web services” • Visualise surface interactions between two proteins • Delegating complex tasks to web “agents” • Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English

  6. What is the Problem? • Consider a typical web page: • Markup consists of: • rendering information (e.g., font size and colour) • Hyper-links to related content • Semantic content is accessible to humans but not (easily) to computers…

  7. What information can we see… WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed Tim berners-lee Tim is the well known inventor of the Web, … Ian Foster Ian is the pioneer of the Grid, the next generation internet …

  8. What information can a machine see… WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed Tim berners-lee Tim is the well known inventor of the Web, … Ian Foster Ian is the pioneer of the Grid, the next generation internet …

  9. Solution: XML markup with “meaningful” tags? <name>WWW2002 The eleventh international world wide webcon</name> <location>Sheraton waikiki hotel Honolulu, hawaii, USA</location> <date>7-11 may 2002</date> <slogan>1 location 5 days learn interact</slogan> <participants>Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants> <introduction>Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed</introduction> <speaker>Tim berners-lee</speaker> <bio>Tim is the well known inventor of the Web,</bio>…

  10. But What About… <conf>WWW2002 The eleventh international world wide webcon</conf> <place>Sheraton waikiki hotel Honolulu, hawaii, USA</place> <date>7-11 may 2002</date> <slogan>1 location 5 days learn interact</slogan> <participants>Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants> <introduction>Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed</introduction> <speaker>Tim berners-lee</speaker> <bio>Tim is the well known inventor of the Web,…

  11. Need to Add “Semantics” • External agreement on meaning of annotations • E.g., Dublin Core • Agree on the meaning of a set of annotation tags • Problems with this approach • Inflexible • Limited number of things can be expressed • Use Ontologies to specify meaning of annotations • Ontologies provide a vocabulary of terms • New terms can be formed by combining existing ones • Meaning (semantics) of such terms is formally specified • Can also specify relationships between terms in multiple ontologies

  12. “... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.” History of the Semantic Web • Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: • TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web • E.g., article in May 2001 issue of Scientific American…

  13. Scientific American, May 2001: Beware of the Hype

  14. Beware of the Hype • Hype seems to suggest that Semantic Web means: “semantics + web = AI” • “A new form of Web content that is meaningful to computers will unleash a revolution of new abilities” • More realistic to think of it as meaning: “semantics + web + AI = more useful web” • Realising the complete “vision” is too hard for now (probably) • But we can make a start by adding semantic annotation to web resources Images from Christine Thompson and David Booth

  15. Web “Schema” Languages • Existing Web languages extended to facilitate content description • XML XML Schema (XMLS) • RDF RDF Schema (RDFS) • XMLS not an ontology language • Changes format of DTDs (document schemas) to be XML • Adds an extensible type hierarchy • Integers, Strings, etc. • Can define sub-types, e.g., positive integers • RDFS is recognisable as an ontology language • Classes and properties • Sub/super-classes (and properties) • Range and domain (of properties)

  16. RDF and RDFS • RDF stands for Resource Description Framework • It is a W3C candidate recommendation (http://www.w3.org/RDF) • RDF is graphical formalism ( + XML syntax + semantics) • for representing metadata • for describing the semantics of information in a machine- accessible way • RDFS extends RDF with “schema vocabulary”, e.g.: • Class, Property • type, subClassOf, subPropertyOf • range, domain

  17. hasColleague Ian Uli The RDF Data Model • Statements are <subject, predicate, object> triples: • Can be represented using XML serialisation, e.g.: • <Ian,hasColleague,Uli> • Statements describe properties of resources • A resource is a URI representing a (class of) object(s): • a document, a picture, a paragraph on the Web; • http://www.cs.man.ac.uk/index.html • a book in the library, a real person (?) • isbn://5031-4444-3333 • … • Properties themselves are also resources (URIs)

  18. URIs • URI = Uniform Resource Identifier • "The generic set of all names/addresses that are short strings that refer to resources“ • URIs may or may not be dereferencable • URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages) • In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document: • http://www.somedomain.com/some/path/to/file#fragmentID

  19. Linking Statements • The subject of one statement can be the object of another • Such collections of statements form a directed, labeled graph • Note that the object of a triple can also be a “literal” (a string) • Note also that RDF triples don’t by themselves give meaning • You know that (1) Ian and Carol are most likely colleagues (barring multiple jobs for Uli (2) (Uli hasCollegue Ian) holds (“colleagueness” –unlike “love” is symmetric). But DOES YOUR PROGRAM KNOW THIS?

  20. RDF Syntax • RDF has an XML syntax that has a specific meaning: • Every Description element describes a resource • Every attribute or nested element inside a Description is apropertyof that Resource with an associated object resource • Resources are referred to using URIs <Description about="some.uri/person/ian_horrocks"> <hasColleague resource="some.uri/person/uli_sattler"/> </Description> <Description about="some.uri/person/uli_sattler"> <hasHomePage>http://www.cs.mam.ac.uk/~sattler</hasHomePage> </Description> <Description about="some.uri/person/carole_goble"> <hasColleague resource="some.uri/person/uli_sattler"/> </Description>

  21. RDF Schema (RDFS) • RDF gives a formalism for meta data annotation, and a way to write it down in XML, but it does not give any special meaning to vocabulary such as subClassOf or type • Interpretation is an arbitrary binary relation • I.e., <Person,subClassOf,Animal> has no special meaning • RDF Schema defines “schema vocabulary” that supports definition of ontologies • gives “extra meaning” to particular RDF predicates and resources (such as subClasOf) • this “extra meaning”, or semantics, specifies how a term should be interpreted

  22. “Background Theory” RDF Schema is really RDF background knowledge! “Instances”

  23. RDF/RDFS vs. General Knowledge Rep & Reasoning • We noted that RDF can be seen as “base level facts” and RDFS can be seen as “background theory/facts/rules • At this level, inference with RDF/RDFS seems to be just a special case of Knowledge Representation Reasoning • This is good (CSE471 Ahoy!) and bad (reasoning over most non-trivial logics is NP-hard or much much worse). • RDF/RDFS can be seen as an attempt to limit the complexity of reasoning by limiting the expressiveness of what can be expressed • RDF/RDFS together can be seen as capturing a certain tractable subset of First Order Logic • ..already there is trouble in paradise with people complaining that the expressiveness is not enough • Enter OWL, which attempts to provide expressiveness equivalent to “description logics” (a sort of inheritance reasoning in First-order logic)

  24. Problems with RDFS • RDFS too weak to describe resources in sufficient detail • No localised range and domain constraints • Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants • No existence/cardinality constraints • Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents • No transitive, inverse or symmetrical properties • Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical • … • Difficult to provide reasoning support • No “native” reasoners for non-standard semantics • May be possible to reason via FO axiomatisation

  25. RDF Schema is now being superseded by OWL

  26. Layer 4½: Mapping Between Ontologies • Taxonomy Crisis: • How can your agent know that my “title” is your “name”?! • How can my agent know that some of your “address” objects are post-boxes, not physical addresses?! • How can my agent know that many Asian first names correspond to Western surnames? • Semantic Web Solution: Services for translating/mapping between “related” ontologies. • Suppose Amazon.com uses Dublin Core (“title”), while Fred Hanna uses it’s own document ontology (“name”). So far … my agent is forced to choose a ontology, or must be carefully crafted to understand both lanuages • A better solution: A niche now exists for a independent entity (UniversalBookInfo.com) that maps “title”  “name” etc

  27. without UniversalBookInfo.com Nick wants tobuy War & Peace Nick’svery complicatedagent €€€ Programmer’sbank account Amazonontology FredHannaontology Amazon Fred Hanna

  28. with UniversalBookInfo.com Nick wants tobuy War & Peace Nick’s agent Joe’s agent € € Jane’s Agent € UniversalBookInfo.com Amazon Fred Hanna €€€ Bank Account

  29. ??? ??? ??? (In)famous “Layer Cake”  Semantics+reasoning ?  Relational Data ?  Data Exchange • Relationship between layers is not clear • OWL DL extends “DL subset” of RDF

  30. Who will annotate the data? • Semantic web works if the users annotate their pages using some existing ontology (or their own ontology, but with mapping to other ontologies) • But users typically do not conform to standards.. • and are not patient enough for delayed gratification… • Two Solutions • 1. Intercede in the way pages are created (act as if you are helping them write web-pages) • What if we change the MS Frontpage/Claris Homepage so that they (slyly) add annotations? • E.g. The Mangrove project at U. Wash. • Help user in tagging their data (allow graphical editing) • Provide instant gratification by running services that use the tags. • 2. Collaborative tagging! • “Folksonomies” (look at Wikipedia article) • FLICKR, Technorati, deli.cio.us etc • 3. Automated information extraction (next topic)

  31. Folksonomies—The good • Bottom-up approach to taxonomies/ontologies • [In systems like] Furl, Flickr and Del.icio.us... people classify their pictures/bookmarks/web pages with tags (e.g. wedding), and then the most popular tags float to the top (e.g. Flickr's tags or Del.icio.us on the right).... • [F]olksonomies can work well for certain kinds of information because they offer a small reward for using one of the popular categories (such as your photo appearing on a popular page). People who enjoy the social aspects of the system will gravitate to popular categories while still having the freedom to keep their own lists of tags. Classic case of research playing catch-up with practice ;-)

  32. Works best when Many people Tag the same Info…

  33. Folksonomies… the bad • On the other hand, not hard to see a few reasons why a folksonomy would be less than ideal in a lot of cases: • None of the current implementations have synonym control (e.g. "selfportrait" and "me" are distinct Flickr tags, as are "mac" and "macintosh" on Del.icio.us). • Also, there's a certain lack of precision involved in using simple one-word tags--like which Lance are we talking about? (Though this is great for discovery, e.g. hot or Edmonton) • And, of course, there's no heirarchy and the content types (bookmarks, photos) are fairly simple. • For indexing and library people, folksonomies are about as appealing as Wikipedia is to encyclopediaeditors. • But.. there's some interesting stuff happening around them. Computizing Eyeballs (brain) cycle stealing

  34. Collaborative Computing AKA Brain Cycle StealingAKA Computizing Eyeballs • A lot of exciting research related to web currently involves “co-opting” the masses to help with large-scale tasks • It is like “cycle stealing”—except we are stealing “human brain cycles” (the most idle of the computers if there is ever one ;-) • Remember the mice in the Hitch Hikers Guide to the Galaxy? (..who were running a mass-scale experiment on the humans to figure out the question..) • Collaborative knowledge compilation (wikipedia!) • Collaborative Curation • Collaborative tagging • Many big open issues • How do you pose the problem such that it can be solved using collaborative computing? • How do you “incentivize” people into letting you steal their brain cycles?

More Related