1 / 69

The Semantic Web and (vs?) Knowledge Representation ENC 2004, September 2004

The Semantic Web and (vs?) Knowledge Representation ENC 2004, September 2004. Peter F. Patel-Schneider Bell Labs Research Murray Hill, NJ, USA pfps@lucent.com. Abstract.

chaylse
Télécharger la présentation

The Semantic Web and (vs?) Knowledge Representation ENC 2004, September 2004

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semantic Web and (vs?) Knowledge Representation ENC 2004, September 2004 Peter F. Patel-Schneider Bell Labs Research Murray Hill, NJ, USA pfps@lucent.com

  2. Abstract The vision of the Semantic Web is a collection of documents in the World Wide Web whose content is meaningful to computers. In this vision computers can directly and effectively process information stored in the Web without having to depend on human guidance. This vision also underlies Knowledge Representation, a subdiscipline of Artificial Intelligence. Research in Knowledge Representation over the last four decades has produced some surprising results but has also identified some very difficult problems that need to be overcome if this vision is to be realized in its entirety. The Semantic Web vision provides several opportunities for Knowlege Representation that have not been explioted in the past as well as some new difficulties that could hinder application of techniques from Knowledge Representation to the Semantic Web. Achieving the vision of the Semantic Web require exploiting the successes of Knoweldge Representation in a challenging new environment.

  3. Acknowledgments and Caveat Some of the slides for this talk have been taken from various talks on the Semantic Web by Ian Horrocks and others, including Jeen Broekstra, Carole Goble, Frank van Harmelen, Austin Tate, and Raphael Volz. Thanks go to all of them. This talk contains my personal opinions on the Semantic Web and Knowledge Representation. Other researchers in both areas have differing opinions.

  4. What is the Semantic Web? The Semantic Web aims to make information on the World Wide Web accessible to computers. • Not only parsable by computers (i.e., XML), but also understandable (in some sense) by computers. • Prior agreements between humans are not needed to provide meaning (as is the case for XML). • Human guidance is not always needed.

  5. “... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.” History of the Semantic Web • Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN • TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: • TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web • E.g., article in May 2001 issue of Scientific American…

  6. Scientific American, May 2001:

  7. What is Knowledge Representation? Knowledge Representation aims to make information accessible to computers. • Not only parsable by computers (i.e., databases) but also understandable (in some sense) by computers. • Prior agreements between humans are not needed to provide meaning (as is the case for databases). • Human guidance is not always needed.

  8. Why the Semantic Web matches Knowledge Representation • Semantic Web is not just data • Divergent interpretations of reality • Missing information (and not just null values) • Doesn‘t match assumptions of databases or object-oriented programming • Open world (no closed-world assumption) • Objects change status over time • Semantic Web is a representation system

  9. The Semantic Web and/vsKnowledge Representation • The Semantic Web is both an opportunity and a challenge for Knowledge Representation • An opportunity because the Semantic Web is (or will be) a source of information with very similar goals to those of Knowledge Representation • A challenge because some of the characteristics of the Semantic Web violate some of the assumptions that have generally been held in Knowledge Representation • Knowledge Representation is both a resource and a cautionary tale for the Semantic Web • Knowledge Representation techniques can be utilized in the Semantic Web • Problems encountered in Knowledge Representation have already plauged the Semantic Web.

  10. Why Knowledge Representation is a resource for the Semantic Web • Knowledge Representation provides formal rigor • Meaning of information-bearing constructs are formally determined • Needed for computers to process the constructs (compare with formal syntax, also required for computer processing) • Knowledge Representation is concerned with reasoning • Determining what follows from a collection of information • Provides an account of what can be (reliably) done by a computer • Knowledge Representation systems are becoming quite powerful and reliable

  11. Why Knowledge Representation is a cautionary tale for the Semantic Web • Early Knowledge Representation was not formal • Lead to many interminable debates about the meaning of constructs • Current work in Knowledge Representation is very concerned with the formal meaning of constructs • Computing with representations is difficult • Many problems are intractable or even undecidable • Lead to a retreat to simpler languages • Current systems are quite capable, even on expressive languages • Heavily optimized code • Computers are much more powerful

  12. Why the Semantic Web is an opportunity for Knowledge Representation • The World Wide Web forms a (very) large source of information (albeit in awkward formats) • The Semantic Web aims to transform much of this information into a form compatible with the goals of Knowledge Representation • The Semantic Web also will contain services that can be controlled by other computers (and reasoned about) • A potential solution to the Grounding Problem in Knowledge Representation

  13. Why the Semantic Web is a challenge for Knowledge Representation • Very large amounts of information will be part of the Semantic Web • Can overwhelm formal reasoning methods • The Semantic Web contains differing interpretations of reality • Which one(s) to choose? • Recovering from inconsistencies • How to determine how inconsistencies arise • How to determine who to trust

  14. Why is the Semantic Web a good idea?

  15. Where we are Today: the Syntactic Web [Hendler & Miller 02]

  16. The Syntactic Web is… • A hypermedia, a digital library • A library of documents called (web pages) interconnected by a hypermedia of links • A database, an application platform • A common portal to applications accessible through web pages, and presenting their results as web pages • A platform for multimedia • BBC Radio 4 anywhere in the world! Terminator 3 trailers! • A naming scheme • Unique identity for those documents A place where computers do the presentation (easy) and people do the linking and interpreting (hard). Why not get computers to do more of the hard work? [Goble 03]

  17. Hard Work using the Syntactic Web… Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

  18. Impossible (?) using the Syntactic Web… • Complex queries involving background knowledge • Find information about “animals that use sonar but are not either bats or dolphins” • Locating information in data repositories • Travel enquiries • Prices of goods and services • Results of human genome experiments • Finding and using “web services” • Visualise surface interactions between two proteins • Delegating complex tasks to web “agents” • Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English

  19. What is the Problem? • Consider a typical web page: • Markup consists of: • rendering information (e.g., font size and colour) • Hyper-links to related content • Semantic content is accessible to humans but not (easily) to computers…

  20. What information can we see… WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed Tim berners-lee Tim is the well known inventor of the Web, … Ian Foster Ian is the pioneer of the Grid, the next generation internet …

  21. What information can a machine see… WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed Tim berners-lee Tim is the well known inventor of the Web, … Ian Foster Ian is the pioneer of the Grid, the next generation internet …

  22. Solution: XML markup with “meaningful” tags? <name>WWW2002 The eleventh international world wide webcon</name> <location>Sheraton waikiki hotel Honolulu, hawaii, USA</location> <date>7-11 may 2002</date> <slogan>1 location 5 days learn interact</slogan> <participants>Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants> <introduction>Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed</introduction> <speaker>Tim berners-lee</speaker> <bio>Tim is the well known inventor of the Web,</bio>…

  23. But What About… <conf>WWW2002 The eleventh international world wide webcon</conf> <place>Sheraton waikiki hotel Honolulu, hawaii, USA</place> <date>7-11 may 2002</date> <slogan>1 location 5 days learn interact</slogan> <participants>Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants> <introduction>Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed</introduction> <speaker>Tim berners-lee</speaker> <bio>Tim is the well known inventor of the Web,…

  24. Machine sees… <name>WWW2002 The eleventh international world wide webc</name> <location>Sheraton waikiki hotel Honolulu, hawaii, USA</location> <date>7-11 may 2002</date> <slogan>1 location 5 days learn interact</slogan> <participants>Registered participants coming from australia, canada, chile denmark, france, germany, ghana, hong kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants> <introduction>Register now On the 7th May Honolulu will provide the backdrop of the eleventh international world wide web conference. This prestigious event … Speakers confirmed</introduction> <speaker>Tim berners-lee</speaker> <bio>Tim is the well known inventor of the W</bio> <speaker>Ian Foster</speaker> <bio>Ian is the pioneer of the Grid, the ne</bio>

  25. Need to Add “Semantics” Two very different possible approaches: • External agreement on meaning of annotations • Agree on the meaning of a set of annotation tags, e.g., Dublin core • Problems with this approach • Inflexible • Limited number of things can be expressed • Use on-line Ontologies to specify meaning of annotations • Ontologies provide a vocabulary of terms • New terms can be formed by combining existing ones • Meaning (semantics) of such terms is formally specified • Can also specify relationships between terms in multiple ontologies Semantic Web takes second approach

  26. Characteristics of the Semantic Web • Part of the Web • Uses Web addressing (URIs) • Adheres to Web philosophy • Connected to the rest of the Web (e.g, services) • Large part of the Web • Very many connected documents • Diverse, conflicting • Semantic • Contains ontological information about meaning of objects

  27. What are Ontologies?

  28. Ontology: Origins and History Ontology in Philosophy a philosophical discipline—a branch of philosophy that deals with the nature and the organisation of reality • Science of Being (Aristotle, Metaphysics, IV, 1) • Tries to answer the questions: What characterizes being? Eventually, what is being?

  29. Concept Relates to activates Form Referent Stands for ? [Ogden, Richards, 1923] Ontology in Linguistics “Tank“

  30. Ontology in Computer Science • An ontology is an engineering artifact: • It is constituted by a specific vocabulary used to describe a certain reality, plus • a set of explicit assumptions regarding the intended meaning of the vocabulary. • Thus, an ontology describes a formal specification of a certain domain: • Shared understanding of a domain of interest • Formal and machine manipulable model of a domain of interest “An explicit specification of a conceptualisation” [Gruber93]

  31. Structure of an Ontology Ontologies typically have two distinct components: • Names for important concepts in the domain • Elephant is a concept whose members are a kind of animal • Herbivore is a concept whose members are exactly those animals who eat only plants or parts of plants • Adult_Elephant is a concept whose members are exactly those elephants whose age is greater than 20 years • Background knowledge/constraints on the domain • Adult_Elephants weigh at least 2,000 kg • All Elephants are either African_Elephants or Indian_Elephants • No individual can be both a Herbivore and a Carnivore

  32. Ontology Design and Deployment • Given key role of ontologies in the Semantic Web, it will be essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances • Correct— captured intuitions of domain experts • Minimally redundant— no unintended synonyms • Richly axiomatised— (sufficiently) detailed descriptions • Store (large numbers) of instances of ontology classes, e.g.: • Annotations from web pages • Answer queries over ontology classes and instances, e.g.: • Find more general/specific classes • Retrieve annotations/pages matching a given description • Integrate and align multiple ontologies

  33. Ontology Languages • Wide variety of languages for “Explicit Specification” • Graphical notations • Semantic networks • Topic Maps (see http://www.topicmaps.org/) • UML • RDF • Logic based • Description Logics (e.g., OIL, DAML+OIL, OWL) • Rules (e.g., RuleML, LP/Prolog) • First Order Logic (e.g., KIF) • Conceptual graphs • (Syntactically) higher order logics (e.g., LBase) • Non-classical logics (e.g., Flogic, Non-Mon, modalities) • Probabilistic/fuzzy • Degree of formality varies widely • Increased formality makes languages more amenable to machine processing (e.g., automated reasoning)

  34. Many ontology languages use “object oriented” model based on • Objects/Instances/Individuals • Elements of the domain of discourse • Types/Classes/Concepts • Sets of objects sharing certain characteristics • Relations/Properties/Roles • Sets of pairs (tuples) of objects • Such languages are/can be: • Well understood • Formally specified • (Relatively) easy to use • Amenable to machine processing Description Logics are a family of such ontology languages

  35. What are Description Logics?

  36. Description Logics • A family of logics that can be used to represent ontologies • Based on • Individuals, e.g., John • Concepts, e.g., Person, Parent • Roles, e.g, childOf • Concepts can have defining characteristics • E.g., Parent is precisely those Persons who have a child • Description Logics are • Well understood • Formally specified with model-theoretic semantics • (Relatively) easy to use • Amenable to machine processing

  37. Sample Description Logic Ontology Fragment child  Person Person father  inverse(child) spouse  Person Person spouse is symmetric Student  Person Parent = Person  1 child john  Person  name=“John Smith” <john,sally>  spouse sally name=“Sally Brown” <sally,paul>  father paul Person   0 spouse

  38. Description Logic Inferences sally  Person  spouse:Person because sally is john’s spouse, spouse’s range is Person, and spouse is symmetric paul  Parent because paul is a Person, paul is sally’s father, father is a subrole of the inverse of child paul  spouse : {sally} because paul has no spouse mary  ( spouse : Person )  ( spouse:  {john} ) because if mary has a non-Person spouse, it can’t be john, because john is a Person

  39. Reasoning in Description Logics • Description Logics can be quite complex • Boolean constructs (, , ) • (Counted) modalities (, , , ) • Concept definitions (=) • Determining inference in Description Logics can be difficult • Computationally intractable (EXPTIME complete or worse) • Generally decidable, however • Nevertheless, systems exist that are effective in practice • Highly optimized, tuned for normal situations • Always return the right answer • Almost always return very quickly • E.g., FaCT, RACER, DLP

  40. Characteristics of Description Logics • Logics • Formal basis • Model theory provides meaning • Inference can be used to determine implicit information • Ontology Languages • Provide definitions for terms (concepts) • Provide information about individuals • Expressive • Can say lots of things (but not everything) • Determining inference is difficult • Implemented • Powerful systems exist (FaCT, DLP, RACER)

  41. (Representation) Languages for the Semantic Web

  42. Initial Semantic Web Languages • RDF (Resource Description Framework) • W3C recommendation (http://www.w3.org/RDF) • RDF is a language ( + XML syntax + semantics) • for representing metadata • for describing the semantics of information in a machine- accessible way • RDFS (RDF Schema) extends RDF with “schema vocabulary” • Class, Property • type, subClassOf, subPropertyOf • range, domain • RDFS is a very simple ontology language

  43. hasColleague Ian Uli The RDF Data Model • Statements are <subject, predicate, object> triples: <Ian,hasColleague,Uli> • Can be represented as a graph: • Statements describe properties of resources • A resource is any object that can be pointed to by a URI: • a document, a picture, a paragraph on the Web; • http://www.cs.man.ac.uk/index.html • a book in the library, a real person (?) • isbn://5031-4444-3333 • … • Properties themselves are also resources (URIs)

  44. URIs • URI = Uniform Resource Identifier • "The generic set of all names/addresses that are short strings that refer to resources" • URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages) • In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document: • http://www.somedomain.com/some/path/to/file#fragmentID

  45. RDF Syntax • RDF has an XML syntax • Every Description element describes a resource • Every attribute or nested element inside a Description is apropertyof that Resource • We can refer to resources by using URIs <Description about="some.uri/person/ian_horrocks"> <hasColleague resource="some.uri/person/uli_sattler"/> </Description> <Description about="some.uri/person/uli_sattler"> <hasHomePage>http://www.cs.mam.ac.uk/~sattler</hasHomePage> </Description> <Description about="some.uri/person/carole_goble"> <hasColleague resource="some.uri/person/uli_sattler"/> </Description>

  46. RDF Schema (RDFS) • RDF gives a language for meta data annotation, and a way to write it down in XML, but it does not provide any way to structure the annotations • RDF Schema augments RDF to allow you to define vocabulary terms and the relations between those terms • it gives “extra meaning” to particular RDF predicates and resources • e.g., Class, subClassOf, domain, range • These terms are the RDF Schema building blocks (constructors) used to create vocabularies: <Person,type,Class> <hasColleague,type,Property> <hasColleague,range,Person> <hasColleague,domain,Person> <Professor,subClassOf,Person> <Carole,type,Professor> <Carole,hasColleague,Ian>

  47. RDF and RDFS circa 2001 • Initial definition of RDF and RDFS was informal • A document giving an English description of what everything meant • Not adequate for representation • Debate on exact meaning of constructs, e.g., blank nodes • Similar to problems with informal Knowledge Representation work • W3C chartered the RDF Core Working Group to fix this (and other problems) • Produced cleaned up syntax for RDF • Produced formal semantics • RDF and RDFS are now real representation languages • Formal syntax, formal semantics, inference

  48. Semantics and Model Theories • Ontology/KR languages aim to model (part of) world • Terms in language correspond to entities in world • Meaning given by a Model Theory (MT) • MT defines relationship between syntax and interpretations • Can be many interpretations (models) of one piece of syntax • Models are supposed to be analogue of (part of) world • E.g., elements of model correspond to objects in world • Formal relationship between syntax and models • Structure of models reflect relationships specified in syntax • Inference (e.g., subsumption) is defined in terms of MT • E.g., T² A  B iff in every model of T, ext(A)  ext(B)

  49. RDF/RDFS Semantics • RDF has non-standard semantics to deal with certain bits of RDF • Semantics given by RDF Model Theory (MT) • In RDF MT, an interpretation I of a vocabulary V consists of: • IR, a non-empty set of resources • IS, a mapping from V into IR • IP, a distinguished subset of IR (the properties) • A vocabulary element v 2 V is a property iff IS(v) 2 IP • IEXT, a mapping from IP into the powerset of IR£IR • I.e., a set of elements <x,y>, with x,y elements of IR • IL, a mapping from typed literals into IR • Class interpretation ICEXT simply induced by IEXT(IS(type)) • ICEXT(C) = {x | <x,C> 2 IEXT(IS(type))}

  50. RDFS Interpretations • RDFS adds extra constraints on interpretations • E.g., interpretations of <C,subClassOf,D> constrained to those where ICEXT(IS(C)) µ ICEXT(IS(D)) • Can deal with triples such as • <Species,type,Class> <Lion,type,Species> <Leo,type,Lion> • <SelfInst,type,SelfInst> • And even with very unusual triples such as • <type,subPropertyOf,subClassOf> • But not clear if meaning of unusual triples matches intuition (if there is one)

More Related