1 / 42

The Semantic Web

The Semantic Web . Stefan Decker Information Sciences Institute University of Southern California. Outline. Semantic Web Overview Vision, Challenges, Rationals Semantic Web in SCEC. Semantic Web. coined by Tim Berners-Lee (1997)

annot
Télécharger la présentation

The Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semantic Web Stefan Decker Information Sciences Institute University of Southern California

  2. Outline • Semantic Web Overview • Vision, Challenges, Rationals • Semantic Web in SCEC

  3. Semantic Web • coined by Tim Berners-Lee (1997) "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” • T. Berners-Lee, J. Hendler, O. Lassila,“The Semantic Web”, Scientific American, May 2001

  4. Insurance Co. Rating Provider sites Physician’s Agent Mom required treatment in-plan? close-by? Specialist? Schedule appointment Driving schedule Lucy’s Agent Pete’ Agent Doctor’s appointment“The Semantic Web”, Scientific American, May 2001

  5. Means to Achieve the Vision • Explicit Ontologies • Needed to understand each others data(e.g., joint notion about what a schedule is) • Web Services • Required to actively interconnect systems(automatically make an appointment)

  6. Technical challenges • Interoperability • Inaccurate, incomplete, heterogeneous data • Unreliable, ill-defined, evolving services • Natural language processing, data mining • make information explicit • Human-computer interaction • querying interfaces, visualization • Scalability • Subsecond performance

  7. Social challenges • Standardization is hard • DublinCore • Bogus or inaccurate metadata • Physician rating, profile • Competition and commoditization • Economical incentive • Chicken and egg • Complexity: developers and users

  8. Jump Starters • Machine Readable Data: • .org (human-edited directory) • .org (Music encyclopedia) • RSS (RDF Site Summary) • (embedded metadata) • CC/PP (Composite Capability/Preference Profiles) • P3P (Platform for Privacy Preferences)

  9. Jump Starters • B2B Vocabulary Projects • PapiNet.org: Vocabulary for Paper Industry • BPMI.org: Vocabulary for exchanging Business Process Models • XML-HR: Vocabularies for human resources (HR) • DMTF (Distributed Management Task Force) (Vocabularies for managing enterprises • … • Research Vocabulary Projects • Gen Ontology Working Group • Earth Sciences • MathNet • …

  10. How do we get there? Research communities DL, AI, DB, … Standards bodies W3C, OMG, … Non-profit US, EC, Japan Industry IBM, Nokia, HP, Microsoft(?),... Business.semanticweb.org

  11. Non-profit • DARPA • “DARPA Agent Markup Language” • since Aug 2000 • NSF • Co-sponsored events (e.g., SWWS) • Further support in the loop • European Council • “Semantic Web Technologies”, FrameWork 6 • Japan • Interoperability Technology Association for Information Processing, Japan (INTAP) www.daml.org www.semanticweb.org/SWWS www.ontoweb.org www.net.intap.or.jp/INTAP/

  12. AI: “Add logic to the Web” • Assertions, rules • Agents • Interoperability • First-order logics • Ontologies, description logics • Logic programming, datalog • Problem-solving methods • … Distributed knowledge base

  13. DB: “Everything is syntax” • Semistructured data • Web services • Interoperability • Data integration • Mediation, query rewriting • Model management • Conceptual modeling Conglomerate of distributed heterogeneous (semistructured) databases

  14. Many Previously Unknown Communication Partners

  15. Heterogenous Data • To many data formats/languages

  16. 1. Step • Define uniform, underlying syntax • Lowest common denominator: labeled graphs(semi-structured Data) -> RDF Relational Database Structured Text (e.g., Vcard) Person begin: vcardfn: Stefann: Decker;Stefanend: vcard Person row row vcard1 fn n L-name L-name ID ID F-name F-name Stefan Decker;Stefan 1 Decker Stefan Decker 2 Birgit

  17. XML • Containment, hierarchy • Adjacency (A followed by B) • Attributes (atomic values) • Opaque reference (IDREF) Good for serialization, poor for modeling relational semantics

  18. <Creator> <uri>http://www.w3.org/Home/Lassila</uri> <name>Ora Lassila</name> </Creator> <Document uri=“http://www.w3.org/Home/Lassila” <Creator>Ora Lassila</Creator> </Document> Ora Lassila <Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/> Encoding of Information “The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila http://www.w3.org/Home/Lassila Creator Endless encoding possibilities in XML:

  19. Introduction to RDF • RDF (Resource Description Framework) • Beyond Machine readable to Machine understandable • RDF unites a wide variety of stakeholders: • Digital librarians, content-raters, privacy advocates, B2B industries, AI... • Significant (but less than XML) industrial momentum, lead by W3C • RDF consists of two parts • RDF Model (a set of triples) • RDF Syntax (different XML serialization syntaxes) • RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF)

  20. Ora Lassila A Simple Example • Describing Resources • URIs: global OIDs, literals • Binary relationships between objects • Arcs (relationships) are first-class objects • Blank (anonymous) nodes • “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila” • Structure • Resource (subject) http://www.w3.org/Home/Lassila • Property (predicate) http://www.schema.org/#Creator • Value (object) "Ora Lassila” s:Creator http://www.w3.org/Home/Lassila

  21. RDF • Graph-based universal syntax (Agent-) Applications RDF-Layer (Single dataformat, Query and storage System) Scheduling Service Insurance Ratings Calendar Semantics in a global, open environment?

  22. Step2: Ontologies • What is an Ontology? „An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 • Ontologies are social contracts • Agreed, explicit semantics • Understandable to outsiders • (Often) derived in a community process • Ontologies require Knowledge Representation • Is_a hierarchy, part of, attributes, axioms

  23. RDF and Ontologies • Idea: Define an Ontology Language by defining predefined nodes and arcs • The Ontology Language itself is just an Ontology • Ontologies are used to tag data from sources

  24. From an Ontology LivingThing subClassOf Person row row L-name L-name ID ID F-name F-name 1 Decker Stefan Decker 2 Birgit Step 2: Layers on Top of RDF Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001

  25. W3C Semantic Web Activity • Annotation (Annotea) • Access control • Calendaring • Collaboration • Logic • Rules • Workflows Working Groups Advanced development RDF Core Web Ontology

  26. RDF Core Working Group • Resource Description Framework (RDF) • Goals • Improve RDF abstract model and XML syntax according to implementors feedback • Define precise semantics for RDF and RDF Schema • Clarify ties with XML family

  27. Web Ontology Working Group • Standard definition language for ontologies (conceptual models) • Derived from Description Logics • But partial mapping to Datbase and Datalog possible -> (see Horrocks, Volz, Decker, Grossof: WWW2003) • Extension of RDF Schema and DAML+OIL • Class Expressions (Intersection, Union, Complement) • XML Schema Datatypes • Enumerations • Property Restrictions • Cardinality Constrains • Value Restrictions

  28. The Layer Cake Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001 Research Phase Standardization Phase Recommendation Phase

  29. SCEC/IT Architecture for a Community Modeling Environment

  30. Tasks within SCEC - CME • Towards an Earth Sciences Ontology: • Cataloging and Unification of Existing Databases • E.g., Fissures and Fault Activity Database • Building a Mediation Environment • Organizing a Community Process • Enriching of Web Services and Grid Infrastructure with Semantics • Service Discovery and Match Making

  31. Fault Activity Database • Hand-Maintained within SCEC (Sue Perry) • Re-engineering of the Database Schemata <rdfs:Class rdf:about="&FAD_v1;AVG_RECURRENCE_INTERVAL" rdfs:label="AVG_RECURRENCE_INTERVAL"> <a:_slot_constraints rdf:resource="&FAD_v1;SCFADsep_02_00106"/> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT" rdfs:label="AVG_SLIP_PER_EVENT"> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT_METHOD" rdfs:label="AVG_SLIP_PER_EVENT_METHOD"> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdf:Property rdf:about="&FAD_v1;CFM-A_coord_file_URL" a:maxCardinality="1" rdfs:label="CFM-A_coord_file_URL"> <rdfs:domain rdf:resource="&FAD_v1;FAULT"/> <rdfs:range rdf:resource="&rdfs;Literal"/> </rdf:Property>

  32. Planned: Mediation Environment with RDF-based Rule Language Applications Mediation with RDF-based Rule Language Fault Activity Database Fissures Grid Services

  33. Motivation: Why Rule Languages for the Web • Plethora of data available • Data needs to be adapted and combined • “Time to Market”: Faster to write rules than code • Data Transformation and Integration • Logic specification, not programming • Tabled evaluation/bottom-up evaluation • Semi-structured data • Multiple semantics (Relational Data, UML, ER, TopicMaps, DAML+OIL, XML-Schema, special purpose data models) • Distributed, heterogeneous sources

  34. What’s Wrong With Existing Approaches? • Built-in semantics (e.g. SiLRI, RQL, DQL) • but: many RDF-based languages with different semantics (DAML+OIL, RDF Schema, UML/RDF, TopicMaps/RDF, DMTF, …) • For each language a specialized query language ????

  35. TRIPLE:Language Overview • Native support • for Resources & namespaces, • Abbreviations • Models (sets of RDF statements) • Reification • Rules with expressive bodies (full FOL syntax) • Inspired by F-Logic: • subject[predicateobject] (“molecule”)

  36. Language Description I • Namespace and resource abbreviations: • rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#”. • isa := rdf:subClassOf. • Statements, triples, molecules: • subject[predicateobject] • subject[p1o1;p2 o2; ...] • s1[p1  s2[p2o] ] • Models, model expressions, parameterized models: • s[po]@m “triple <s,p,o> in model m” • s[po]@(m1 m2) model intersection, union, diff. • s[po]@sf(m1, X, Y) Skolem function

  37. Language Description II • Reification: • stefan[believes  <Ora[isAuthorOfhomepage]> ] • Logical formulae: • usual logical connectives and quantifiers:         • all variables introduced via  (or ) • Clauses: • facts: s[p1o1; p2 o2; ...]. • rules: X s1[p1X] s2[p2X]  ... . • Model blocks: • @model { clauses } • Mdl @model(Mdl) { clauses }

  38. TRIPLE Stefan Decker dc:title dc:creator db:d_01_01 dc:subject dc:subject ... RDF triples rule N p(N)[ rdf:type  xyz:Person; xyz:name  N ]  D D[dc:creator  N]. Person Stefan Decker name rdf:type query:“find all names” N  P P[rdf:type  xyz:Person; xyz:name  N]@db:documents. N = “Stefan Decker” Example: Dublin Core dc := “http://purl.org/dc/elements/1.0/”. db := “http://www-db.stanford.edu/”. ···· @db:documents { db:d_01_01 [ dc:title  TRIPLE; dc:creator  “Stefan Decker”; dc:subject  RDF; dc:subject  triples; ... ]. } namespace abbreviations model block fact

  39. Example: Specification of RDF Schema Semantics rdf := 'http://www.w3.org/...rdf-syntax-ns#'. rdfs := 'http://www.w3.org/.../PR-rdf-schema-...#'. type := rdf:type. subPropertyOf := rdfs:subPropertyOf. subClassOf := rdfs:subClassOf. FORALL Mdl @rdfschema(Mdl) { FORALL O,P,V O[P->V] <- O[P->V]@Mdl. FORALL O,V O[subClassOf->V] <- EXISTS W (O[subClassOf->W] AND W[subClassOf->V]). … } namespace abbreviations resource abbreviations model block “copy” triples from Mdl Transitivity of subClassOf

  40. Example: Cars Ontology with RDF Schema Semantics @cars { xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource]. xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:MiniVan[ rdfs:subClassOf -> xyz:Van; rdfs:subClassOf -> xyz:PassengerVehicle]. } xyz:MotorVehicle xyz:Truck xyz:Van xyz:PassengerVehicle xyz:MiniVan X = xyz:Van X = xyz:Truck X = xyz:PassengerVehicle FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@cars. X = xyz:Van X = xyz:Truck X = xyz:PassengerVehicle X = xyz:MiniVan FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@rdfschema(cars).

  41. Grid Computing and Web Services (ongoing) • Matchmaking between Jobs and Resources • Hard-Coded in Globus Toolkit • Reeingineering using a Ontology and Rule-based solution • RDF and DMTF Vocabulary (www.dmtf.org) <rdfs:Class rdf:ID="CIM_ComputerSystem"> <rdfs:subClassOf rdf:resource="#CIM_System"/> <version><![CDATA["2.6.0"]]></version><rdfs:comment parseType="Literal"><![CDATA["A class derived from System that is a special collection of ManagedSystemElements. This collection provides compute capabilities and serves as aggregation point to associate one or more of the following elements: FileSystem, OperatingSystem, Processor and Memory (Volatile and/or NonVolatile Storage)."]]></rdfs:comment> <rdfs:subClassOf> <daml:Restriction> <daml:toClass rdf:resource="#string"/> <daml:onProperty> <daml:DatatypeProperty rdf:ID="NameFormat"> <daml:toClass rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> </daml:DatatypeProperty> </daml:onProperty> </rdfs:Class>

  42. Semantic Web and Earth Sciences • Semantic Web field provides technologies for explicity vocabulary and mediate data • Standards-based, many resources available • Editors, Rule Engines, APIs • Effort feeds back for other domain

More Related