1 / 51

Introduction to Semantic Web What? Why? How? So far? Next?

Introduction to Semantic Web What? Why? How? So far? Next? . Frank van Harmelen AI Department Vrije Universiteit Amsterdam. Creative Commons License: allowed to share & remix, but must attribute & non-commercial. Who am I. Frank van Harmelen Prof in AI at Vrije Universiteit Amsterdam

esmeralda
Télécharger la présentation

Introduction to Semantic Web What? Why? How? So far? Next?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Semantic WebWhat? Why? How? So far? Next? Frank van Harmelen AI Department Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

  2. Who am I • Frank van Harmelen • Prof in AI at Vrije Universiteit Amsterdam • Knowledge Representation • Early Semantic Web Projects (> 1999) • Co-designed OWL • Tech advisor of Aduna (Sesame) • Scientific Director of LarKC(Large Knowledge Collider) • I know nothing about image analysis…

  3. Who are you? • who knows roughly what Semantic Web is? • who has heardof RDF & OWL? • who has studied RDF & OWL? • who has used RDF & OWL? • who expects ever to use RDF & OWL? • who is a logician • who is a KR researcher • who is a Web researcher • who is an imageresearcher

  4. General idea of the Semantic Web

  5. General idea of Semantic Web • Make current web more machine accessible(currently all the intelligence is in the user) • Motivating use-cases • search • personalisation • semantic linking • data integration • web services • ...

  6. These are non-trivial design decisions. Alternative would be: General idea of Semantic Web Make current web more machine accessible(currently all the intelligence is in the user) Do this by: • Making data and meta-dataavailable on the Webin machine-understandable form (formalised) • Structure the data and meta-data in ontologies

  7. What’s wrong with the Web? and another web page about Frank This page is about the Vrije Uniersitei a web page in English about Frank And this page is about LarKC And this page is about Stefano ? ? ? linked web-pages, written by people, written for people, used only by people... ? ? Many of these pages already come from data, usable by computers! linked data, usable by computers! useful for people! But we can’t link the data....

  8. Semantic Web "Web of Data" (TBL) • expose data on the web (“facts”) in interoperable form (RDF) • expose knowledge on the webwith interoperable semantics (ontologies, RDF Schema, OWL) • Apply lightweight inference for • Interoperability • Query answering • Search • Unexpected reuse • …

  9. Not just data,also knowledge • All of this: • Low expressivity logic (RDF) • That allows some inference:Property inheritance, domain/range inference • Some of this: • Medium expressive logic (OWL) • That allows more inference:(in)equality, number restrictions, datatypes

  10. different owners & locations Desideratum:On the Web of Data, anyone can say anything about anything • Need for total decoupling of • data • vocabulary • meta-data [<x> IsOfType <T>] x T <village>

  11. different owners & locations Two versions of Semantic Web story:  • V1: Semantic Web = annotated Web ;1 & 2 are embedded in text & images on the Web • V2: Semantic Web = Web of Data ;1 & 2 live in dedicated repositories (triple stores)   [<x> IsOfType <T>] x T <village>

  12. Why is this hard?

  13. alleviates <treatment> <name> <symptoms> <drug> IS-A <disease> <drugadministration> machine accessible meaning(What it’s like to be a machine) META-DATA

  14. name symptoms disease drug administration What is meta-data? • it's just data • it's data describing other data • its' meant for machine consumption

  15. What is required?

  16. Required are: • one or more standard vocabularies • so search engines, producers and consumersall speak the same language • a standard syntax, • so meta-data can be recognised as such • lots of resources with meta-data attached

  17. Bluffer’s Guide to RDF & RDF Schema

  18. Bluffer’s Guide to RDF • Express relations between things: • Results in labelled network (“graph”) • All labels are actually web-addresses (URIs) • You can “ping” any label and find out more • Bits of the graph can live at physically different locations & have different owners Predicate Object Subject AuthorOf Frank y publishedBy AuthorOf x MIT

  19. Bluffer’s Guide to RDF Schema • types for subjects & objects & predicates • Types organised in a hierarchy • Inheritance of properties person artifact publisher book author man AuthorOf Frank y publishedBy AuthorOf x MIT

  20. So what’s special about RDF(S)? • statements about an identifier can be distributed <owl:Individual ID="CENTRAL-COAST" /> <owl:Individual rdf:about="CENTRAL-COAST"> <type rdf:resource="#CALIFORNIA-REGION"/> </owl:Individual> • no unique name assumption • no closed world assumption Remember web-style decoupling

  21. different owners & locations Remember: • Need for total decoupling of • data • vocabulary • meta-data [<x> IsOfType <T>] x T <village>

  22. RDF(S) have a (very small) formal semantics • Defines what other statements are implied by a given set of RDF(S) statements • Ensures mutual agreement on minimal contentbetween parties without further contact • In the form of “entailment rules” • Very simple to compute(and not explosive in practice)

  23. RDF(S) semantics: examples • Aspirin isOfType PainkillerPainkiller subClassOf Drug Aspirin isOfType Drug • aspirin alleviates headachealleviates range symptom  headache isOfType symptom

  24. RDF(S) semantics: examples • AspirinisOfTypePainkillerPainkillersubClassOfDrug AspirinisOfTypeDrug • aspirin alleviates headachetreatsrangesymptom headacheisOfTypesymptom

  25. RDF(S) semantics • X R Y + R domain T  X IsOfType T • X R Y + R range T  Y IsOfType T • T1 SubClassOf T2 +T2 SubClassOf T3  T1 SubClassOf T3 • X IsOfType T1 +T1 SubClassOf T2  X IsOfType T1 Semantics = predictable inference

  26. Bluffer’s Guide to OWL

  27. OWL: things RDF Schema can’t do • equality • enumeration • number restrictions • Single-valued/multi-valued • Optional/required values • inverse, symmetric, transitive • boolean algebra • Union, complement • …

  28. Layered language • OWL Lite: • Classification hierarchy • Simple constraints • OWL DL: • Maximal expressiveness • While maintaining tractability • Standard formalisation • OWL Full: • Very high expressiveness • Loosing tractability • Non-standard formalisation • All syntactic freedom of RDF(self-modifying) Full DL Lite Syntactic layering Semantic layering

  29. OWL Light • (sub)classes, individuals • (sub)properties, domain, range • conjunction • (in)equality • cardinality 0/1 • datatypes • inverse, transitive, symmetric • hasValue • someValuesFrom • allValuesFrom RDF Schema • OWL Full • Allow meta-classes etc • OWL DL • Negation • Disjunction • Full Cardinality • Enumerated types Language Layers Full DL Lite

  30. Backward compatibility with RDF <owl:Class rdf:ID="City"> <rdfs:subClassOf rdf:resource="#GeographicEntity"/> <rdfs:subClassOf> <owl:Restriction> <owl:onPropertyrdf:resource="#ruler"/> <owl:allValuesFromrdf:resource="#Mayor"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class> • OWL agents understand everything…

  31. Backward compatibility with RDF <owl:Class rdf:ID="City"> <rdfs:subClassOf rdf:resource="#GeographicEntity"/> <daml:subClassOf> <daml:Restriction> <daml:onPropertyrdf:resource="#ruler"/> <daml:toClassrdf:resource="#Mayor"/> </daml:Restriction> </daml:subClassOf> </owl:Class> • OWL agents understand everything… … others still the most important aspects

  32. OWL also has a formal semantics • Defines what other statements are implied by a given set of statements • Ensures mutual agreement on content(both minimal and maximal)between parties without further contact • Can be used for integrity/consistency checking • Hard to compute (and rarely/sometime/always explosive in practice)

  33. OWL semantics: minimal • vanGogh isOfType ImpressionistImpressionist subClassOf Painter vanGogh isOfType Painter • vanGogh painter-of sunflowerspainter-of domain painter vanGogh isOfType painter

  34. OWL semantics: maximal • vanGogh isOfType ImpressionistImpressionist disjointFrom Cubist NOT: vanGogh isOfType Cubist • painted-by has-cardinality 1sun-flowers painted-by vanGoghPicasso different-individual-from vanGogh NOT: sun-flowers painted-by Picasso

  35. Remember: Require are • standard vocabularies • a standard syntax, • lots of resources with meta-data attached

  36. Ontologies: real life examples • handcrafted • music: CDnow(2410/5), MusicMoz(1073/7) • biomedical: SNOMED (200k), GO(15k), Emtree(45k+190kSystems biology • ranging from lightweight • Yahoo, UNSPC, Open directory (400k) to heavyweight (Cyc (300k)) • ranging from small (METAR) to large (UNSPC)

  37. Biomedical ontologies (a few..) • Mesh • Medical Subject Headings, National Library of Medicine • 22.000 descriptions • EMTREE • Commercial Elsevier, Drugs and diseases • 45.000 terms, 190.000 synonyms • UMLS • Integrates 100 different vocabularies • SNOMED • 200.000 concepts, College of American Pathologists • Gene Ontology • 15.000 terms in molecular biology • NCBI Cancer Ontology: • 17,000 classes (about 1M definitions),

  38. Remember: Require are • standard vocabularies • a standard syntax, • lots of resources with meta-data attached

  39. Who makes the meta-data? • Don’t throw away what we already have: • Databases (Amazon.com) • Navigation structures • meta-data in documents • Office, Acrobat, MP3, jpg • As spin-off on what we already do • MIT Media Lab photo annotator • Automated analysis • Text, Images, Video

  40. Summary so far

  41. Linked Data/Semantic Web • Identification • Uniform Resource Identifier (URI) • Global identifier (NB: persistent!) • Looks like a URL, is often and internationalized Resource Identifier (IRI) • Description • Resource DescriptionFramework (RDF) • RDF Schema (RDFS) • SimpleKnowledgeOrganization System (SKOS) • Web OntologyLanguage (OWL) • Querying • RDF Triple stores • SPARQL Query Language

  42. Hoe ziet RDF eruit? • Datamodel is een (directed) graph • Elk data-item is een ‘resource’ met een URI als identifier • Elke eigenschap is een binaire relatie: • ‘triple’ • Tussen resources: <subjectURI, predicateURI, objectURI> • Tussen een resource en een ‘literal’ <subjectURI, predicateURI, “literal value”>

  43. Why is this a Web of data? • Global unique identifiers • Reuse of identifiers in other datasets • For data:(two sources say something about over ‘Amsterdam’ ) • For schema:(two sources each use the same concept ‘City’) • This reuse builds “links” between datasets

  44. Does this work in practice?

  45. Linked Open Data cloud already many billions of facts & rules any CD ever recorded (almost) life-science databases basic facts on every country on the planet hierarchical dictionaries (UK, FR, NL) common sense rules & facts (100.000’s) May ‘09 estimate > 4.2 billion triples + 140 million interlinks scientific bibliographies names of artists & art works (10.000’s) Geographic names (millions) Encyclopedia It gets bigger every month

  46. It gets bigger every month

  47. And remember:not just data • All of this: • Low expressivity logic (RDF/RDFS) • That allows some inference:Property inheritance, domain/range inference • Some of this: • Medium expressive logic (OWL) • That allows more inference:(in)equality, number restrictions, datatypes

  48. Nice in the lab, but are you getting anywhere in practice?

  49. Semantic Web News Quiz • Google • Reuters • New York Times • Microsoft • Zemanta • Obama Government • BBC (music, worldcup, wildlife) • BestBuy.com • Facebook

  50. Challenges

More Related