1 / 26

Metadata

Metadata. The Semantic Web Directories and Thesauri XML is not enough Topic maps RDF. Sources of Knowledge for finding documents [DeRose99]. “ The user , including their current explicit query and any historical or profile information the system may have gained earlier.

Télécharger la présentation

Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata The Semantic WebDirectories and Thesauri XML is not enough Topic maps RDF CS3352

  2. Sources of Knowledge for finding documents [DeRose99] • “The user, including their current explicit query and any historical or profile information the system may have gained earlier. • The documents in the library or on the web, including their nominal "content" and whatever metadata has been attached • The world, about which the system may have certain information, such as dictionaries and thesauri of natural language terms; basic knowledge of object categories ("dog is-a animal"), and much more…” Text, image Mark-up, Links, Catalogue database Ontologies, Thesauri Knowledge CS3352

  3. What is metadata? • Data cataloging resources • Administrative cataloguing: acquisition history, author… • Structural: size, image format… • Data describing the content and meaning of resources royal UK male trophy presenter, footballer trophy winner CS3352

  4. Metadata Representation Expressive, so we can say what we want; Compositional, so that we can build complex terms out of simple pieces; Controlled,so we only say consistent and coherent things; Incremental, so we can keep adding descriptions CS3352

  5. A standard for metadata defined by the digital library community Others: MARC, VRA… 15 Elements: Title Subject Description Creator Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights Dublin Core • Core elements defined in RFC 2413: • http://src.doc.ic.ac.uk/computing/internet/rfc/rfc2413.txt • http://www.ariadne.ac.uk • http://www.ukoln.ac.uk CS3352 From : Metadata for images, Michael Day http://www.ukoln.ac.uk

  6. Metadata on the web yesterday • Meta tags CS3352

  7. Metadata on the Web yesterday <?xml version="1.0" encoding="utf-8"?> <book isbn="0836217462"> <title>Being a Dog Is a Full-Time Job</title> <author>Charles M. Schulz</author> <character> <name>Snoopy</name> <friend-of>Peppermint Patty</friend-of> <since>1950-10-04</since> <qualification>extroverted beagle</qualification> </character> <character> <name>Peppermint Patty</name> <since>1966-08-22</since> <qualification>bold, brash and tomboyish</qualification> </character> </book> CS3352

  8. Metadata on the web yesterday CS3352

  9. World Wide Web • Tim Berners-Lee reprise… “... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.” Berners-Lee 1996 CS3352

  10. Web = Data+Information-Knowledge Browse the Links Search using Words steamer steamer, tank Search using experience Link structure is content • rhetorical narratives Search using indexes Metadata and classifications CS3352

  11. ? Resource describing UK soccer players and their careers “Find a very successful European team-based sports person” • Metadata • Knowledge • Inference Resource listing sporting competitions including FA Cup and Superbowl Steve Redgrave’s home page Resource describing the Olympic Games Resource that lists teams that have won the FA Cup CS3352

  12. Event nationality Country People win participates Competition holds Sport UK participants = 11 partof Europe Sports Person Tennis Tournament Rowing Soccer participants = 4 Sports Tournament Coxless Fours Soccer player Rower Olympic Games Rower win Olympic Games Soccer player wins FA Cup once Soccer Tournament Tennis Tournament UK Rower win Olympic Games > 2 times Wimbledon FA Cup CS3352

  13. A Shared Understanding • Metadata • Data describing the content and meaning of resources • But everyone must speak the same language… • Terminologies • Shared and common vocabularies • For search engines, agents, curators, authors and users • But everyone must mean the same thing… • Ontologies • Shared and common understanding of a domain • Essential for exchange and discovery CS3352

  14. Ontologies • “The [reusable] specification of conceptualizations, used to help programs and humans share knowledge”[Gruber93] • An ontology will include: • a vocabulary of terms, and • some specification of their meaning • structure on the domain and constrain the possible interpretations of terms [Uschold99] • precise notion of what meaning means Ontologies provide: • a shared and common understanding of a domain that can be communicated across people and applications CS3352

  15. Ontology Precise notion of what meaning means • formal, explicit, rigour • unambigious • agents not just people • machine computable • from machine-readable to machine-understandable. • use knowledge representation and reasoning to supply the meaning CS3352

  16. What is an Ontology? Thesauri “narrower term” relation Frames (properties) Formal is-a General Logical constraints Catalog/ ID Informal is-a Formal instance Disjointness, Inverse, part-of… Terms/ glossary Value Restrs. From Debbie McGuinness CS3352

  17. Ontologies and E-Anything Simple ontologies provide: • Controlled shared vocabulary (search engines, authors, users, databases, programs all speak same language) • Organization (and navigation support) • Expectation setting (left side of many web pages) • Browsing support (tagged structures such as Yahoo!) • Search support (query expansion approaches such as FindUR, e-Cyc) • Sense disambiguation • Conflict detection • Structured, comparative search • Generalization/ Specialization • … From Debbie McGuinness CS3352

  18. The Semantic Web • http://www.semanticweb.org CS3352

  19. Metadata on the web tomorrow • Resources annotated with metadata using knowledge as a shared vocabulary • Metadata held outside the resource • Knowledge structures for holding the ontology • XML DTDs • Product classifications • Directories • Home > Recreation > Sports > Events > International Games > Olympic Games > • W3C: RDF and RDFS • Resource Description Framework • Topic maps • DAML+OIL CS3352

  20. course title teacher students name http XML is not good for describing ontologies • XML defines grammars to verify and structure documents • The grammar enforces constraints on tags • Different grammars define the same content • XML lacks a semantic model – it only has a surface model which is a tree. <course date=“...”><title>...</title><teacher>...</teacher> <name>...</name> <http>...</http><students>...</students></course> • node = label + attr/values + contents CS3352

  21. XML is not good for describing ontologies • Meaning of XML documents is intuitively clear • “semantic” markup tags are domain terms • But computers do not have intuition • Tag names per se do not provide semantics • The semantics are encoded outside the XML specification • XML makes no commitment on: • Domain specific ontological vocabulary • Ontological modelling primitives  requires pre-arranged agreement on  &  Feasible for closed collaboration • agents in a small & stable community • pages on a small & stable intranet CS3352

  22. XML DTDs and XML Schema • DTD does not distinguish between objects and relations • XML Schema’s type extension mechanism is a red herring – it can’t be used to model ontological subtypes • XML has been used as a serialisation syntax for other markup languages – e.g. SMIL, XOL <class> <name> person </name> </class> <slot> <name>year-of-birth</name> <domain.person</domain> <slot-cardinality>1</slot-cardinality> </slot> CS3352

  23. Requirements for an Ontology-language • Well designed • Useful and proven modelling primitives • Intuitive to human users • Can say simple things simply • Expressive enough to capture many ontologies • Efficient, sound and complete reasoning support • Well defined • clear syntax - read ontologies • Formal semantics – understand (process) ontologies - to facilitate machine interpretation of that semantics; • Expressive enough to capture many ontologies • Compatible • Easy mapping to/from other ontology languages • Maximum compatibility with XML and RDF(S); CS3352

  24. Sem Web Research Issues • Ontology creation • Millions of ontologies will be built • Ontology Engineering is difficult and time-consuming • Ontology Learning • Scalable RDF Repositories (all is built on top of the same data model !) • Infrastructure • Scalable reasoning services for different languages • Resource-ID Management • Versioning of ontologies and corresponding metadata CS3352

  25. Sem Web Research Issues • Metadata Management • legacy data (HTML, XML, ...) -> legacy data migration: • Annotation of Web documents (HTML, PDF, ...) • Semi-automation using information extraction • XML-Wrapper / Transformer • Database Converter / Exporter • Maintenance of Metadata, ontologies and resources • sources, ontologies, and metadata have to be maintained in a consistent way • organizational process is needed • tools are needed • Metadata have to reflect changes of the sources • metadata have to reflect changes of the ontologies CS3352

  26. Selected Semantic Web Projects • COHSE • http://inanna.ecs.soton.ac.uk/cohse/ • Ontobroker • http://ontobroker.aifb.uni-karlsruhe.de/ • SHOE • http://www.cs.umd.edu/projects/plus/SHOE/ CS3352

More Related