1 / 36

INLS 520

INLS 520. Information Organization. Review. Controlled vocabularies Term Lists, Hierarchies, Trees, Paradigms, Facets, Folksonomies Knowledge organization systems Term Lists, Thesauri, Taxonomies, Ontologies. Today. Protege tutorial Create a thesaurus Create an ontology Ontologies

jaeger
Télécharger la présentation

INLS 520

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INLS 520 Information Organization INLS 520 Erik Mitchell

  2. Review • Controlled vocabularies • Term Lists, Hierarchies, Trees, Paradigms, Facets, Folksonomies • Knowledge organization systems • Term Lists, Thesauri, Taxonomies, Ontologies INLS 520 Erik Mitchell

  3. Today • Protege tutorial • Create a thesaurus • Create an ontology • Ontologies • Basic concept • Building in protege • RDF (?) • OWL (?) INLS 520 Erik Mitchell

  4. Assignment 1 recap • Required XML tags • <?XML... ?> • Required DC elements • None, need a content wrapper <dc> and at least one element <title>, <author>, etc. • Advanced Concepts • Namespaces • Schemas/DTDs • MARC & DC • Advantages / disadvantages • Techniques for discovering data • View Source • DC DOT Metadata generator INLS 520 Erik Mitchell

  5. CV Concepts & definitions • Controlled Vocabularies • Organized Lists • Relationships between concepts • Knowledge organization systems • Typed relationships • Direct / inferable knowledge INLS 520 Erik Mitchell

  6. Thesauri Definitions • “Guide to use of terms, showing relationships between them, for the purpose of providing standardized, controlled vocabulary for information storage and retrieval”(Monash) • “A list of words showing similarities, differences, dependencies, and other relationships to each other”(USG) INLS 520 Erik Mitchell

  7. Thesauri Concepts • Preferred terms • Non-preferred terms • Semantic relations between terms • How to apply terms (guidelines, rules) • Scope notes • Adding terms (How to produce terms that are not listed explicitly in the thesaurus) INLS 520 Erik Mitchell

  8. Common thesaural identifiers • SN Scope Note • Instruction, e.g. don’t invert phrases • USE Use (another term in preference to this one) • UF Used For • BT Broader Term • NT Narrower Term • RT Related Term INLS 520 Erik Mitchell

  9. Thesauri Guides • National Information Standards Organization. (2005). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z39.19-2005. Bethesda, MD: NISO Press. • http://www.niso.org/standards/resources/Z39-19-2005.pdf?CFID=5559601&CFTOKEN=31747314 • Aitchison, Jean & Gilchirist, Alan. Thesaurus Construction: A Practical Guide. 3rd ed. London: Aslib, 1997. • Willpower Information Management Consultants • http://www.willpower.demon.co.uk/thesprin.htm INLS 520 Erik Mitchell

  10. Ontology Definitions • “The study of being or existence” • “A conceptualization of a specification” (Gruber) • “An ontology formally defines a common set of terms that are used to describe and represent a domain.” (OWL) INLS 520 Erik Mitchell

  11. Webster’s Dictionary • Webster’s Third New International Dictionary defines Ontology as: • A science or study of being, specifically a branch of metaphysics*relating to the nature and relations of being. • A theory concerning the kinds of entities and specifically the kinds of abstract entities that are to be admitted to a language system. *Metaphysics: Nature of being “or” existence. INLS 520 Erik Mitchell

  12. Ontology Concepts • Classes • Names of objects in the domain • Relationships between classes • Connections between classes • Properties of classes • Background or identifying knowledge of these objects • Constraints on these properties & relationships • Limits and parameters of the relationships INLS 520 Erik Mitchell

  13. Class exercise • Protégé overview • Orientation • Object types (Classes, Slots, Instances) • Relationships (hierarchies, associative) • As a group, we will work through the protege training guide • http://protege.stanford.edu/doc/tutorial/get_started/get-started.pdf INLS 520 Erik Mitchell

  14. What is the semantic web • URI (Universal resource identifier) • OWL/RDFS • All built on top of regular web • RDF underlying language of semantic web • Xml represents data (document based) • RDF represents pure information (anyone can use, re-harvestable), you could call this knowledge • Examples • Swoogle • Goog411 INLS 520 Erik Mitchell

  15. Ontologies (review) • “A common set of terms that are used to describe and represent a domain” • Classes, Relationships, Properties, Constraints • A formal organization of knowledge • The primary role of an ontology is to define a language which people and computers in a given domain can share INLS 520 Erik Mitchell

  16. A good ontology has • Features: • Meaningful – all classes have instances • Accurate / correct • Non-redundant – each class/instance is represented in a single way • Rich in description – context, content • Enabled functionality: • Able to use queries to connect new pieces of information • Use XML & definitions to integrate knowledge across domains INLS 520 Erik Mitchell

  17. Ontology Continuum • Keyword Lists • Basic Thesauri • Complex Thesauri • Taxonomies • Simple Ontologies (wordnet) • Complex Ontologies (OWL) INLS 520 Erik Mitchell

  18. SHOE Ontology project – • Possible to build an ontology for anything • Simple HTML Ontology Extensions (SHOE) Project • http://www.cs.umd.edu/projects/plus/SHOE/ • http://www.cs.umd.edu/projects/plus/SHOE/html-pages.html • Sample projects • Beer Ontology • http://www.cs.umd.edu/projects/plus/SHOE/onts/index.html#beer • Document Ontology • http://www.cs.umd.edu/projects/plus/SHOE/onts/docmnt1.0.html INLS 520 Erik Mitchell

  19. Ontology Concepts • Multiple inheritance • Vertical and horizontal relationships • Decomposed subject/object • Predicate based description (isRelatedto, hasVersion) • First Order Predicate Logic • Statements broken down into subjects/predicates • Proposition • All men are mortal, Socrates is a man • Therefore • Socrates is mortal INLS 520 Erik Mitchell

  20. Creating a CV review • Design methods • Re-use existing, start with content & desired use ideas • Committee / community approach • Top-down • Concept driven • Bottom-up • Document driven • Empirical approach • Deductive approach • Select terms, create relationships, perform term control • Inductive approach • Establish CV at outset, build hierarchies on as needed basis INLS 520 Erik Mitchell

  21. Top-Down Identify audience Identify all topics, concepts, uses, and context of the domain Sort topics identified into an appropriate organization scheme (enumerative, hierarchical, faceted) Solidify structure and clean up gaps & redundancies Assign documents to categories, test retrieval Bottom-up Identify audience Survey documents for topics/concepts. Build system on the fly – let content drive structure and limits of system Identify gap & redundancies in system Test retrieval Creating a CV review (2) INLS 520 Erik Mitchell

  22. Creating a CV review (3) • Think about scope, use, content, maintenance • Gather Terms • Based on existing systems, content • Based on user needs/expectations • Investigate issues of specificity, exhaustivity, granularity • Build hierarchies, relationships • Broader/narrower terms, Related terms, Use/Use for, see/see also • Establish Rules • Implement • Evaluate • Maintain http://www.boxesandarrows.com/view/creating_a_controlled_vocabulary INLS 520 Erik Mitchell

  23. Creating an Ontology • Determine Scope of field, define boundaries • Check for existing ontologies, vocabularies • Select a top-down/bottom-up approach • Identify concepts, vocabulary, parameters, constraints • Identify relationships • Multiple hierarchies, inheritance • Build, test, maintain INLS 520 Erik Mitchell

  24. Class exercise • Design your own ontology • In Groups, pick a domain of knowledge • Type of food (pizza, soup, beer), field of study (library science, math), etc • Come up with a basic ontological framework and begin creating it in Protege • Be prepared to share a brief overview with the class which will include • Domain area • Top level classses • Instance definitions • Relationships INLS 520 Erik Mitchell

  25. Assignment 2 • Overview • In this assignment you will create an ontology on a topic of your choice. Your ontology should contain multiple classes and instances and be focused on a specific purpose. This assignment includes an implementation of the ontology in Protégé and a brief paper explaining your ontology. • Guidelines • Select a topic of interest and determine the top level (i.e. Basketball, Chocolate, etc). • Define the scope (depth/breadth) and purpose of the ontology. Define specific classes and facets (known as slots in Protégé) that describe those classes. Your ontology should have between 5-10 classes with multiple (2-5) slots for each class. Think about the use of hierarchy and multiple inheritance in your ontology. • Summarize your ontology in a short paper (no more than two pages). Outline your ontology and discuss your rationale and key decisions (e.g. scope, purpose, classes and slots, defining relationships) • Implement the ontology in Protégé. Define your classes and instances. Create two queries that illustrate ways in which the data could be retrieved. • Dates & groupwork • Due – November 6th • Groupwork is acceptable INLS 520 Erik Mitchell

  26. RDF • Subject, property, object triples • Transmitted in xml • RDFS extends RDF with an ontology language • Properties, specialization • OWL • More powerful extension of RDFS • Uses same syntax of RDF INLS 520 Erik Mitchell

  27. RDF Model Author Webpage: http://www.stuff.com “Saki Knafo” (Value) Object (Resource) Subject (Property type) Predicate • “The author of the stuff webpage is Saki Knafo” • A literal, a triple, a statement INLS 520 Erik Mitchell

  28. How is RDF different? • RDF is a descriptive model that • Allows variable contextualized description • Deconstructs the descriptive process • Allows more granular automated processing of data • Uses exact markup to indicate the context of values (namespaces, schemas) INLS 520 Erik Mitchell

  29. Encoding RDF in XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description rdf:about="http://www.stuff.com/">   <dc:title>The Hang: The Island of Black Jeans</dc:title>   <dc:creator>SAKI KNAFO</dc:creator>   <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date>   <dc:description>descriptive content</dc:description>   </rdf:Description> </rdf:RDF> INLS 520 Erik Mitchell

  30. Iterative RDF description <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vcard="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/vcard.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description rdf:about=“http://www.stuff.com">   <dc:title>The Hang: The Island of Black Jeans</dc:title> <dc:creator rdf:href = "#Creator_001"/>   <dc:identifier>http://www.stuff.com</dc:identifier>   <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date>   <dc:description>descriptive content</dc:description>   </rdf:Description> <rdf:Description ID="Creator_001"> rdf:about="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_,,,">   <vcard:given>Saki</vcard:given> <vcard:family>Knafo</vcard:family> <vcard:email> <vcard:userid>knafo@www.nytimes.com</vcard:userid> </vcard:email>   </rdf:Description> </rdf:RDF> INLS 520 Erik Mitchell

  31. RDFS • RDF Schema • Defines additional rdf elements that help type relationships • Special Classes • Based on RDF Classes / Properties / Attributes with additional • http://www.w3schools.com/rdf/rdf_reference.asp • Allows the creation of vocabularies / ontologies INLS 520 Erik Mitchell

  32. OWL (Web Ontology Language) • An ontolgy that is geared towards representing information on the web • Classes, properties, and relationships that describe URIs and their facets. • Based on the Triple concept • Subject, Predicate, Object • 3 versions: OWL-Lite, OWL-DL, OWL-Full • Formatted in RDF/XML • Uses RDF and RDFS as a foundation • Adds new elements in the owl namespace INLS 520 Erik Mitchell

  33. OWL Versions • OWL-Lite • Simple hierarchies, constraints • OWL-DL • Uses description logics • Logic-based semantic markup based on first-order predicate logic • Still guarantees finite relationship processing • Best suited for automation • OWL-Full • Most complex • Open ended, possible to get into infinite processing INLS 520 Erik Mitchell

  34. OWL Example <?xml version="1.0"?> <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns:rdfs="http://www.w3.org/2000/01/rdfschema#" xmlns:owl=http://www.w3.org/2002/07/owl# xmlns=http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl# xml:base="http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl"> <owl:Ontology rdf:about=“> <owl:versionInfo rdf:datatype="http://www.w3.org/2001/X...">1.0</owl:versionInfo> <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string" > An ontology containing the basic part relations: partOf, hasPart, partOf_directly, and hasPart_directly. These are described in the accompanying note. Author: Chris Welty </rdfs:comment> </owl:Ontology> <owl:TransitiveProperty rdf:ID="partOf"> <owl:inverseOf> <owl:TransitiveProperty rdf:ID="hasPart"/> </owl:inverseOf> </owl:TransitiveProperty> <owl:ObjectProperty rdf:ID="hasPart_directly"> <rdfs:subPropertyOf rdf:resource="#hasPart"/> <owl:inverseOf> <owl:ObjectProperty rdf:ID="partOf_directly"> <rdfs:subPropertyOf rdf:resource="#partOf"/> </owl:ObjectProperty> </owl:inverseOf> </owl:ObjectProperty> </rdf:RDF> (Chris Welty) INLS 520 Erik Mitchell

  35. More OWL Examples • Airport • Pizza INLS 520 Erik Mitchell

  36. Next Week(s) • Fall Break – Enjoy • 10/30 – Guest speaker Lorrie Eakin • 11/6 – First Group presentations INLS 520 Erik Mitchell

More Related