1 / 44

Information Retrieval on the Semantic Web Using Ontology-based Visualization

Information Retrieval on the Semantic Web Using Ontology-based Visualization . Larry Reeve INFO780 – XML and Databases Dr. Han - Spring 2004. Overview. Semantic Web and Ontologies RDF and OWL Visualization Uses Cluster Map Futures. Semantic Web. Machine-processable Web

karl
Télécharger la présentation

Information Retrieval on the Semantic Web Using Ontology-based Visualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Retrieval on the Semantic Web Using Ontology-based Visualization Larry Reeve INFO780 – XML and Databases Dr. Han - Spring 2004

  2. Overview • Semantic Web and Ontologies • RDF and OWL • Visualization Uses • Cluster Map • Futures

  3. Semantic Web • Machine-processable Web • How to model meaning? • Common framework that allows data to be shared and reused • Extension of current web • Funding • DARPA - $70 million • European Union - € 55 million

  4. Existing Web • Resources: • identified by URI's • untyped • Links: • href, src, ... • limited, non-descriptive • User: • Semantics of resource gleaned from content • Machine: • Little information available - significance of the links only evident from the context around the anchor. Source: W3C

  5. Semantic Web • Resources: • Globally Identified by URI's • or Locally scoped (Blank) • Extensible • Relational • Links: • Identified by URI's • Extensible • Relational • User: • Richer user experience • Exchange knowledge effectively • Machine: • More processable information is available Source: W3C

  6. Semantic Web Architecture Source: W3C

  7. Ontology • Specification of a conceptualization (Gruber) • Provide common definition of a domain • Documents annotated with metadata to determine “meaning”

  8. Ontology • Play central role in Semantic Web • Used for: • Querying • Presentation • Navigation • Move from keyword-based searching to logic-based searching

  9. Keyword Search + Taxonomy

  10. Ontology Types • Lightweight • Simple keyword hierarchies • (Yahoo, Open Directory Project) • Well-defined • Complex concept hierarchies, properties, value restrictions, axiomatised relationships

  11. Ontology • Many ontologies currently defined: • DAML – DARPA Agent Markup Language (www.daml.org) • DAML Ontology Library – 282 entries • Baseball Teams • GPS coordinate systems • Employment hierarchy for CMU • Stanford • OntoLingua Server (www-ksl-svc.stanford.edu) • Protégé Ontologies Library (protege.stanford.edu)

  12. W3C Standards • RDF – Resource Description Framework • data model for representing resources and their relations between them • OWL – Web Ontology Language • provides a vocabulary for describing properties and classes and allows for greater expressive complexity than RDF alone • Both recommendations issued Feb 2004

  13. RDF • Represented using XML • An RDF statement is a triple composed of a subject, a predicate, and an object • Each RDF statement is modeled as a graph structure : • subjects and objects are nodes • predicate is an arc • Example: • index.html has a creator whose value is John Smith • subject(“index.html”)  predicate(“creator”)  object(“John Smith”) • Helpful in IR by providing more details to a search engine other than keywords

  14. RDF Fragment • <RDF xmlns:r="http://www.w3.org/TR/RDF/" • xmlns:d="http://purl.org/dc/elements/1.0/" • xmlns="http://directory.mozilla.org/rdf"> • <Topic r:id="Top"> • <tag catid="1"/> • <d:Title>Top</d:Title> • <narrow r:resource="Top/Arts"/> • <narrow r:resource="Top/Business"/> • <narrow r:resource="Top/Computers"/> • <narrow r:resource="Top/Games"/> • <narrow r:resource="Top/Health"/> • </Topic> • </RDF> Source: Open Directory Project (www.dmoz.org)

  15. OWL • Considered an extension of RDF • The vocabulary provided by OWL describes items such as: • relations between classes • cardinality • equality • richer typing of properties • characteristics of properties • enumerated classes • Comprised of three languages: • OWL Lite for building classification hierarchies and simple constraints • OWL Description Logics • OWL Full

  16. OWL • OWL Lite: • for building classification hierarchies and simple constraints • OWL Description Logics (DL) • provides all OWL features in addition to computational completeness (guaranteed computability of conclusions) as well as decidability (all computations will finish in finite time) • OWL Full • provides all OWL features with no computational guarantees

  17. OWL Fragment • <rdf:RDF • xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" • xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" • xmlns:owl="http://www.w3.org/2002/07/owl#" • xmlns:first="http://www.w3.org/2002/03owlt/Ontology/premises001#" • xml:base="http://www.w3.org/2002/03owlt/Ontology/premises001" > • <owl:Ontology rdf:about="" /> • <owl:Class rdf:ID="Car"> • <owl:equivalentClass> • <owl:Class rdf:ID="Automobile"/> • </owl:equivalentClass> • </owl:Class> • <first:Car rdf:ID="car"> • <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Thing" /> • </first:Car> • <first:Automobile rdf:ID="auto"> • <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Thing" /> • </first:Automobile> • </rdf:RDF> Source: W3C

  18. Ontology-based IViz • Ontology Life Cycle • Development • IsAViz, Protégé • Instantiation • Manual, semi-automatic • Deployment • Analyze, query, and navigate an ontology-based information space

  19. Ontology-based IViz • Ontology Characteristics • Light-weight • (Taxonomies with few logical class relations) • Large number of instances • Instance overlaps between classes • Incomplete

  20. IViz in Deployment Stage • Analysis Visualization • Overview; pattern detection • Requires: data set, ontology, classifier • Query Visualization • Use ontology in query construction • Query Navigation • Information spaces / result sets

  21. Analysis Visualization • Requires: data set, ontology, classifier • Analysis within single domain • Same document set with different ‘perspectives’ • Comparison of different data sets • Information change over time

  22. Analysis within single domain Economic Sector Geographic Region

  23. Comparison of Different Data Sets Two banking web sites analyzed using the same ontology

  24. Monitoring Three ontology classes changing over time

  25. Query Visualization • Query Formulation; Review of Results; Query Refinement

  26. Query Navigation • Visualization is not primary interface • Serves as a global map • Select ontology classes • Documents displayed in text list

  27. Existing IViz Techniques • Hyperbolic Tree • ‘The Brain” • Self-Organizing Maps (SOMs)

  28. Hyperbolic Tree (Source: http://www.inxight.com)

  29. The Brain Source: http://www.thebrain.com

  30. Kohonen SOM Source: http://websom.hut.fi/websom/

  31. Cluster Map

  32. Cluster Map - Class Positioning • Spring Embedder algorithm • Nodes attract • Edges repel • …until a stable state is attained • Semantic Closeness • Two classes are close when they share many instances • Two instances are close when they belong to the same class

  33. Cluster Map UI

  34. Cluster Map Advantages • All classes and class instances are displayed at one time • Non-tree like hierarchies can be displayed (not just graph structures) • Overlap between classes is exploited • Good for categorizing IR query results using light-weight ontology

  35. Cluster Map Weakness • Light-weight ontologies • Number of classes small as compared to number of class instances • Some classes will be densely populated • Increasing specialization will help • Scaling to large number of instances • Doesn’t show document similarity • Can only view by class membership

  36. Displaying Document Similarity • Document analysis is subordinate to navigation and querying • Can show document list with ranking • Seeling Proposal: • Document Map Visualization

  37. Seeling Visualization • Basic idea: • Select ontology class • See all documents against the document space

  38. Seeling UI

  39. Document Similarity – Volvox • Extend Cluster Map • Replace document containers with volvox containers • Retains global display • No separate “document space” display • Another benefit - unlimited nesting – allows drilldown • Named by Dr. McCain / Henry Small after similarly-shaped microorganism Source: www.groxis.com

  40. Cluster Map with Volvox Extension

  41. Non-class membership Views • View data by combining classes • Use information (properties, sub-classes) that relate classes to one another • Example: • Data about people, projects, and organization-produced papers • Visualize people and papers together to show their interaction

  42. Semantic Views

  43. Summary • Ontologies useful in categorizing IR search results • Cluster Map visualizes small document spaces effectively • Can be adapted to handle larger document spaces • Alternate views, complex ontologies will require other visualization methods • More research is needed to support the use of ontologies in visualization

  44. Information Retrieval on the Semantic Web Using Ontology-based Visualization • Questions

More Related