1 / 33

RDF Databases

RDF Databases. By: Chris Halaschek. Outline. Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction Demo Future Directions. Motivation. Having metadata available is not enough

mari-solis
Télécharger la présentation

RDF Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RDF Databases By: Chris Halaschek

  2. Outline • Motivation / Requirements • Storage Issues • Sesame • General Introduction • Architecture • Scalability • RQL Introduction • Demo • Future Directions

  3. Motivation • Having metadata available is not enough • Need tools to process, transform, and reason with the information • Need a way to store the metadata and interact with it

  4. Requirements • Scalable • Good performance • Useful query language

  5. Storage Issues • How to store the data? • In relational database as tables • Querying requires many joins…costly • Triples • Native graph structure • Querying requires graph traversals…need efficient algorithms

  6. Sesame - Introduction • Open source RDF Schema-based repository and querying facility • Developed as a research prototype by Aidministrator Nederland bv • NLnet Foundation sponsors its further development as open source software

  7. Sesame - Introduction • Can handle RDF data in XML-serialized RDF and N-Triples format • Can extract the contents of a Sesame repository in XML-serialized RDF, N-Triples, and N3 format

  8. Sesame – Architecture

  9. Repository • Many options due to Repository Abstraction Layer (RAL) • DBMS – relational, object-relational, etc • Existing RDF stores • RDF files • RDF network services

  10. Repository Abstraction Layer (RAL) • Interface that translates RDF-specific methods to a specific DBMS • Defined by an RDF API • Created their own set of interfaces rather than adopt or extent the existing RDF API proposal • Existing API targeted main memory model • Theirs offers specific operations that support RDF Schema semantics (i.e. subsumption reasoning)

  11. RAL Continued • Several of Sesame’s functional modules are clients of the RAL • Problems: • Must read from repository – performance decrease • Solution – selectively caching data in memory • For small repositories, all data can be cached

  12. Functional Modules • Interact with RAL • RQL query module • Evaluates RQL queries • RDF administration module • Allows uploading RDF data and schema information, as well as deleting information • RDF export module • Allows extraction of schema and/or data from repository

  13. RQL Query Module • Proposed RQL: • Developed within the European IST project C-Web • Follow-up project by ICS at FORTH, in Greece • Adopts the syntax of OQL • Sesame’s implementation of RQL is slightly different from the proposed RQL • Better compliance to W3C specificaitons • Support for optional domain and range restrictions • Queries are translated into sets of call to the RAL • Note: Also supports RDQL – based on SquishQL

  14. RQL Query Module

  15. Admin Module • Main functions: • Add RDF data/schema information • Clear repository • Retrieves information from an RDF(s) source and parses it using SiRPAC RDF parser • Parser delivers information to admin module in statement form – (S,P,O) • Module check statements for consistency and then inserts data

  16. RDF Export Module • Exports the contents of a repository formatted in XML-serialized RDF • Supplies a basis for using Sesame in combination with other RDF tools

  17. Communication with Sesame • Multiple options for various contexts • HTTP • RMI • SOAP • Intermediaries between the functional modules and their clients

  18. Sesame – Architecture

  19. Sesame - Scalability • Performance Tests • Uploaded and queried collection of nouns from Wordnet – 400,000 RDF statements • Performed on Sun UltraSPARC 5, 256 MB RAM • Used Java Servlets running on web server to communicate of HTTP • PostgreSQL version 7.1.2 repository

  20. Scalability Continued • Uploading nouns • 94 minutes • 71 statements per second • Querying was much slower than expected • Due to distributed storage over multiple tables • Retrieving data required doing many joins

  21. Sesame’s Future • Migration of Sesame to alternate repositories to boost performance • DAML + OIL support

  22. RQL Introduction • Museum schema example

  23. RQL - Syntax • Query typically built upon three clauses • Select • Projection over query results • From • Bind variables to specific locations in graph model • Where • Optional – constraint on values of variables in the from clause

  24. RQL - Example select X, @P from {X} @P {Y} where Y like "Pablo" • x and y are bound to nodes • @P bound to a connecting edge - @ prefix signifies the variable is bound to properties • $ prefix signifies classes • http://sesame.aidministrator.nl/sesame/actionFrameset.jsp?repository=museum

  25. RQL - Namespaces • In RDF, nodes and edges are identified by URIs • Can be very long • Namespace abbreviation mechanism • Extra clause • using namespace cult = http://www.icom.com/schema.rdf# • Simply type: cult:paints

  26. RQL – Path Expressions • Specify a linear path through the graph select PAINTER, PAINTING, TECH from {PAINTER} cult:paints {PAINTING}. cult:technique {TECH} using namespace cult = http://www.icom.com/schema.rdf# • http://sesame.aidministrator.nl/sesame/actionFrameset.jsp?repository=museum

  27. RQL – Querying Schema • Retrieving the class of a resource select X, $X, Y from {X : $X} cult:paints {Y} using namespace cult = http://www.icom.com/schema.rdf# • Variable $X is matched to the class of the resource value of X • http://sesame.aidministrator.nl/sesame/actionFrameset.jsp?repository=museum

  28. RQL – Querying Schema • Constraining resources to a schema select X, Y from {X : cult:Cubist } cult:paints {Y} using namespace cult = http://www.icom.com/schema.rdf#

  29. RQL – Standard Functions • Class (also Property) • subClassOf (also subProperyOf) • typeOf • In all above use ^ for only direct descendents (i.e. subClassOf^( cult:Painter ) )

  30. RQL – subClassOf • Example: select X, @P, Y from {X} @P {Y} where X in subClassOf^( cult:Painter ) using namespace cult = http://www.icom.com/schema.rdf#

  31. RQL – Advanced Queries • Set Operators • Union, Intersection, Difference • Logical Operators • Domain and Range Constraints • Comprehensive List: http://sesame.aidministrator.nl/publications/rql-tutorial.html

  32. Future of RDF Databases • Standard query language • Improved storage structures • Native graph model

  33. References / Links • Sesame: http://sesame.aidministrator.nl/ • NLnet Foundation: http://www.nlnet.nl/ • Original Specifications of RQL: http://139.91.183.30:9090/RDF/RQL

More Related