210 likes | 342 Vues
The TAP system aims to create a cohesive global knowledge base by integrating data from diverse web services into a unified framework. This architecture facilitates data publishing, API consumption, and robust querying through a minimalist interface, GetData. Edutella complements this by offering P2P networking infrastructure for RDF metadata, supporting standardized queries, replication, and semantic mediation. Together, TAP and Edutella enhance data interoperability, making it easier to query and aggregate knowledge across the semantic web landscape.
E N D
Semantic Web Search By Raluca PAIU [paiu@l3s.de] Raluca Paiu
Overview • The TAP System • Edutella • Edutella Wrapper Raluca Paiu
The TAP System • Goal: create a single schematically unified global knowledge base by knitting together data from disparate web services into a coherent whole. Raluca Paiu
TAP Architecture[1] • TAP provides: • A facility for publishing data • A library which implements an application programming interface for consuming this data • A registry Raluca Paiu
TAP Architecture [2] • Publishing Data TAPache • Functions as a module for Apache HTTP server • Provides the GetData interface • Offers a mechanism for aggregating the data in multiple RDF files Raluca Paiu
TAP Architecture [3] • Consuming data - through a minimalist query interface called GetData Raluca Paiu
TAP Architecture [4] • The registry: • Available as a separate server • Can be abstracted as a lookup table • Redirects the queries to the appropriate sites • Caching Raluca Paiu
GetData [1] • Simple query interface to network accessible data presented as directed labeled graphs. • Requirements: • Simplicity • Predictability Raluca Paiu
GetData [2] • Allows a client program to access the values of one or more properties (or their inverse) of a resource from a graph • Each GetData query is a SOAP message • A message specifies two arguments: • The resource whose properties are being accessed • Properties that are being accessed • Optional arguments: the client wants the inverse of properties, the number of answers desired • The answer of a GetData query is a graph which contains the resource (whose properties are being queried) along with the properties specified in the query and their respective targets / sources. Raluca Paiu
GetData [3] • The abstract syntax of a GetData query: • GetData(<resource>, <property>) -> <value> • GetData(<resource>, <property>, “inverse=yes”) -> <value> • GetData(S,P) O • GetData(O,P,”inverse=yes”) S S P O Raluca Paiu
Example: GetData(<Yo-Yo Ma>, birthplace) => <Paris> GetData(<Yo-Yo Ma>, Author, inverse=yes) => <Appalachian Journey>, <Taverner> GetData [4] Raluca Paiu
Edutella • P2P networking infrastructure based on RDF • Offers the following services: • Query Service – standardized query and retrieval of RDF metadata • Replication Service – for availability, balancing and data persistence • Mapping Service – translation between different metadata vocabularies • Mediation Service – mediate access between different services • Clustering Service – set up the semantic routing and semantic clusters Raluca Paiu
Edutella Query Service • Standardized query exchange mechanism for RDF metadata stored in distributed RDF repositories • The Edutella network uses the query exchange language family RDF-QEL-i (based on Datalog semantics) as standardized query exchange language format which is transmitted in an RDF/XML-format. • The query languages levels are defined as follows: • RDF-QEL-1 – restricted to conjunctive formulas only • RDF-QEL-2 – extends RDF-QEL-1 with disjunction • RDF-QEL-3 – contains the full Datalog Semantics (conjunction, disjunction, negation) • Further levels allow different models of recursion Raluca Paiu
Datalog Semantics [1] • A Datalog program can be expressed as: • A set of rules/implications: • Head – one positive literal in the consequent of the rule • Body – conjunction of one or more literals in the antecedent of the rule, including conditions on variables • A set of facts – single positive literals • The actual query literals (a rule without head) Literals – predicates expressions describing relations between any combination of variables and constants Raluca Paiu
Datalog Semantics [2] • Disjunction – expressed as a set of rules with identical head • A Datalog query is formed by: • Conjunction of query literals • A possibly empty set of rules Raluca Paiu
Edutella Wrapper [1] • The process that every wrapper must perform is the following: • Receives a QEL as a string that uses the Elena Common Ontology • Understands the QEL query • Maps the Elena Common Ontology to the local ontology • Converts the QEL to the local query language • Sends the transformed query to the repository • Receives the results from the repository • Transforms the results to a variable binding table • Returns the results Raluca Paiu
Edutella Wrapper [2] • Wrapping QEL to GetData: • Map the QEL query to a N-Tree • Every node corresponds to a variable or a resource • A node (corresponding to a variable) might have associated some restrictions • Traverse the N-Tree to find the order in which the GetData queries have to be sent • Top-down – for direct search • Bottom-up – for inverse search • Bind the results to the variables Raluca Paiu
Edutella Wrapper [3] • For a node corresponding to a variable, which has more than one child, intersect the results obtained on each branch • Apply the restrictions (if any) to the node corresponding to a variable • If the query is made of rules, we have an N-Tree for each rule we have to make an union between the results corresponding to a variable from each tree. • Return the results as RDF graph answers X Pn P1 P2 … Y1 Y2 Yn Raluca Paiu
Edutella Wrapper [4] • Example: • ?- qel:s(X,<http://localhost/data/tap.rdf/teamMember>,Y), • qel:s(Y,<http://localhost/data/tap.rdf/hasResearchArea>, <http://localhost/data/tap.rdf/Artificial_Intelligence>). • The corresponding tree: Name: X Type: variable Restrictions: null <teamMember> Name: Y Type: variable Restrictions: null Name: Artificial_Intelligence Type: resource Restrictions: null <hasReseachArea> Raluca Paiu
Edutella Wrapper [5] • The tree corresponds to a direct search -> bottom-up traversal (first all the children of a node, than the node itself) • Y <- GetData(<Artificial_Intelligence>, • <hasResearchArea>, inverse=yes) • For each binding of Y X <- GetData (<binding_Yi>, <teamMember>, inverse=yes) • Return the results as RDF graph answers Raluca Paiu
Thank You ! Raluca Paiu