210 likes | 357 Vues
This paper discusses high-level data access via query rewriting in ontology-based data access (OBDA) systems. It emphasizes the importance of abstraction from diverse data sources to optimize query answering, particularly in domains with large, distributed, and inconsistent datasets. We explore various algorithms for rewriting queries into union conjunctive queries (UCQ) and generating efficient SQL queries for relational databases. The paper also highlights ongoing research trends in the semantic web and presents benchmarks comparing query rewriting methodologies, aiming for improved efficiency and expressiveness in handling real-world applications.
E N D
High-level Data Access Based on Query Rewritings Ekaterina Stepalina Higher School of Economics
High-Level Data Access • Concentration on application domain tasks • Abstraction from data sources • Efficient work • Research • This problem is actively considered on modern scientific conferences on knowledge representation and ontologies – OWLED (2009), (ICDE IIMAS, 2008) , the Semantic Web magazine (2011 – the Mastro System) • W3C developedOWL 2, OWL 2 QL (2008) and etc.
Ontology-Based Data Access (OBDA) • Large amounts of data (distributed, inconsistent) • Main task – query answering (domain-oriented and efficient)
What is Ontology? • Ontology is a knowledge domain described on some knowledge representation language. • Entity-Relationship and UML Class diagrams can be seen as ontology languages.
Logic-Based Knowledge Representation • Enables semantic processing of data • Enables inference of implicit knowledge • Well studied and actively developed • Description logics (Baader,1999), esp. DL-Lite • Standardized • OWL 2 Profiles
DL-Lite Best Suites for OBDA • High expressive and computationally efficient • Allows delegating query answering to DBMSs and using all advantages of modern relational technologies • Supported by the W3C standard - OWL 2 QL
Query Answering Problem • Given a query and an n-tuple of objects fromA. Decide, whether , or the n-tuple is the answer for with respect to K. For knowledge represented in DL-Lite, we can formulate queries in domain concepts, translate them into ordinary SQL queries and perform over separate databases.
OBDA System Architecture • Ontology Editor • OBDA-Enabled Reasoner • Mapping Processor • Data Source Manager • Consistency Checker
Query Rewritings • OBDA-Enabled Reasoner rewrites the initial ontology query into a set of UCQ (union conjunctive query). • Mapping Processor builds an SQL from UCQ and given mappings. • The initial query syntax may differ (SparQL, datalog query, etc.)
TBox and ABox in DL • TBox is a finite set of concept and role inclusion axioms: • ABox is a finite set of assertions: • Where - object’s name, A – concept name, P – role name, q – integer.
Interpretation • Interpretation (the particular instance of KB) is a pair if non-empty domain and an interpretation function : , , and . • UNA (unique name assumption):
OWL 2 QL • UNA is ignored; (in)equality must be defined explicitly • Language expressive power reduced up to (other designation - ). • Basic conceptual modeling relations are available: (A)sym, (Ir)Ref, Tran • Main constraints of : • Functional relations cannot be defined • Particular roles cannot be assigned only to specific concepts, all roles are applied to all concepts • Disjunction coverage of knowledge domain cannot be defined
Query Rewriting Sample • RDB tables: Person(name, age), Lives (person, city), Manages (boss, employee). • Query:Get the names and ages of all people living in the same city with their boss. • UCQ: • SimplifiedUCQ: • SQL query: • SELECT P.name, P.age • FROM Person P, Manages M, Lives L1, Lives L2 • WHERE P.name=L1.person AND P.name=M.employee AND M.boss=L2.person AND L1.city=L2.city
Query Rewriting Algorithms • CGLLR (Calvanese et al., 2007) - Applies all suitable TBox axioms to - Replaces axioms containing existential qualifications with another 3 axioms, which increases the number of UCQ • RQR (Pérez-Urbina, Horrocks, Motik, 2009) • Generates clauses from TBox assertions and then resolve clauses with query • Potentially supports more expressive DLs
Query Rewriting Benchmark • 9 ontologies with axioms, containing-existential qualification: • Vicodi (V) • Stock exchange (S) • University (U,UX) • Adolena (A,AX) • Synthetic (P1, P5,P5X)
Comparison Results • RQRis more preferable to implement in OBDA-enabled reasoners, thanCGLLR: • Generates less UCQ, especially for ontologies with large number of existential qualifications • May be further optimized and advanced to more expressive DLs, than
Current Work • Preparing an ontology for a real application – interactive television platform (IPTV) for testing algorithms on real data • Optimizing RQR – reducing the number of generated clauses • Main idea – not advance RQR, but support more expressiveness and all OWL 2 QL constructors in powerful mappings
References • The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, 2002. ISBN 0521781760. Edited by F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P. F. Patel-Schneider. • F. Baader. Logic-Based Knowledge Representation. In M.J. Wooldridge and M. Veloso, editors, Artificial Intelligence Today, Recent Trends and Developments, number 1600 in Lecture Notes in Computer Science, pages 13–41. Springer Verlag, 1999. • Artale, A.; Calvanese, D.; Kontchakov, R. and Zakharyaschev, M. (2009) The DL-Lite family and relations. Journal of Artificial Intelligence Research 36 (1), pp. 1-69. ISSN 1076-9757. • H.P´erez-Urbina, I.Horrocks, and B.Motik. Efficient Query Answering for OWL 2. In Proceedings of the 8th International Semantic Web Conference (ISWC2009), Chantilly, Virginia, USA, 2009. • H.P´erez-Urbina, B.Motik, and I.Horrocks. Tractable Query Answering and Rewriting under Description Logic Constraints. JournalofAppliedLogic, 2009.