Parallel ABox Reasoning of EL Ontologies

Yuan Ren, Jeff Z. Pan and Kevin Lee University of Aberdeen, UK, NICTA, Australia, JIST2011 Parallel ABox Reasoning of EL Ontologies

Motivation • Computing infrastructure has significantly improved in the last decades. • Computer networks  cloud computing • Integrated circuits  multi-core processors • Computation can be, and have already been parallelised in my applications. • But most off-the-shelf DL reasoners do not support parallelised reasoning.

Existing Works • Parallel reasoning with multiple computational nodes • Marvin system: RDF reasoning; • Weaver and Handler’s work: RDFS inference; • SAOR: Join-free pD* inference; • DRAGO: OWL DL reasoning; • Anne and Heiner’s work: ALCHIQ reasoning; • MapReduce based approaches such as WebPIE: RDFS, pD*, EL TBox reasoning; pD* justification • Parallel reasoning with multiple computational core in a single computer • Soma and Prasanna’s work: pD* reasoning; • Liebig and Muller’s work: SHN tableau reasoning; • Meissner’s work: ALC tableau reasoning; • Aslani and Haarslev’s work: TBox reasoning • ELK algorithm by Kazakov: ELHR+TBox classification; SROIQ SHIN ALC intractable EL+ tractable pD* ELHR+ TBox only RDFS Parallel reasoning for large amount of data in the EL profile is missing RDF

Supporting ELHbottom, R+ ABox Reasoning • Why EL family? • Some of the well-known largest terminologies are in the EL family, • E.g. SNOMED CT; • Why ABox reasoning? • Semantic applications will populate terminologies with data • E.g. Chintan Patel et al (SWJ2007) populated SMOMED CT with 59 million ABox assertions. • Why the “bottom” is non-trivial? • Enables inconsistency checking • E.g. in SNOMED CT, Groin is defined as Abdomen AND Leg, which is inaccurate and can be detected if Abdomen and Leg are disjoint. • Role hierarchy can not be pre-computed

ELHbottom, R+ Syntax and Semantics

TBox Reasoning in ELHR+ • EL reasoning is realised by applying completion rules • Starting from the original axioms • Check which axioms can be joined to trigger rules • Increase the entailment set until closed under the rules

TBox Reasoning in ELHR+ • A naïve approach requires guarding shared data collections with locks … Inserting into and Retrieving from a set can not be performed at the same time ! Solution 1: Guarding with locks Solution 2: separating the inferences and data collections

Parallel TBox Reasoning in ELHR+ • Key data structures: • Axiom: a GCI • RI closures are pre-computed; • Context: a concept • Context.scheduled: a queue of Axioms to be processed; • Context.processed: a set of Axioms already processed; • Context.isActive = true IFF Scheduled is non-empty; • ActiveContexts: a queue of Context • Every element must be unique

Parallel TBox Reasoning in ELHR+ Worker 1 Worker 2 … scheduled processed

Parallel TBox Reasoning in ELHR+ Worker 2 Worker 1 … scheduled processed Reasoning is completely and independently separated into different contexts

Parallel TBox Reasoning with ELHR+ • Get contexts of axioms • Contexts are only need for premise axioms in rules • Not for side condition axioms! • Once a new axiom is derived • It must be added into the schedule of ALL of its contexts • And later be saved into the processed set of ALL of its contexts Optimising by reducing premises in rules Optimising by reducing contexts in axioms

Extending TBox with Bottom • The bottom rule: • It still has a common context for all premise axioms • Lock-free parallelisation guaranteed.

Extending to ABox Reasoning • ABox reasoning • Computing the atomic types of all individuals • Computing the atomic relations between all individuals • A simply approach by reusing the TBox algorithm • Internalising the ABox with nominals • Treating singleton nominals as atomic concepts

Mixing TBox and ABox Reasoning • Introducing redundancies • has to be maintained in A.scheduled and A.processed, waiting for the derivation of . Worker 1 Worker 2 scheduled … processed

Separating TBox and ABox Reasoning • C.scheduledC.processed • contains no nominal! • Can always be computed earlier than • Can be used as side conditions in rules. • C does not need to be a context in

Separating TBox and ABox Reasoning Applicable nominals NOT applicable nominals Extending to ABox rules When the filler is NOT a nominal When the filler is a nominal

ABox Rules • First perform TBox reasoning • Only non-nominals are used as contexts • Perform ABox reasoning with ABox rules • Only singleton nominals are used as contexts • Sound & complete (Theorem 1)

Separating Relations and Types • {a}.scheduled{a}.processed • won’t affect relations! • Can always be computed later than • can be used as side conditions in rules. • {a} does not need to be a context in in type stage.

Separating Relations and Types • Relation computations are perfectly parallelised • R(a,b)  S(a,b) with RIs as side conditions; • R(a,b)  R(c,b) with R(c,a) in ABox and trans(R) as side conditions; • R(a,b)  R(c,b) with a=c in ABox as side condition; • b=a R(c,b) with R(c,a) in ABox as side condition; • Relation computations can be performed in parallel with TBox classification

Evaluation • Benchmark • VICODI ontology • NotGalenTBox + synthetic ABox generated by SyGENiA • Environment • AWS EC2 cloud computing, 64-bit Linux, 7G RAM, each worker ≈ 2.5-3.0 GHz Off-the-shelf Reasoners PEL

Evaluation • Scalability Evaluation • NotGalenTBox + synthetic ABox generated by SyGENiA • AWS EC2 cloud computing, 64-bit Linux, 70G RAM, each worker ≈ 3.5-4.2 GHz

Summary • Parallel ABox reasoning can handle 1 million individuals and 9 million ABox assertions • Optimising the orders of different inferences can reduce redundancies. • Parallel ABox reasoning can still be improved • Do not scale linearly • Frequent RAM I/O vs. limited bandwidth can be a potential cause. • Distributed reasoning as a future work • Language is still not expressive enough • Role chains and nominals in TBox are hard to parallelise • Extension to other language as a future work • Full materialisation is too memory-consuming • Target-oriented QA algorithm and optimisation as a future work

Thank You! • Q & A

Parallel ABox Reasoning of EL Ontologies

Parallel ABox Reasoning of EL Ontologies

Presentation Transcript

Ontologies

Ontologies for Reasoning about Failures in AI Systems

Ontology, ontologies and ontological reasoning 3: ontological reasoning

Parallel and Distributed Systems for Probabilistic Reasoning

Evolution of OWL 2 QL and EL Ontologies

Reasoning the FMA Ontologies with TrOWL

Ontologies

Ontologies

Context Representation and Reasoning with Formal Ontologies

Towards Parallel Nonmonotonic Reasoning with Billions of Facts

Ontology, ontologies and ontological reasoning 2: what do ontologies represent?

Ontologies

Reference Ontologies, Application Ontologies, Terminology Ontologies

Ontologies

Towards a Distributed Reasoning within Multiple Ontologies

Development of Ontologies

Ontologies

Reasoning with Multi-version Ontologies: a temporal logic approach

Ontologies

Ontologies

Reference Ontologies, Application Ontologies, Terminology Ontologies

Reasoning with Inconsistent Ontologies