310 likes | 445 Vues
This research presents an efficient method for diagnosing and repairing incoherent ontology alignments generated both automatically and manually. Incoherent alignments pose significant issues in various applications, such as inconsistent ontologies during instance migration and empty result sets in query translation. The proposed local optimal diagnosis method aims to efficiently address coherency by computing minimal hitting sets and local optimal diagnoses that ensure alignment integrity.
E N D
An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de
Computing a local optimal diagnosis Problem Statement • Automatically and manually (!) generated ontology alignments are often incoherent • See OAEI-2008 results of conference track • => Incoherent alignments are a problem in many application scenarios* • Instance migration results in inconsistent ontologies • Query translation results in ‚a priori‘ empty result sets • Find a way to automatically repair incoherent alignments in a very efficient way, because … • ‚Agents on the web‘ require coherent alignments on the fly • Large ontologies require efficient algorithms * C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.
Computing a local optimal diagnosis Outline • Alignment Semantics • Incoherence of an alignment, MIPS alignments • Alignment Diagnosis • Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis • Computing a Local Optimal Diagnosis (LOD) • Brute-Force LOD and Efficient LOD • Experimental Results • Runtime, Quality of the Diagnosis
Computing a local optimal diagnosis "Natural" Semantics Merged Ontology <1#Person, 2#Person, =, 0.98> <1#hasName, 2#name, =, 0.87> <1#writtenBy, 2#docWrittenBy, = 0.7> <1#authorOf, 2#hasWritten, =, 0.56> <1#firstAuthor, 2#Author, ⊑ , 0.56> O1∪A O2 Correspondences An alignment A and two ontologies O1 and O2 O2 O1 1#firstAuthor ⊑ 2#Author Axioms 1#Person ≣ 2#Person …
Computing a local optimal diagnosis Incoherence of an Alignment Definition: Incoherence of an Alignment An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi {1,2} that is unsatisfiable in O1 ∪A O2. can be reduced to the satisfiability of ∃i#R.⊤ Definition: MIPS Alignment (minimal conflict set) Given an incoherent alignment A between ontologies O1 and O2. A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.
Computing a local optimal diagnosis "Terminology" Alignment Correspondence Alignmentwith MIPS shown as subsets Alignmentin a sequence ordered by confidencesMIPS depicted by red-dotted links
Computing a local optimal diagnosis Alignment Diagnosis Definition: Alignment Diagnosis Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2. Proposition: Alignment Diagnosis and minimal Hitting Sets Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.
Computing a local optimal diagnosis Local Optimal Diagnosis (LOD) high confidence • Definition: Accused correspondence • A correspondence c A is accused by A iff there exists a MIPS in A with c M such that for all c‘ ≠ c in M it holds that • (1) conf(c‘) > conf(c) and • (2) c‘ is not accused by A. • Definition: Local optimal diagnosis (LOD) • The set of all accussed correspondences is referred to as local optimal diagnosis (LOD). important! low confidence
Computing a local optimal diagnosis Algorithm 1 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10 … continue the same way
Computing a local optimal diagnosis Algorithm 1: Result • … and after a few more slides we would end up like this: 1 2 3 4 5 6 7 8 9 10 • Note: • 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS • We have not computed a single MIPS alignment! First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08) With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)
Computing a local optimal diagnosis „Patternbased“ reasoning • Idea: Use incomplete method for incoherence detection in A‘ ⊆A • Classify O1 and O2 once, then check for each pair of correspondence in A‘ wether a certain pattern occurs • If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent • If no pattern occurs A‘ can nevertheless be incoherent! Oj Oi
Computing a local optimal diagnosis That doesn‘t work … • Use the efficient coherence test instead of complete reasoning in algorithm described above • Reasoning about A' ⊆ A does not require to reason in O1 ∪A' O2, but is replaced by iterating over all pairs in A' • Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD • Missing out one MIPS might result in a chain of incorrect follow-up decisions! • Thus, afterwards removal of missed-out MIPS does not work! • How to exploit the efficient method while still constructing a LOD?
Computing a local optimal diagnosis Algorithm 2: Example 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Run the BF algorithm with efficient reasoning. Still incoherent? Verification Step: Use binary search to detect correspondence k such that A[0… k-1] is coherent and A[0 … k] is incoherent safe part, efficient reasoning did not fail up to k 1 2 3 4 5 6 7 8 9 10 k=8 incorrect part,recompute! Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Run the main algorithm again with efficient reasoning for A[k+1 … n] where ∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis. Still incoherent?If yes, we have knew > kold repeat again the same verification step A[1…k] 1 2 3 4 5 6 7 8 9 10 A[k+1…n] Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Final result is a LOD. 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Runtime Considerations (Theory) • n = size of alignment A • m = number of times the binary search is applied • The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry • Runtime of pattern based reasoning not really matters with respect to runtime! • Runtime Comparison • Brute Force LOD: O(n) • Efficient LOD: O(log(n) * m) • Do we have m << n ?
Computing a local optimal diagnosis Results: Runtime • Based on experiments with OAEI conference ontologies and submission from 2007/08 • Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D) • Four different state of the art matching systems n m • Better results for benchmark datasets: 5 to 10 times faster
Computing a local optimal diagnosis Results: Quality of Diagnosis • Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure • For alignments with low precision positive effects are very strong. • In rare cases an incorrect correspondences annotated with high confidence has negative effects
Computing a local optimal diagnosis Summary • Algorithm 1: Algorithm for computing a LOD • Without computing MIPS or MUPS! • Algorithm 2: General approach for improving the algorithms of type 1 • Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning • In principle applicable to each semantic for which we can find a similar efficient reasoning approach! • Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!
Thanks for attention Questions? Computing a local optimal diagnosis
Back-Up Slides Computing a local optimal diagnosis
Computing a local optimal diagnosis Property Pattern Example ∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑¬Person O2 ∃reviewOfPaper.⊤ ∃readPaper.⊤ ≣ readPaper reviewOfPaper disjoint disjoint ≣ Document Document ∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1