An Efficient Method for Computing Alignment Diagnoses

An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de

Computing a local optimal diagnosis Problem Statement • Automatically and manually (!) generated ontology alignments are often incoherent • See OAEI-2008 results of conference track • => Incoherent alignments are a problem in many application scenarios* • Instance migration results in inconsistent ontologies • Query translation results in ‚a priori‘ empty result sets • Find a way to automatically repair incoherent alignments in a very efficient way, because … • ‚Agents on the web‘ require coherent alignments on the fly • Large ontologies require efficient algorithms * C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.

Computing a local optimal diagnosis Outline • Alignment Semantics • Incoherence of an alignment, MIPS alignments • Alignment Diagnosis • Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis • Computing a Local Optimal Diagnosis (LOD) • Brute-Force LOD and Efficient LOD • Experimental Results • Runtime, Quality of the Diagnosis

Computing a local optimal diagnosis "Natural" Semantics Merged Ontology <1#Person, 2#Person, =, 0.98> <1#hasName, 2#name, =, 0.87> <1#writtenBy, 2#docWrittenBy, = 0.7> <1#authorOf, 2#hasWritten, =, 0.56> <1#firstAuthor, 2#Author, ⊑ , 0.56> O1∪A O2 Correspondences An alignment A and two ontologies O1 and O2 O2 O1 1#firstAuthor ⊑ 2#Author Axioms 1#Person ≣ 2#Person …

Computing a local optimal diagnosis Incoherence of an Alignment Definition: Incoherence of an Alignment An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi  {1,2} that is unsatisfiable in O1 ∪A O2. can be reduced to the satisfiability of ∃i#R.⊤ Definition: MIPS Alignment (minimal conflict set) Given an incoherent alignment A between ontologies O1 and O2. A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.

Computing a local optimal diagnosis "Terminology" Alignment Correspondence Alignmentwith MIPS shown as subsets Alignmentin a sequence ordered by confidencesMIPS depicted by red-dotted links

Computing a local optimal diagnosis Alignment Diagnosis Definition: Alignment Diagnosis Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2. Proposition: Alignment Diagnosis and minimal Hitting Sets Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.

Computing a local optimal diagnosis Local Optimal Diagnosis (LOD) high confidence • Definition: Accused correspondence • A correspondence c  A is accused by A iff there exists a MIPS in A with c  M such that for all c‘ ≠ c in M it holds that • (1) conf(c‘) > conf(c) and • (2) c‘ is not accused by A. • Definition: Local optimal diagnosis (LOD) • The set of all accussed correspondences is referred to as local optimal diagnosis (LOD). important! low confidence

Computing a local optimal diagnosis Algorithm 1 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10

Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10 … continue the same way

Computing a local optimal diagnosis Algorithm 1: Result • … and after a few more slides we would end up like this: 1 2 3 4 5 6 7 8 9 10 • Note: • 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS • We have not computed a single MIPS alignment! First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08) With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)

Computing a local optimal diagnosis „Patternbased“ reasoning • Idea: Use incomplete method for incoherence detection in A‘ ⊆A • Classify O1 and O2 once, then check for each pair of correspondence in A‘ wether a certain pattern occurs • If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent • If no pattern occurs A‘ can nevertheless be incoherent! Oj Oi

Computing a local optimal diagnosis That doesn‘t work … • Use the efficient coherence test instead of complete reasoning in algorithm described above • Reasoning about A' ⊆ A does not require to reason in O1 ∪A' O2, but is replaced by iterating over all pairs in A' • Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD • Missing out one MIPS might result in a chain of incorrect follow-up decisions! • Thus, afterwards removal of missed-out MIPS does not work! • How to exploit the efficient method while still constructing a LOD?

Computing a local optimal diagnosis Algorithm 2: Example 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

Computing a local optimal diagnosis Algorithm 2: Example Run the BF algorithm with efficient reasoning. Still incoherent? Verification Step: Use binary search to detect correspondence k such that A[0… k-1] is coherent and A[0 … k] is incoherent safe part, efficient reasoning did not fail up to k 1 2 3 4 5 6 7 8 9 10 k=8 incorrect part,recompute! Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

Computing a local optimal diagnosis Algorithm 2: Example Run the main algorithm again with efficient reasoning for A[k+1 … n] where ∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis. Still incoherent?If yes, we have knew > kold repeat again the same verification step A[1…k] 1 2 3 4 5 6 7 8 9 10 A[k+1…n] Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

Computing a local optimal diagnosis Algorithm 2: Example Final result is a LOD. 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

Computing a local optimal diagnosis Runtime Considerations (Theory) • n = size of alignment A • m = number of times the binary search is applied • The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry • Runtime of pattern based reasoning not really matters with respect to runtime! • Runtime Comparison • Brute Force LOD: O(n) • Efficient LOD: O(log(n) * m) • Do we have m << n ?

Computing a local optimal diagnosis Results: Runtime • Based on experiments with OAEI conference ontologies and submission from 2007/08 • Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D) • Four different state of the art matching systems n m • Better results for benchmark datasets: 5 to 10 times faster

Computing a local optimal diagnosis Results: Quality of Diagnosis • Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure • For alignments with low precision positive effects are very strong. • In rare cases an incorrect correspondences annotated with high confidence has negative effects

Computing a local optimal diagnosis Summary • Algorithm 1: Algorithm for computing a LOD • Without computing MIPS or MUPS! • Algorithm 2: General approach for improving the algorithms of type 1 • Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning • In principle applicable to each semantic for which we can find a similar efficient reasoning approach! • Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!

Thanks for attention Questions? Computing a local optimal diagnosis

Back-Up Slides Computing a local optimal diagnosis

Computing a local optimal diagnosis Property Pattern Example ∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑¬Person O2 ∃reviewOfPaper.⊤ ∃readPaper.⊤ ≣ readPaper reviewOfPaper disjoint disjoint ≣ Document Document ∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1

An Efficient Method for Computing Alignment Diagnoses

An Efficient Method for Computing Alignment Diagnoses

Presentation Transcript

Space Efficient Alignment Algorithms

Birch: An efficient data clustering method for very large databases

Multiple alignment method

An efficient method of license plate location

Power Efficient Computing

An Efficient Initialization Method for Nonnegative Matrix Factorization

An efficient multiple alignment method for RNA secondary structures including pseudoknots

AN EFFICIENT METHOD FOR FACTORING RABIN SCHEME

Energy-Efficient Computing and Computing for Efficient Energy Usage

An efficient algorithm for optimizing whole genome alignment with noise

An Efficient Distance Calculation Method for Uncertain Objects

An Efficient Recovery Scheme for Mobile Computing Environment

An Efficient Distance Calculation Method for Uncertain Objects

RecTree: An Efficient Collaborative Filtering Method

Efficient Loop Versioning for Relative Alignment

Massively Parallel Computing for Protein Alignment

An efficient iterative method in numerical calculation

An Efficient Computational Method for Nonlinear Power Optimization Problems

Energy-Efficient Computing for Wildlife Tracking

Energy-Efficient Computing for Wildlife Tracking