250 likes | 351 Vues
This paper discusses a grass-roots approach to ontology alignment, focusing on the sharing of structured data among peers who may use varying terminologies. We present a methodology for ontology alignment that includes heuristic-based techniques such as name similarity, structure similarity, and instance co-occurrence. Our approach enables end-users to align ontologies collaboratively, creating a more efficient process for reusing alignments and improving overall data interoperability. We highlight the properties, challenges, and future directions for leveraging close approximations and inconsistencies in grass-roots alignments.
E N D
Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California
Motivation • Sharing Structured Data among peers • However, peers might use different terminology (Ontology) Need Ontology Alignment
What is Alignment • Correspondence between concepts
Alignment: State of the Art • Heuristics-based: • Name similarity • Structure similarity • Instance • Constraints • Co-occurrence • Domain Expert • Centralized • Precise Alignment
Our Approach • Cursory Alignment by End Users • Easy to produce • Combining different user’s alignments • Reuse to reduce effort by each user • Grass-Roots Alignment Alignment Corpus Peer-to-Peer Alignment
Grass-roots Alignment Example: WebScripter tool when a user puts different stuffs into the same column, they mean same thing Inferred Alignment: iswc:Person = isi: Div2Member Inferred Alignment: iswc:phone = isi: phonenumber
O2 O1 Graduate GraduateStudent PhDStudent MSStudent O3 O4 Doctoral Student Master Student Properties of Grass-Roots Alignment • Might be • Approximate • inconsistent • Intransitive
Challenge • How to reuse approximate or inconsistent grass-roots alignments for alignment purposes • Approximation • conservative semantics of alignment • Inconsistency • evidences
A A O1 O2 O1 O2 A O2 B A O2 B B B C C C C (a) (b) O1 O2 O1 O2 A A C A C A B C B B B C (c) (d) Observations & Assumptions • Users tend to pick closest alignment candidate
Basic Idea: • Class relationships specified in ontology • definite • Class relationships indicated by previous alignments • Indefinite/ambiguous • Inference to get more Definite class relationships • Use these class relationships for future alignment
Class Alignment Algorithm:Step 1 • Subclass Relationships Specified in the Ontology
C A A , , NOT NOT O1 O2 A A B B C C B A A B C C B OR B C Class Alignment Algorithm:Step 2 • Class Relationships Implied by Grass-roots Alignments: the Semantics of Grass-roots Alignments:
the Semantics of Grass-roots Alignments (Cont) O1 O2 A A C B NOT C B
O1 O2 A D A · D B C B · C the Semantics of Grass-roots Alignments (Cont)
Class Alignment Algorithm:Step 2 • Class Relationships Implied by Alignments
Class Alignment Algorithm:Step 3: Forward-chaining Inference
Dealing with Evidences • (f1, e1) AND (f2, e2) ... AND (fi, ei) = > (f, e), its evidence e = e1*e2*..*ei. • same fact supported by evidences e1, e2, ..ei, e = e1+e2+...+ei. • Also note that same evidence doesn't count twice, that is, e1 + e1 = e1, e1 * e1 = e1. • Quantifying Evidences: • V(e): a numerical value between (0, 1). • V(e1+e2) = 1-(1-V(e1))*(1-V(e2)) • V(e1*e2) = V(e1)*V(e2)
Class Alignment AlgorithmStep 4: Class Alignment Using Facts KB • Sup(A): the set of superclasses of A • Sub(A): the set of subclasses of A • Ind(A): all B such that • (A > B OR B > A) • neither A > B or B > A is in KB • I.e., B and A are indistinguishable according to facts KB. • deal with inconsistencies: • for each B from Sup(A), if there is a better-supported fact A > B, NOT(B > A) or B||A, remove B from Sup(A). Do the same to Sub(A).
Class Alignment Using Facts KB (cont) • Examples: • Ind(MasterStudent)={MSStudent} • Sup(MasterStudent)={Graduate,Student,UnivStudent} • Sub(Graduate)={MasterStudent,MSStudent,DoctoralStudent}
Class Alignment Using Facts KB (cont) • Given A from O1, find best alignment B in O2 in the following order: • O2 ∩ Ind(A) • O2 ∩ Sup(A) • If B, B1 ∈ O2 ∩ Sup(A), pick B if B1 > B • O2 ∩ Sub(A) • If B, B1 ∈ O2 ∩ Sub(A), pick B if B > B1 • Everything being equal, pick better supported • Otherwise no alignment candidate for A in O2.
Class Alignment Using Facts KB (cont) • Example: • Ind(MasterStudent)={MSStudent} • Sup(DoctoralStudent)={Graduate,Student,UnivStudent} • Ind(Student)={UnivStudent} O1 O2 UnivStudent Student Graduate DoctoralStudent MasterStudent MSStudent
Evaluation (qualitative analysis) • In the ideal case: • Each previous alignment is best possible • Then: Guaranteed Correctness in some cases O1 O2 UnivStudent Student Graduate DoctoralStudent • Sup(DoctoralStudent)= • {UnivStudent,Graduate} • In the not-so-ideal case: • Bad facts likely filtered out
Evaluation • 26 ontologies on university student domain • Measure resultant fact KB vs Reference KB
Related Work: • schema mediation, schema reconciliation, schema matching, semantic coordination, semantic mapping, and ontology mapping • ONION, PROMPT, LSD, GLUE, Automatch, SemInt, CUPID, COMA, MGS-DCM, HSDM Mediator, MOBS… • Name similarity, Structure similarity, Domain Constraints, Instance Features, Instance similarity, Multi-strategy learning, Statistical analysis, Alignment reuse. • Little work on Peer-to-Peer Alignment
Summary • An Alignment Approach: • Ontology Alignment carried out by end users in a Peer to Peer fashion • Peers are both alignment consumer and producer • Future work: • Detailed experiments, theoretical analysis • Property alignment with class as context Thank You!