html5-img
1 / 26

Multi-Concept Alignment and Evaluation

Multi-Concept Alignment and Evaluation. Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th , 2007. Introduction: Multi-Concept Alignment. Mappings involving combinations of concepts

amish
Télécharger la présentation

Multi-Concept Alignment and Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11th, 2007

  2. Multi-Concept Alignment and Evaluation Introduction: Multi-Concept Alignment • Mappings involving combinations of concepts • o1:FruitsAndVegetables → (o2:Fruits OR o2:Vegetables) • Also referred to as: • Multiple, complex • Problem: only a few matching tools consider it • Cf. [Euzenat & Shvaiko]

  3. Multi-Concept Alignment and Evaluation Why is MCA a Difficult Problem? • Much larger search space: |O1| x |O2| → 2|O1|x 2 |O2| • How to measure similarity between sets of concepts? • Based on which information and strategies? “Fruits and vegetables” vs. “Fruits” and “Vegetables” together • Formal frameworks for MCA? • Representation primitives • owl:IntersectionOf? skosm:AND? • Semantics A skos:broader ( skosm:AND B C)  A broader B & A broaderC ?

  4. Multi-Concept Alignment and Evaluation Agenda • The multi-concept alignment problem • The Library case and the need for MCA • Generating MCAs for the Library case • Evaluating MCAs in the Library case • Conclusion

  5. Multi-Concept Alignment and Evaluation Yet MCA is needed in real-life problems • KB collections (cf. OAEI slides) • Scenario: re-annotation of GTT-indexed books by Brinkman concepts

  6. Multi-Concept Alignment and Evaluation Yet MCA is needed in real-life problems • Books can be indexed by several concepts • with post-coordination: co-occurrence matters {G1=“History” , G2=“the Netherlands”} in GTT → a book about Dutch history • Granularity of two vocabularies differ →{B1=“Netherlands; History”} • Alignment should associate combination of concepts

  7. Multi-Concept Alignment and Evaluation Agenda • The multi-concept alignment problem • The Library case and the need for MCA • Generating MCAs for the Library case • Evaluating MCAs in the Library case • Conclusion

  8. Multi-Concept Alignment and Evaluation MCA for Annotation Translation: Approach • Produce similarity measures between individual concepts • Sim(A,B) =X • Grouping concepts based on their similarity • {G1,B1,G2,G3,B2} • Creating conversion rules • {G1,G2,G3} → {B1,B2} • Extraction of deployable alignment

  9. Multi-Concept Alignment and Evaluation MCA Creation: Similarity Measures • KB scenario has dually indexed books • Brinkman and GTT concepts co-occur • Instance-based alignment techniques can be used • Between concepts from a same vocabulary, similarity mirrors possible combinations!

  10. Multi-Concept Alignment and Evaluation MCA Creation: 2 Similarity Measures • Jaccard overlap measure applied on concept extensions • Latent Semantic Analysis • Computation of similarity matrix • Filter noise due to insufficient data • Similarity between concepts between vocabularies and inside vocabularies

  11. Multi-Concept Alignment and Evaluation MCA Creation: 2 Concept Aggregation Methods • Simple Ranking • For a concept, take the top k similar concepts • Gather GTT concepts and Brinkman ones • Clustering • Partitioning concepts into similarity-based clusters • Gather concepts Global approach: the most relevant combinations should be selected

  12. Multi-Concept Alignment and Evaluation Generated Rules • Clustering generated much less rules • With more concepts

  13. Multi-Concept Alignment and Evaluation Agenda • The multi-concept alignment problem • The Library case and the need for MCA • Generating MCAs for the Library case • Evaluating MCAs in the Library case • Conclusion

  14. Multi-Concept Alignment and Evaluation Evaluation Method: data sets • Training and evaluation set from dually-indexed books • 2/3 training, 1/3 testing • Two training sets (samples) • Random • Rich: books that have at least 8 annotations (both thesauri)

  15. Multi-Concept Alignment and Evaluation Evaluation Method: Applying Rules Gr1→Br1 Gt Gr2→Br2 Gr3→Br3 • Several configurations for firing rules • 1. Gt = Gr • 2. Gt  Gr • 3. Gt  Gr • 4. ALL

  16. Multi-Concept Alignment and Evaluation Evaluation Measures • Precision and recall for matched books • Books that were given at least one good Brinkman annotation • Pb, Rb • Precision and recall for annotation translation • Averaged over books

  17. Multi-Concept Alignment and Evaluation Results: for ALL Strategy

  18. Multi-Concept Alignment and Evaluation Results: Rich vs. Random Training Set • Rich does not improve the results a lot • Bias towards richly annotated books • Jaccard performances go down • LSA does better • Statistical corrections allow simple grouping techniques to cope with data complexity

  19. Multi-Concept Alignment and Evaluation Results : for Clustering

  20. Multi-Concept Alignment and Evaluation Results: Jaccard vs. LSA • For 3 and ALL, LSA outperforms Jaccard • For 1 and 2 Jaccard outperforms LSA • Simple similarity is better at finding explicit similarities • Really occurring in books • LSA is better at finding potential similarities

  21. Multi-Concept Alignment and Evaluation Results : using LSA

  22. Multi-Concept Alignment and Evaluation Results: Clustering vs. Ranking • Clusters performs better on strategies 1 and 2 • They match existing annotations better • They have better precision • Ranking has higher recall but lower precision Classical tradeoff (ranking keeps noise)

  23. Multi-Concept Alignment and Evaluation Agenda • The multi-concept alignment problem • The Library case and the need for MCA • Generating MCAs for the Library case • Evaluating MCAs in the Library case • Conclusion

  24. Multi-Concept Alignment and Evaluation Conclusions • There is an important problem: multi-concept alignment • Not extensively dealt with current litterature • Needed by applications • We have first approaches to create such alignments • And to deploy them! • We hope that further research will improve the situation (with our ‘deployer’ hat on) • Better alignments • More precise frameworks (methodology research)

  25. Multi-Concept Alignment and Evaluation Conclusions: performances • Evaluation shows mitigated results • Performances are generally very low • These techniques cannot be used alone • Notice: dependence on requirements • Settings were manual indexer choose among several candidates allow for lower precision • Notice: indexing variablity • OAEI have demonstrated that manual evaluation somehow compensates for the bias of automatic one

  26. Multi-Concept Alignment and Evaluation Thanks!

More Related