1 / 22

Towards a Logic Formalization of Taxonomic Concepts

Towards a Logic Formalization of Taxonomic Concepts. Dave Thau, Bertram Lud äscher, Shawn Bowers UC Davis thau@learningsite.com. Gray 1834. Chapman 1860. Kral 1998. Thau 2006. Names are Confusing . Adapted from R. Peet. Ranunculus plumosa. R.plumosa var intermedia. R.plumosa

capucine
Télécharger la présentation

Towards a Logic Formalization of Taxonomic Concepts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Logic Formalization of Taxonomic Concepts Dave Thau, Bertram Ludäscher, Shawn Bowers UC Davis thau@learningsite.com

  2. Gray 1834 Chapman 1860 Kral 1998 Thau 2006 Names are Confusing Adapted from R. Peet Ranunculus plumosa R.plumosa var intermedia R.plumosa var plumosa Ranunculus pinetcola Ranunculus plumosa Ranunculus plumosa Ranunculus homunculus 5th International Conference on Ecological Informatics

  3. Impact on Data Analysis • Can’t find data • If A º B, a search on A should retrieve B • Same if A  B • Can’t aggregate data • If A  B, you should be able to combine data from A into B 5th International Conference on Ecological Informatics

  4. Where In Greece Can I Find Ranunculus aquatilis?  R. aquatilis R. trichophyllus 5th International Conference on Ecological Informatics

  5. A B A B B A B A A overlap B A disjoint B A  B A  B B  A Mapping Taxonomies Benson, 1948 FNA-03, 1997  Ranunculus aquatilis Ranunculus aquatilis º R.a. var calvescens R.a. var capillaceus R.a. var aquatilis R.a. var diffusus R.a. var hispidulus º º   This results in 512 (more than 240 million) possible sets of relationships. 5th International Conference on Ecological Informatics

  6. Overview • The problems – Names change, experts disagree, data become incomparable • The partial solution – Taxonomic Concepts • Another part of the solution – Logic • Representing taxonomy in logic • Using the representation to detect inconsistencies and discover new relations • Applications 5th International Conference on Ecological Informatics

  7. Logic, why? • Precise modeling language • Solid mathematical basis • Good tools for reasoning are available • Explicit, “portable” representation (not buried in code) 5th International Conference on Ecological Informatics

  8. T = (N, E) N = {A, B, C} E = {B A, C A} isaTx:m(x)  n(x)m n  E, T=(N,E)) } isa isa isa isa isa      Basic Taxonomy A • Rooted tree • Only “Isa” relations isa isa B C B A C A In the basic taxonomy TisaT 5th International Conference on Ecological Informatics

  9. A isa isa B C isa isa   Some Additional Constraints • No empty nodes • All nodes have at least one element • Tx: n(x)n  N, T=(N,E)) } • Disjointness • The children of a node are disjoint • !Tx: n1(x)  n2(x)  n1 m  E, n2 m  E, T=(N,E)) } • Closed World • A node with children is defined as the union of those children • This one’s formula is a bit long – trust me… 5th International Conference on Ecological Informatics

  10. Mapping Formulae • Mappings between nodes in two different taxonomies have their owns • In the slides and proofs to come I will use these symbols: A  B: A is included in B A  B: A includes B A  B: A and B are equivalent 5th International Conference on Ecological Informatics

  11. Inferring Unstated Correspondences Benson, 1948 Kartesz, 2004 Ranunculus arizonicus Ranunculus arizonicus Given: º Given:  R.a. var chihuahua R.a. var typicus We can demonstrate:  Peet, 2005: B.1948:R.a.typicus is included in K.2004:R. arizonicus B.1948:R. arizonicus is congruent to K.2004:R. arizonicus 5th International Conference on Ecological Informatics

  12. Proving New Mappings Benson, 1948 Kartesz, 2004 A Ranunculus arizonicus D Ranunculus arizonicus º  B R.a. var chihuahua C R.a. var typicus  ? Show B  D and (D  B) 5th International Conference on Ecological Informatics

  13. Formal Proof of Mapping Part 1 Part 2 5th International Conference on Ecological Informatics

  14. Inconsistent Mapping Benson, 1948 Kartesz, 2004 Ranunculus hydrocharoides Ranunculus hydrocharoides º R.h. var natans R.h. var stolonifer R.h. var typicus R.h. var stolonifer R.h. var typicus º º Peet, 2005: B.1948:R.h.stolonifer is congruent to K.2004:R.h.stolonifer B.1948:R.h.typicus is congruent to K.2004:R.h.typicus B.1948:R. hydrocharoides is congruent to K.2004:R. hydrocharoides 5th International Conference on Ecological Informatics

  15. Proving Inconsistency Benson, 1948 Kartesz, 2004 Ranunculus hydrocharoides Ranunculus hydrocharoides º R.h. var natans R.h. var stolonifer R.h. var typicus R.h. var stolonifer R.h. var typicus º º 5th International Conference on Ecological Informatics

  16. Formal Proof of Inconsistency 5th International Conference on Ecological Informatics

  17. Showing Inconsistency Using Popular Tools Benson, 1948 Kartesz, 2004 Ranunculus Ranunculus Ranunculus macranthus Ranunculus petiolaris Ranunculus petiolaris … …   B.48:R. petiolaris  K.04:R. petiolaris  B.48:R. macranthus contradicts B.48:R. macranthus and B.48:R. petiolaris are disjoint. Peet, 2005: B.1948:R. macranthus contains K.2004: R. petiolaris B.1948:R. petiolaris is contained by K. petiolaris 5th International Conference on Ecological Informatics

  18. Resolving Inconsistencies • Trying to simultaneously satisfy no emptiness, disjointness and the closed world • Relaxing any of these makes the mapping consistent – giving us clues to hidden truths • It turns out that Kartesz and Benson focus on different localities. 5th International Conference on Ecological Informatics

  19. Inconsistent Mapping Benson, 1948 Kartesz, 2004 Ranunculus hydrocharoides Ranunculus hydrocharoides º R.h. var natans R.h. var stolonifer R.h. var typicus R.h. var stolonifer R.h. var typicus º º Peet, 2005: B.1948:R.h.stolonifer is congruent to K.2004:R.h.stolonifer B.1948:R.h.typicus is congruent to K.2004:R.h.typicus B.1948:R. hydrocharoides is congruent to K.2004:R. hydrocharoides 5th International Conference on Ecological Informatics

  20. Summary • Taxonomic Concepts are important • Logic is a useful tool when reasoning about mappings between taxonomies • We have the beginnings of a representation for taxonomies • That representation can find unstated mappings • And detect inconsistent mappings 5th International Conference on Ecological Informatics

  21. Future Work • Beefing up the representation • Formalizing more constraints, such as rank • Working in other factors, such as locality • Adding ‘intelligence’ to tools which build mappings • Using the representation in a workflow system to aid data integration 5th International Conference on Ecological Informatics

  22. Thanks! Questions? • We would like to acknowledge: • Bob Peet for the Ranunculus data set • NSF, under SEEK awards 0225676, 0225665, 0225635, and 0533368 5th International Conference on Ecological Informatics

More Related