1 / 17

ZOOMA – Optimal Ontology Mapping Application

ZOOMA – Optimal Ontology Mapping Application. Tony Burdett 29 nd April. What is ZOOMA?. ZOOMA is an ontology mapping application, designed to find optimal matches between “text values” and “ontology terms”. A bit of background:

jonnat
Télécharger la présentation

ZOOMA – Optimal Ontology Mapping Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ZOOMA – Optimal Ontology Mapping Application Tony Burdett 29nd April

  2. What is ZOOMA? ZOOMA is an ontology mapping application, designed to find optimal matches between “text values” and “ontology terms”. A bit of background: • ZOOMA grew out of the need to offer queries against EFO for data newly loaded into Atlas 2.0. • At load, user supplied text values are not resolved against ontology terms • Curators need to be able to “map” text values for newly loaded data in order to support ontology-enabled queries Master headline

  3. Wider Usecase • Our pressing requirement, right now, is to map text values to ontology terms in the Atlas • But, this is a wider problem – such mappings are found in everything we do. • Atlas, ArrayExpress2, BII, MAGE-TAB, submitters all need to do this sort of ontology mapping • So it makes sense to do the mapping logic once and reuse it! Master headline

  4. Some Jargon • We probably all understand what “text values” and “ontology terms” means (or we think we do) • “Text values” are things that a user, or maybe a curator has entered • This isn’t necessarily (but probably is) some sort of controlled term • “Ontology Terms” are things which come (only) from an ontology • If you can’t find it, it’s not an ontology term! • A “mapping” is an assertion that some text value somehow relates to some ontology term(s). Master headline

  5. Atlas World View • The Atlas DB has Property, PropertyValue and OntologyTerm tables. • Properties are the “types” (e.g. organismpart) and Property Values are the values (e.g. heart) • These are controlled (reused between experiments) • There are join tables – AssayPVOntology and SamplePVOntology – to join property values to OntologyTerms on the assay and sample level Master headline

  6. Master headline

  7. ArrayExpress2 World View • The ArrayExpress database also has Property, PropertyValue, OntologyTerm and OntologyEntry tables. • However, they’re joined in different ways • In the Atlas, Properties and Property Values are reused, and the join table determines uniqueness • In ArrayExpress2, there are join tables to join Properties to OntologyEntries, and Property Values to OntologyTerms (basically) • This means Property and Property Values must be unique per “usage”. Master headline

  8. Master headline

  9. MAGE-TAB World View • In MAGE-TAB, certain columns can be followed with “Term Source REF” which designates a link to an Ontology Term. • The value entered into the column is a text value • The Term Source REF indicates we should be able to find an ontology term with the matching “name” in the given ontology • But this can be hit and miss! Although term source accession is better Master headline

  10. MAGE-TAB World View Property (IDF may link to OntologyEntry) Property value Ontology term Master headline

  11. Now that’s over… what does ZOOMA do? • In general, ZOOMA has three main modes of operation… • Automatic • This highlights any optimal mappings that don’t need curation • One text value maps to the same set of terms every time • Error detection • Highlight any mappings from text value to ontology term that might be in error • This only makes sense if you have a “repository of mappings” e.g. Atlas • Mapping suggestions • Take values, and propose new mappings to terms • Requires a curator eye to make a decision on which mapping is best Master headline

  12. What ZOOMA does • These three modes are currently implemented for the Atlas, and for a list of values submitted by file • In the Atlas, ZOOMA will… • Lookup property values from Atlas (these are our text values), and find the optimal ontology term hits against EFO • Lookup property values from Atlas and find inherited ontology term hits from elsewhere in the database • Lookup property values from Atlas and find possible matches to terms not in EFO, querying BioPortal/OLS (using OntoCAT) • Currently, this data goes into a report, and is then written back to the database. • But automatic writing back to the Atlas is coming soon! Master headline

  13. What ZOOMA does • When running over a submitted text file, ZOOMA can… • Lookup user supplied text values (from a supplied file) and find optimal hits against your ontology of choice • Lookup user supplied text values and recover mappings from the Atlas • Lookup user supplied text values and find possible matches in BioPortal/OLS, again using OntoCAT • Actually, the implementation of the different inputs is the same – we just need to chaange where our text values come from • Again, this generates a report • Error detection doesn’t really mean anything in this case – no mapping errors to detect! Master headline

  14. Some tech… • The idea behind ZOOMA is to isolate the logic from the sources of text values and ontology terms. • To achieve this, ZOOMA has several top level interfaces: • OntologyMapper • OntologyMappingFormulator • OntologyTermRetriever • OntologyMappingHypothesis • OntologyMappingHypothesisFactory • OntologyMappingEvaluator • OntologyMappingCalculator • OntologyMappingOutcome Master headline

  15. Some more tech… • The infrastructure described means that to use new databases just means writing new OntologyTermRetriever implementations • An OntologyTermRetriever fetches OntologyTerms that possibly match a text value, with enough information to decide how good the match is (the OntologyMappingContext) • New mapping logic can be added by adding implementations of the other interfaces – so at the moment, there is a “RankingBasedCalculator” that implements OntologyMappingCalculator • Implementations can quickly be wired up with Spring Master headline

  16. What’s next? • Very short term: do a production release! Ele has already been using ZOOMA to do Atlas mappings, but it’s been a bit ad hoc so far • I need to add support for writing our new mappings into the Atlas database, instead of the report • I want to add an ArrayExpress2 retriever, so we can determine mappings here • And also add write support, to map terms there too • Possibly create an OWL-driven backend? Master headline

  17. Questions? Master headline

More Related