1 / 44

Phenotype annotation

Phenotype annotation. Chris Mungall Lawrence Berkeley Labs NCBO GO. Outline. Principles of Compositionality Tour of PATO Pre vs post composition Quantitative phenotypes Next steps. Phenotype annotation: why?. To shed light on the relationships between genes, environment and phenotype

ozzy
Télécharger la présentation

Phenotype annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phenotype annotation Chris Mungall Lawrence Berkeley Labs NCBO GO

  2. Outline • Principles of Compositionality • Tour of PATO • Pre vs post composition • Quantitative phenotypes • Next steps

  3. Phenotype annotation: why? • To shed light on the relationships between genes, environment and phenotype • To compare genes and phenotypes across organisms • To improve human health and wellbeing

  4. Difficulties • Phenotypes can be complex • Descriptions are often composite • Encompass relationships between different kinds of entities, at different levels of granularity • Different ways of describing the same thing • Descriptions must be rigorous and unambiguous • Ensures meaningful analyses and comparisons within and between organisms

  5. Compositionality is essential for describing phenotypes • Compositionality is a principle of good ontology design • aka building blocks, cross-products, normalised/modular design • Create complex descriptions (definitions) from simpler ones • Descriptions can be composed at any time • Ontology construction time (pre-composition) • Annotation time (post-composition)

  6. An example of compositionality • Plasma membrane of spermatocyte • Plasma membrane[GO CC] • Spermatocyte[OBO Cell] • Formal means of composition • Genus-differentia Genus Differentia aplasma membranewhichispart_ofaspermatocyte GO-CC OBO-REL Cell

  7. Compositionality and ontology tools • Composition supported by: • Phenote • OBO-Edit • Cross-product plugin • Protégé-OWL • SWOOP • …and others

  8. Advantage: Automatic DAG calculation amembranewhichispart_ofagerm cell aplasma membranewhichispart_ofaspermatocyte

  9. The building blocks of phenotype descriptions: EQ • Entities and qualities (EQ) • (Bearer) Entity • E.g: compound eye, spermatocyte, blood, wing growth, scale morphogenesis • Quality (aka property, attribute) • A kind of dependent continuant • Defined in PATO • E.g: green, hot, squamous, rugose, edematous, light-sensitivity, luminescent, ectopic, arrested, decomposed

  10. Formal treatment of EQ • We must be clear about what we mean when we compose an E and a Q • Otherwise we will have incomplete query results and erroneous statistics in annotations • The meaning must be computable • Formally, an EQ description defines: aQualitywhichinheres_inabearer entity Which implicitly refers to: abearer entity whichbears aQuality

  11. Example normal eya[1]/eya[1]

  12. Kinds of entities which can be bearers of biological qualities • Continuants (3D entities) • Cell parts (GO) • Cells (OBO Cell ontology) • Gross anatomical entities (CARO, FMA, flyAO, MA, zfishAO, …) • Aggregates of organisms (?) • Occurrents (4D entities) • Biological processes (GO)

  13. PATO normal eya[1]/eya[1] GO FlyAO

  14. Tour of PATO • Tour from the top-down • The top level of PATO has been built according to formal ontological principles • This helps us define terms in a consistent and unambiguous way • The top level can be hidden from end-users by means of ontology views (aka slims) • Still subject to change • Feedback welcome!

  15. PATO: Top level division Note: some nodes omitted for brevity Quality Quality of a continuant A quality which inheres In a continuant Quality of an occurrent A quality which inheres In a process or spatiotemporal region physical quality cellular quality morphology duration rate color density shape size structure arrested premature delayed

  16. Divisions by granularity Monadic quality of a continuant … Physical quality A quality that exists through action of continuants at the physical level of organisation Cellular quality A quality that exists at the cellular level of organisation … nucleate quality ploidy potency color temperature mass green diploid multipotent large mass pink hot haploid totipotent anucleate small mass cold yellow aneuploid oligoptent binculeate

  17. Monadic vs relational quality of a continuant … Monadic quality of a C A quality of a C that inheres solely in the bearer and does not require another entity Relational quality of a C A quality of a C that requires another entity apart from its bearer to exist … Sensitivity (to) Displacement (with) Connected-ness (to) Physical quality Cellular quality morphology shape size structure

  18. Example relational quality • Sensitivity • Directed towards some entity type • E.g. • Sensitivity of an eye to red light • The quality inheres_in the eye • With respect to (towards) red light • Pheno-syntax: • E= eye Q= sensitivity E2= red_light

  19. On absence • Annotation patterns for absence, counts are currently under discussion • “spermatocyte devoid of asters” • E= CL:spermatocyte • Inheres in the spermatocyte • Q= PATO:lacks_part • The quality/relation of missing some part or parts • E2= GO-CC:aster • The quality is with respect to the type “aster”

  20. Pre- vs post- composition • When do we build the phenotype description? • In the ontology • During annotation? • Reconciling pre and post composition: An analysis of the plant_trait ontology

  21. When do we build the phenotype description? • Early? • Pre-composed phenotype definitions • MP:0000017 “big ears” • TO:0000227 “root length” • TO:0000029 “chlorine sensitivity” • Late? • Post-composed phenotype definitions • E= MA:ear Q= PATO:big • E= PO:root Q= PATO:length • E= organism Q= PATO:sensitivity E2= CHEBI:chlorine

  22. Is this comparable? MP:0000285 “abnormal cardiac valve morphology” MP:0000287 “heart valve hypoplasia” ? PATO:0000051 “morphology” PATO:0000141 “structure” E= MA:heart_valve Q=PATO:hypoplastic PATO:0000645 “hypoplastic”

  23. Yes: if term is decomposable MP:0000285 “abnormal cardiac valve morphology” MP:0000287 “heart valve hypoplasia” Def: ahypoplasticitywhichinheres_inaheart valve = PATO:0000051 “morphology” PATO:0000141 “structure” E= MA:heart_valve Q=PATO:hypoplastic PATO:0000645 “hypoplastic”

  24. Comparing phenotypes • We want to compare and query both within and across species • For gross anatomical phenotypes to be compared across species, descriptions must be decomposed or decomposable to anatomical terms • Anatomical terms must be comparable • Homology links • CARO: Common Anatomy Reference Ontology

  25. Case study: Defining plant traits with PATO • OBO Plant Trait ontology • Pre-composed phenotype terms • Analagous to OBO mammalian_phenotype ontology • Task: Define these terms with PATO • A good test of PATO • Demonstration of compositional approach • Allows meaningful comparison across plant species • Pilot study before applying to metazoans http://www.bioontology.org/wiki/index.php/PATO:Pre_vs_Post_Coordinating

  26. Methods • Creation of genus-differentia definitions • First pass: Obol • Second pass: manual editing • Ontologies used • PATO • Plant anatomical entities (PO) • Gramene environment (GEO) • Chemical entities of biological interest (CHEBI) • GO

  27. Basic phenotype terms • “root length” (TO:00000227) • E= PO:root Q= PATO:length • Formally: Def: alengthwhichinheres_inaroot

  28. Relational qualities involving types of chemical • “Chlorine sensitivity” [TO:0000029] • Directed towards an additional entity type • Q= PATO:sensitivity E2= CHEBI:chlorine Def: asensitivitywhich is directed towardschlorine [ inheres_inorganism ]

  29. Relational qualities involving the environment • “drought sensitivity” [TO:0000029] • Directed towards an additional entity type • Q= PATO:sensitivity E2= EO:drought Def: asensitivitywhichis directedtowardsdrought [ inheres_inorganism ] OBO needs a good environment ontology

  30. Complex phenotypes • “Chinsura boro” • "Abortion of microspore development at trinucleate stage” Def: aarrestedwhichinheres_in (microspore development whichduring trinucleate stage )

  31. Results of plant_trait analysis • 252/784 terms provided with genus-differentia definitions so far • Helped find inconsistencies and problems in the ontology • New term suggestions for PATO • proportionality • Approach should work for animal phenotype ontologies

  32. Bacterial phenotypes • Performed similar analysis on bacterial phenotype terms • Provided by Garrity & Hozzein • Results (morphological only): • 26 new terms added to PATO • Rugose, rhizoidal, lobate, filamentous, … • Todo: chemical utilization phenotypes • Required: • Ontologies for aggregates of organisms • Assay ontology

  33. Measurements • Ontologies provide qualitative partitions on the kinds of entities we find in nature • We may also want to record quantitative information • Comes from measurements of qualities • The measurement is not the phenotype • Phenotypes exist independently of our measurements of them

  34. Measurement schema • A measurement record consists of • The quality being measured • E.g. the length of a particular mouse tail • The unit type • From PATO UO • A magnitude • Floating point number • Error measure [optional]

  35. Sample of PATO UO • Unit • Base unit • Length unit • Angstrom • meter • Mass unit • Dalton • Gram • Substance unit • Derived unit • Concentration unit • pH • Quality • Morphology • Size length • Physical quality • Mass

  36. Phenotype exchange formats • Genotypes and phenotypes: • Pheno-syntax • Pheno-XML • General purpose • OWL (using canonical EQ encoding) • Also has Obo equivalent • GO annotation files • Works with pre-coordinated terms only

  37. OBD-Phenotype • A database for phenotype associations • Built on OBD framework • Tuned for inference and reasoning • Graph traversal built in from the start • Results • Annotations on data from OMIM, ZFIN and FlyBase • Currently too small a dataset to do analysis

  38. Next steps • Get PATO & Phenote used across multiple organisms and projects • MODs, BIRN, OMIM, • Collect annotation data from multiple sources in one repository (OBD) • Both pre + post composed • Demonstrated improved analysis of annotation data using PATO

  39. filamentous - having thin filamentous extensions at its edge • pleomorphic - a quality inhering in a cell by virtue of it ability to take on two or more different shapes during its life cycle • pulvinate - shaped like a cushion or has a marked convex cushion-like form • umbonate - having a knob or knoblike protuberance • rugose - having many wrinkles or creases on the surface • glistening - emitting or reflecting lots of light • dull - emitting or reflecting little or no light • viscid - covered with a sticky or clammy coating • mucoid - consistency of mucus • spiral - plane curve traced by a point circling about the center but at increasing distances from the center • rhizoidal - having root like extensions radiating from its center • spiny - having spines, thorns or similar stiff projections on its surface • warty - having a hard rough surface; not smooth • curled - having parallel chains in undulate fashion on the border • fragile - easily damaged or disrupted; brittle • butyraceous - resembling butter in appearance and consistency • undulate - having a wavy, shallow edge • punctiform - small and resembling a point • lobate - a morphological quality in which the bearer has deeply undulated edges forming lobes • erose - having an irregularly toothed edge • raised - is a thick colony that appear above the medium surface with terraced edges • convex - a shape that obtains by virtue of having inward facing edges; having a surface or boundary that curves or bulges outward, as the exterior of a sphere

  40. Proportions • “amylose to amylopectin ratio”TO:0000372 Def: acompositionalitywhichis directedtowardsamylose relative_toamylopectin [ inheres_inorganism ]

More Related