240 likes | 379 Vues
BioPAX The Birth of A Data Exchange Language for Biological Pathways. Joanne Luciano BioPAX Core Group www.biopax.org 7 th International Annual Bio-Ontologies Meeting 30 July 2004 Glasgow, Scotland United Kingdom. Introduction. BioPAX = Biopathway Exchange Language Emerged at ISMB
E N D
BioPAXThe Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Core Group www.biopax.org 7th International Annual Bio-Ontologies Meeting 30 July 2004 Glasgow, Scotland United Kingdom
Introduction BioPAX = Biopathway Exchange Language Emerged at ISMB • conceived at ISMB ’01 • born at ISMB ’02 • crawling at ISMB ’03 (Level 0.5) • walking at ISMB ’04 (Level 1.0) • now approaching the “terrible twos” 7th BioOntologies Workshop
What is a pathway? Depends on who you ask 7th BioOntologies Workshop
} Research Community Need WIT BioCyc Reactome aMAZE KEGG BIND DIP HPRD MINT IntAct PSI format CSNDB TRANSPATH TRANSFAC PubGene GeneWays Integrated Pathway Database Pathway Databases Metabolic Protein Interaction Signal Transduction Gene Regulatory 7th BioOntologies Workshop
Design Goals • Encapsulation: An entire pathway in one record • Compatible: Use existing standards wherever possible • Computable: From file reading to logical inference • Successful: Buy-in from the research community 7th BioOntologies Workshop
Technical Logistics & Goals Interoperability • Integration and exchange of pathway data • Interchange through a common (standard) representation • accommodate existing database representations • provide a basis for future databases • enables development of tools for searching and reasoning over the data base 7th BioOntologies Workshop
Technical Logistics (cont’d) Why OWL? Why OWL DL? Expressivity (biology = “complex relationships”) • W3C Standard (use existing standards) “Semantic Web enabled” • XML based (the exchange language in computing) • Machine Computable • Facilitate integration of knowledge, data, tool development • Uncover inconsistencies and new knowledge • OWL DL • Enable full reasoning capability for users from file reading to logical inference • Complete: all conclusions are guaranteed to be computed • Decidable: all computations will finish in finite time (with OWL Lite, short amount of time) 7th BioOntologies Workshop
Social Logistics Get organized Make the decision & commitment 2 or 3 dedicated individuals Small core group • Bi-weekly conference calls, bi-monthly F2F • Commitment & resources • Participants willing and able cover their costs • Outside funding (DOE) Special interests and needs form subgroup task forces • Core group member(s) • Outside experts International representation & participation (Outreach & Community Building) • conferences and mailing lists • follow-up and individual Collaborate with complementary/competing representations 7th BioOntologies Workshop
Social LogisticsHow we engendered buy in from the field whichmade life much easier Take things in steps: • Pathway Database vision -> Data Exchange Format as 1st step • Data Exchange Format -> Release in Levels of increasing complexity Level 1 supports Metabolic pathways, Level 2 Early success leads to early adoption, leads to increased probability of overall project success. Get “buy in” and get involvement -leads to acceptance later • Support the existing databases (BioCYC, WIT, BIND, etc.) • Got database sources to agree to participate in the development to assure that their DBs will be properly represented • Got database sources to agree to export in the new format once it is defined 7th BioOntologies Workshop
Social Logistics (cont’d) Get “buy in” (continued) • Community Involvement and Support Core group (represents voice of community, small, committed) Mailing List User community Subgroups • International Meetings and Presentations Tool developers Modelers Users (researchers) Ontology developers Database providers Complementary representations (SBML, CellML) Like minds General Community 7th BioOntologies Workshop
Implementation of BioPAX Designed using GKB Editor and Protégé BioPAX uses OWL to define the Schema BioPAX Instances to store the data 7th BioOntologies Workshop
BioPAX – Ontology 7th BioOntologies Workshop
OWL (schema) Instances (Individuals) data 7th BioOntologies Workshop
Complex Relationships Captured 7th BioOntologies Workshop
Ontology Slot Definitions 7th BioOntologies Workshop
Integration -> KnowledgeKnowledge is Power Data in the same format: Metabolic Protein Protein Interaction Signal Transduction Gene Regulation Facilitates • Centralized public pathway DB • Share data between existing DBs • Distribute public and proprietary data • Knowledge Assembly • Reasoning 7th BioOntologies Workshop
Application Database User A Common Exchange Language Promotes collaboration (big science), accessibility BioPAX Without BioPAX >100 DBs and tools 7th BioOntologies Workshop
Consistency Checking: Nutrient-related analysis of a BioPAX knowledge base Known Nutrient set Fired Reaction Unfired Reaction Essential compounds Missing essential compound 7th BioOntologies Workshop Biomass
What Next? • BioPAX future Development • Level 2, 3, future levels • BOF (check schedule) • Talk later today by Gary Bader at BioPathways SIG • Poster in Main Conference (check program) • Development of tools and API • libBioPAX • Semantic Web Life Science Initiatives • BOF Sunday 7th BioOntologies Workshop
BioPAX Supporting Groups Databases • BioCyc (www.biocyc.org) • BIND (www.bind.ca) • WIT (wit.mcs.anl.gov/WIT2) • PharmGKB (www.pharmgkb.org) Grants • Department of Energy (Workshop) Groups • Memorial Sloan-Kettering Cancer Center: G. Bader, M. Cary, J. Luciano, C. Sander • SRI Bioinformatics Research Group: P. Karp, S. Paley, J. Pick • University of Colorado Health Sciences Center: I. Shah • BioPathways Consortium: J. Luciano, E. Neumann, A. Regev, V. Schachter • Argonne National Laboratory: N. Maltsev, E. Marland • Samuel Lunenfeld Research Institute: C. Hogue • Harvard Medical School: E. Brauner, D. Marks, J. Luciano, A. Regev • NIST: R. Goldberg • Stanford: T. Klein • Columbia: A. Rzhetsky • Dana Farber Cancer Institute: J. Zucker Collaborating Organizations: • Proteomics Standards Initiative (PSI) • Systems Biology Markup Language (SBML) • CellML • Chemical Markup Language (CML) The BioPAX Community 7th BioOntologies Workshop
Exchange Formats in the Pathway Data Space Database Exchange Formats Simulation Model Exchange Formats SBML, CellML PSI Biochemical Reactions Protein Interaction Networks Rate Formulas Metabolic Pathways Low Detail High Detail Regulatory Pathways Low Detail High Detail 7th BioOntologies Workshop
Molecular Interactions Pro:Pro All:All Metabolic Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail Small Molecules Low Detail High Detail Level 1 BioPAXReleased July 2004 Database Exchange Formats Simulation Model Exchange Formats SBML, CellML Genetic Interactions PSI Rate Formulas BioPAX Level 1 Biochemical Reactions 7th BioOntologies Workshop
Molecular Interactions Pro:Pro All:All Metabolic Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail Small Molecules Low Detail High Detail Exchange Formats in the Pathway Data Space Database Exchange Formats Simulation Model Exchange Formats BioPAX SBML, CellML Genetic Interactions PSI Rate Formulas Biochemical Reactions 7th BioOntologies Workshop