1 / 25

Introduction to the Pathway Tools Software and BioCyc Database Collection

Introduction to the Pathway Tools Software and BioCyc Database Collection. BioCyc Collection of 673 Pathway/Genome Databases. Pathway/Genome Database (PGDB) – combines information about Pathways, reactions, substrates Enzymes, transporters Genes, replicons

whitby
Télécharger la présentation

Introduction to the Pathway Tools Software and BioCyc Database Collection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to the Pathway Tools Software and BioCyc Database Collection

  2. BioCyc Collection of 673 Pathway/Genome Databases • Pathway/Genome Database (PGDB) – combines information about • Pathways, reactions, substrates • Enzymes, transporters • Genes, replicons • Transcription factors/sites, promoters, operons • Tier 1: Literature-Derived PGDBs • MetaCyc • EcoCyc -- Escherichia coli K-12 • Tier 2: Computationally-derived DBs, Some Curation -- 28 PGDBs • HumanCyc • Mycobacterium tuberculosis • Tier 3: Computationally-derived DBs, No Curation -- 643 DBs

  3. BioCyc Tier 3 • Tier 3 DBs created by subjecting annotated genomes to PathoLogic computational processing pipeline: • Predicts metabolic pathways • Predicts which genes code for missing enzymes in metabolic pathways • Infers transport reactions from transporter names • Predicts operons (bacteria) • Tier 3 genomes are regenerated on a regular basis

  4. Source of BioCyc Tier 3 Genomes • Tier 3 is in transition • Majority of Tier 3 genomes from JCVI CMR • Some Tier 3 genomes from RefSeq • In 2011 we expect all Tier 3 genomes to be from RefSeq • Currently, only BioCyc genomes derived from CMR have ortholog data • In 2011, all Tier 3 genomes will have ortholog data

  5. Obtaining a PGDB for Organism of Interest • Find existing curated PGDB • Find existing PGDB in BioCyc • Create your own

  6. 2,100+ licensees: 180 groups applying software to 1,600 organisms Saccharomyces cerevisiae, SGD project, Stanford University 135 pathways / 565 publications Candida albicans, CGD project, Stanford University dictyBase, Northwestern University Mouse, MGD, Jackson Laboratory Drosophila, FlyBase, Harvard University Under development: C. elegans, WormBase Arabidopsis thaliana,TAIR, Carnegie Institution of Washington 288 pathways / 2282 publications PlantCyc,Carnegie Institution of Washington Six Solanaceae species, Cornell University GrameneDB, Cold Spring Harbor Laboratory Medicago truncatula, Samuel Roberts Noble Foundation Pathway Tools Software: PGDBs Created Outside SRI

  7. F. Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa Genoscope, Acinetobacter M. Bibb, John Innes Centre, Streptomyces coelicolor E. Uberbacher, ORNL Multple Bioenergy-related organisms G. Serres, MBL, Shewanella oneidensis R.J.S. Baerends, University of Groningen, Lactococcus lactis IL1403, Lactococcus lactis MG1363, Streptococcus pneumoniae TIGR4, Bacillus subtilis 168, Bacillus cereus ATCC14579 Matthew Berriman, Sanger Centre, Trypanosoma brucei, Leishmania major Sergio Encarnacion, UNAM, Sinorhizobium meliloti Mark van der Giezen, University of London, Entamoeba histolytica, Giardia intestinalis Michael Gottfert, Technische Universitat Dresden, Bradyrhizobium japonicum Artiva Maria Goudel, Universidade Federal de Santa Catarina, Brazil, Chromobacterium violaceum ATCC 12472 Pathway Tools Software: PGDBs Created Outside SRI

  8. Pathway Tools Software: PGDBs Created Outside SRI • Large scale users: • C. Medigue, Genoscope, 200+ PGDBs • G. Sutton, J. Craig Venter Institute, 80+ PGDBs • G. Burger, U Montreal, 60+ PGDBs • Bart Weimer, UC Davis, Lactococcus lactis, Brevibacterium linens, Lactobacillus acidophilus, Lactobacillus plantarum, Lactobacillus johnsonii, Listeria monocytogenes • Partial listing of outside PGDBs at http://biocyc.org/otherpgdbs.shtml

  9. Pathway Tools Overview Annotated Genome MetaCyc Reference Pathway DB PathoLogic Pathway/Genome Database Pathway/Genome Navigator Pathway/Genome Editors Briefings in Bioinformatics 11:40-79 2010

  10. Pathway Tools Software: PathoLogic • Computational creation of new Pathway/Genome Databases • Transforms genome into Pathway Tools schema and layers inferred information above the genome • Predicts operons • Predicts metabolic network • Predicts which genes code for missing enzymes in metabolic pathways • Infers transport reactions from transporter names Bioinformatics 18:S225 2002

  11. Pathway Tools Software:Pathway/Genome Editors • Interactively update PGDBs with graphical editors • Support geographically distributed teams of curators with object database system • Gene editor • Protein editor • Reaction editor • Compound editor • Pathway editor • Operon editor • Publication editor

  12. What is Curation? • Ongoing updating and refinement of a PGDB • Correcting false-positive and false-negative predictions • Incorporating information from experimental literature • Authoring of comments and citations • Updating database fields • Gene positions, names, synonyms • Protein functions, activators, inhibitors • Addition of new pathways, modification of existing pathways • Defining TF binding sites, promoters, regulation of transcription initiation and other processes

  13. Pathway Tools Software:Pathway/Genome Navigator • Querying and visualization of: • Pathways • Reactions • Metabolites • Proteins • Genes • Chromosomes • Two modes of operation: • Web mode • Desktop mode • Most functionality shared, but each has unique functionality

  14. Pathway Tools Ontology / Schema • Ontology classes: 1621 • Datatype classes: Define objects from genomes to pathways • Classification systems for pathways, chemical compounds, enzymatic reactions (EC system) • Protein Feature ontology • Controlled vocabularies: • Cell Component Ontology • Evidence codes • Comprehensive set of 248 attributes and relationships

  15. What is a Pathway? • A connected sequence of biochemical reactions • Occurs in one organism • Conserved through evolution • Regulated as a unit • Starts or stops at one of 13 common intermediate metabolites

  16. Enzyme Data Available in MetaCyc • Reaction(s) catalyzed • Alternative substrates • Activators, inhibitors, cofactors, prosthetic groups • Subunit structure • Genes • Features on protein sequence • Cellular location • pI, molecular weight • Gene Ontology terms • Links to other bioinformatics databases

  17. Family of Pathway/GenomeDatabases EcoCyc CauloCyc AraCyc MtbRvCyc HumanCyc MetaCyc

  18. Comparison of BioCyc to KEGG • KEGG approach: Static collection of reference pathway diagrams are color-coded to produce organism-specific views • KEGG vs MetaCyc: Resource on literature-derived pathways • KEGG maps are not pathways Nuc Acids Res 34:3687 2006 • KEGG maps contain multiple biological pathways • KEGG maps are composites of pathways in many organisms -- do not identify what specific pathways elucidated in what organisms • KEGG has no literature citations, no comments, less enzyme detail • KEGG vs BioCyc organism-specific PGDBs • KEGG does not curate or customize pathway networks for each organism • Highly curated PGDBs now exist for important organisms such as E. coli, yeast, mouse, Arabidopsis • KEGG re-annotates entire genome for each organism

  19. Comparison of Pathway Tools to KEGG • Inference tools • KEGG does not predict presence or absence of pathways • KEGG lacks pathway hole filler, operon predictor • Curation tools • KEGG does not distribute curation tools • No ability to customize pathways to the organism • Pathway Tools schema much more comprehensive • Visualization and analysis • KEGG does not perform automatic pathway layout • No comparative pathway analysis

  20. Pathway Tools Implementation Details • Allegro Common Lisp • PC/Windows, Linux, Macintosh platforms • Ocelot object database • 420,000+ lines of code • Lisp-based WWW server at BioCyc.org • Manages 480+ PGDBs

  21. What We Will Not Cover Today

  22. Desktop BioCyc • Linux/PC, Windows/PC, Macintosh • Why install locally? • Runs faster • Significantly more functionality • More searches • Regulatory Overview, Genome Overview, Metabolite Tracing • Write programs using APIs • Access BioCyc when you can’t access the Internet • See: http://biocyc.org/download.shtml • Takes advantage of large monitors

  23. Comparative Analyses in Web andDesktop Modes • Web: Comparative analysis reports • Compare reaction complements • Compare pathway complements • Compare transporter complements • Both: Comparative genome browser • Both: Comparative pathway table • Desktop: Cellular Overview

  24. Metabolite Tracing

  25. Not Covered Today • Reachability analysis • Creation and editing of PGDBs • Search  Advanced

More Related