1 / 22

Curation of the EcoCyc Database: The EcoCyc Update Project

Curation of the EcoCyc Database: The EcoCyc Update Project. Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International. http://www.ecocyc.org. http://www.biocyc.org. EcoCyc Organization. EcoCyc collects information about multiple types of database objects

danielwhite
Télécharger la présentation

Curation of the EcoCyc Database: The EcoCyc Update Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Curation of the EcoCyc Database:The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International http://www.ecocyc.org http://www.biocyc.org

  2. EcoCyc Organization • EcoCyc collects information about multiple types of database objects • Pathway * • Reaction * • Compound * • Protein • Gene * • Transcription Unit * hierarchies Genes Proteins Pathway Reactions Compounds

  3. EcoCyc Statistics 176 pathways 992 enzymes 1006 enzymatic reactions 169 transporters 828 transcription units 1929 proteins have a comment (598 > 300 characters)

  4. EcoCyc Pathway Information http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2

  5. EcoCyc Pathway Information http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2

  6. …viewed with “More Detail”

  7. EcoCyc Protein Information reaction comment citations

  8. EcoCyc Gene Information

  9. EcoCyc Metabolic Overview Static or animated views of expression data http://biocyc.org/ov-expr.shtml

  10. EcoCyc Curation • names and synonyms • gene classes • subunit composition of protein complexes • location of gene product • protein or complex molecular weight • enzyme activity name • enzyme properties (activators, inhibitors, cofactors) • comment fields • evidence • citations • reactions catalyzed • pathway information

  11. Build a new MOD or add a “Pathway Module”! • Saccharomyces cerevisiae • SGD, Stanford University • Arabidopsis thaliana • Carnegie Institution of Washington • Plasmodium falciparum, • Stanford University • Mycobacterium tuberculosis • Stanford University • Synechocystis • Carnegie Institution of Washington • Methanococcus janaschii • EBI Pathway Tools Software - Takes annotated genome - Generates database, including pathway predictions Freely available (academics/non-profits) Current Pathway Tools Users http://bioinformatics.ai.sri.com/ptools/ Pathway Tools software environment for creation, curation, analysis, and Web publishing of MODs ptools-info@ai.sri.com

  12. EcoCyc Strengths • Metabolism • Transport • Transcription regulation

  13. EcoCyc into the Future: “EcoCyc is not just metabolism anymore!” …an integrated, review-level information resource on E. coli genomics and biochemistry…

  14. The EcoCyc Update Project: • What do we need to do? Goals • Can we possibly get it done? Quantification • Where do we start? Priorities • How is it going? Progress

  15. EcoCyc Update: Curation Goals Curate every gene product: • literature-based descriptions • comprehensive reference lists • Expand database scope beyond metabolism, transporters, and transcription • Curate associated reactions and pathways • Stay current with the latest papers

  16. EcoCyc Update: Quantification 4405 genes -175 transcription factors -168 transporters 4062 genes to curate Full-time curator: 4 days/week on curation + Part-time curator (70%), years 2-4 Year 1: 1600 hours Year 2: 3000 hours Year 3: 3000 hours Year 4: 3000 hours Total:10,600 hours/4062 genes: 2.6 hours per gene Curation of abstracts

  17. EcoCyc Update: Priorities • 1. Problems raised by users and advisors • 2. Gene products that have new characterizations published in the literature • 3. Gene products that have not yet been thoroughly curated • 4. Gene products that have been curated, but have not been updated lately

  18. Where are we now? 807 gene products curated. 807/4062 = 19.9% of the total (excluding transport and transcription factors) 4-year plan: Curate 615 genes in Year 1 We are meeting our goal!

  19. The EcoCyc Collaboration UNAM • Julio Collado-Vides, Project Leader • Socorro Gama-Castro, Curator • Martin Peralta, Curator TIGR • Ian Paulsen, Project Leader • Mark Hance, Curator UCSD • Milton Saier, Project Leader • Can Tran, Curator SRI • Peter Karp, PI • Suzanne Paley, Software Engineer • John Pick, Software Engineer • Martha Arnaud, Curator UCD • John Ingraham, Project Leader MBL • Monica Riley, Editor Emerita • Funding: • NIH National Center for Research Resources

  20. Saccharomyces cerevisiae, Stanford University pathway.yeastgenome.org/biocyc/ Plasmodium falciparum, Stanford University plasmocyc.stanford.edu Mycobacterium tuberculosis, Stanford University BioCyc.org Arabidopsis thaliana andSynechocystis, Carnegie Institution of Washington Arabidopsis.org:1555 Methanococcus janaschii, EBI Maine.ebi.ac.uk:1555 Other PGDBs in progress by 40 other users Software freely available Each PGDB owned by its creator Pathway/Genome DBs Created byExternal Users

More Related