1 / 30

First Microme Jamboree – June, Monday 27 and Tuesday 28

First Microme Jamboree – June, Monday 27 and Tuesday 28. MicroScope functionalities to support pathways curation. LABGeM team Laboratory of Bioinformatic Analysis in Genomic and Metabolism CEA /DSV/IG/Genoscope & CNRS UMR8030. The MicroScope platform.

oprah
Télécharger la présentation

First Microme Jamboree – June, Monday 27 and Tuesday 28

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. First Microme Jamboree – June, Monday 27 and Tuesday 28 MicroScope functionalities to support pathways curation LABGeM team Laboratory of Bioinformatic Analysis in Genomic and Metabolism CEA/DSV/IG/Genoscope & CNRS UMR8030

  2. The MicroScope platform http://www.genoscope.cns.fr/agc/microscope Labelled in 2006 (RIO) and in 2009 • October 2002: • Begining of the Acinetobacter baylyi ADP1genome annotation • Computational platform for the annotation and comparative analysis of bacterial genomes. • - equipments (servers/disks storage/backups) • - softwares and data • - human resources (development/training/support) • => it offers to the communityof microbiologists high technological resources for the automatic and expert analysis of genomic data.

  3. 493inFrance 175inEurope 81inUSA + 110others countries { 859 personal accounts Since 2004,33 ‘genome’ papers (4 announcements) Specific genomic analysis : 22 other publications Usage of the platform Expert annotations : 370 000 expert annotations 5000 expert annotations a month (2010) About 980 bacterial genomes : 345 genomes annotated in the system (mostly sequenced at Genoscope and in USA...) and 635 from public databanks

  4. Three MicroScope components JBPM Workflows > 25methods : Primary Databank Syntactic Functional / relational Update Annotations Analyses Integrated in a workflow management system Process Management JBPM Database DB Job Release History => full automatisation : PkGDB MicroCyc • genome annotation • primary data up-to-date Pathway Internal Computational Primary Data Management Genome Genomic results Databanks DataBases Objects Vallenet D. et al. «MicroScope - a platform for microbial genome annotation and comparative genomics» Database2009 MaGe Web Interface Keyword search Blast and Pattern Tutorial Profile Phylogenetic Login Fusion / Fission Genome overview Tandem duplications Vallenet D, et al. «MaGe - a microbial genome annotation system supported by synteny results» Nucleic Acids Research2006 Visualization Minimal Gene Set Genome browser RGPfinder and Data Export SNPs / InDels Synteny maps Artemis KEGG MicroCyc CGView Metabolic Profile Synton Gene Gene LinePlot Pathway / Synteny display editor cart

  5. Tools for the syntactic & functional annotation • Syntactic annotation • Public tools : RepSeek (repeats), Oriloc (oriC/terC position), tRNAscan-SE (tRNA genes), Blast on Rfam (snRNA genes). • “homemade” tools : findrRNA (rRNA genes), AMIMat(gene models according to codon usage), AMIGene(based on GeneMark), MICheck (re-annotation of public bacterial genomes). • Functional annotation • Public tools :BLAST (searches in specialized databases and Uniprot), InterproScan (domains and functional sites), COGnitor (COG protein families), PRIAM (enzymatic functions), Pathway tools(metabolic pathways reconstruction), SignalP & TMHMM & PSORT(protein localisation). • “homemade” tools : Syntonizer (gene context analysis), and at the end, AutoFAssign,automatic functional annotation procedure : Blast on ‘reference genome annotations’ & syntenies > HAMAP results > TIGRfam/Pfam results & Blast on UniProt

  6. Classification of protein genes • Functional classifications from annotation tools • Gene Ontogoly (GO classification) <- InterProScan results • COG classification <- COGnitor results • Functional classifications (Gene Editor) • MultiFun (E. coli; M. Riley) • TIGR main roles • Other kind of classification Inspired by the ‘protein name confidence’ defined in PseudoCAP = Pseudomonas aeruginosa community annotation project (www.pseudomonas.com)

  7. Results available to correct/complete annotation Annotations from reference genomes MicroScope curated annotations Synteny results on available complete bacterial genomes TrEMBL contains functional annotations which often come from automatic procedures only: ‘IPMed?’ is used for proteins that may have an experimentally validated function.

  8. TrEMBL Blast similarities: example IPMed = Interesting PubMed?

  9. One instance of PkGDB for all MicroScope projects • Collaborative annotation • Annotator accounts and rights on sequences • Annotation history The MicroScope platform : data management -1- Relational DataBase PkGDB (Prokaryotic Genome DataBase) • Data organisation and persistence : • Public/primary data • Data generated during the annotation process (analysis results and expert annotations)

  10. Bacterial Genome http://www.genoscope.cns.fr/agc/microcyc Today: 977 organisms, 20 Go The MicroScope platform : data management -2- Enzymatic activities prediction (PRIAM) EC numbers correspondence • Experimentally elucidated metabolic pathways • 1600 pathways from 2000 organisms (P. Karp, SRI, USA) Pathway Tools A metabolic database is built for each annotated microbial genome PGDB = Pathway/Genome Database(orgname_Cyc)

  11. «Metabolic profiles» functionality PkGDB Select pathway classes Select organisms to compare Number of reactions for pathway x in a given organism Total number of reactions in pathway x

  12. Metabolic phyloprofile : example of results

  13. Using the “Keywords Search” functionality

  14. Available datasets to be explored ? • Automatically annotated genes + validated genes • Only all/personal validated genes • Only annotations from databank files or from our annotation pipeline • Gene/Protein features: G+C%, MW, Pi • Specific fields of the gene editor: Comments/Note BlastP/Synteny results against: • The set of genomes of the Microscope project • Escherichia coli(updated annotation )or Bacillus subtilis(SubtiList database) annotations • The set of E. coli, B. subtilis, or P. aeruginosa essential genes • Genes involved in synteny groups and annotated as Protein of Unknown Function or Putative enzyme • The set of similarities obtained with different sources: • - HAMAP High-quality Automated/Manual Annotation • - SwissProt or TrEMBL databank, limited or not to • blast hits having a possible interesting PubMedID • - PRIAM enzymatic profiles (Enzyme commission), • - COG databank, • - InterPro databank • Genes encoding enzymes involved in KEGG and BioCyc metabolic pathways • The results obtained with SignalP,Tmhmm, PsortB and Coiled Coil

  15. Query on P. putida annotation Step1 : genes annotated as « unknown function » => 2093 results (35%) Step2 : which ones have blast similarities (<> unknown functions) with UnitProt entries linked to PubMedID ?

  16. Results of the query... Result : 216 genes (123 in SP and 93 in TrEMBL) « Get gene » => 114 genes (can be re-annotated)

  17. Syntaxic re-annotation of P. putida PP3464 PP3463 PP3465 PP3462 PP3461 PP3460 PP3466 PP3459 PSEPK3872 PSEPK3868 PSEPK3873 Quinohemoprotein amine dehydrogenase

  18. Bacterial synteny: parameters • Correspondence relationship = Sequence similarity : BlastP Bidirectional Best Hit OR at least30% identity on 80% of the shortest sequence • Co-localization Gap = 5

  19. How to read the synteny maps ? ACIAD2450 A putative ortholog to ACIAD2440 on the E. coli genome ACIAD2440 A putative paralog to ACIAD2450 with two others co-localized ADP1 genes (in yellow) Another putative paralog to ACIAD2450, elsewhere on the ADP1 chromosome This P. putida « ortholog » (PP0114) is in synteny with two other genes (coloured in blue-purple). These two P. putida genes (PP0220 and PP4425) are similar to ACIAD2450 (putative paralogs of PP0114 ?)

  20. How are genes organized in a synteny group ? -2-

  21. « Syntonome » results in the gene annotation editor PkGDB proteomes NCBI + WGS proteomes

  22. Artemis Metabolic pathways KeyWordsBlast / MotivesPhylogenetic profiles Fusions / Fissions Genomic islands Metabolic profiles LinePlot Synton visualization CGView Exploration MicroScope web interfaces : MaGe Help MicroScope project Options Authentication Genome Overview Export Annotation editor EXPERT CURATION Synteny map

  23. MicroScope tutorial

  24. Annotation data in the ‘Gene Validation’ section of the editor This automatic information does not need to be changed With the help of the Analysis Results section This information must be completed or corrected by the annotator This information is optional

  25. New

  26. Adding gene-protein-reaction association (metacyc reactions) PP0082 = trpA gene List of the predicted reactions linked to the gene 1 Click on EC to search for all MetaCyc reactions corresponding to the annotated EC number 2 3

  27. Adding gene-protein-reaction association (metacyc reactions) PP0082 = trpA gene Added for PP PP0083 = trpB gene

  28. David Vallenet Demo : please go to http://www.genoscope.cns.fr/agc/microscope/

More Related