1 / 53

April 2006

Ensembl Training. Xos é Mª Fernández European Bioinformatics Institute. Sept 2008. April 2006. Ensembl Training. Ensembl What can we offer EBI Training Logistics. Ensembl What can we offer EBI Training Logistics. Ensembl - Project. Joint project

licia
Télécharger la présentation

April 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ensembl Training Xosé Mª Fernández European Bioinformatics Institute Sept 2008 April 2006

  2. Ensembl Training • Ensembl • What can we offer • EBI Training • Logistics • Ensembl • What can we offer • EBI Training • Logistics

  3. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  4. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  5. Collating genomic annotation

  6. Current Status 21,916 protein-coding genes in the human genome (e! 50) with additional segments ‘predicted’ to be protein-coding genes NCBI34 NCBI35 NCBI36

  7. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  8. Anopheles gambiae Aedes aegypti Drosophila melanogaster Dasypus novemcinctus Loxodonta africana Echinops telfairi Tupaia belangeri Homo sapiens Pan troglodytes Macaca mulatta Otolemur garnettii Mus musculus Rattus norvegicus Spermophilus tridecemlineatus Cavia porcellus Oryctolagus cuniculus Erinaceus europaeus Myotis lucifugus Canis familiaris Felis catus Bos taurus Monodelphis domestica Ornithorhynchus anatinus Gallus gallus Xenopus tropicalis Gasterosteus aculeatus Oryzias latipes Takifugu rubripes Tetraodon nigroviridis Danio rerio Ciona intestinalis Ciona savignyi Caenorhabditis elegans Saccharomyces cerevisiae The era of sequencing genomes

  9. Ensemblv50

  10. EnsemblGenomes

  11. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  12. Integration

  13. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  14. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  15. Ensembl - Project • Joint project • EMBL – European Bioinformatics Institute (EBI) • Wellcome Trust Sanger Institute • Produce accurate, automatic genome annotation • Focused on selected eukaryotic genomes • Integrate external (distributed) biological data • Presentation of the analysis to all via the Web at http://www.ensembl.org • Open distribution of the analysis the community • Development of open, collaborative software (databases and APIs)

  16. How can we help? • We want to know how we can help you to make the most of Ensembl: • Workshops to train users • What data do you use (e.g. Clinical cytogeneticists use Ensembl to design FISH probes, exploring adding additional DAS tracks) • Help you sharing information using DAS • Publish ‘case study’ or ‘protocol’ papers in journals widely used by the community • Attend conferences with hands-on sessions • Share bookmarks and configurations by setting up groups (with specific profiles for clinical molecular geneticists)

  17. 1800 bps in chr 11… ............................cccgtggagccacaccctagggttggccaatc tactcccaggagcagggagggcaggagccagggctgggcataaaagtcagggcagagcca tctattgcttgcaggagccagggctgggcataaaagtcagggcagagccatctattgctt ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATC TGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAG TTGGTGGTGAGGCCCTGGGCAGGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACC AATAGAAACTGGGCATGTGGAGACAGAGAAGACTCTTGGGTTTCTGATAGGCACTGACTC TCTCTGCCTATTGGTCTATTTTCCCACCCTTAGCTGCTGGTGGTCTACCCTTGGACCCAG AGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAG GTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGAC AACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT CCTGAGAACTTCAGGgtgagtctatgggacgcttgatgttttctttccccttcttttcta tggttaagttcatgtcataggaaggggataagtaacagggtacagtttagaatgggaaac agacgaatgattgcatcagtgtggaagtctcaggatcgttttagtttcttttatttgctg ttcataacaattgttttcttttgtttaattcttgctttctttttttttcttctccgcaat ttttactattatacttaatgccttaacattgtgtataacaaaaggaaatatctctgagat acattaagtaacttaaaaaaaaactttacacagtctgcctagtacattactatttggaat atatgtgtgcttatttgcatattcataatctccctactttattttcttttatttttaatt gatacataatcattatacatatttatgggttaaagtgtaatgttttaatatgtgtacaca tattgaccaaatcagggtaattttgcatttgtaattttaaaaaatgctttcttcttttaa tatacttttttgtttatcttatttctaatactttccctaatctctttctttcagggcaat aatgatacaatgtatcatgcctctttgcaccattctaaagaataacagtgataatttctg ggttaaggcaatagcaatatctctgcatataaatatttctgcatataaattgtaactgat gtaagaggtttcatattgctaatagcagctacaatccagctaccattctgcttttatttt atggttgggataaggctggattattctgagtccaagctaggcccttttgctaatcatgtt catacctcttatcttcctcccacagCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCA TCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGG TGTGGCTAATGCCCTGGCCCACAAGTATCACTAAGCTCGCTTTCTTGCTGTCCAATTTCT ATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCC TTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGCaatgatgtatttaa attatttctgaatattttactaaaaagggaatgtgggaggtcagtg.............. ............................cccgtggagccacaccctagggttggccaatc tactcccaggagcagggagggcaggagccagggctgggcataaaagtcagggcagagcca tctattgcttgcaggagccagggctgggcataaaagtcagggcagagccatctattgctt acatttgcttctgacacaactgtgttcactagcaacctcaaacagacaccatggtgcatc tgactcctgaggagaagtctgccgttactgccctgtggggcaaggtgaacgtggatgaag ttggtggtgaggccctgggcagggttggtatcaaggttacaagacaggtttaaggagacc aatagaaactgggcatgtggagacagagaagactcttgggtttctgataggcactgactc tctctgcctattggtctattttcccacccttagctgctggtggtctacccttggacccag aggttctttgagtcctttggggatctgtccactcctgatgctgttatgggcaaccctaag gtgaaggctcatggcaagaaagtgctcggtgcctttagtgatggcctggctcacctggac aacctcaagggcacctttgccacactgagtgagctgcactgtgacaagctgcacgtggat cctgagaacttcagggtgagtctatgggacgcttgatgttttctttccccttcttttcta tggttaagttcatgtcataggaaggggataagtaacagggtacagtttagaatgggaaac agacgaatgattgcatcagtgtggaagtctcaggatcgttttagtttcttttatttgctg ttcataacaattgttttcttttgtttaattcttgctttctttttttttcttctccgcaat ttttactattatacttaatgccttaacattgtgtataacaaaaggaaatatctctgagat acattaagtaacttaaaaaaaaactttacacagtctgcctagtacattactatttggaat atatgtgtgcttatttgcatattcataatctccctactttattttcttttatttttaatt gatacataatcattatacatatttatgggttaaagtgtaatgttttaatatgtgtacaca tattgaccaaatcagggtaattttgcatttgtaattttaaaaaatgctttcttcttttaa tatacttttttgtttatcttatttctaatactttccctaatctctttctttcagggcaat aatgatacaatgtatcatgcctctttgcaccattctaaagaataacagtgataatttctg ggttaaggcaatagcaatatctctgcatataaatatttctgcatataaattgtaactgat gtaagaggtttcatattgctaatagcagctacaatccagctaccattctgcttttatttt atggttgggataaggctggattattctgagtccaagctaggcccttttgctaatcatgtt catacctcttatcttcctcccacagctcctgggcaacgtgctggtctgtgtgctggccca tcactttggcaaagaattcaccccaccagtgcaggctgcctatcagaaagtggtggctgg tgtggctaatgccctggcccacaagtatcactaagctcgctttcttgctgtccaatttct attaaaggttcctttgttccctaagtccaactactaaactgggggatattatgaagggcc ttgagcatctggattctgcctaataaaaaacatttattttcattgcaatgatgtatttaa attatttctgaatattttactaaaaagggaatgtgggaggtcagtg..............

  18. CAP Enhancer Promotor Poli(A) 1800bps in chr 11…

  19. Compara Multiple Alignments Constrained elements Syntenies

  20. Sequence Conservation Over Time

  21. Pan comparative analysis

  22. Functional Genomics Integrates diverse genome-wide functional and epigenetic data to annotate the active genome Components Ensembl Regulatory Build ChIP-seq analysis DNA Methylation resources External collaborative projects

  23. Regulatory Features

  24. ENCODEProviding a map of the genome Pilot project completed in 2007: 1% of human genome Assessevery possible computational and experimental experiment • Comparative genomics, sequencing, expression, ChIP-chip, etc. Summary of results: • Majority of human bases are transcribed • Identification of many novel non-protein-coding transcripts • Identification of transcription start sites • Deciphering enhancer and regulatory regions of the genome • Regulatory elements are on either side of the transcription start site • Chromatin accessibility and histone modification patterns are very predictive of presence and activity of transcription start sites • DNA replication timing correlates with chromatin structure ENCODE2 (started 2008) extended to 100% of genome

  25. Gene concept post-ENCODE Gene as discrete unit • Union of genomic sequences encoding a coherent set of potentially overlapping functional products. • Statistical model to help interpret and provide concise summarisation to potentially noisy experimental data.

  26. Every individual has a unique genome

  27. SNPs in Ensembl • GeneSNPView • Gene Variation Report • Variations in region of gene • Variations and consequences

  28. Jim, Craig, YanHuang No 1, Marjolein… Jimomevs Craigome Craig Venter: • Sequence & analysis since 2003 • 32 mill seq (20 billion bp) • More variability than anticipated Jim Watson: • 454 technology (7.4x) • 100 mill unpaired reads (25 billion bps) • $1,000,000 “The Diploid Genome Sequence of an Individual Human” PLoS Biology 5: 10 2113-2144 (2007) “The Complete Genome of an Individual by Massively Parallel DNA Sequencing” Nature452:872-876 (2008)

  29. Spot the difference • Venter TTCTTCATTGGGCCGAACTTTCTGGTCCTCATCCAACAGCTCTTCTATCAYGTGTTCGAAAGTGTCAGCCAATGATGTCAAGCCTCTTGAACCTGCCTTGGGCCCATTCACGCTCTCCAGAGTCCCATGGGTCCGCACACCTGGGTAGGCCAAGCCACCTTGTCCTCGGATGTTTGCTTCTTTCATGGGGGCAGCCTTCATGCAACCAAAGTATGAAATAACCATAGTAAGGAAAAGGATGGTCATCACTCTTCTCACCTGGTGGAACTGTAGGGAGAAAGCAGAAACAAGACAGAAAACTGGTTAGGGCTTTCTTTCACCGGGATGCCATGTGGCCCATCTGATTGTAATTCCAGGCCATTCT • Watson TTCTTCATTGGGCCGAACTTTCTGGTCCTCATCCAACAGCTCTTCTATCATGTGTTCGAAAGTGTCAGCCAATGATGTCAAGCCTCTTGAACCTGCCTTGGGCCCATTCACGCTCTCCAGAGTCCCATGGGTCCGCACACCTGGGTAGGCCAAGCCACCTTGTCCTCGGATGTTTGCTTCTTTCATGGGGGCAGCCTTCATGCAACCAAAGTATGAAATAACCATAGTAAGGAAAAGGATGGTCATCACTCTTCTCACCTGGTGGAACTGTAGGGAGAAAGCAGAAACAAGACAGAAAACTGGTTAGGGCTTTCTTTCACCGGGATGCCATGTGGCCCATCTGATTGTAATTCCAGGCCATTCT • Watson TTCTTCATTGGGCCGAACTTTCTGGTCCTCATCCAACAGCTCTTCTATCATGTGTTCGAAAGTGTCAGCCAATGATGTCAAGCCTCTTGAACCTGCCTTGGGCCCATTCACGCTCTCCAGAGTCCCATGGGTCCGCACACCTGGGTAGGCCAAGCCACCTTGTCCTCGGATGTTTGCTTCTTTCATGGGGGCAGCCTTCATGCAACCAAAGTATGAAATAACCATAGTAAGGAAAAGGATGGTCATCACTCTTCTCACCTGGTGGAACTGTAGGGAGAAAGCAGAAACAAGACAGAAAACTGGTTAGGGCTTTCTTTCACCGGGATGCCATGTGGCCCATCTGATTGTAATTCCAGGCCATTCT • Venter TTCTTCATTGGGCCGAACTTTCTGGTCCTCATCCAACAGCTCTTCTATCAYGTGTTCGAAAGTGTCAGCCAATGATGTCAAGCCTCTTGAACCTGCCTTGGGCCCATTCACGCTCTCCAGAGTCCCATGGGTCCGCACACCTGGGTAGGCCAAGCCACCTTGTCCTCGGATGTTTGCTTCTTTCATGGGGGCAGCCTTCATGCAACCAAAGTATGAAATAACCATAGTAAGGAAAAGGATGGTCATCACTCTTCTCACCTGGTGGAACTGTAGGGAGAAAGCAGAAACAAGACAGAAAACTGGTTAGGGCTTTCTTTCACCGGGATGCCATGTGGCCCATCTGATTGTAATTCCAGGCCATTCT

  30. TrancriptSNPView • SNP in different strains • Variations and consequences • Individual genotypes • Variations in region of gene

  31. HapMap “The International HapMap Project “Nature426, 789 - 796 (18 Dec 2003)

  32. BioMart

  33. European Genotype Archive http://www.ebi.ac.uk/ega/

  34. Ensembl Training • Ensembl • What can we offer • EBI Training • Logistics

  35. Literature and ontologies CitExplore, GO Databases at EBI Nomenclature HGNC Nomenclature HGNC Genomes Ensembl, Integr8 Genomes Ensembl, Integr8 Nucleotide sequence EMBL Archive Nucleotide sequence EMBL Archive Proteomes UniProt, PRIDE Proteomes UniProt, PRIDE Gene expression ArrayExpress Protein structure ePDB Protein families, motifs and domains InterPro Protein families, motifs and domains InterPro Chemical entities ChEBI Chemical entities ChEBI Protein interactions IntAct Protein interactions IntAct Pathways Reactome Pathways Reactome Systems BioModels

  36. A tripartite user-training programme Training any time, anywhere, at any pace Training comes to you Hands-on user training on all our core data resources for lab-based researchers

  37. Interactive training for all levels of experience • Hands-on training in our purpose-built IT training suite at EMBL-EBI, Hinxton, Cambridge • Learn from the EBI’s experts through a combination of talks and practical exercises • Take a two-day tour of all our core data resources, or focus in on specific data types • Full programme at www.ebi.ac.uk/training/handson

  38. 2008 2009 Coming up in our Hands-on Training 6–8 October A two-day dip into the EBI’s resources 24–27 November Programmatic access in Java: webservices and workflows Transcriptomics resources and data analysis 19–22 January 23–26 February Bioinformatics resources for protein structure Sequence to genes: genome informatics 16–18 March 27–29 April Programmatic access to biological databases 11–15 May A walk through EBI Bioinformatics Resources

  39. The Bioinformatics Roadshow • Supported under the EU Integrated Infrastructures Initiative FELICS (www.felics.org) • FELICS provides access to many of Europe’s most widely used data resources: EBI, Swiss Institute of Bioinformatics, BRENDA, and the European Patent Office • We provide hands-on training in a wide variety of data resources and tools, where you want it, when you want it and targeted to your organization’s needs • For more information see www.ebi.ac.uk/training/roadshow or e-mail copeland@ebi.ac.uk

  40. eLearning platform Courses available Ensembl Sequence searching Courses under development ArrayExpress UniProt MSD/PDBe PRIDE Gene Ontology Literature searching and mining Patent searching

  41. eLearning platform (2)

  42. Each course is modular A course contains 3–5 modules (~30 min each) Modules contain… Video tutorial learn by watching and listening Print tutorial Learn by reading Quiz Learn by testing your understanding Reflective task Learn by practicing

  43. Roadshow modules Genomes Ensembl, EMBL-Bank Structures MSD, PDBSum, ProFunc Transcriptomes ArrayExpress, Expression Profiler Proteomes UniProt, InterPro, IntAct, PRIDE, OLS Mini modules Web services; BioMart; SRS; Chemistry GO/GOA; Alignments; Literature Pathways Reactome BioModels

  44. Ensembl Training • Ensembl • What can we offer • EBI Training • Logistics

  45. Workshops 2007-2008 UK 32 Belgium 5 US 13 (+) Kenya 3 Germany 7 South Africa 3(+) Netherlands 6 Spain 2 Portugal 6 Norway 2 China 5

  46. Ensembl 2007-2008

More Related