450 likes | 734 Vues
The European Molecular Biology Laboratory (EMBL) is supported by sixteen countries. Consists of the main Laboratory in Heidelberg (Germany), Outstations in Hamburg (Germany), Grenoble (France) and Hinxton (U. K.), and an external Research Programme in Monterotondo (Italy). http://www.embl.de/
E N D
The European Molecular Biology Laboratory (EMBL) is supported by sixteen countries. Consists of the main Laboratory in Heidelberg (Germany), Outstations in Hamburg (Germany), Grenoble (France) and Hinxton (U. K.), and an external Research Programme in Monterotondo (Italy). http://www.embl.de/ from 1974 http://www.ebi.ac.uk/ from 1996
The EBI Mission • To provide Bioinformatics Facilities for the Scientific Community • To become a flagship laboratory for research in bioinformatics • To provide bioinformatics training • To help disseminate standards & technologies
Role of Bioinformatics • To Support Experimental Biology • To Collect and Archive Data • To provide Framework and Integration • To give Easy Access to Data • To make New Discoveries through Data Analysis • To predict through modelling • To facilitate application and exploitation of academic research in Medicine, Agriculture, Health and Environment
Dramatic Changes in Biology over last 5 years • Data Explosion & New Types of Data • Move towards High-Throughput Biology • Move towards Systems Biology • Much larger community – often naïve users • Growth of Applied Biology – molecular medicine, agriculture, food, environmental sciences
Genomes Literature Expression- profiling Metabolic data Proteome data Biochemistry Bioinformatics Comparative genomics Mutant/RNAi data Hypotheses and in silico models
Molecules to Cells to Organisms Protein E.coli Genome Genomes
Systems Biology Input Adaptor Adaptor CheR CheZ Methyl Methyl CheB ATP CheA CheW CheW ADP Pi Pi CheY Flim C Output
Molecular Basis of Disease p53 tumour suppressor core domain – cancers of many types Cu-Zn Superoxide Dismutase - Autosomal dominant Amyotrophic lateral sclerosis
Linking to Domain data, eFamily Sequence Mapping, SIFTS MSDchem ligand data PQS biological assemblies Electron Density Visualisation AstexViewer MSDPro, MSDlite MSDsite Active sites SSM fold matching Surface Matching
From Structure To Biochemical Function Gene Protein 3D Structure Function Given a protein structure: • Where is the functional site? • What is the multimeric state of the protein? • Which ligands bind to the protein? • What is biochemical function?
High throughput • A new sequence every 4 seconds • 600 000 web requests a day • 100 000 users • 5-10 core databases • 20 000 000 cross-references • About 160 other databases
ftp year million files; Terabytes 2001 4.5 11914 2002 5.6 11809 2003 13.5 43860 2004 17.3 60508 2005 26.3 85396
Web Servers Requests millions 2002 118631650 118 2003 255399724 255 2004 354235704 354 2005 482076196 482
Distinct hosts served Number users(millions) 2002 1586883 1.5 2003 2784974 2.7 2004 3656109 3.6 2005 3919564 3.9
dynamic pages domains (2005) 1. .uk (United Kingdom) 21.14% 2. .com (Commercial) 17.16% 3. [unknown domain] 13.37% 4. [unresolved numerical addresses] 11.05% 5. .edu (USA Higher Education) 5.29% 6. .net (Networks) 5.27% 7. .fr (France) 4.76% 8. .it (Italy) 4.68% 9. .de (Germany) 2.81% 10. .nl (Netherlands) 2.00%
The Services of the EBI • Nucleotide sequences • Genes • Transcription information • Protein sequences • Protein families • Macromolecular structures • Molecular interactions • Pathways • Metabolic information • Scientific Literature
Structure of EBI: Services Database Integration and External Services Lopez Apweiler,Stoesser Stoehr, Zhu Henrick Brazma Birney
Structure of EBI: Research Text Mining Computational Genomics Structural Proteomics Phylogeny & Evolution Neuroinformatics
EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences
EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences EMSD Macromolecular Structure Data
EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences Array-Express Microarray Expression Data EMSD Macromolecular Structure Data
EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences EnsEMBL Human Genome Gene Annotation Array-Express Microarray Expression Data EMSD Macromolecular Structure Data
EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences EnsEMBL Human Genome Gene Annotation Array-Express Microarray Expression Data EMSD Macromolecular Structure Data IntActProtein Interactions
GKB Pathways EMBL-BankDNA sequences SWISS-PROT + TrEMBL Protein Sequences EnsEMBL Human Genome Gene Annotation Array-Express Microarray Expression Data EMSD Macromolecular Structure Data IntActProtein Interactions
Integrative science demandsintegrative resources • EBI databases have a backbone of integrative links • 20 000 000 cross-references support trans-database navigation • Is this good enough? • sparse and coarse-grain • not straight-forward to use
Integrative science demandsintegrative resources Major efforts involved in integration • Interpro: database of protein families, domains and functional sites. • Interg8: data integration project co-ordinated by the EBI, to provide an integrated layer for the exploitation of genomic and proteomic data. • GRID technologies
European Patent Office • Support the inclusion of sequence data in the public databases • Development of tools to capture sequence data • Run their searches at the EBI • (similar arrangements in USA and Japan ensure exchange) • Analogous systems being developed for structure information
Industry Support • Current successful Industry programme for Pharma • Quarterly meetings • R&D Training - workshops • Industry Forum • Funded by subscriptions • New SME programme under development
Expression Data Proteomic Data Chip-on-Chip Atlases Electron tomographs Human Variation Disease Links Metabolome Data ?? New Data