1 / 17

Bioinformatics : Data-driven molecular biology

Bioinformatics : Data-driven molecular biology. Mikhail Gelfand A.A.Kharkevich Institute for Information Transmission Problems, RAS Moscow II Испано-российский форум по информационным и коммуникационным технологиям Madrid, 21-25 / IX / 2009. Exponential increase of data volume.

gunda
Télécharger la présentation

Bioinformatics : Data-driven molecular biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics: Data-driven molecular biology Mikhail Gelfand A.A.Kharkevich Institute for Information Transmission Problems, RAS Moscow II Испано-российский форум по информационным и коммуникационным технологиям Madrid, 21-25 / IX / 2009

  2. Exponential increase of data volume red – papers (PubMed) blue – sequence fragments (GenBank) green – nucleorides (GenBank) of 18 million papers in PubMed, ~675 thousand have keywords “bioinformat* OR comput*”

  3. 622 complete genomes (bacteria)

  4. >45 thousand Google hits on “genome deciphered” Top 10 hits: • bioremediation • bacterium Pseudomonas • agriculture and biotech • crop and biofuel plant Sorghum • rice • medicine • pathogenic bacterium Staphylococcus • SARS (atypical pneumonia) virus • Brugia worm (elephantiasis) • individual genome (medicine) • James Watson • science / model organism • macaque • science / evolution • mammoth (mitochondrial) • platypus

  5. Sequencingis just the beginning Bacterial genome:several million nucleotides 600 through 9,000 genes (~ 90% of a genome codes for proteins) This slide: 0,1% of theEscherichia coli genome Human genome: 3 billion nucleotides, 25-30 thousand genes polymorphisms (individual differences): ~ 1 for 1000 nucleotides differences between human and chimpanzee: ~ 1 of 100

  6. Not just genomes Other types of large-scale experiments / datasets: • State of the genome (gene expression) • methylation • nucleosome positioning • histone modifications • Transcriptomics, protein abundance (gene expression) • Protein-protein interactions • signaling etc. • functional complexes • Protein-DNA interactions (regulation) • etc. etc.

  7. Goals • Functional annotation of genes and proteins • biological function • regulation (in what conditions) • Functional annotation of genomes • metabolic reconstruction and modeling • regulatory networks and development • prediction of organism properties from its genome

  8. Applications: biotechnology • Improvement of production strains (chemistry, pharma, food industry) • via modeling of metabolic pathways • New enzymes (new functions, stress tolerance) • via sequencing and functional annotation • Biofuels • fast-growing, stress-tolerant plants; identification of genes • microbes as producers of ethanol or fatty acids: targeted genome design

  9. Applications: medicine and pharma • Personalized medicine • identification of predisposing alleles: lifestyle • pharmacogenomics (metabolic alleles) • diagnostics • Drug targets (chronic disease) • analysis of signaling pathways • Anti-infectives • identification of drug targets • Drug design; identification of drug candidates • modeling of protein structure and interactions of proteins with small molecules

  10. Methods. Integration of data • Systems biology:Integration of diverse datasets for one organism • Comparative genomics:Simultaneous analysis of genomic data for many organisms • Comparative systems biology:understanding the evolution of gene regulation and expression, signaling etc. • Comparative structural biology

  11. Bioinformatics in Russia • Few high-throughput experiments • Open data • Collaborations • Theory (evolution), methods, algorithms • Highlights: • Evolution (IITP RAS) and taxonomy (IPCB MSU) • Regulation (FBB MSU, GosNIIGenetika, IITP RAS, ICaG SB RAS) • Annotation (FBB MSU, IITP RAS) • Protein Structure (IPR RAS, IMB RAS, IPCB MSU, BF MSU) • Modeling • Metabolism (IPCB MSU, ICaG SB RAS) • Regulation (SpBSPU , ICaG SB RAS) • Drug design (IBMC RAMS)

  12. Research and Training Center “Bioinformatics”, Institute of Information Transmission Problems (5 years: 2003-2009) • Molecular evolution • Alternative splicing as a driver of evolution in eukaryotes • Positive selection • Comparative genomics of regulation in bacteria • Evolution of regulatory pathways • Protein-DNA interactions • Annotation • Gene recognition • Functional annotation • Regulation

  13. Comparative genomics in action: confirmed predictions • Regulatory mechanisms • riboswitches (riboflavin – vitamin B1, thiamin – vitamin B2) • antisense regulation of the methionine-cysteine pathway • role of the ribosome in zinc homeostasis • Regulators: NrdR, MtaR/MetR, CmbR, NiaR • Enzymes: FadE, ThiN, TenA, CobZ, CobX/CbiZ, PduX, NagP, NagB-II • Microcins (capistruin, Burkholderia thailandensis) • Transporters • АВС-transporters with universal energizing components: Co, Ni, biotin (vitamin H), thiamin (vitamin B2), riboflavin (vitamin B1) • other: threonin, methionin, oligogalacturonides, N-acetylglucosamin, corrinoids, nyacin, riboflacin, Co • Regulatory motifs: nitrogen-fixation, fatty acid biosynthesis, iron homeostasis, catabolism of chitin and pectin • Regulatory sites: several dozens

  14. Functional annotation of genomes First Russian bacterial genome,Acholeplasma laidlawii(2008):sequencing and proteomics: Institute of Physico-Chemical Medicine; annotation: IITP: ~1,5 Mb; ~1400 genes. Established function for~80% genes; metabolic reconstruction

  15. Publications (refereed)

  16. Bold: on-going * Former students Collaborations • European Laboratory of Molecular Biology * • Germany • Humboldt University, Berlin • Munich Technical University • France • Lyon University • United Kingdom • University of East Anglia • Spain • Center for Genome Regulation (Barcelona) • USA • MIT • Burnham Institute * • Lawrence Berkeley National Laboratory * • Stowers Institute * • Rutgers University • China • China-Germany Partner Institute of Molecular Genetics (Shanghai) • Industry • Biomax (Germany) • Interated Genomics (USA)

More Related