290 likes | 505 Vues
K-mer based strategies for studying infectious diseases. Dr Fredrik Vannberg. CDC - GEORGIA TECH - INTEL GENOMICS OF INTECTIOUS DISEASES WORKSHOP JULY 16 2014. Background . Oxford University Statistical genetics Harvard Medical School Dana Farber High Throughput DNA Sequencing Core
E N D
K-mer based strategies for studying infectious diseases Dr Fredrik Vannberg CDC - GEORGIA TECH - INTEL GENOMICS OF INTECTIOUS DISEASES WORKSHOP JULY 16 2014
Background • Oxford University • Statistical genetics • Harvard Medical School • Dana Farber High Throughput DNA Sequencing Core • Harvard Institute of Proteomics • Harvard School of Public Health • HIV genome phylogenetics • Emory School of Medicine • Tuberculosis virulence studies
Georgia TechGenomic Center for Infectious Diseases Viral Bacterial Fungal Parasite Genome Center Bioinformatics / High Performance Computing Administrative Core / Dissemination
Molecular Epidemiology • Molecular epidemiology encompasses both host and pathogen • Studies to understand host genetic variation • Studies to understand pathogen genetic variation • Phenotypic consequences of varying effect size for each variant within the host and pathogen
DNA sequence ‘variation’ refers to differences in sequences between individuals or alleles Sequence1 ATCGGTCAATCGAT-CGT Sequence2 ATCGGTCGATCGATACGT How does variation relate to disease?
Host genetic susceptibility studies Wong, et al, PLoS Pathogens (2010)
Immunological quantitative traits Fairfax, et al, Nature Genetics
Positive selection in the human genome related to pathogens Grossman, et al, Cell 2013
Global metrics for shared genetic information content Boolean Discrete Math Eigen vector/PCA Linear Algebra
Global metrics for shared genetic information content Discrete Math Linear Algebra
Global metrics for shared genetic information content Cai Huang, PhD student
Georgia Tech Computational Science and Engineering
Bioinformatics at Georgia Tech Development & application of computational methods to questions in genomics, molecular & systems biology. Bioinformatics integrates approaches from computer science, mathematics and engineering for the processing, analysis and interpretation of biological data. Georgia Tech relevant sub- and related disciplines: Computational genomics (Aluru, Borodovsky, Gibson, Hammer, Jordan, Konstantinidis, Stewart, McDonald, Vannberg, Yi) Molecular epidemiology (Jordan, Vannberg, Weitz) High performance computing (Aluru, Bader, Skolnick, Song, Vannberg) Algorithm development (Borodovsky, Heitsch, Jordan, Skolnick, Weitz, Vannberg) Modeling & simulation (Kemp, Lee, Song, Voit, Wang, Weitz, Weiss)