1 / 43

Disease, natural selection and the 1000 Genomes Project

Disease, natural selection and the 1000 Genomes Project. Elinor K. Karlsson Sabeti lab @ Harvard University and Broad Institute. Human migration & recent evolution. Most diversity. “Out of Africa” ~50,000 years ago. 1000 Genomes populations. 5 European populations. 5 East Asian

Télécharger la présentation

Disease, natural selection and the 1000 Genomes Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disease, natural selection and the 1000 Genomes Project Elinor K. Karlsson Sabeti lab @ Harvard University and Broad Institute

  2. Human migration & recent evolution Most diversity “Out of Africa” ~50,000 years ago

  3. 1000 Genomes populations 5 European populations 5 East Asian populations 5 Southeast Asian pops Most diversity “Out of Africa” ~50,000 years ago 5 African populations

  4. Genomic Signals of Natural Selection after before                         prevalence                         generations

  5. Genomic Signals of Natural Selection after before                         prevalence                         generations* * in humans, tests sensitive to events within the last ~50,000 years

  6.            Broken Haplotype Long Correlations Genomic Signals of Natural Selection after before Tests: • 1) Long-range correlations • iHS, XP-EHH  Long:

  7.            Derived Allele Frequency Derived Allele Frequency Position Position Genomic Signals of Natural Selection after before Tests: • 1) Long-range correlations • iHS, XP-EHH 2) High frequency derived Ancestor  Chimp Human Gorilla Derived:

  8.            Genomic Signals of Natural Selection after before Tests: • 1) Long-range correlations • iHS, XP-EHH 2) High frequency derived 3) High differentiation • FST  Differentiated Differentiation Position Differentiation Position

  9. Position Position Localize: Composite of Multiple Signals 1. Long-range correlation 2. High frequency of newer allele G C H 3. Differentiation Position

  10. Simulations: CMS narrows signal of selection • 1Mb to 104kb (dense genotype data) • 1Mb to 89 kb (full sequence data) • 500-1500 variants to 100 • Causal variant among top 20 variants in 50% of tests • Causal variant was top variant in 25% of tests CMS on real data: 185 selected regions from human haplotype map ...

  11. Human Haplotype Map signals

  12. 1000 Genomes Project = perfect for selection tests • Positive selection tests detect common variants (>20% ) • 1000 Genomes Project – all variants over 1-5% frequency in population • Find candidate variants • Test function

  13. 1000 Genomes Project sequencing 2500 people

  14. 1000 Genomes Project European East Asian Yoruban 412 candidate regions: 35 nonsynonymous SNPs 147 with single gene 88 with multiple genes 48 lincRNAs 56 eQTLs

  15. 1000 Genomes Project: West Africa (Yoruba)

  16. CMS signal overlapping gene eQTL

  17. CMS signal overlapping LINC eQTL

  18. CMS signal overlapping enhancer signal

  19. Non-synonymous mutation in TLR5 (Yorubans) TLR5: sensing & clearence of bacterial pathogens dimerization and activation domain

  20. TLR5 variant lowers NF-kB response

  21. Selection + Association Strength of selection Trait association 1000 genomes data GWAS data Selection signal Population Association Signal Affected vs. unaffected

  22. Pathogens = recent, strong selective pressure Global migration and agricultural revolution ... • new pathogen environments • increased population density • closer contact with animal disease vectors • neolithic demographic transition (~12,000 yra)

  23. Pathogens = recent, strong selective pressure Many pathogen receptors / modifiers in selected regions RHOA and OTUB1: Yersinia pestis DAG1: Mycobacterium leprae TLR1: H. pylori, M. leprae and others TLR5: Salmonella typhimurium and others LARGE: Lassa virus DARC: Plasmodium vivax malaria PVRL4: measles virus VDR: Mycobacterium tuberculosis APOL1: Trypanosoma brucei

  24. Few GWAS of pathogen susceptibility 3% pathogen related NHGRI GWAS Catalogue

  25. Project 1: Lassa fever suceptibility in West Africa Lassa virus causes hemorrhagic fever which kills >20,000 people each year Endemic in West Africa: Nigeria, Mano River Union Persistently infected rodent reservoir M. natalensis

  26. Signal of selection at LARGE in Yoruba population Function of LARGE connected to Lassa fever: Sabeti PC et al, Nature (2007)

  27. 1000 Genomes Project West Africa populations Mende (Sierra Leone) Esan & Yoruban (Nigeria)

  28. Distinct populations genetically Figure not shown

  29. Selection signal at LARGE in all three populations Figure not shown Yoruba (Nigeria) Figure not shown Esan (Nigeria) Figure not shown Mende (Sierra Leone)

  30. Association signal at LARGE in Mende and Esan Figure not shown

  31. African populations have low LD and poor tagging Illumina 2.5M Array % captured (r2 > 0.8)

  32. Impute with 1000 Genomes data Imputation with 1000G Association analysis

  33. Imputation boosts association signal Figure not shown

  34. Association signal overlaps with selection signal Association in Mende (Sierra Leone) Figure not shown Association in Esan (Nigeria) Figure not shown Signal of selection in Yoruba (Nigeria) Figure not shown

  35. Next: more data • Selection scan with new 1000 Genomes data • bigger GWAS with imputation • Combine selection and association genomewide

  36. Project 2: Cholera in Bangladesh • Ancient disease • 5th century BC • Common disease • 50% exposed by age 15 • High fatality • historically up to 70% • higher in children • Risk is heritable • 1 degree relatives have • ~ 3x higher susceptibility

  37. CMS: ~300 regions of natural selection CMS (selection) position in genome (by chromosome) * FPR = 0.1%

  38. INRICH enrichment analysis of gene sets Genes ~ IKBKG (p=5x10-5) Potassium ion transport (p=2x10-3) Kell blood group genes (4x10-3) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 http://atgu.mgh.harvard.edu/inrich/ ( Lee et al 2012)

  39. Top region of selection = top region of association CMS (selection) -log10p (association*) * Test = PLINK DFAM (combines data from discordant sibship, parent-offspring trios and unrelated case/controls)

  40. Selected genes link to NFκB / inflammasome

  41. Next: More data • 1. Natural selection scan: • NF-κB pathway & inflammasome enriched for selection in Bangladesh • Next: selection scan with 1000 Genomes data • 2. Cholera susceptibility association study • Strongest selected locus associated with cholera susceptibility • Next: GWAS with imputation • 3. Experimental follow-up • Cholera toxin stimulates inflammasome • Next: RNAi / RNA-Seq

  42. Same approach can be applied to other diseases Malaria, dengue fever, leprosy, tuberculosis ... • Strong (dead = no children) • Recent (many new pathogens in last 50,000 years) • Diverse (varies by population) Other diseases? Autoimmune disorders?

  43. Acknowledgements SabetiLab PardisSabeti ShervinTabrizi IlyaShlyakhter Shari Grossman Danny Park Sameer Gupta Yana Kamberov Kristian Andersen Rachel Sealfon Stephen F. Schaffner RidhiTariyal Matt Stremlau Stephen Gire Christian Matranga Sarah Winnicki And everyone else! MGH Regina LaRocque Jason Harris Crystal Ellis Christine Becker Ed Ryan Lynda Stuart Steve Calderwood Sarah Shin Broad Institute Colm O'Dushlaine Phil Hyoun Lee (MGH) Shaun Purcell (MGH) Nick Patterson Yves Boie Andrew Crenshaw Scott Mahan Shannon Power Genomics Platform Sierra Leone Richard Fonnie Augustine Goba Donald Grant SimbirieJalloh LansanaKanneh Danielle Levy Bangladesh Firdausi Qadri AtiqurRahman Nigeria Christian Happi WunmiOmoniwa Philomena Egiaghe OdiaIkponmwosa Tulane University Robert Garry John Schieffelin

More Related