1 / 27

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012. Last Time. Sequence data and quantification of variation Infinite sites model Nucleotide diversity ( π ) Sequence-based tests of neutrality Tajima ’ s D Hudson-Kreitman-Aguade

tricia
Télécharger la présentation

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012

  2. Last Time • Sequence data and quantification of variation • Infinite sites model • Nucleotide diversity (π) • Sequence-based tests of neutrality • Tajima’s D • Hudson-Kreitman-Aguade • Synonymous versus Nonsynonymous substitutions • McDonald-Kreitman

  3. Today • Signatures of selection based on synonymous and nonsynonymous substitutions • Multiple loci and independent segregation • Estimating linkage disequilibrium

  4. Using Synonymous Substitutions to Control for Factors Other Than SelectiondN/dS or Ka/Ks Ratios

  5. Types of Mutations (Polymorphisms)

  6. Synonymous versus Nonsynonymous SNP • First and second position SNP often changes amino acid • UCA, UCU, UCG, and UCC all code for Serine • Third position SNP often synonymous • Majority of positions are nonsynonymous • Not all amino acid changes affect fitness: allozymes

  7. Synonymous substitution rate can be used to set neutral expectation for nonsynonymous rate dS is the relative rate of synonymous mutations per synonymous site dN is the relative rate of nonsynonymous mutations per non-synonymous site  = dN/dS If  = 1, neutral selection If  < 1, purifying selection If  > 1, positive Darwinian selection For human genes,  ≈ 0.1 Synonymous & Nonsynonymous Substitutions

  8. http://www.mun.ca/biology/scarr/Transitions_vs_Transversions.htmlhttp://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html Complications in Estimating dN/dS CGT(Arg)->AGA(Arg) CGT(Arg)->AGT(Ser)->AGA(Arg) CGT(Arg)->CGA(Arg)->AGA(Arg) • Multiple mutations in a codon give multiple possible paths • Two types of nucleotide base substitutions resulting in SNPs: transitions and transversions not equally likely • Back-mutations are invisible • Complex evolutionary models using likelihood and Bayesian approaches must be used to estimate dN/dS (also called KA/KS or KN/KS depending on method) (PAML package)

  9. Hartl and Clark 2007 dn/ds ratios for 363 mouse-rat comparisons • Most genes show purifying selection (dN/dS < 1) • Some evidence of positive selection, especially in genes related to immune system interleukin-3: mast cells and bone marrow cells in immune system

  10. McDonald-KreitmanTest • Conceptually similar to HKA test • Uses only one gene • Contrasts ratios of synonymous divergence and polymorphism to rates of nonsynonymous divergence and polymorphism • Gene provides internal control for evolution rates and demography

  11. Aligned 11,624 gene sequences between human and chimp • Calculated synonymous and nonsynonymous substitutions between species (Divergence) and within humans (SNPs) • Identified 304 genes showing evidence of positive selection (blue) and 814 genes showing purifying selection (red) in humans Application of McDonald-Kreitman Test: • Positive selection: defense/immunity, apoptosis, sensory perception, and transcription factors • Purifying selection: structural and housekeeping genes Bustamente et al. 2005. Nature 437, 1153-1157

  12. Genes showing purifying (red) or positive (blue) selection in the human genome based on the McDonald-Kreitman Test Bustamente et al. 2005. Nature 437, 1153-1157

  13. How can you differentiate between effects of selection and demographic effects on sequence variation? Will this work for organellar DNA?

  14. Extending to Multiple Loci • So far, only considering dynamics of alleles at single loci • Loci occur on chromosomes, linked to other loci! “The fitness of a single locus ripped from its interactive context is about as relevant to real problems of evolutionary genetics as the study of the psychology of individuals isolated from their social context is to an understanding of man’s sociopolitical evolution” Richard Lewontin (quoted in Hedrick 2005) • Size of region that must be considered depends on Linkage Disequilibrium

  15. Gametic (Linkage) Disequilibrium (LD) • Nonrandom association of alleles at different loci into gametes • Haplotype: Genotype of a group of closely linked loci • LD is a major factor in evolution • LD itself provides insights into population history • Estimation of LD is critical for ALL population genetic data

  16. Nomenclature and concepts A1 B1 A2 B2 • Two loci, two alleles • Frequency of allele i at locus 1 is pi • Frequency of allele i at locus 2 is qi p1 q1 p2 q2

  17. Nomenclature and concepts A1 B1 A2 B2 B1 A1 A2 B2 • Genotype is written as • A1 and B1 are in coupling phase • A1 and B2 are in repulsion phase

  18. Gametic Disequilibrium B1 A1 A2 B2 Meiosis A1 B1 A1 A2 B2 B2 A2 B1 • Easiest to think about physically linked loci, but not necessarily the case What Are Expected Frequencies of Gametes in a Population Under Independent Assortment? p1q1 p1q2 p2q1 p2q2

  19. B1 A1 A2 B2 Meiosis A1 B1 A1 A2 B2 B2 A2 B1 What are expected frequency of Gametes with complete linkage? A1 B1 p1 q1 A2 B2 p2 q2 x11 x21 x22 x12

  20. Linkage disequilibrium measure, D Independent Assortment: With LD: Substituting from above table:

  21. Problem: D is sensitive to allele frequencies • Can’t have negative gamete frequencies • Maximum D set by allele frequencies Solution: D' = D/Dmax ranges from -1 to 1 Example, if D is positive: p1=0.5, q2=0.5, Dmax=0.25 but p1=0.1, q2=0.9, Dmax=0.09 Dmax Calculation: If D is positive, Dmax is lesser of p1q2 or p2q1 If D is negative, Dmax is lesser of p1q1 or p2q2

  22. LD can also be estimated as correlation between alleles • r can also be standardized to a -1 to 1 scale • It is equivalent to D’ in this case

  23. Recombination B1 A1 A2 B2 A1 B1 A1 B2 A2 B2 A2 B1 • Shuffling of parental alleles during meiosis • Occurs for unlinked loci and linked loci • Rate of recombination for linked markers is partially a function of physical distance

  24. B1 A1 A2 B2 Meiosis A1 B1 A1 A2 B2 B2 A2 B1 What is the expected recombination rate for unlinked loci? Coupling Repulsion Coupling Repulsion Where nr is number of repulsion phase gametes, and nc is number of coupling phase gametes

  25. LD is partially a function of recombination rate • Expected proportions of gametes produced by various genotypes over two generations First generation (Second generation) Where c is the recombination rate and D0 is the initial amount of LD

  26. Recombination degrades LD over time Where t is time (in generations) and e is base of natural log (2.718)

  27. Effects of recombination rate on LD • Decline in LD over time with different theoretical recombination rates (c) • Even with independent segregation (c=0.5), multiple generations required to break up allelic associations • Genome-wide linkage disequilibrium can be caused by demographic factors (more later)

More Related