1 / 30

Human Genetic Variation

Human Genetic Variation. Genetics of Complex Diseases. Challenges. Challenge 2: Correcting genotyping errors. How can we detect genotyping errors? Hardy-Weinberg Equilibrium If we have Mother-father-child trios we can check Mendelian consistency. . Challenge 3: Population Substructure.

paul2
Télécharger la présentation

Human Genetic Variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Genetic Variation Genetics of Complex Diseases

  2. Challenges

  3. Challenge 2: Correcting genotyping errors • How can we detect genotyping errors? • Hardy-Weinberg Equilibrium • If we have Mother-father-child trios we can check Mendelian consistency.

  4. Challenge 3: Population Substructure • Imagine that all the cases are collected from Africa, and all the controls are from Europe. • Many association signals are going to be found • The vast majority of them are false; Why ??? Different evolutionary forces: drift, selection, mutation, migration, population bottleneck.

  5. Shaping Genetic Variation • Mutations add to genetic variation • Natural Selection controls the frequency of certain traits and alleles • Genetic drift

  6. Ancestral population

  7. Ancestral population migration

  8. different allele frequencies Ancestral population Genetic drift

  9. Population Substructure • Imagine that all the cases are collected from Africa, and all the controls are from Europe. • Many association signals are going to be found • The vast majority of them are false; What can we do about it?

  10. Ancestry Inference • To what extent can population structure be detected from SNP data? • What can we learn from these inferences? • Can we build the tree of life? • How do we analyze complexpopulations (mixed)? Novembre et al., Nature, 2008

  11. Principal Component Analysis • Dimensionality reduction • Based on linear algebra • Intuition: find the ‘most important’ features of the data.

  12. Principal Component Analysis Plotting the data on a onedimensional line for which the spread is maximized.

  13. Principal Component Analysis • In our case, we want to look at two dimensions at a time. • The original data points have many dimensions – each SNP corresponds to one dimension.

  14. Data Available

  15. International consortium that aims in genotyping the genome of 270 individuals from four different populations. HUJI 2006

  16. Launched in 2002. • First phase (2005): • ~1 million SNPs for 270 individuals from four populations • Second phase (2007): • ~3.1 million SNPs for 270 individuals from four populations • Third phase (ongoing): • > 1 million SNPs for 1115 individuals across 11 populations HUJI 2006

  17. HapMap Populations MKK LWK YRI GIH ASW MEX JPT CHD CHB CEU TSI

  18. HapMap PCA 1-2

  19. HapMap PCA 1-3

  20. HapMap PCA 1,2,4

  21. Lessons from the HapMap • African populations have higher genetic diversity than other populations • Evidence for bottlenecks or founder effect in the other population • Evidence for the out-of-Africa theory • HapMap was used to detect: • Common deletions across the genome • Regions under selection • Recombination rates, hotspots • Associations of SNPs with disease

  22. Example: detection of deletions using SNPs Conrad et al., Nature Genetics, 2006

  23. Example: detection of deletions using SNPs • Conrad et al. applied the method on the HapMap and found: • Typical individuals have roughly 30-50 deletions larger than 5kb (500kb-750kb total sequence length). • Deletions tend to be gene-poor. • The deletions detected in the HapMap span 267 known and predicted genes. • Deletions were found to be related to different conditions such as Schizophrenia (Steffanson et al., 2008), lupus glomerulonephritis (Aitman et al., Nature, 2006), and others.

  24. Distribution of deletion length Conrad et al., Nature Genetics, 2006

  25. Significant Region • Why do we have differences between data1 and data2? • How come so many SNPs seem to be associatedin this region? • Maybe there are multiple ‘causal SNPs’? • Or maybe there are correlations between the SNPs… ?

  26. Linkage Disequilibrium Signatures of History

  27. Linkage Disequilibrium

  28. Genotype T C C ì ü ì ü ì ü mother chromosome father chromosome A CG í ý í ý í ý G A A î þ î þ î þ ATACGA AGCCGC AGACGA ATCCGC Possible phases: …. Haplotypes vs. Genotypes • Cost effective genotyping technology gives genotypes and not haplotypes. Haplotypes ATCCGA AGACGC

  29. Haplotypes cluster naturally

  30. Haplotypes cluster naturally

More Related