1 / 39

Analyzing human population genetic history through the study of genetic variation

Analyzing human population genetic history through the study of genetic variation. Mark Mata Mentor: Eleazar Eskin UCLA Zar Lab SoCalBSI 2009. Background. To study human population genetic history is to study parts of human evolution

niveditha
Télécharger la présentation

Analyzing human population genetic history through the study of genetic variation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing human population genetic history through the study of genetic variation Mark Mata Mentor: Eleazar Eskin UCLA Zar Lab SoCalBSI 2009

  2. Background • To study human population genetic history is to study parts of human evolution • Human evolution is one of the fundamental questions in science • We ask ourselves many questions like: • Where do we come from? • Why are we all different? • How are we all different?

  3. Background • The ZarLab does studies with the most recent events in human evolution: • Now that we have modern humans, what variations have occurred in our genes since our ancient African ancestors • To answer this question our group is looking at human variation to produce a genetic history of these changes

  4. Why do we care? • Many diseases are caused by variations that have occurred in our genetic history • Better understanding of our genetic history and human variation may eventually lead to better treatment plans • Personalized medicine: • “The right drug, in the right dose, to the right person, at the right time.” PerkinElmer website: http://las.perkinelmer.com/content/snps/genotyping.asp#snps

  5. Human Variation • Modern humans share 99.9% of our DNA • 0.1% account for variations between humans • Of this, 80% of the variation are the result of SNPs • SNP (single-nucleotide polymorphism) – position in the genome where there are two different bases present in the population. The base at a SNP on a chromosome is referred to as the “allele” • A haplotype is the sequence of alleles on a genome • The other 20% are from deletions or insertions on the genome PerkinElmer website: http://las.perkinelmer.com/content/snps/genotyping.asp#snps

  6. International HapMap Project • Study done by the International HapMap Consortium • “…create a public, genome-wide database of common human sequence variation…” • Identified SNPs and compiled the SNP alleles into a database of haplotypes for four different populations (Phase 1) • Population used were a group of 60 Mormons in Utah • Have been widely studied in the past • Western and Northern European descent • Have very detailed records • Used their chromosome 19 “A haplotype map of the human genome” by: The International HapMap Consortium. Nature. Published 27 October 2005

  7. My Project Goals • Reconstruct human genetic history • This is a very difficult problem • Sub-problem: Identify recent genetic events • Make the assumption that these new genetic events are rare or very few in number • Easier to classify and identify relationships when compared to older more common haplotypes • These new events are important because they identify shared recent ancestry • Disease causing variations could be from recent events

  8. Identifying Recent Genetic Events Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombinations

  9. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Workflow Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT Common AAAAAAAAAT* AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT Rare AA|TTTTTTTT AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA TTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTA*TTT

  10. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Frequency of Variation Individual’s Region How Many Haplotype TTTTTTTTTTTTTTT TTTTTTTTTT AAAAAAAAAAAAAAA AAAAAAAAAA TTTTTTTTTTTTTTT TTTTTTTTTT AAAAAAAAAAAAAAA AAAAAAAAAA TTTTTTTTTTTTTTT TTTTTTTTTT AAAAAAAAAAAAAAA AAAAAAAAAA AAAAAAAAAA - 59 TTTTTTTTTTTTTTT TTTTTTTTTT TTTTTTTTTT - 58 AAAAAAAAATTTTTT AAAAAAAAAT AAAAAAAAAT - 1 AATTTTTTTTTTTTT AATTTTTTTT AATTTTTTTT - 1 TTTTTTATTTTTTTT TTTTTTATTT TTTTTTATTT - 1 AAAAAAAAAAAAAAA AAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAA

  11. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Frequency of Variation Individual’s How Many Frequency of Haplotype Variation TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA AAAAAAAAAA – 59/120 ~49% TTTTTTTTTT|TTTTT TTTTTTTTTT – 58/120 ~48% AAAAAAAAAT|TTTTT AAAAAAAAAT – 1/120 ~1% AATTTTTTTT|TTTTT AATTTTTTTT – 1/120 ~1% TTTTTTATTT|TTTTT TTTTTTATTT – 1/120 ~1% AAAAAAAAAA|AAAAA AAAAAAAAAA|AAAAA

  12. Grouping Variations Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Classified as either common or rare haplotypes • Make the assumption that new genetic events are rare or very few in number • A cut off rate of 5% frequency or higher was used to separate common subsequences from rare subsequences • 5% was a number that came from the International HapMap Consortium study “A haplotype map of the human genome” by: The International HapMap Consortium. Nature. Published 27 October 2005

  13. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Grouping Variations Individual’s Frequency of Group Genes Variation TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA TTTTTTTTTT|TTTTT AAAAAAAAAA|AAAAA Common: TTTTTTTTTT|TTTTT AAAAAAAAAA AAAAAAAAAA|AAAAA AAAAAAAAAA – 49% TTTTTTTTTT TTTTTTTTTT|TTTTT TTTTTTTTTT – 48% AAAAAAAAAT|TTTTT AAAAAAAAAT – 1% Rare: AATTTTTTTT|TTTTT AATTTTTTTT – 1% AAAAAAAAAT TTTTTTATTT|TTTTT TTTTTTATTT – 1% AATTTTTTTT AAAAAAAAAA|AAAAA TTTTTTATTT AAAAAAAAAA|AAAAA

  14. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Recent Events • Make comparisons to identify two forms of variation: • Point mutations • Recombination events Common: Rare: AAAAAAAAAA AAAAAAAAAT TTTTTTTTTT AATTTTTTTT TTTTTTATTT

  15. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Point Mutations Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAT* AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AA|TTTTTTTT AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA TTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTA*TTT

  16. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Point Mutations Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA TTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTA*TTT

  17. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Recent Events • Point mutations • Are found by comparing a common haplotype and with a rare haplotype • A difference of one shows that a rare haplotype is a point mutation of a common haplotype • Marked by a “*” next to the point mutation Common: TTTTTTTTTT TTTTTTA*TTT Rare: TTTTTTATTT

  18. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Recombination Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAT* AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AA|TTTTTTTT AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA TTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTA*TTT

  19. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Recombination Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AA|TTTTTTTT AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA

  20. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Recent Events Recombination • Combine portions of two common haplotypes and see if they form a rare haplotype Common: Possible Recombinations: AAAAAAAAAAAA|TTTTTTTT TTTTTTTTTTAAA|TTTTTTT AAAA|TTTTTT AAAAA|TTTTT AAAAAA|TTTT AAAAAAA|TTT AAAAAAAA|TT

  21. Select a region in a haplotype and find the frequency of variation Group variations into common and rare Find recent point mutations Find recent recombination events Rare Mutations • Marked by a “|” at the border between one haplotype and another haplotype Possible Recombinations: Actual Recombinations: AA|TTTTTTTTAA|TTTTTTTT AAA|TTTTTTT AAAA|TTTTTT AAAAA|TTTTT AAAAAA|TTTT AAAAAAA|TTT AAAAAAAA|TT

  22. Sample input and output • chr-haplotypes.txt: new_chr-haplotypes.txt: • Indv1 Indv1 • TTTTTTTTTTTTTTT T T T T T T T T T T • Indv1 Indv1 • AAAAAAAAATTTTTT A A A A A A A A A T* • Indv2 Indv2 • AATTTTTTTTTTTTT A A|T T T T T T T T • Indv2 Indv2 • TTTTTTATTTTTTTT T T T T T T A*T T T

  23. Visualization Tool

  24. Expanding to the Whole Chromosome • Now that we have a way to look for variations in regions of a chromosome, we can expand the technique to look for variations in a whole chromosome • We used a technique of overlapping windows AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA |AAAAAAAAAA| |AAAAAAAAAA| |AAAAAAAAAA| |AAAAAAAAAA| |AAAAAAAAAA|

  25. Overlapping Windows Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAT* AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AA|TTTTTTTT AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTTTTT TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA TTTTTTTTTT AAAAAAAAAAAAAAA TTTTTTA*TTT

  26. Overlapping Windows Individual’s Frequency of Identify Haplotypes Variation Events TTTTTTTTTTTTTTT AAAAAAAAAA AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAT* AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT AAAAAAAAAAAAAAA AAAAAAAAAA – 49% TTTTTTTTTTTTTTT TTTTTTTTTT – 48% AAAAAAAAATTTTTT AAAAAAAAAT – 1% AATTTTTTTTTTTTT AATTTTTTTT – 1% TTTTTTATTTTTTTT TTTTTTATTT – 1% AAAAAAAAAAAAAAA

  27. Overlapping • Recombination events that looked like point mutations Common: AAAAAAAAAAAAAAA TTTTTTTTTTTTTTT Rare: AAAAAAAAATTTTTT First 10 Slide over 5 and next 10 Common: AAAAAAAAAA Common: AAAAAAAAAA TTTTTTTTTT Rare: AAAAAAAAAT* Rare: AAAA|TTTTTT AAAAAAAAA|T*TTTTT AAAAAAAAA|TTTTTT

  28. Applying to a Population’s Chromosome • Now that we have a technique to look for new variations in a whole chromosome • We can apply it to a population and identify regions where recent genetic events took place

  29. Identified Recent Genetic Events In chromosome 19: Unique point mutations = 13723 Unique recombination events = 4065 Total unique events = 15697 Total point mutations = 46072 Total recombination events = 11381 Total number of events = 57453 Average point mutations per individual = 383 Average recombination events per individual = 94 Average events per individual = 478

  30. Point Mutations Number of Events SNP Position in the Haplotype

  31. Recombination Events Number of Events SNP Position in the Haplotype Haplotype

  32. Point Mutations and Recombination Events Number of Events Haplotype SNP Position in the Haplotype

  33. Conclusion • We have developed an algorithm for identifying recent genetic events in an individual • There were more point mutations identified than there were recombination events • Certain regions in the genome where there were many recent genetic events and there are regions with fewrecent genetic events

  34. Future Work • Run the algorithm over the whole genome • Extend the algorithm to multiple populations • Identify recent events that are unique to a population vs. ones that are shared • Identify genetic relations between common haplotypes • Create a chronological order of recent events in an individual • Adapt the algorithm for high-throughput sequencing data

  35. UCLA ZarLab • Dr. EleazarEskin • All the lab people SoCalBSI • Dr. JamilMomand • Dr. Sandra Sharp • Dr. Nancy Warter-Perez • Dr. Wendie Johnston • Dr. Beverly Krilowicz • Dr. Silvia Heubach • Dr. Jennifer Faust • Ronnie Cheng Funded By: • SoCalBSI 2009 Interns

  36. Determining ancestors • The other ancestors are determined through SNP differences of 2 or more

  37. My Project • Red line • Point Mutation • Blue line • Ancestor to common relationship • Black dashed line • Haplotype resulted from cross over mutation

  38. Graph Graph is generated by a program called Graphviz which is a graphical visualization program

  39. Graph

More Related