1 / 27

The coalescent with recombination (Chapter 5, Part 1)

Explore the importance of relaxing the last assumption in the Wright-Fisher model - the absence of recombination. Discover how recombination affects genealogical relationships and the mathematical complexity of its analysis.

duaned
Télécharger la présentation

The coalescent with recombination (Chapter 5, Part 1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The coalescent with recombination(Chapter 5, Part 1)

  2. Six Assumptions of Wright-Fisher Model • Discrete and non-overlapping generations • Haploid individuals or two subpopulations • The population size is constant • All individuals are equally fit • The population has no geographical or social structure • The genes are not recombining No need to be relaxed Have been relaxed in Chapter 4 To be relaxed soon Comp 790-Coalescent with recombination

  3. No recombination: the last assumption • The last assumption that needs to be relaxed. • Why does it need? • Recombination occurs in most of the real data sets. • Why is it the last one to be relaxed? • More mathematically complex in analysis • The sequence samples are no longer related by a tree, but a graph or a collection of trees. Comp 790-Coalescent with recombination

  4. Outline • What is recombination? • An example of recombination • Hudson’s model of recombination • Wright-Fisher model with recombination • ARG Simulation Algorithm Comp 790-Coalescent with recombination

  5. What is recombination? • Recall the slides in lecture 5. • Recombination • A process in which new gene combinations are introduced • Eg. Crossover, Gene-conversion Comp 790-Coalescent with recombination

  6. No recombination Recombination What is the result of recombination? Grandparents Layer Parents Layer Recombination Children Layer Comp 790-Coalescent with recombination

  7. An example of recombination • The Apolipoprotein E gene • 31 different haplotypes (rows) • 21 segregating sites (columns) • Some pairs of sites cannot be fitted on a single tree. • There must be recombination. Comp 790-Coalescent with recombination

  8. Pair-wise LD measure • LD is a indirect measure of the correlation of genealogical trees for different segregating sites. • The higher LD, the more correlated the pair of sites • The color denotes the significance • There is a weak tendency that highly significant LD is found for close sites. Comp 790-Coalescent with recombination

  9. LD on different distance • LD is smaller the further apart the sites are. • Recombination leads to these pattern. • Sites far apart experience more recombination events. Comp 790-Coalescent with recombination

  10. A summary of the example • We cannot use previous model without recombination to fit these sequences. • Recombination is the cause. • Recombination can generate incompatibilities between pairs of sites. • Segregation sites far apart experience more recombination events, so they become less correlated. Comp 790-Coalescent with recombination

  11. Forward perspective: Parental chromosome is directly inherited from grandparental chromosomes Choose a random point uniformly Copy the genetic material from Chromosome A to the left of that point Copy the genetic material from Chromosome B to the right of that point. Hudson’s model of recombination A B Recombination Comp 790-Coalescent with recombination

  12. Reversed: Choose a chromosome from a parent The chromosome splits to two grandparental chromosomes Hudson’s model of recombination (cont.) Recombination Comp 790-Coalescent with recombination

  13. Modeling recombination and coalescence • Recombination events are the opposite of coalescent events. • Looking backwards • Coalescence is a combining event. • Recombination is a splitting event. • But how can we model both of these events? • Use a similar idea we did before (in adding mutation events to coalescence). • Question 1:What is this idea? Comp 790-Coalescent with recombination

  14. Another exponential distribution • We model the waiting time of recombination events to be an exponential distribution. • This distribution is independent of the coalescent process. • The parameter (or the intensity of recombination) depends on the recombination rate(ρ) in a sequence, times the number of ancestral lineages. Comp 790-Coalescent with recombination

  15. From Hudson’s model to Wright-Fisher model • Hudson’s model simplifies recombination process in terms of the biological facts. • The mechanisms of recombination are very different and complicated in eukaryotes, bacteria, and viruses. • The process is still not very well understood at the molecular level. • But still, it forms the basis for most applications of coalescent theory to recombining sequences. • Now we modify Wright-Fisher model to include this kind of simplified model of recombination. Comp 790-Coalescent with recombination

  16. Wright-Fisher model with recombination • Diploid Wright-Fisher Model • An individual perspective Comp 790-Coalescent with recombination

  17. Wright-Fisher model with recombination (cont.) • Haploid Wright-Fisher Model • We can ignore the existence of individuals under some conditions. • A sequence perspective Comp 790-Coalescent with recombination

  18. Discrete time formulation • In discrete model, let r be the recombination rate. • TRdenotes the number of generations until the first recombination event. • The probability that a sequence was created by recombination in j generation is • TR is geometrically distributed. Comp 790-Coalescent with recombination

  19. Continuous time approximation • Let the scaled recombination rate ρ=4Nr, similar to θ in mutation. J=2Nt • is exponentially distributed. • Note that the probability until now is for only one sequence Comp 790-Coalescent with recombination

  20. Continuous time approximation (cont.) • If there are k sequences, the parameter of the exponential distribution will be kρ/2 • Question 2: Why? • The waiting times for recombination events of every sequences are exponentially distributed ( i.e. Exp(ρ/2) ) and are independent. • The intensity of recombination in any of the k sequences equals the sum of the intensity in each sequence. Comp 790-Coalescent with recombination

  21. Continuous time approximation (cont.) • Again, both coalescence event or recombination event in k sequences are independent and exponentially distributed. • The waiting time of one of these events occurs will be Exp( ) • The probability that the first event is a coalescence is • The probability that it is a recombination is Comp 790-Coalescent with recombination

  22. ARG Simulation algorithm • 1. Start with k = n genes. • 2. For k sequences with ancestral material, draw a random number from the exponential distribution with parameter k(k − 1)/2 + kρ/2. This is the time to the next event. • 3. With probability (k − 1)/(k − 1 + ρ) the event is a coalescence event, otherwise it is a recombination event. • 4. If it is a coalescence event choose two sequences among ancestral sequences at random and merge them into one sequence inheriting the ancestral material to both of the sequences. Decrease k by one. If k = 1 end the process, otherwise go to 1. Comp 790-Coalescent with recombination

  23. ARG Simulation algorithm (cont.) • 5. If it is recombination, draw a random sequence and a random point on the sequence. Create an ancestor sequence with the ancestral material to the left of the chosen point and a second ancestor with the ancestral material to the right of the recombination point. Increase the number of ancestral sequences k by one and go to 1. Question 3: Where can we find the missing material of the ancestors? Splitting A random point Comp 790-Coalescent with recombination

  24. Is the single ancestor ever reached? • A coalescence event decreases k by one. • A recombination event increases k by one. • Question 4: Is there an end for the process? • YES! • Why? • It is a birth-death process. • The coalescent intensity is k(k-1)/2 [birth rate] • The recombination intensity is kρ/2 [death rate] • k(k-1)/2 >= kρ/2 • GMRCA is always found. But it may be a LONG time. Comp 790-Coalescent with recombination

  25. Genealogical structure: From tree to graph • With recombination, we must use a graph to model the sequence relations rather than a tree. • ARG (Ancestral Recombination Graph) • The graph resulting from the algorithm Comp 790-Coalescent with recombination

  26. Genealogical structure:From graph to a collection of trees • However, if we focus on a single point on the sequence, there will be no recombination! • Question 5: Why? • The point of child sequence is always inherited from only one parent sequence. • Local tree • The tree relating the sequences in a single position • The genealogy graph can be seen as a collection of local trees, one for each position. Comp 790-Coalescent with recombination

  27. Next time • More on simulation algorithm • Effect of a single recombination event • Coalescent events with gene conversion Comp 790-Coalescent with recombination

More Related