1 / 16

Towards Identifying Lateral Gene Transfer Events

Towards Identifying Lateral Gene Transfer Events. L. Addario -Berry, M. Hallett , J. Lagergren Presented By: Jeff Mathew. Roadmap. Key terms τ -transfer problem H-moves and I-moves algorithm Tree generation for simulation Experimental results Conclusions and future work. LGT = HGT

ady
Télécharger la présentation

Towards Identifying Lateral Gene Transfer Events

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Identifying Lateral Gene Transfer Events L. Addario-Berry, M. Hallett, J. Lagergren Presented By: Jeff Mathew

  2. Roadmap • Key terms • τ-transfer problem • H-moves and I-moves algorithm • Tree generation for simulation • Experimental results • Conclusions and future work

  3. LGT = HGT Root of scenario tree must correspond to root of gene tree The scenario tree is connected and respects the direction of evolution implied by the arcs of T and S. Lateral transfer scenario

  4. α-activity • An α-active scenario for a gene tree and species tree allows at most alpha copies of a gene to simultaneously exist in the genome of an ancestral taxon. • Authors focus on 1-active scenarios though intractability results have been proved earlier for α≥ 1.

  5. τ-transfer problem • Input: Species tree S, gene tree T, integer τ • Output: A τ* lateral transfer scenario for S and T, τ* ≤τ • Intractability result • The decision version of the α-Active, τ-Transfer Problem (does there exist a α-active scenario with cost ≤ τ?) is NP-complete. • τ is the number of lateral transfer events needed to explain the difference between S and T

  6. Algorithm • 2 Phase approach • Phase 1 • While H-fat or I-fat vertices remain • Perform H-fat move or I-fat move • At the end of phase 1, we are guaranteed that the scenario is 1-active. What about cycles? • Phase 2 • Remove minimum number of LGT events from each candidate to make it acyclic. • Running Time: 24τ n2

  7. Simulating species trees • Create random species tree S on n-leaves. Θ(log n) expected depth • S is supposed to reflect the actual evolutionary relationships between taxa • S is ultrametric. Therefore, edge-weights correspond to time. • Randomly assign weights to every edge such that every root-to-leaf path has weighted sum 1.

  8. Simulating gene trees • Begin with generated ultrametric species tree • Lateral transfer events occur according to a Poisson process with mean rate λ • Moving from root to leaves, for each vertex x0 with children x1 and x2, examine both edges • If the Poisson process provides us with a lateral transfer event along (x0, x1), we add it and point it to a randomly chosen edge alive at that point in time. • Else add a speciation event for x1 • Repeat the analysis for (x0, x2)

  9. Degenerate Cases • Simulation can result in plausible biological events that are not detectable by the algorithm. • Useless transfers: LGTs that don’t change the gene tree • Transfer-loss events: One child of a node is a LGT event. Another child is a loss event.

  10. Ω = number of repetitions • τ = true number of LGT events • τ‘ = minimum cost LGT scenario found by algorithm • λ = mean rate of LGTs from Poisson process Results

  11. Finding the saturation point • The point when the average τ‘ stops increasing. • Random trees from a large pool were chosen as gene trees and species trees • Trials suggest that saturation point is slightly above n/2, i.e., when τ > n/2, the algorithms stops detecting new LGT events • Thus, if τ’> n/2, the correspondence between T and S via LGT events is not very meaningful.

  12. Ω = number of repetitions • τ = true number of LGT events • τ‘ = minimum cost LGT scenario found by algorithm • λ = mean rate of LGTs from Poisson process Results

  13. Ω = number of repetitions • τ = true number of LGT events • τ‘ = minimum cost LGT scenario found by algorithm • λ = mean rate of LGTs from Poisson process Results

  14. Ω = number of repetitions • τ = true number of LGT events • τ‘ = minimum cost LGT scenario found by algorithm • λ = mean rate of LGTs from Poisson process Results

  15. Conclusions • Empirically verified feasibility of the τ-transfer algorithm • Degenerate events such as transfer-loss events that result in over-estimates of transfers occur with low probability • Achieved near-optimal scenarios when λis low enough not to cause saturation • The cycle elimination phase of the algorithm is extremely rare in practice implying a O(22τ n2) running time.

  16. Future work and open problems • Use weighted gene trees and species trees • Species trees are nearly ultra-metric while gene trees are not • Do fast algorithms exist when the input is a set of gene trees with no species tree? • Tractability on larger phylogenies • Can we consider gene duplication, lateral gene transfers, and other events simultaneously? • Can we use probabilistic models that assign likelihood events to various events and optimize over such models in a tractable manner?

More Related