270 likes | 658 Vues
The Evolutionary Basis of Bioinformatics: An Introduction to Phylogenetics. > Sequence 1 GAGGTAGTAATTAGATCCGAAA… > Sequence 2 GAGGTAGTAATTAGATC T GAAA… > Sequence 3 GAGGTAGTAATTAGATC T G TC A…. http://bioquest.org/bedrock. What is phylogenetics?.
 
                
                E N D
The Evolutionary Basis of Bioinformatics:An Introduction to Phylogenetics > Sequence 1 GAGGTAGTAATTAGATCCGAAA… > Sequence 2 GAGGTAGTAATTAGATCTGAAA… > Sequence 3 GAGGTAGTAATTAGATCTGTCA… http://bioquest.org/bedrock
What is phylogenetics? Phylogenetics is the study of evolutionary relationships among and within species. birds snakes rodents primates crocodiles marsupials lizards
crocodiles birds lizards snakes rodents primates marsupials What is phylogenetics? This is an example of a phylogenetic tree.
Applications of phylogenetics • Forensics: Did a patient’s HIV infection result from an invasive dental procedure performed by an HIV+ dentist? • Conservation: How much gene flow is there among local populations of island foxes off the coast of California? • Medicine: What are the evolutionary relationships among the various prion-related diseases? To be continued…
Sequence A Sequence B Sequence C Sequence D Sequence E Phylogenetic concepts:Interpreting a Phylogeny Which sequence is most closely related to B? A, because B diverged from A more recently than from any other sequence. Physical position in tree is not meaningful! Only tree structure matters. Time
A A A B B ? ? X X B ? = = Root Root ? C ? ? D D C C D Time Phylogenetic concepts:Rooted and Unrooted Trees
chicken human fruit fly chicken oak human – bones + bones bacteria oak archaea – cell nuclei fruit fly bacteria archaebacteria oak bacteria archaebacteria fruit fly + cell nuclei human chicken Rooting and Tree Interpretation
shark ray ray trout shark trout eagle eagle bat mouse bat mouse Rooting Methods Outgroup root Add 2+ taxa whose branches contain tree’s new root Must already know position of new tree’s root (often go from higher to lower taxonomic unit, e.g. family  genus)
How Many Trees? (assuming bifurcation only)
Unrooted trees Rooted trees # sequences # pairwise distances # trees # branches /tree # trees # branches /tree 3 3 1 3 3 4 4 6 3 5 15 6 5 10 15 7 105 8 6 15 105 9 945 10 10 45 2,027,025 17 34,459,425 18 30 435 8.69  1036 57 4.95  1038 58 N N (N - 1) 2 (2N - 5)! 2N - 3 (N - 3)! 2N - 3 (2N - 3)! 2N - 2 (N - 2)! 2N - 2 How Many Trees?
Evolutionary trees measure time. Phylograms measure change. sharks seahorses sharks seahorses frogs owls Root frogs Root owls crocodiles armadillos crocodiles 5% change bats armadillos 50 million years bats Tree Types
Ultrametricity All tips are an equal distance from the root. Additivity Distance between any two tips equals the total branch length between them. X X a a Y b b e Y e c c d d Root Root a = b + c + d + e XY = a + b + c + d + e Tree Properties In simple scenarios, evolutionary trees are ultrametric and phylograms are additive.
X a b Y e c d Root Tree Building Exercise Using the distance matrix given, construct an ultrametric tree. Ultrametricity All tips are an equal distance from the root. a = b + c + d + e
Phylogenetic Methods Many different procedures exist. Three of the most popular: Neighbor-joining • Minimizes distance between nearest neighbors Maximum parsimony • Minimizes total evolutionary change Maximum likelihood • Maximizes likelihood of observed data
+ wings bat chimp hawk + hair + wings Phylogenetic concepts:Homology and Homoplasy no hair no wings Homology: identity due to shared ancestry (evolutionary signal) Homoplasy: identity despite separate ancestry (evolutionary noise)
Trees are hypotheses about evolutionary history So far, we’ve looked at understanding and formulating these hypotheses. Now, let’s turn our attention to testing them.
P Q P. A C A T A C G Q. G T A T A C G R. G C A C A T G S. G C A C A C A S R Tree Testing Let’s study the following four sequences: How can we explain the indicated character? Homology: Changed just once. Homoplasy: Changed twice or more. Homology more likely, but homoplasy still feasible.
P Q W. A C A T G T C A G A C G X. G T A T G T C A G A C G Y. G C A C A C T G A A T G Z. G C A C A C T G A A C A S R Tree Testing Now let’s look at four other sequences: Same two explanations possible. Any changes to their relative likelihood? Homology much more likely; homoplasy implausible.
A C Long branches  Strong evolutionary signal B D A C Short branches  Weak evolutionary signal B D A C Zero-length branches  NO evolutionary signal B D Tree Testing Basic principle: Tree-testing methods: Bootstrapping, Jackknifing, Split decomposition, …
Applications of phylogenetics 1. Forensics Did a patient’s HIV infection result from an invasive dental procedure performed by an HIV+ dentist?
So what do the results mean? • 2 of 3 patients closer to dentist than to local controls. Statistical significance? More powerful analyses? • Do we have enough data to be confident in our conclusions? What additional data would help? • If we determine that the dentist’s virus is linked to those of patients E and G, what are possible interpretations of this pattern? How could we test between them?
Applications of phylogenetics 2. Conservation How much gene flow is there among local populations of island foxes off the coast of California?
http://bioquest.org/bedrock/ Wayne, K. R, Morin, P.A. 2004 Conservation Genetics in the New Molecular Age, Frontiers in Ecology and the Environment. 2: 89-97. (ESA publication)
Applications of phylogenetics 3. Medicine What are the evolutionary relationships among the various prion-related diseases?
Linking Sequence and Structure Enolase