1 / 18

MCB 372 #12: Tree, Quartets and Supermatrix Approaches

MCB 372 #12: Tree, Quartets and Supermatrix Approaches. J. Peter Gogarten University of Connecticut Dept. of Molecular and Cell Biology. Collaborators: Olga Zhaxybayeva (Dalhousie) Jinling Huang (ECU) Tim Harlow (UConn) Pascal Lapierre (UConn) Greg Fournier (UConn).

cerin
Télécharger la présentation

MCB 372 #12: Tree, Quartets and Supermatrix Approaches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MCB 372 #12: Tree, Quartets and Supermatrix Approaches J. Peter GogartenUniversity of ConnecticutDept. of Molecular and Cell Biology Collaborators: Olga Zhaxybayeva(Dalhousie)Jinling Huang(ECU) Tim Harlow (UConn)Pascal Lapierre (UConn)Greg Fournier (UConn) Edvard Munch, The Dance of Life (1900) Fundedthrough the NASA Exobiology and AISR Programs, and NSF Microbial Genetics

  2. In the Felsenstein Zone A B A B “long branches attract” 0.8 0.8 0.1 0.1 0.1 C D “true” tree C D inferred tree A C D B

  3. Protpars reconstructions A C B D A C D B

  4. ML reconstructions with alignment step

  5. the two longest branches join together seq. from A seq. from D seq. from C seq. from B seq. from A True Tree: seq. from C seq. from D seq. from B long branch attraction artifact 100% bootstrap support for bipartition (AD)(CB) What could you do to investigate if this is a possible explanation? use only slow positions, use an algorithm that corrects for ASRV

  6. Consensus of all trees from all bootstrap samples +----------------------------------2 P_mobilis | | +------6 T_petrophila | +134.0-| | +675.0-| +------7 T_RQ2 | | | | +-60.0-| +-------------5 T_maritima | | | +------| +--------------------4 T_lettingae | | +------3 Thermosipho +--------------206.0-| +------1 F_nodosum 2 P_mobilis 3 Thermosipho 1 F_nodosum 4 T_lettingae 5 T_maritima 7 T_RQ2 6 T_petrophila

  7. Consensus of all consensus trees +------6 T_petrophila +318.0-| +745.0-| +------7T_RQ2 | | +330.0-| +-------------5T_maritima | | +------| +--------------------4 T_lettingae | | | | +------3 Thermosipho | +--------------562.0-| | +------1 F_nodosum | +----------------------------------2 P_mobilis 2 P_mobilis 3 Thermosipho 1 F_nodosum 4 T_lettingae 5 T_maritima 7 T_RQ2 6 T_petrophila

  8. Consensus of all collapsed (<95%) consensus trees +----------------------------------2 P_mobilis | | +------6 T_petrophila | +134.0-| | +675.0-| +------7 T_RQ2 | | | | +-60.0-| +-------------5 T_maritima | | | +------| +--------------------4 T_lettingae | | +------3 Thermosipho +--------------206.0-| +------1 F_nodosum If you still have difficulties with tree do the tree tests at http://www.sciencemag.org/cgi/content/full/310/5750/979/DC1

  9. core METAGENOME Strain-specific Dispensable Welch et al, 2002 Edwards-Venn cogwheel A.W.F. Edwards 1998 Dispensable + + + Pan-genome

  10. Genomic Islands Binnewies, Motro et al., Funct. Integr. Genomics (2006) 6: 165–185

  11. Gene frequency in a typical genome • Pick a random gene from any of the 293 genomes • Search in how many genomes this gene is present • Sampling of 15,000 genes (Accessory pool) (Character genes) (Extended Core) F(x) = sum [ An*exp(Kn*x)]

  12. Kézdy-Swinbourne Plot Novel genes after looking in x + ∆x genomes ~230 novel genes per genome If f(x)=K+A • exp(-k•x), then f(x+∆x)=K+A • exp(-k•(x+∆x)). Through elimination of A: f(x+∆x)=exp(-k • ∆x) • f(x) + K’ And for x, f(x)K, f(x+∆x)K Novel genes after looking in x genomes only values with x ≥ 80 genomes were included Even after comparing to a very large (infinite) number of bacterial genomes, on average, each new genome will contain about 230 genes that do not have a homolog in the other genomes.

  13. Gene frequency in individual genomes Accessory Pool(mainly genes acquired form the mobilome) Character Genes ca. 25% ca.70% ca. 5% Extended Core Approximate number of genes sampled in 200 bacterial genomes: 25,160 core genes 453,781 extended core genes 156,259 accessory genes

  14. The Phylogenetic Position of Thermotoga (a) concordant genes, (b) all genes & according to 16S (c) according to phylogenetically discordant genes Gophna, U., Doolittle, W.F. & Charlebois, R.L.: Weighted genome trees: refinements and applications. J. Bacteriol. (2005)

  15. From: Delsuc F, Brinkmann H, Philippe H. Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 2005 May;6(5):361-75.

  16. Supertreevs. Supermatrix From: Alan de Queiroz John Gatesy: The supermatrix approach to systematics Trends Ecol Evol. 2007 Jan;22(1):34-41 Schematic of MRP supertree (left) and parsimony supermatrix (right) approaches to the analysis of three data sets. Clade C+D is supported by all three separate data sets, but not by the supermatrix. Synapomorphies for clade C+D are highlighted in pink. Clade A+B+C is not supported by separate analyses of the three data sets, but is supported by the supermatrix. Synapomorphies for clade A+B+C are highlighted in blue. E is the outgroup used to root the tree.

  17. B) Generate 100 datasets using Evolver with certain amount of HGTs A) Template tree C) Calculate 1 tree using the concatenated dataset or 100 individual trees D) Calculate Quartet based tree using Quartet Suite Repeated 100 times…

  18. Supermatrix versus Quartet based Supertree inset: simulated phylogeny

More Related