1 / 5

Genes to Trees Daniel Ayres and Adam Bazinet

Genes to Trees Daniel Ayres and Adam Bazinet. CMSC858P - Project 2 Proposal. Phylogenetic tree reconstruction. “Genes to Trees”. GenBank. Data collection. Phylogenetic analysis (PAUP, MrBayes, GARLI). Data curation. Multiple sequence alignment (ClustalW, Muscle, MAFFT).

Télécharger la présentation

Genes to Trees Daniel Ayres and Adam Bazinet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genes to Trees Daniel Ayres and Adam Bazinet CMSC858P - Project 2 Proposal

  2. Phylogenetic tree reconstruction “Genes to Trees” GenBank Data collection Phylogenetic analysis (PAUP, MrBayes, GARLI) Data curation Multiple sequence alignment (ClustalW, Muscle, MAFFT) Visual inspection and post-processing

  3. How does it work? • User inputs: • Set of DNA or amino acid sequences • Taxonomic constraints • Homologous sequences obtained from GenBank • Smaller groups eliminated • Multiple alignment of each group made • Uninformative columns removed • “Super-matrix” of all sequences created • Phylogenetics analysis performed • Output: • Phylogenetic tree of closely related organisms Workflow

  4. Is it feasible? • Scripting will be done with Perl • Extensive use of BioPerl libraries • Collection of modules for bioinformatics programming • Accessing sequence data from local and remote databases • Manipulating individual sequences • Searching for similar sequences • Creating and manipulating sequence alignments

  5. Why is this relevant? • Results can serve as a starting point for further analysis • Multiple analyses can be run in parallel • Workflow is modular • A step towards robust, high-throughput phylogenetics

More Related