1 / 28

Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Bio

Web Valley 2014 16S sequencing for microbiome studies. Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Biology University of Trento Italy. The human microbiome. 10x more microbial t han human cells

jake
Télécharger la présentation

Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Bio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Valley 2014 16S sequencing for microbiome studies Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Biology University of Trento Italy

  2. The human microbiome • 10x more microbial than human cells • 1M times as many microbes inside each of us than humans on earth • 100x more microbial than human genes Nature 486(7402) Who’s there? What are they doing? Scientific American, May 2012 Metagenomics: Study of uncultured microorganisms from the environment, which can include humans or other living hosts Focus on taxonomic and functional characteristics of the total collection of microorganisms within a community Main experimental tool is high-throughput sequencing: ~10M short (~100nt) reads per dataset

  3. 16S sequencing Liu, Bo, et al. "Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences." BMC genomics 12.Suppl 2 (2011): S4. • PROS: • Cost-effective • Avoids non-bacterial contamination • The resulting dataset is reasonable in size and complexity • Mature analysis software available • Can potentially catch low abundance bacteria • CONS: • Not genome-wide (so no metabolic potential) • Limited taxonomic resolution • Not effective for pathogen profiling • Cannot catch viruses and eukaryotes • Several (usually underestimated) biases • Almost impossible cross-study comparisons

  4. 16S-based “metagenomics” V6 George Rice, Montana State University PCR to amplify the single16S rRNA marker gene Samples Classify sequence microbe Microbes Counts V2

  5. The ribosome Ribosomes are the universal machinery that translate the genetic code into proteins. • The ribosomal machinery is composed by: • Two subunits • several proteins • mRNAs • tRNAs • rRNA (5S, 16S, 23S)

  6. The ribosome

  7. The ribosome

  8. The 16S rRNA  Center for Molecular Biology of RNA, University of California

  9. The 16S rRNA gene 1/3 This annotation has been performed on a representative E. coli 16S sequence Baker, G. C., J. J. Smith, and Donald A. Cowan. JMMs 55.3 (2003): 541-555.

  10. The 16S rRNA gene 2/3

  11. The 16S rRNA gene 3/3

  12. The 16S rRNA V6 V7 V7 V6 V4 V8 V4 V5 V8 V5 V3 V3 V1 V9 V1 V9 V2 V2  Center for Molecular Biology of RNA, University of California

  13. 16S: The 530 loop structure of six species 1

  14. The 16S gene: statistical view of the variable regions Variability within the 16S rRNA gene Andersson, Anders F., et al. " PloS one 3.7 (2008) V6 V3 • Which HTM would you choose? • 454 historically well suited (~400nt reads  3 regions), good cost/throughput trade-off • Illumina (HiSeq) is not optimal (shorter reads, unnecessary high throughput) • Illumina MiSeq and IonTorrent can be a nice compromise. V2 V5 V9 V4 V8 V1 V7 Claesson, Marcus J., et al. Nucleicacidsresearch 38.22 (2010) Multiple variable regions can be targeted simultaneously (if you have long enough reads!)

  15. Which HTM would you choose? Throughput Very low (~1 seqs / sample) Medium (~3k seqs / sample) High (~50k seqs / sample)

  16. The data revolution is now

  17. One of the challenges: which technology? http://flxlexblog.files.wordpress.com/

  18. One of the challenges: which technology? MolEcolResour. 2011 Sep;11(5):759-69

  19. One of the challenges: which technology? MolEcolResour. 2011 Sep;11(5):759-69

  20. In silico primer validation/testing The idea: use the available (taxonomically labeled) 16S sequences to check which organisms are targeted by the primers http://www.arb-silva.de/search/testprobe (to test single probes) http://www.arb-silva.de/search/testprime (to test pairs of probes, below)

  21. An example on “universal” primers Fw: CCTACGGGRSGCAGCAG Rev: ATTACCGCGGCTGCT (ourprimers)

  22. An example on “universal” primers Archaea, 49.2% matches Bacteria, 94.7% matches Proteobacteria, 97.1 % matches WS6 candidate division, 2.9 % matches BE AWARE: universal primers do not exists, and the choice of the primers is going to bias your study no matter what!

  23. Validation of hypervariable regions using a mock community Ward, Doyle V., et al. PloSone 7.6 (2011): e39315-e39315.

  24. Variability within hyper variable regions

  25. A high level 16S analysis workflow Hamady, Micah, and Rob Knight. Genome research 19.7 (2009): 1141-1152.

  26. Schematic 16S analysis workflow Input dataset (one sample) Multiple-sequence alignment Operational taxonomic unit (OTUs) definition CAAGCCGAAUGCAGCUAUUC CAAGCCUGAUGCAGCCAUGC CAUGCCUGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUCC CAAGGCUGAGACAGCCUUGC CAAGCCUGAUGCUGCCAUGC CAAGCCGAAUGCAGCUAUGC CAAGCCGGAGACAGCCUUGC AAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUUC CAAGCCUGAUGCAGCCAUGC CAUGCCUGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUCC CAAGGCUGAGACAGCCUUGC CAAGCCUGAUGCUGCCAUGC CAAGCCGAAUGCAGCUAUGC CAAGCCGGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUUC CAAGCCGAAUGCAGCUAUCC CAAGCCGAAUGCAGCUAUGC CAUGCCUGAGACAGCCUUGC CAAGGCUGAGACAGCCUUGC CAAGCCGGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCUGAUGCAGCCAUGC CAAGCCUGAUGCUGCCAUGC AAAGCCUGAUGCAGCCAUGC OTU_1 OTU_2 OTU_3 OTU_1 OTU_3 OTU_2 OTU_1  30% OTU_2  30% 16S DB with taxonomic information OTU_3  40% OTU_1  E. coli OTU_2  S. aureus OTU_3  S. pneumoniae

  27. Intro into diversity analysis • Alpha-diversity • A measure of how diverse (complex) a microbial community is • “within sample” diversity • Species richness (i.e. number) is a widely use alpha diversity index • Beta-diversity • A measure of how different two microbial communities are • “between sample” diversity • Inverse of number of shared species is one possibility to estimate beta-diversity Jurasinski, G., Retzer, V., & Beierkuhnlein, C. (2009). Oecologia, 159(1), 15-26.

  28. Practical tutorial time http://nickloman.github.io

More Related