1 / 10

Transcriptome analysis

BIT 815: Analysis of Deep Sequencing Data. Transcriptome analysis. With a reference Challenging due to size and complexity of datasets Many tools available, driven by biomedical research GATK and R/ Bioconductor offer many options

atalo
Télécharger la présentation

Transcriptome analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIT 815: Analysis of Deep Sequencing Data Transcriptome analysis • With a reference • Challenging due to size and complexity of datasets • Many tools available, driven by biomedical research • GATK and R/Bioconductor offer many options • Start by mapping reads to reference genome with a mapping/alignment tool – deal with exon-intron junctions • Reconstruct transcripts from mapped reads – deal with alternate splicing products • Calculate relative abundance of different transcripts • Estimate biological significance based on annotation • Example tools: Bowtie/TopHat, Cufflinks, Myrna

  2. Workflow summary from a review “From RNA-seq reads to differential expression results”, by Oshlack et al, Genome Biol 11:220, 2010. Note emphasis on statistical analysis methods; an equal emphasis should be placed on experimental design.

  3. BIT 815: Analysis of Deep Sequencing Data The ‘Tuxedo’ suite of programs: Bowtie, TopHat, Cufflinks and CummeRbund See Trapnell et al, Nature Protocols 7:562 – 578, 2012 for details

  4. TopHat maps reads • Cufflinks assembles transcripts • Cuffmerge merges transcript data detected in different treatments • Cuffdiffevaluates differential expression • CummeRbund provides visualization tools

  5. BIT 815: Analysis of Deep Sequencing Data Why merge data across treatments?

  6. BIT 815: Analysis of Deep Sequencing Data Differential transcript abundance mechanisms

  7. BIT 815: Analysis of Deep Sequencing Data Transcriptome analysis • Without a reference • First step is assembly • Transcriptome assembly pipelines • Velvet/Oases – Oases is a post-assembly processor for Velvet • Trans-ABySS (BCGSC) – based on ABySS parallel assembler • Rnnotator – based on Velvet • Trinity (Broad Institute) – a set of three programs • Common strategy: Assembly at multiple k-values, then merging of resulting contigs, followed by refinement • Once an assembly is available, continue with analysis as before

  8. BIT 815: Analysis of Deep Sequencing Data After Transcriptome Assembly… • Some amount of analysis of differential splicing versus differential promoter activity is possible, but conclusions may be less robust in the absence of a reference • The fraction of the total number of genes that can be discovered by RNA-seq depends on the diversity of tissue types and developmental stages analyzed, as well as the depth of sequencing

  9. 330 million SOLiD reads from a human cell line detect only about 67% of all annotated transcripts in the human genome. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Labaj et al, Bioinformatics 27:i383-91, 2011

  10. BIT 815: Deep Sequencing Transcriptomeanalysis with RSEMRNA-Seq with Expectation MaximizationLi & Dewey, BMC Bioinformatics 12:323, 2011 (a). Allows estimation of transcript abundance without a reference genome, based on alignments to assembled transcripts, although the transcripts can be taken from a reference genome sequence if it is available (b). Uses the Bowtie aligner by default, but considers reads that map to multiple locations in the reference transcript collection (c). For each sample, files of estimated transcript and isoform abundance are produced, along with SAM files of alignments. (d). The files of transcript and isoform abundance can be used to evaluate differential expression using tools from R and Bioconductor

More Related