1 / 29

Transcriptome Analysis

Transcriptome Analysis. Technology and Analysis overview. Roy Williams PhD; Burnham Institute for Medical Research. Measuring Gene Expression.

ahowton
Télécharger la présentation

Transcriptome Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transcriptome Analysis Technology and Analysis overview Roy Williams PhD; Burnham Institute for Medical Research

  2. Measuring Gene Expression Idea: measure the amount ofmRNAto see whichgenesare beingexpressedin (used by) the cell. Measuringproteinwould be more direct, but is currently harder.

  3. General assumption of microarray technology • Use mRNA transcript abundance level as a measure of expression for the corresponding gene • Proportional to degree of gene expression

  4. How to measure RNA abundance • Several different approaches with similar themes • Illumina bead array – highly redundant oligo array • Affymetrix GeneChip – highly redundant oligo array • Nimblegen – highly redundant oligo array • 2-colour array (very long cDNA; low redundancy) • SAGE (random sequencing of cDNA library)

  5. The Illumina Beadarray Technology • Highly redundant ~50 copies of a bead • 60mer oligos • Absolute expression • Each array is deconvoluted using a colour coding tag system • Human, Mouse, Rat, Custom

  6. Figure 1. Design of a randomly assembled gene-specific probe array x,y array coordinate Kenneth Kuhn et al. Genome Res. 2004; 14: 2347-2356

  7. Affymetrix Technology • Highly redundant (~25 short oligos per gene) • Absolute expression • PM-MM oligo system valuable for cross hybe detection • Human, Mouse, E. coli, Yeast…….. • Affy and illumina arrays have been systematically compared

  8. Spotted Arrays • Low redundancy • cDNA and oligo • Two dyes Cy5/Cy3 • Relative expression • Cost and custom

  9. Single Colour Labelling

  10. Corporate Cartoons • Measuring Gene Expression • http://www.affymetrix.com/corporate/outreach/lesson_plan/downloads/function.swf

  11. Microarrays in action off on

  12. Areas Being Studied with Microarrays • Differential gene expression between two (or more) sample types • Similar gene expression across treatments • Tumor sub-class identification using gene expression profiles • Classification of malignancies into known classes • Identification of “marker” genes that characterize different cell types • Identification of genes associated with clinical outcomes (e.g. survival)

  13. Microarray experiments mRNA levels compared in many different contexts • Different tissues, same organism (brain vs liver) • See GNF Tissue Atlas • Same tissue, same organism (treated vs control, tumor vs non-tumor, undiff vs diff) • Same tissue, different organisms (wt v. ko) • Time course experiments (effect of ttt, development)

  14. ChIP-on-chip hybridization Snap shot of transcription factors bound to locations in the Genome

  15. Genome-wide data sets: considerations and complications Factors to consider when analysing genome-wide data sets • How were the data generated? • What method was used? Are there any technical limitations? • On how many repeats is the data set based? How reproducible are the data? • Has the error rate been estimated? If so, how high is it? • Can known and trusted examples be confirmed? • How complete is the data set? Which genes or proteins are missing? • How was the data analysed? • Do genes or proteins of interest also appear in other data sets? • Complications when comparing genome-wide data sets • Different isolates and experimental conditions have been used • Databases are not interconnected • There is no unifying data format (.GCT format is close…..) • Keywords for database searches are not standardized • Some data are not readily accessible or in the public domain

  16. Microarrays and Stem Cells • Isolate and analyse as many different types of stem cells (and others) as possible • Discover structure/patterns in data • Classify/cluster genes according to their expression profile • Determine transcription circuitry

  17. Aim: • Group together (cluster) genes that behave similarly across different conditions • Define/quantify similarities • There are dozens of similarity metrics • Euclidean distance • Pearson correlation • Standard correlation

  18. Hierarchical Gene Clustering Genes 153 stem cell Samples

  19. Cell Cycle Gene Cluster Guilt by Association!

  20. Cell Cycle Genes In Stem Cells

  21. K-means and NMF clustering • Classifies genes into non-overlapping groups • The number of clusters is specified by the user (k) • Unsupervised methods

  22. K-means Clustering • INPUT

  23. K-means Clustering • OUPUT: sorted into 16 groups

  24. NMF clustering: current favourite Output is a correlation matrix Can see relationships between clusters Gives QC for output USED FOR: Cancer Classification Stem Cell Classification NMF is the most accurate classification technique: proven! Nested groups

  25. Which types of genes are enriched in a cluster? Our Cell cycle • Idea: Compare your cluster of genes with lists of genes with common properties (function, expression, location). • Find how many genes overlap between your cluster and a gene list. • Calculate the probability of obtaining the overlap by chance This measures if the enrichment is significant. • This analysis provides an unbiased way of detecting connections between expression and function. 0 15000 25 7 GeneOntology Cell cycle

  26. Pathway Analysis: Stem Cells

  27. Zoomed IN:

  28. Demo of Ingenuity Pathway Analysis • Detects networks in your data • Allows you to look for connections between genes and drugs/small molecules • User friendly

More Related