1 / 63

Expression Data and Microarrays

Expression Data and Microarrays. CMMB November 29, 2001 Todd Scheetz. Overview. Gene expression mRNA protein Northern Blots RT-PCR SAGE MicroArray. Gene Expression Review. Transcription generation of mRNA from genomic DNA a complete copy is made, including both introns and exons.

euclid
Télécharger la présentation

Expression Data and Microarrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expression Data and Microarrays CMMB November 29, 2001 Todd Scheetz

  2. Overview Gene expression • mRNA • protein Northern Blots RT-PCR SAGE MicroArray

  3. Gene Expression Review Transcription • generation of mRNA from genomic DNA a complete copy is made, including both introns and exons.  pre-mRNA genomic AAAA... pre-mRNA

  4. Gene Expression Review Processing / Splicing • removal of the introns from the pre-mRNA  mature mRNA • also exported from the nucleus to the cytoplasm • alternative splicing AAAA... pre-mRNA AAAA... mature mRNAs (splice variants) AAAA...

  5. Gene Expression Review Translation • takes an mRNA molecule and uses it to construct an amino acid sequence. • the ribosome is the underlying machinery used in the process of translation.

  6. Measuring Gene Expression Two major differentiating factors… Quantitative vs. Qualitative mRNA vs protein Most techniques can be used to determine quantitative expression levels. Ex. EST sequencing

  7. Measuring Gene Expression More sophisticated experiments… Comparing expression levels of multiple genes Comparing co-regulation or differential regulation. Ex. EST sequencing

  8. Northern Blot Measure relative expression levels of mRNA 1. mRNA isolation and purification 2. electrophorese on a gel 3. The gel is probed by hybridizing with a labeled clone for the gene under study.

  9. Northern Blot

  10. Northern Blot

  11. RT-PCR Measures relative expression of mRNA 1. Isolate and purify mRNA 2. reverse transcription 3. PCR amplification 4. run on gel and probe/hybridize

  12. RT-PCR

  13. RT-PCR Why use RT? Can observe very low levels of expression Requires very small amounts of mRNA The bad… Potential expression-level skew due to non-linearity of PCR Have to design multiple custom primers for each gene.

  14. SAGE

  15. SAGE

  16. SAGE Tags are isolated and concatermized. Relative expression levels can be compared between cells in different states.

  17. SAGE --gene to tag mapping http://www.ncbi.nlm.nih.gov/SAGE/SAGEcid.cgi?cid=28726

  18. MicroArray What are they? allow 1000’s of expression analyses to be performed concurrently. What technologies are used? How to analyze the image? How to analyze the expression data? What bioinformatics challenges are there?

  19. Potential Microarray Applications • Drug discovery / toxicology studies • Mutation/polymorphism detection Differing expression of genes over: • Time • Tissues • Disease States • Sub-typing complex genetic diseases

  20. DNA Array Technology

  21. Physical Spotting

  22. MicroArray

  23. Glass Microarray 326 Rat Heart Genes, 2x spotting

  24. Photolithographic

  25. MicroArray

  26. MicroArray

  27. MicroArray

  28. MicroArray Overview of data capture two different mRNA populations, labeled with different fluors excited by a laser each fluour excites at a different wavelength, which is captured using a photodetector attached to a filter tuned to the particular fluor

  29. MicroArray Overview of image analysis spot identification grid alignment skew image normalization variable background uneven hybridization

  30. Microarray Data Pipeline

  31. Image Analysis/Data Quantization • Feature (target  probe) segmentation • Data extraction and quantization of: • Background • Feature • Correlation of feature identity and location within image • Display of pseudo-color image

  32. Image Segmentation +

  33. Microarray Experiment Design • Type I: (n = 2) • How is this gene expressed in target 1 as compared to target 2? • Which genes show up/down regulation between the two targets? • Type II: (n > 2) • How does the expression of gene A vary over time, tissues, or treatments? • Do any of the expression profiles exhibit similar patterns of expression?

  34. Motivation & Design Constraints • Probe set design involves the prioritizing and parsing of an initial data set containing potentially hundreds of thousands of probe candidates to define a reasonably sized set for use in a microarray experiment • A single hybridization can produce several thousand data tuples, each containing multiple (n>10) measurements • No “All-in-one” software package is currently available, therefore, communication of data between the packages must be facilitated by the pipeline

  35. Probe Set Design • Goal of probe set design is to identify a reasonably sized subset of probes from a much larger starting set from a variety of sources • By defining a set of criteria, an investigator should be able to create new probe sets or refine existing sets • Pruning a data set should be done in several stages: • Use readily available information to limit scope of data • Obtain more information about remaining probes • Narrow focus based on additional information • Iterate until desired data set is obtained

  36. 1° -- Direct Species Tissue Chromosome Sequence Available Quality Tail/Poly(A) signal Map position known? Cluster size 2° -- Indirect Blast results Confidence value Homology (or lack of) Annotation contains words like “transfer” 3’ & 5’ EST reads hit same gene Syntenic Map Information Known phenotypes in other species Sample Probe Set Design Criteria

  37. cDNA Microarray Slide Creation • cDNA clones defining a probe set must be re-arrayed from their sources (e.g. local storage or commercial) into a format suitable for amplification and printing (e.g. 96-well microtiter plates) • Based on the size of the probe set and the limitations of the printer, a parameter set (# of pens, spot spacing, grid dimensions,…) must be defined for printing the probe set onto the slide(s) • A mapping operation must be performed in order to track each probe from source to destination in order to correlate known information with a particular “spot” in a microarray image

  38. MicroArray Overview of data analysis vs. time vs. other genes co-reg. diff. reg pathway ident.

  39. Data Analysis • Data analysis consists of several post-quantization steps: • Statistics/Metrics Calculations • Scaling/Normalization of the Data • Differential Expression • Coordinated Gene Expression (aka clustering) • Most software packages perform only a limited number of analysis tasks • Databases can facilitate the movement of data between packages

  40. Scaling and/or Normalization • Positive Controls • ‘Spiked’ DNA • Housekeeping Genes • Total Array • Negative Controls • Foreign DNA • ‘Empty’ spots

  41. Scaling and/or Normalization • Linear regression • Log-linear regression • Ratio statistics • Log(ratio) mean/median centering • Nonlinear regression

  42. MicroArray Bioinformatics challenges 1. data management 2. utilizing data from multiple experiments (type II) 3. utilizing data from multiple groups * with different technologies * with only processed data available

  43. 3’ … A C G G G C … … ATG … 5’ 3’ … A C G G G A … … ATG … 5’ 3’ … A C G G G C … … ATG … 5’ Condition1 2 3 4 Expression Level + - + + - - + + - - - + + - - + - ? Gene A B C E D 0 60 120 180Time Database(s) Local Alignment A Expression Level - 0 + C B 1 2 3 4Timepoints Search Window

  44. MicroArray data management clone - spot clone - gene raw expression level normalized expression level annotation/links expression profile

  45. MArray Expt Mgmt Redux Experiment 5-Tuple: (Probe Set_ID, Target_ID, Hyb Condition_ID, Hyb Iteration_ID, GenePix_Analysis_ID)

  46. Database Support (EBI Schema) http://www.ebi.ac.uk/arrayexpress/ http://www.bioinf.man.ac.uk/microarray/maxd

  47. Differential Expression • Type I analysis • Look for genes with vastly different expression under different conditions • How do you measure “vastly different”? • What role should derived statistics play?

  48. Type I: Differential Expression

  49. Coordinated Gene Expression • Type II analysis • “Eisen”ized data (dendrograms) • Self-Organizing Maps • Principal Component Analysis • k-means Clustering

  50. Hierarchical Clustering

More Related