1 / 36

Introduction to genome biology and microarrays experiment

Introduction to genome biology and microarrays experiment. Statistics for Microarray Data Analysis – Lecture 1 The Fields Institute for Research in Mathematical Sciences May 25, 2002. The human genome. The cell is the fundamental working unit of every living organism.

dakota
Télécharger la présentation

Introduction to genome biology and microarrays experiment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to genome biology and microarrays experiment Statistics for Microarray Data Analysis – Lecture 1 The Fields Institute for Research in Mathematical Sciences May 25, 2002

  2. The human genome • The cell is the fundamental working unit of every living organism. • Humans: trillions of cells (metazoa); other organisms like yeast: one cell (protozoa). • Cells are of many different types (e.g. blood, skin, nerve cells), but all can be traced back to a single cell, the fertilized egg.

  3. Genes • The human genome is distributed along 23 pairs of chromosomes. • 22 autosomal pairs; • the sex chromosome pair, XX for females and XY for males. • In each pair, one chromosome is paternally inherited, the other maternally inherited. • Chromosomes are made of compressed and entwined DNA. • A (protein-coding) gene is a segment of chromosomal DNA that directs the synthesis of a protein.

  4. Chromosomes and DNA

  5. DNA • A deoxyribonucleic acid or DNA molecule is a double-stranded polymer composed of four basic molecular units called nucleotides. • Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), cytosine (C), and thymine (T). • The two chains are held together by hydrogen bonds between nitrogen bases. • Base-pairing occurs according to the following rule: G pairs with C, and A pairs with T.

  6. Genetic and physical maps

  7. Genetic and physical maps • Physical distance: number of base pairs (bp). • Genetic distance: expected number of crossovers between two loci, per chromatid, per meiosis. • Measured in Morgans (M) or centiMorgans (cM). • 1cM ~ 1 million bp (1Mb) in humans

  8. Central dogma • The expression of the genetic information stored in the DNA molecule occurs in two stages: (i) transcription, during which DNA is transcribed into mRNA; (ii) translation, during which mRNA is translated to produce a protein. DNA  mRNA  protein • Other important aspects of regulation: methylation, alternative splicing, etc. • The correspondence between DNA's four-letter alphabet and a protein's twenty-letter alphabet is specified by the genetic code, which relates nucleotide triplets to amino acids.

  9. Idea: measure the amount of mRNAto see which genes are being expressed in (used by) the cell. Measuring protein might be better, but is currently harder.

  10. Differential expression • Each cell contains a complete copy of the organism's genome. • Cells are of many different types and states E.g. blood, nerve, and skin cells, dividing cells, cancerous cells, etc. • What makes the cells different? • Differential gene expression, i.e., when, where, and in what quantity each gene is expressed. • On average, 40% of our genes are expressed at any given time.

  11. RNA • A ribonucleic acid or RNA molecule is a nucleic acid similar to DNA, but • single-stranded; • ribose sugar rather than deoxyribose sugar; • uracil (U) replaces thymine (T) as one of the bases. • RNA plays an important role in protein synthesis and other chemical activities of the cell. • Several classes of RNA molecules, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and other small RNAs.

  12. Exons and introns • Genes comprise only about 2% of the human genome; the rest consists of non-coding regions, whose functions may include providing chromosomal structural integrity and regulating when, where, and in what quantity proteins are made (regulatory regions). • The terms exonand intronrefer to coding (translated into a protein) and non-coding DNA, respectively.

  13. Functional genomics • The various genome projects have yielded the complete DNA sequences of many organisms. E.g. human, mouse, yeast, fruitfly, etc. • Human: 3 billion base-pairs, 30-40 thousand genes. • Challenge: go from sequence to function, i.e., define the role of each gene and understand how the genome functions as a whole.

  14. Introduction to microarrays

  15. Gene expression assays • DNA microarrays rely on the hybridization properties of nucleic acids to monitor DNA or RNA abundance on a genomic scale in different types of cells. • The main types of gene expression assays: • High-density nylon membrane arrays; • Serial analysis of gene expression (SAGE); • Short oligonucleotide arrays (Affymetrix); • Long oligonucleotide arrays (Agilent); • Fibre optic arrays (Illumina); • cDNA arrays (Brown/Botstein)*.

  16. Applications of microarrays • Measuring transcript abundance; • Genotyping; • Estimating DNA copy number; • Determining identity by descent; • Measuring mRNA decay rates; • Identifying protein binding sites; • Determining sub-cellular localization of gene products; • ……..

  17. Theprocess Building the chip: PCR PURIFICATION and PREPARATION MASSIVE PCR PREPARING SLIDES PRINTING RNA preparation: Hybing the chip: CELL CULTURE AND HARVEST POST PROCESSING ARRAY HYBRIDIZATION RNA ISOLATION DATA ANALYSIS PROBE LABELING cDNA PRODUCTION Taken from Schena & Davis

  18. Building the chip Consolidate into 384-well plates PCR amplification Directly from colonies with SP6-T7 primers in 96-well plates Arrayed Library (96 or 384-well plates of bacterial glycerol stocks) Spot as microarray on glass slides Ngai Lab, UC Berkeley)

  19. Pins collect cDNA from wells 384 well plate Contains cDNA probes Print-tip group 1 cDNA clones Spotted in duplicate Print-tip group 6 Glass Slide Array of bound cDNA probes 4x4 blocks = 16 print-tip groups

  20. Sample preparation

  21. HybridizationBinding of cDNA target samples to cDNA probes on the slide cover slip Hybridize for 5-12 hours

  22. Hybridization chamber 3XSSC • Humidity • Temperature • Formamide (Lowers the Tm) HYB CHAMBER ARRAY LIFTERSLIP SLIDE LABEL SLIDE LABEL

  23. Expression profiling with DNA microarrays cDNA “B” Cy3 labeled cDNA “A” Cy5 labeled Laser 1 Laser 2 Scanning Hybridization + Analysis Image Capture

  24. RGB overlay of Cy3 and Cy5 images

  25. Raw data • Human cDNA arrays • ~ 43K spots; • 16–bit TIFFs: ~ 20Mb per channel; • ~ 2,000 x 5,500 pixels per image; • Spot separation: ~ 136um; • For a “typical” array: Mean = 43, med = 32, SD = 26 pixels per spots

  26. Image analysis • The raw datafrom a cDNA microarray experiment consist of pairs of image files, 16-bit TIFFs, one for each of the dyes. • Image analysis is required to extract measures of the red and green fluorescence intensities for each spot on the array.

  27. Image analysis GenePix

  28. Image analysis 1. Addressing.Estimate location of spot centers. 2. Segmentation.Classify pixels as foreground (signal) or background. • 3. Information extraction.For • each spot on the array and each • dye • signal intensities; • background intensities; • quality measures. R and Gfor each spot on the array.

  29. Spot • Batch automatic addressing. • Segmentation. Seeded region growing (Adams & Bischof 1994): adaptive segmentation method, no restriction on the size or shape of the spots. • Intensity extraction • Foreground. Mean of pixel intensities within a spot. • Background. Morphological opening: non-linear filter which generates an image of the estimated background intensity for the entire slide. • Spot quality measures. • Software package.Spot, built on the freely available software package R.

  30. Qualitymeasures • Spot • One channel, R or G • Signal/noise ratio; • Variation in pixel intensities; • Identification of “bad spots” (no signal), etc. • Two channels, R/G • Brightness: foreground/background ratio; • Uniformity: variation in pixel intensities and ratios of intensities; • Morphology: area, perimeter, circularity. • Array (slide) • Percentage of spots with no signal; • Range of intensities; • Distribution of spot signal area, etc. • How to use quality measures in subsequent analyses?

  31. Terminology • Probe: DNA spotted on the array, aka. spot, immobile substrate. • Target: DNA hybridized to the array, mobile substrate. • Sector: collection of spots printed using the same print-tip (or pin),aka. print-tip-group, pin-group, spot matrix, grid. • Batch: collection of slides with the same probe layout. • The terms slide or array are often used to refer to the printed microarray.

  32. Biological Question Data Analysis & Modelling Sample Preparation Microarray Life Cycle MicroarrayDetection Microarray Reaction Taken from Schena & Davis

  33. WWW resources • Complete guide to “microarraying” http://cmgm.stanford.edu/pbrown/mguide/ • http://www.microarrays.org • Parts and assembly instructions for printer and scanner; • Protocols for sample prep; • Software; • Forum, etc. • Animation: http://www.bio.davidson.edu/courses/genomics/chip/chip.html

  34. Acknowledgments Introduction to biology: based on Bioconductor short course lecture 1 (http://www.bioconductor.org/) and Temple short course lecture 1 with pictures from http://www.accessexcellence.org/ Introduction to microarray: based on IPAM, UCLA http://www.ipam.ucla.edu/programs/fg2000/tutorials.html#terry_speed Thanks also to: Natalie Thorne (WEHI) Ingrid Lönnstedt (Uppsala)

More Related