1 / 27

Biology and Cells

Biology and Cells. All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin, nerve), but all arose from a single cell (the fertilized egg)

Télécharger la présentation

Biology and Cells

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biology and Cells • All living organisms consist of cells. • Humans have trillions of cells. Yeast - one cell. • Cells are of many different types (blood, skin, nerve), but all arose from a single cell (the fertilized egg) • Each cell contains a complete copy of the genome (the program for making the organism), encoded in DNA.

  2. DNA • DNA molecules are long double-stranded chains; 4 types of bases are attached to the backbone: adenine (A), guanine (G), cytosine (C), and thymine (T). A pairs with T, C with G. • A gene is a segment of DNA that specifies how to make a protein. • Human DNA has about 30-35,000 genes; • Rice -- about 50-60,000, but shorter genes.

  3. Exons and Introns: Data and Logic? • exons arecoding DNA (translated into a protein), which are only about 2% of human genome • introns are non-coding DNA, which provide structural integrity and regulatory (control) functions • exons can be thought of program data, while introns provide the program logic • Humans have much more control structure than rice

  4. Gene Expression • Cells are different because of differential gene expression. • About 40% of human genes are expressed at one time. • Gene is expressed by transcribing DNA into single-stranded mRNA • mRNA is later translated into a protein • Microarrays measure the level of mRNA expression

  5. Gene Expression Measurement • mRNA expression represents dynamic aspects of cell • mRNA expression can be measured with latest technology • mRNA is isolated and labeled with fluorescent protein • mRNA is hybridized to the target; level of hybridization corresponds to light emission which is measured with a laser

  6. Molecular Biology Overview Nucleus Cell Chromosome Gene (DNA) Gene (mRNA), single strand Protein Graphics courtesy of the National Human Genome Research Institute

  7. Gene Expression Microarrays The main types of gene expression microarrays: • Short oligonucleotide arrays (Affymetrix); • cDNA or spotted arrays (Brown/Botstein). • Long oligonucleotide arrays (Agilent Inkjet); • Fiber-optic arrays • ...

  8. DNA Chip Microarrays • Put a large number (~100K) of cDNA sequences or synthetic DNA oligomers onto a glass slide in known locations on a grid. • Label an RNA sample and hybridize (Label 2 RNA samples with 2 different colors of flourescent dye - control vs. experimental) • Mix two labeled RNAs and hybridize to the chip • Measure amounts of RNA bound to each square in the grid • Make comparisons • Cancerous vs. normal tissue • Treated vs. untreated • Time course

  9. Spot your own Chip(plans available for free from Pat Brown’s website) Robot spotter Ordinary glass microscope slide

  10. cDNA Spotted Microarrays

  11. Affymetrix “Gene chip” system • Uses 25 base oligos synthesized in place on a chip (20 pairs of oligos for each gene) • RNA labeled and scanned in a single “color” • one sample per chip • Can have as many as 20,000 genes on a chip • Arrays get smaller every year (more genes) • Chips are expensive • Proprietary system: “black box” software, can only use their chips

  12. 50um Affymetrix Microarrays Raw image 1.28cm ~107 oligonucleotides, half Perfectly Match mRNA (PM), half have one Mismatch (MM) Raw gene expression is intensity difference: PM - MM

  13. Data Acquisition • Scan the arrays • Quantitate each spot • Subtract background • Normalize • Export a table of fluorescent intensities for each gene in the array

  14. Normalization • Can control for many of the experimental sources of variability (systematic, not random or gene specific) • Bring each image to the same average brightness • Can use simple math or fancy - • divide by the mean (whole chip or by sectors) • LOESS (locally weighted regression) • No sure biological standards

  15. Multiple Comparisons • In a microarray experiment, each gene (each probe or probe set) is really a separate experiment • Yet if you treat each gene as an independent comparison, you will always find some with significant differences • (the tails of a normal distribution)

  16. Microarray Potential Applications • Biological discovery • new and better molecular diagnostics • new molecular targets for therapy • finding and refining biological pathways • Recent examples • molecular diagnosis of leukemia, breast cancer, ... • appropriate treatment for genetic signature • potential new drug targets

  17. Microarray Data Analysis Types • Gene Selection • find genes for therapeutic targets • avoid false positives (FDA approval ?) • Classification (Supervised) • identify disease (biomaker study) • predict outcome / select best treatment • Clustering (Unsupervised) • find new biological classes / refine existing ones • Understanding regulatory relationship/pathway • exploration • …

  18. Microarray Data Mining Challenges • too few records (samples), usually < 100 • too many columns (genes), usually > 1,000 • Too many columns likely to lead to False positives • for exploration, a large set of all relevant genes is desired • for diagnostics or identification of therapeutic targets, the smallest set of genes is needed • model needs to be explainable to biologists

  19. Data Preparation Issues • Thresholding: usually min 20, max 16,000 • For older Affy chips (new Affy chips do not have negative values) • Filtering - remove genes with insufficient variation • e.g. MaxVal - MinVal < 500 and MaxVal/MinVal < 5 • biological reasons • feature reduction for algorithmic • For clustering, normalize each gene (sample) separately to Mean = 0, Std. Dev = 1

  20. Normalization issues Within-slide • What genes to use • Location • Scale Paired-slides (dye swap) • Self-normalization Between slides

  21. Test RNA Sample Control RNA Sample Reverse-Transcription radio-labelled cDNA probes Hybridization to microarray filters Compare densities at each spot to determine if treatment changes gene expression. Compile subset of differentially expressed genes. Gene Control Test A 1X 3X : : : Z 1X 0.5X Use Phosphor Imager laser scanner to obtain densities of each spot on filter.

  22. Normalization continued • Intensity-dependent normalization (Yang, YH, 2002 ) • Do M-A plot to check the data distribution, where • Use Lowess function in R to perform normalization where c(A) is the lowess fit to the M-A plot • Transform data by M'=M - c(A). • Locally nonparametric method and is robust to a small number of differentially expressed genes.

  23. (R,G)  (M,A) Transformation “Observed” data {(R,G)} R = red channel signal G = green channel signal (background corrected or not) Transformed data {(M,A)} M = log2(R/G) (ratio), A = log2(R·G)1/2 = 1/2·log2(R·G) (intensity)  R=(22A+M)1/2, G=(22A-M)1/2

  24. Normalization • Regression normalization: • Fit the linear regression model: • Assumption: all the genes on the array have the same variance (homogeneity) • Test the significance of the intercept . Fit a linear regression without  if it is insignificant. • Transform the treatment data: • Problem: • assumption may not hold • nonlinear trend (the third replicates of RL95 data has a slight quadratic trend) .

  25. Scatter plot of log intensity before and after regression normalization

More Related