1 / 27

(2) Ratio statistics of gene expression levels and applications to microarray data analysis

(2) Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics, Vol. 18, no. 9, 2002 Yidong Chen, Vishnu Kamat, Edward R. Dougherty, Michael L. Bittner, Paul S. Meltzer1, and Jeffery M. Trent. Outline. Introduction Ratio Statistics

agnes
Télécharger la présentation

(2) Ratio statistics of gene expression levels and applications to microarray data analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. (2) Ratio statistics of gene expression levels and applications to microarray data analysis Bioinformatics, Vol. 18, no. 9, 2002 Yidong Chen, Vishnu Kamat, Edward R. Dougherty, Michael L. Bittner, Paul S. Meltzer1, and Jeffery M. Trent

  2. Outline • Introduction • Ratio Statistics • Quality Metric for Ratio Statistics • Conclusion

  3. Introduction • Motivation Expression-based analysis for large families of genes has recently become possible owing to the development of cDNA microarrays, which allow simultaneous measurement of transcript levels for thousands of genes. For each spot on a microarray, signals in two channels must be extracted from their backgrounds. This requires algorithms to extract signals arising from tagged mRNA hybridized to arrayed cDNA locations and algorithms to determine the significance of signal ratios.

  4. Introduction • Results 1.estimation of signal ratios from the two channels, and the significance of those ratios. 2. a refined hypothesis test is considered in which the measured intensities forming the ratio are assumed to be combinations of signal and background. The new method involves a signal-to-noise ratio, and for a high signal-to-noise ratio the new test reduces (with close approximation) to the original test. The effect of low signal-to-noise ratio on the ratio statistics constitutes the main theme of the paper. 3. a quality metric is formulated for spots

  5. Ratio Statistics

  6. Ratio Statistics assuming a constant coefficient of variation • Consider a microarray having n genes, with red and green fluorescent expression values labeled by and , respectively. • Hypothesis test: • Assumption:

  7. Ratio Statistics assuming a constant coefficient of variation (cont.) • Ratio test statistics: • Assuming and to be normally and identically distributed, has the density function

  8. Ratio Statistics assuming a constant coefficient of variation (cont.) • self-self experiment • Duplicate

  9. Ratio Statistics assuming a constant coefficient of variation (cont.) • Confidence interval 1. Integrating the ratio density function 2. The C.I. is determined by the parameter c, one can either use the par. derived from pre-selected housekeeping genes or a set of duplicate genes.

  10. Ratio Statistics for low signal-to-noise ratio • The actual expression intensity measurement is of the form

  11. Ratio Statistics for low signal-to-noise ratio (cont.) • Null hypothesis of interest: test statistics:

  12. Ratio Statistics for low signal-to-noise ratio (cont.) • Major difference: 1. the assumption of a constant cv applies to and , not to and 2. the density of is not applicable • SNR (signal-to-noise ratio)

  13. SNR (signal-to-noise ratio) • Assuming that are independent,

  14. The Expression intensity scatter plot

  15. Confidence interval for the test statistics • Assumption:

  16. Confidence interval for the test statistics (cont.) • Under the assumption of constant cv for the signal (without the background),

  17. The 99% confidence interval for ratio statistic

  18. Correction of background estimation • Owing to interaction between the fluorescent signal and background, local-background estimation is often biased. • To estimate the bias difference, we find the relationship between the red and green intensities under the null hypothesis by assuming a linear relation, G = aR+b.

  19. Correction of background estimation (cont.) • Simulation 1. generate 10,000 data points from exp. dist. with 2,000 to simulate 10,000 gene expression levels, 2. The intensity measurement for each channel is further simulated by using a normal dist. with mean intensity from the exp. dist. and a constant cv of 0.2 3. simulate background level by a normal dist. (1) no bias: background level ~ N (0,100) (2) some bias: background level ~ N (b,100)

  20. Scatter plot of simulated expression data dog-leg effect

  21. Correction of background estimation (cont.) • G = aR+b we employ a chi-square fitting method that minimizes

  22. Quality Metric for Ratio Statistics • For a given cDNA target, the following factors affect ratio measurement quality: • Weak fluorescent intensities • A smaller than normal detected target area • A very high local background level • A high standard deviation of target intensity

  23. (1)Fluorescent intensity measurement quality • Under the null hypothesis, the signal means are equal, so that

  24. (2)Target area measurement quality

  25. (3)Background flatness quality • Define background flatness

  26. Typical target shap cv=0.48 cv=0.81 cv=0.45 cv=0.98 cv=0.31 cv=0.59 (4)Signal intensity consistency quality

  27. (4)Signal intensity consistency quality (cont.)

More Related