1 / 24

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG). Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry Speed, Walter & Eliza Hall Institute of Medical Research). Summary.

sadie
Télécharger la présentation

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry Speed, Walter & Eliza Hall Institute of Medical Research)

  2. Summary • Summarize the expression level of a probe set by Average Log2 (PM-BG) • PMs need to be normalized • Background makes no use of probe-specific MM • Evaluate and compare through bias, variance and model fit to AvDiff and the Li & Wong algorithm • Use Gene Logic spike-in and dilution study • All three expression measures performed well • AvLog(PM-BG) is arguably the best of the three

  3. SD vs. Avg of Defective Probes

  4. Normalization at Probe Level

  5. Expression after Normalization

  6. Background Distribution

  7. Average Log2(PM-BG) • Normalize probe level data • Compute BG = background mean by estimating the mode of the MM distribution • Subtract BG from each PM • If PM-BG < 0 use minimum of positives divided by 2 • Take average

  8. Spike-In Experiments • Add concentrations (0.5pM – 100 pM) of 11 foreign species cRNAs to hybridization mixture • Set A: 11 control cRNAs were spiked in, all at the same concentration, which varied across chips. • Set B: 11 control cRNAs were spiked in, all at different concentrations, which varied across chips. The concentrations were arranged in 12x12 cyclic Latin square (with 3 replicates)

  9. Why Remove Background?

  10. Probe Level Data (12 chips)

  11. What Did We Learn? • Don’t subtract or divide by MM • Probe effect is additive on log scale • Take logs

  12. Expression Level

  13. Spike-In B Later we consider 24 different combinations of concentrations

  14. Differential Expression

  15. Observed vs True Ratio

  16. Dilution Experiment • cRNA hybridized to human chip (HGU_95) in range of proportions and dilutions • Dilution series begins at 1.25 g cRNA per GeneChip array, and rises through 2.5, 5.0, 7.5, 10.0, to 20.0 g per array. 5 replicate chips were used at each dilution • Normalize just within each set of 5 replicates • For each probe set compute expression, average and SD over replicates, and fit a line to log expression vs. log concentration • Regression line should have slope 1 and high R2

  17. Dilution Experiment Data

  18. Expression and SD

  19. Slope Estimates and R2

  20. Model check • Compute observed SD of 5 replicate expression estimates • Compute RMS of 5 nominal SDs • Compare by taking the log ratio • Closeness of observed and nominal SD taken as a measure of goodness of fit of the model

  21. Observed vs. Model SE

  22. Observed vs. Model SE

  23. Conclusion • Take logs • PMs need to be normalized • Using global background improves on use of probe-specific MM • Gene Logic spike-in and dilution study show all three expression measures performed very well • AvLog(PM-BG) is arguably the best in terms of bias, variance and model fit • Future: better BG; robust/resistant summaries

  24. Acknowledgements • Gene Brown’s group at Wyeth/Genetics Institute, and Uwe Scherf’s Genomics Research & Development Group at Gene Logic, for generating the spike-in and dilution data • Gene Logic for permission to use these data • Francois Collin (Gene Logic) • Ben Bolstad (UC Berkeley) • Magnus Åstrand (Astra Zeneca Mölndal)

More Related