1 / 68

Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers

Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers. Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute Linus.nci.nih.gov/brb. BRB Website http://linus.nci.nih.gov/brb. Powerpoint presentations and audio files

zev
Télécharger la présentation

Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics in the Development and Validation of Gene Expression Profiling Based Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute Linus.nci.nih.gov/brb

  2. BRB Websitehttp://linus.nci.nih.gov/brb • Powerpoint presentations and audio files • Reprints & Technical Reports • BRB-ArrayTools software • BRB-ArrayTools Data Archive • Sample Size Planning for Targeted Clinical Trials

  3. Simplified Description of Microarray Assay • Extract mRNA from cells of interest • Each mRNA molecule was transcribed from a single gene and it has a linear structure complementary to that gene • Convert mRNA to cDNA introducing a fluorescently labeled dye to each molecule • Distribute the cDNA sample to a solid surface containing “probes” of DNA representing all “genes”; the probes are in known locations on the surface • Let the molecules from the sample hybridize with the probes for the corresponding genes • Remove excess sample and illuminate surface with laser with frequency corresponding to the dye • Measure intensity of fluorescence over each probe

  4. Resulting Data • Intensity over a probe is approximately proportional to abundance of mRNA molecules in the sample for the gene corresponding to the probe • 40,000 variables measured for each case • Excessive hype • Excessive skepticism • Some familiar statistical paradigms don’t work well

  5. Good Microarray Studies Have Clear Objectives • Class Comparison (Gene Finding) • Find genes whose expression differs among predetermined classes, e.g. tissue or experimental condition • Class Prediction • Prediction of predetermined class (e.g. treatment outcome) using information from gene expression profile • Survival risk-group prediction • Class Discovery • Discover clusters of specimens having similar expression profiles

  6. Class Comparison and Class Prediction • Not clustering problems • Supervised methods

  7. Class Prediction ≠ Class Comparison • A set of genes is not a predictive model • Emphasis in class comparison is often on understanding biological mechanisms • More difficult than accurate prediction and usually requires a different experiment • Demonstrating statistical significance of prognostic factors is not the same as demonstrating predictive accuracy

  8. Components of Class Prediction • Feature (gene) selection • Which genes will be included in the model • Select model type • E.g. Diagonal linear discriminant analysis, Nearest-Neighbor, … • Fitting parameters (regression coefficients) for model • Selecting value of tuning parameters

  9. Feature Selection • Genes that are differentially expressed among the classes at a significance level  (e.g. 0.01) • The  level is a tuning parameter • Number of false discoveries is not of direct relevance for prediction • For prediction it is usually more serious to exclude an informative variable than to include some noise variables

  10. Optimal significance level cutoffs for gene selection. 50 differentially expressed genes out of 22,000 genes on the microarrays

  11. Complex Gene Selection • Small subset of genes which together give most accurate predictions • Genetic algorithms • Little evidence that complex feature selection is useful in microarray problems

  12. Linear Classifiers for Two Classes

  13. Linear Classifiers for Two Classes • Fisher linear discriminant analysis • Diagonal linear discriminant analysis (DLDA) • Ignores correlations among genes • Compound covariate predictor • Golub’s weighted voting method • Support vector machines with inner product kernel • Perceptrons

  14. When p>>n • It is always possible to find a set of features and a weight vector for which the classification error on the training set is zero. • There is generally not sufficient information in p>>n training sets to effectively use more complex methods

  15. Myth • Complex classification algorithms such as neural networks perform better than simpler methods for class prediction.

  16. Comparative studies have shown that simpler methods work as well or better for microarray problems because they avoid overfitting the data.

  17. Other Simple Methods • Nearest neighbor classification • Nearest k-neighbors • Nearest centroid classification • Shrunken centroid classification

  18. Evaluating a Classifier • Most statistical methods were not developed for p>>n prediction problems • Fit of a model to the same data used to develop it is no evidence of prediction accuracy for independent data • Demonstrating statistical significance of prognostic factors is not the same as demonstrating predictive accuracy • Testing whether analysis of independent data results in selection of the same set of genes is not an appropriate test of predictive accuracy of a classifier

  19. Internal Validation of a Classifier • Re-substitution estimate • Develop classifier on dataset, test predictions on same data • Very biased for p>>n • Split-sample validation • Cross-validation

  20. Split-Sample Evaluation • Training-set • Used to select features, select model type, determine parameters and cut-off thresholds • Test-set • Withheld until a single model is fully specified using the training-set. • Fully specified model is applied to the expression profiles in the test-set to predict class labels. • Number of errors is counted

  21. Leave-one-out Cross Validation • Omit sample 1 • Develop multivariate classifier from scratch on training set with sample 1 omitted • Predict class for sample 1 and record whether prediction is correct

  22. Leave-one-out Cross Validation • Repeat analysis for training sets with each single sample omitted one at a time • e = number of misclassifications determined by cross-validation • Subdivide e for estimation of sensitivity and specificity

  23. With proper cross-validation, the model must be developed from scratch for each leave-one-out training set. This means that feature selection must be repeated for each leave-one-out training set. • Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the analysis of DNA microarray data. Journal of the National Cancer Institute 95:14-18, 2003. • The cross-validated estimate of misclassification error is an estimate of the prediction error for model fit using specified algorithm to full dataset

  24. Prediction on Simulated Null Data • Generation of Gene Expression Profiles • 14 specimens (Pi is the expression profile for specimen i) • Log-ratio measurements on 6000 genes • Pi ~ MVN(0, I6000) • Can we distinguish between the first 7 specimens (Class 1) and the last 7 (Class 2)? • Prediction Method • Compound covariate prediction • Compound covariate built from the log-ratios of the 10 most differentially expressed genes.

  25. Major Flaws Found in 40 Studies Published in 2004 • Inadequate control of multiple comparisons in gene finding • 9/23 studies had unclear or inadequate methods to deal with false positives • 10,000 genes x .05 significance level = 500 false positives • Misleading report of prediction accuracy • 12/28 reports based on incomplete cross-validation • Misleading use of cluster analysis • 13/28 studies invalidly claimed that expression clusters based on differentially expressed genes could help distinguish clinical outcomes • 50% of studies contained one or more major flaws

  26. Myth • Split sample validation is superior to LOOCV or 10-fold CV for estimating prediction error

  27. Comparison of Internal Validation MethodsMolinaro, Pfiffer & Simon • For small sample sizes, LOOCV is much less biased than split-sample validation • For small sample sizes, LOOCV is preferable to 10-fold, 5-fold cross-validation or repeated k-fold versions • For moderate sample sizes, 10-fold is preferable to LOOCV • Some claims for bootstrap resampling for estimating prediction error are not valid for p>>n problems

  28. Simulated Data40 cases, 10 genes selected from 5000

  29. Simulated Data40 cases

  30. DLBCL Data

  31. Ordinary bootstrap • Training and test sets overlap • Bootstrap cross-validation (Fu, Carroll,Wang) • Perform LOOCV on bootstrap samples • Training and test sets overlap • Leave-one-out bootstrap • Predict for cases not in bootstrap sample • Training sets are too small • Out-of-bag bootstrap (Breiman) • Predict for case i based on majority rule of predictions for bootstrap samples not containing case i • .632+ bootstrap • w*LOOBS+(1-w)RSB

  32. Permutation Distribution of Cross-validated Misclassification Rate of a Multivariate Classifier • Randomly permute class labels and repeat the entire cross-validation • Re-do for all (or 1000) random permutations of class labels • Permutation p value is fraction of random permutations that gave as few misclassifications as e in the real data

  33. Does an Expression Profile Classifier Predict More Accurately Than Standard Prognostic Variables? • Not an issue of which variables are significant after adjusting for which others or which are independent predictors • Predictive accuracy, not significance • The two classifiers can be compared by ROC analysis as functions of the threshold for classification • The predictiveness of the expression profile classifier can be evaluated within levels of the classifier based on standard prognostic variables

  34. Does an Expression Profile Classifier Predict More Accurately Than Standard Prognostic Variables? • Some publications fit logistic model to standard covariates and the cross-validated predictions of expression profile classifiers • This is valid only with split-sample analysis because the cross-validated predictions are not independent

  35. Survival Risk Group Prediction • For analyzing right censored data to develop predictive classifiers it is not necessary to make the data binary • Can do cross-validation to predict high or low risk group for each case • Compute Kaplan-Meier curves of predicted risk groups • Permutation significance of log-rank statistic • Implemented in BRB-ArrayTools • BRB-ArrayTools also provides for comparing the risk group classifier based on expression profiles to one based on standard covariates and one based on a combination of both types of variables

  36. Myth • Huge sample sizes are needed to develop effective predictive classifiers

  37. Sample Size Planning References • K Dobbin, R Simon. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6:27-38, 2005 • K Dobbin, R Simon. Sample size planning for developing classifiers using high dimensional DNA microarray data. Biostatistics (2007)

More Related