Comparing Predictive Accuracy and Correct Classification Yaacov Petscher, Barbara Foorman, Leilani Saez, Anne Bishop, & Christopher Schatschneider The Florida State University Florida Center for Reading Research Method Abstract Introduction With the passing of NCLB, a focus of educators has been the identification of students who are likely to be at risk for future reading problems. A significant issue facing researchers today is the development and validation of screening instruments to assess such problems. Traditional approaches to diagnostic accuracy maximize correct classification as a predominant modality of establishing the clinical or practical utility of screening instruments. This practice has been accepted, largely based on screening practices that are typically used in the medical and psychological research communities. A shortcoming of this paradigm is ignoring the base rate of the problem in one’s sample. It is well known that base rate information is typically ignored in the assessment of diagnostic validity, and that the predictive accuracy will vary as a function of the base rate. While correct classification indices are unaffected by base rates, predictive accuracy indices are. The purpose of the present study was to examine the trade-off between maximizing the percentages of students correctly classified as at-risk/not at-risk for reading comprehension failure on the SAT-10 and a new screener, with maximizing predictive accuracy of risk on the SAT-10 by a screener. Typical assessments seek to maximize correct classification based on Sensitivity and Specificity; however, it was of interest for us to achieve 90% negative predictive power (i.e., 90% of students that are identified as at-risk on a screen end up at-risk on the criterion). Our analyses were based on representative sample of 1,935 kindergarten through second grade students who were administered a new screening inventory. The screener consisted of four tasks: Letter Naming, Letter Sounds, Phonological Awareness, and Word Reading. Students were also tested on the SESAT (kindergarten) or SAT-10 (1st-2nd). Logistic regression and ROC analyses were used to determine cut-points that were compared to maximize correct classification or predictive accuracy. Results Spring Winter Fall Based on Cumulative Frequency % Conclusions When looking at the predictive accuracy of a screener, it is important to carefully consider the trade-offs between correct classification and accounting for the prevalence of risk. While correct classification has been argued to be the most important element in diagnostic accuracy, this has largely been relevant to studies in the medical and clinical psychology fields. Accounting for base rates, while leading to possible over-identification, reduces the likelihood of “missing” students who are at risk and thus decreases the need for intensive interventions.