330 likes | 1.01k Vues
The effect of standardized testing on student achievement: Meta-analyses and summary of the research. Richard P. PHELPS. International Test Commission, 7 th Conference, Hong Kong, July, 2010. Meta-analysis.
E N D
The effect of standardized testing on student achievement: Meta-analyses and summary of the research Richard P. PHELPS © 2010, Richard P PHELPS International Test Commission, 7th Conference, Hong Kong, July, 2010
Meta-analysis • A method for summarizing a large research literature, with a single, comparable measure. © 2010, Richard P PHELPS
The effect of standardized testing on student achievement • 12-year long study, almost finished • analyzed close to 700 separate studies, and more than 1,500 separate effects • 2,000 other studies were reviewed and found incomplete or inappropriate • lacking sufficient time and money, hundreds of other studies will not be reviewed © 2010, Richard P PHELPS
Looking for studies to include in the meta-analyses • Included only those studies that found an effect from testing on student achievement or on teacher instruction… © 2010, Richard P PHELPS
Studies included in the meta-analyses • …when: • a test is newly introduced, or newly removed • quantity of testing is increased or reduced • test stakes are introduced or increased, or removed or reduced © 2010, Richard P PHELPS
Studies included in the meta-analyses3. …plus previous research summaries • Kulik, Kulik, Bangert-Drowns, & Schwalb (1983-1991) on: • mastery testing, • frequency of testing, and • programs for high-risk university students • Basol & Johanson (2009) on testing frequency • Jaekyung Lee (2007) on cross-state studies • W.J. Haynie (2007) in career-tech ed © 2010, Richard P PHELPS
Number of studies of effects, by methodology type © 2010, Richard P PHELPS
Effect size: Cohen’s d d = (YE - YC) / Spool YE = mean, experimental group YC = mean, control group Spooled = standard deviation © 2010, Richard P PHELPS
Effect size: Other formulae d= t*((n1+n2/n1*n2)^0.5 d= 2r/(1-r²)^0.5 d= (YE pre-YE post-YC pre+ YC post)/Spooled post © 2010, Richard P PHELPS
Effect size: Interpretation • d between 0.25 & 0.50 weak effect • d between 0.50 et 0.75 medium effect • d more than 0.75 strong effect © 2010, Richard P PHELPS
Quantitative studiesPreliminary results © 2010, Richard P PHELPS
Quantitative studies: Effect size • “Bare bones” calculation: d ≈ +0,53 …a medium effect • Bare bones effect size adjusted for measurement error d ≈ +0,70 …a stronger effect • Adjustments have yet to be made for other attenuations; estimated d is greater than +0.70 © 2010, Richard P PHELPS
Which predictors matter? Testing vs. not testing Higher stakes vs. lower stakes More testing vs. less testing Correlation with accountability index © 2010, Richard P PHELPS
Moderators – Stronger effects level of education university +0.29 over elementary-secondary scale of test administration classroom +0.24 over large-scale study design experiment +0.22 over multivariate or pre-post studies © 2010, Richard P PHELPS
Moderators – Medium Effects jurisdiction of test administration local +0.17 over state/national state/national +0.18 over international stakes are for whom? student +0.11 over school school +0.16 over teacher stakes involved medium +0.14 over low low +0.08 over high © 2010, Richard P PHELPS
Moderators – Small Effects • location of study • outside USA +0.13 over in USA • provision of feedback • strong effect at classroom level and in experimental studies • no discernable effect in large-scale studies © 2010, Richard P PHELPS
Surveys and opinion polls © 2010, Richard P PHELPS
Number and percent of survey items, by repondant group and type of study © 2010, Richard P PHELPS
Number and percent of survey items,by test stakes and group affected by stakes © 2010, Richard P PHELPS
Opinion polls, by year • 250 between 1958--2008, in the U.S. & Canada • 800 unique question-response combinations • close to 700,000 individual respondants © 2010, Richard P PHELPS
Surveys and opinion polls: Regular standardized tests, performance tests © 2010, Richard P PHELPS
Qualitative studies: Summary (One cannot calculate an effect size.) © 2010, Richard P PHELPS
Qualitative studies, by methodology type © 2010, Richard P PHELPS
Qualitative studies: Effect on student achievement 244 studies conducted in the past century in over 30 countries © 2010, Richard P PHELPS
Qualitative studies: Testing improves student achievement and teacher instruction © 2010, Richard P PHELPS
Qualitative studies: Variation by rigor and test stakes © 2010, Richard P PHELPS
Qualitative studies: Regular standardized tests and performance tests © 2010, Richard P PHELPS
An enormous research literature • But, assertions that it does not exist at all are common • Some claims are made by those who oppose standardized testing, and may be wishful thinking • Others are “firstness” claims © 2010, Richard P PHELPS
Dismissive research reviews • With a dismissive research literature review, a researcher assures all that no other researcher has studied the same topic © 2010, Richard P PHELPS
Firstness claims • With a firstness claim, a researcher insists that he or she is the first to ever study a topic © 2010, Richard P PHELPS
Social costs are enormous • Research conducted by those without power or celebrity is ignored and lost; • Public policies are based exclusively on the research results of those with power or celebrity • Society pays again and again for research that has already been done © 2010, Richard P PHELPS
The effect of standardized testing on student achievement: Meta-analyses and summary of the research Richard P. PHELPS © 2010, Richard P PHELPS