Validity Issues for Accountability Systems

Validity Issues for Accountability Systems Eva L. Baker UCLA Graduate School of Education & Information StudiesCenter for the Study of EvaluationNational Center for Research on Evaluation, Standards, and Student Testing AERA 50.09 April 2002

Theory of Action for Accountability Systems • Accurate data and reports • Valid interpretations • Willingness to act • Alternative actions available • Requisite knowledge • Action implemented • Action will improve subsequent results

Purposes and Uses • Assignment into summer school • High school diploma • Awards for teachers • Parents allowed to transfer students • Special assistance to schools • Accreditation of schools

Validity • “… the degree to which evidence and theory support the interpretation of test scores entailed by the proposed use of tests” • “… validity is, therefore, the most fundamental consideration in developing and evaluating tests” • Standards for Educational and Psychological Testing (AERA, APA, NCME, 1999, p. 9)

Areas of Validity Discussion • Assessment purposes • Test specification and representation • Special issues with high stakes • Student and school classification errors • Multiple ways of demonstrating knowledge • Multiple occasions • Student characteristics

Strong Forms of Validity Will Not Be Preconditions for Use • Which is most important? • Instructional sensitivity—test is sensitive to growth substantially due to instruction

Improving Accountability • Guidance and targets • CRESST/CPRE Standards • CCSSO (Gong); organization reports • AERA/APA/NCME Test Standards, Code of Fair Testing…, Responsible Test Use (Eyde et al.), NRC reports (Elmore & Rothman), Heubert and Hauser • Models (ECS, CRESST, ACHIEVE) • Bar setting

Quality Accountability System Standards 5. The weighting of elements in the system, different test content, and different information sources should be made explicit. • The validity of measures that have been administered as part of an accountability system should be documented for the various purposes of the system. • If tests are to help improve system performance, there should be information provided to document that test results are modifiable by quality instruction and student effort.

Quality Accountability System Standards (Cont.) • If test data are used as a basis of rewards or sanctions, evidence of technical quality of the measures and error rates associated with misclassification of individuals or institutions should be published. • Evidence of test validity for students with different language backgrounds should be made publicly available.

Evaluation • Longitudinal studies should be planned, implemented, and reported evaluating effects of the accountability program. Minimally, questions should determine the degree to which the system • builds capacity of staff; • affects resource allocation; • supports high-quality instruction; • promotes student equity access to education; • minimizes corruption; • affects teacher quality, recruitment, and retention; and • produces unanticipated outcomes.

Evaluation (Cont.) • The validity of test-based inferences should be subject to ongoing evaluation. In particular, evaluation should address • aggregate gains in performance over time; and • impact on identifiable student and personnel groups.

Validity Issues for Accountability Systems