The Relationship Between State High School Exit Exams and Mathematical Proficiency: Analyses of the Complexity, Content, and Format of Items and Assessment Protocols By Blake Regan
Overview • Problem Statement and motivation for study • Conceptual Model • Research Questions • Analyses • Participating states and results • Discussion • Limitations • Recommendations • Implications • Conclusion
Problem Statement • Pressures of No Child Left Behind Act of 2001 • Mathematics and Reading Assessments yearly from Grades 3 through 8 and once from Grades 10 through 12 • Financially reward/penalize schools based on test scores/results • Teaching to the test • Teachers are pressured to push content focused learning aside to prepare students for tests • Students who have been classified as proficient by high school exit exams are failing to be college/career ready (Boyd, 2008; Bunch, 2004; Kupermintz, 2001; Lutzer, Rodi, Kirkman, & Maxwell, 2007; NCLB, 2002)
Motivation for Study To determine whether the high-stakes assessments that are used across the nation to ensure high school students are proficient in mathematics, and that serve as a gateway to graduation, are encouraging teachers to teach in a manner likely to result in a genuine development of proficiency across the full range of desirable mathematical behaviors.
Research Questions Analysis A (Student Scores) • Which complexity level of items best predicts student success on high school exit exams? • Which content strand addressed by the items best predicts student success on high school exit exams? • Which item format best predicts student success on high school exit exams? • Analysis B (Assessment Protocols) • Are states’ high school exit exams and cut scores aligned with their respective definition of mathematical proficiency? • To what extent, if at all, do state high school exit exams encourage mathematical proficiency as defined by the NRC (2001)?
Analysis A • Binary Logistic Regression • “used to analyze data in studies where the outcome variable is binary or dichotomous” (Warner, 2008, p. 931) • Students proficiency classification was used as the outcome variable • Proficient or better scored a 1 • Below proficient scored a 0 • Student sub-scores on different types of items were used as predictor variables
Exam 1 • 68,784 student samples • Cut Score was 30 of 60 possible points • Complexity level of items was not provided • Researcher compiled a group of volunteers • Mathematicians, a scientist, businessmen, parents, teachers, a principal, and a superintendent (15 total participants) • E-mailed definitions of each complexity and an electronic version of the exam • Phone conference to classify each item • Unanimous decision before an item was classified
Exam 2a • One of two assessments used by State to assess students’ achievement of mathematical proficiency • Administered in the eleventh-grade • 62,043 student samples • Cut score was 40 of 65 possible points • Complexity level of each item was provided
Exam 2b • Second of two assessments used by a state to assess students’ achievement of mathematical proficiency • Students who fail to be classified as proficient by Exam 2a must be classified as proficient by 2b to qualify for graduation • Administered in the eleventh-grade • 62,043 student samples • Cut score was 28 of 40 possible points
Exam 3 • Administered in the tenth-grade • 65,535 student samples • Cut score was 17 of 46 possible points • Complexity level of items was provided
Summary of Results (p.136) Summary of Results for Exams 1, 2a, 2b, and 3
Discussion (State 1) • Exam 1 • Requires students to successfully earn at least one point from all of the content strands and a minimum of four points from moderate and high complexity items • Recommend an increase in the total number of items to allow for the categorical concurrence requirement to be met • Appropriately assesses students’ achievement of mathematical proficiency
Discussion (State 2) • Exam 2a • Requires students to successfully earn at least one point from all of the content strands and a minimum of 17 points from moderate and high complexity items • Appropriately assesses students’ achievement of mathematical proficiency • Exam 2b • Failed to require students to correctly answer an item from each content strand • Inappropriately assesses students’ achievement of mathematical proficiency
Discussion (State 3) • Exam 3 • Fails to require students to receive one point from each of the content strands as well as a point from either moderate or high complexity items • Inappropriately assesses students’ achievement of mathematical proficiency • Setting the cut score at 23, half of the total possible points • Moderate complexity items • High and extend response items have more predictive power than that of the null model • Depth-of-knowledge consistency
Discussion (overall) • No one item type was the best predictor of student classification for all the exams analyzed • greatest amount of variation between complexity level • least amount of variation between content strand addressed • Balanced the weight and power of the content strands • Need to be just as vigilant with complexity level • Range-of-knowledge correspondence • Categorical concurrence of each content strand separately, low complexity and moderate complexity • Not one exam met the categorical concurrence requirement for each complexity-by-content strand category
Recommendations • Other assessments of mathematical proficiency • Correlation between these findings and teachers’ feelings toward these assessments • Correlation between these findings and teachers’ techniques and practices for preparing students for the exams • Assessments prior to the NCLB • Assessments in other content areas
Implications • Influence on teachers • Cut score • Assessments for the Common Core State Standards • Meeting the NCLB deadline of 2014 • Relationship between the Exam 1 and State 1’s rank according to NAEP 2009
Conclusion To meet the challenges set forth by the NCLB and the ever-expanding technological world, and most importantly for the success of U.S. students, it is imperative that exams that propose to assess student achievement are designed appropriately and critically
References Boyd, B. T. (2008). Effects of state tests on classroom test items in mathematics. School Science and Mathematics, 108(6), 251–262. Bunch, M. B. (2004). Ohio Graduation Tests standard setting report: Reading and mathematics. (T. Moore, Ed.) Columbus, OH: Ohio Department of Education. Kopko, E. (2009). State SAT scores 2009. Retrieved from Best and Worst States: http://blog.bestandworststates.com/2009/08/25/state-sat-scores-2009.aspx Kupermintz, H. S. (2001). Teacher effects as a measure of teacher effectiveness: Construct validity considerations in TVAAS (Tennessee Value Added Assessment System). Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle, WA. Lutzer, D. J., Rodi, S. B., Kirkman, E. E., & Maxwell, J. W. (2007). Statistical abstract of undergraduate programs in the mathematical sciences in the United States: Fall 2005 CBMS survey. Providence, RI: American Mathematical Society. National Center for Education Statistics. (2010). The nation’s report card: Grade 12 reading and mathematics 2009 national and pilot state results. Washington, DC: U.S. Department of Education. Retrieved from http://nces.ed.gov/nationsreportcard/pdf/main2009/2011455.pdf No Child Left Behind (NCLB) Act of 2001, Pub. L. No. 107-110, § 115 Stat. 1425 (2002). Warner, R. M. (2008). Applied statistics: From bivariate through multivariate techniques. Thousand Oaks, CA: Sage.