A Comprehensive Assessment System: Tough Choices for the RTT Assessment Competition

A Comprehensive Assessment System: Tough Choices for the RTT Assessment Competition Scott Marion National Center for the Improvement of Educational Assessment Race to the Top Assessment Public and Expert Input Meeting Boston, MA November 12, 2009

Overview of Comments An explicit theory of action Purposes and uses Sound design principles My proposed design Access and equity A note about psychometrics High schools Some advice on the proposed RFP/RFA Marion. Center for Assessment. Nov. 12, 2009

A Preview of My Vision… • A conceptually coherent comprehensive assessment system that incorporates explicit curriculum/OTL links • End-of-year summative assessments build on well-articulated content and performance standards • Interim performance tasks embedded in mini curricular units • Formative assessment supports/probes • Focused professional development • Actionable reporting system to help reveal student and school strengths and weaknesses Marion. Center for Assessment. Nov. 12, 2009

A Theory of Action • Before finalizing the RFP, USED must articulate a clear and explicit theory of action • Describes how the particular CLEAR goals will be achieved as a result the particular assessment system(s) • Specific mechanisms—how does USED expect we will get from A to B? What is the evidence to support this expectation? • Explicitly describes prioritized design choices, e.g.: • Influence and shape teaching and learning, OR • Measuring existing knowledge, OR • Making cross-state comparisons • The theory of action is a check on the logic of the underlying assumptions Marion. Center for Assessment. Nov. 12, 2009

Purposes and Uses • The plethora of design requirements in the RTT notice will stress any (even comprehensive )assessment system • USED must have a firm sense of the likely accountability uses before letting the RFP • Even though Congress will ultimately reauthorize ESEA • Clarity on purposes/uses will serve as an important touchstone during complicated design deliberations—this is where choices are made explicit, for example: • Trying to have BOTH diagnostic information for each child and a common proficiency test for all students can be incompatible (certainly within the same reasonable length test) • Similarly, growth models could produce more valid information if we measured students over much of the achievement continuum rather than clustering test information around the proficient cutscore Marion. Center for Assessment. Nov. 12, 2009

Overarching Goal ALL students should have meaningful opportunities to develop deep understanding of important content and critical skills to allow for viable postsecondary choices (e.g., college/work ready) and for becoming contributing members of society I propose a system that is intended to support this overall goal… Marion. Center for Assessment. Nov. 12, 2009

My Prioritized Purposes/Uses • Measuring a limited number of big ideas at deeper levels of understanding to provide students opportunities to develop robust knowledge and skills for use in novel and complex settings • Better integration of curriculum, instruction, and assessment because we cannot address these challenges with just an “assessment fix” • Measuring student longitudinal growth as a foundation for valid accountability systems and as information for school improvement • Notice that I am limiting myself to two main purposes, because I do not think a system can do more than 2-3 well. • Intentionally not focusing on cross-state comparisons…I think my proposed design purposes will help us meet the overall goal better and the trade-offs are too great to focus on x-state comparisons. Marion. Center for Assessment. Nov. 12, 2009

Design Principles: Theoretically Based Observation Interpretation Cognition • The RFP must require proposed designs to be based on theoretically sound design models, 2 examples include: • Evidence-centered design (ECD, Mislevy, 1994, 1996) • Student model—exactly what do you want students to know and how (well) do we want them to know it? • Evidence model—what will you accept as evidence that the student has the desired knowledge? • Task model—what tasks will students perform to demonstrate/communicate their knowledge? • Knowing What Students Know (Pellegrino, et al., 2001) Marion. Center for Assessment. Nov. 12, 2009

My Vision… • A conceptually coherentcomprehensive assessment system that incorporates explicit curricular connections • End-of-year summative assessments built on well-articulated content and performance standards • Interim performance tasks embedded in mini curricular units • Formative assessment supports/prompts • Focused professional development • Actionable reporting system to help reveal student and school strengths and weaknesses • This proposal is designed to build a coherent system that bridges curriculum, multiple forms of assessment, and supports for instruction Marion. Center for Assessment. Nov. 12, 2009

Design reporting systems up front • Too often the reports are simply an add-on • Must be conceived as a system of reports • Different purposes, users, and levels of information • Must be actionable—leads to appropriate inferences, decisions, and instructional/programmatic actions • See http://www.schoolview.org/ for a terrific example of what’s possible • Should support the theory of action Marion. Center for Assessment. Nov. 12, 2009

The Curricular Units Depending on grade level, approximately 2-6 of these units throughout the year, varied by grade level (can phase-in) The units could be as short as a few days or as long as a couple of weeks Each unit is focused on a “big idea” of the domain Can be strategically used within existing curricula (e.g., perhaps at the end of a longer unit of study) Serves as the basis for performance tasks and as a context for summative assessment Includes training materials and supports for implementing formative assessment and progress monitoring strategies within each unit Flexible enough to use each year with new/comparable contexts: different science experiment or grade-level text, but assessing same concepts Provides a vehicle for structuring equitable OTL and access for all Marion. Center for Assessment. Nov. 12, 2009

Summative Assessment • Serves as the foundation for growth measurement • Some of the content and specific examples will come from curricular units (so we can measure more than “general” knowledge) • Should be administered toward the end of the school year • Should include rich representation of knowledge and skills (i.e., plenty of open-ended tasks) • Why the obsession with “instant” results? • Who needs the results, in what form, and by when? • What will be done with these results? • Remember, our current accountability schedule has driven this “need” for a rapid turnaround. • Everything comes with a cost! Marion. Center for Assessment. Nov. 12, 2009

Interim performance tasks • These rich and engaging tasks are the foundation of this system • Contextualized within the curricular units • Scored locally and incorporated within local assessment and grading (graduation) systems • Local scoring audited (e.g., KY portfolios) so results can be used in state accountability systems • School level for K-8 • School and individual (e.g., graduation) levels for high school • Tasks should be designed using ECD principles to reveal students’ need for additional support • Most tasks should be released each year Marion. Center for Assessment. Nov. 12, 2009

Formative assessment • The curricular units and associated materials should be designed to facilitate formative assessment probes and processes • Professional development provided to increase teachers’ capacity for implementing and using formative assessment to improve instruction • Formative assessment training and strategies should include a focus on helping all students achieve expectations • Maintain a clear separation between formative assessment and district/state accountability systems • Stakes changes (corrupts) everything Marion. Center for Assessment. Nov. 12, 2009

Opportunity, Access, and Equity • I argue that we have much more of an instruction (OTL) than an assessment problem • Assessment can’t make up for lack of OTL • The proposed curricular units are designed to help level the curriculum and instruction playing field • Provide supports for teachers to help them ensure that all students access the knowledge and skills • Build formative assessment capacity and use so students don’t fall so far behind • Design tasks with multiple and varied opportunities for students to validly participate in the assessment system • Finally, assessment guidelines need to focus first on fair access and less on narrow definitions of comparability • Capitalize on tremendous advances in innovative technological approaches for access and accommodations Marion. Center for Assessment. Nov. 12, 2009

A “New” Psychometrics • A system such as the one I’m proposing will require some serious re-examination of our current psychometric practices • We’ve traded a lot (of validity) in the past for student-level reliability, pretty scales, and overly strict notions of comparability • Yes, we will have serious equating challenges • The foundations for “new” approaches have been established (e.g., Linn, Baker, Dunbar, 1991, Mislevy 1994, Pellegrino, et al, 2001), but still need more attention to work in large-scale, efficient practice • The RFP should push for requirements and expectations beyond the current “safe” methods Marion. Center for Assessment. Nov. 12, 2009

High Schools Assessment system should be situated in specific “indicator” or core courses up to some point (e.g., 10th grade) After this point, there should be more choice in the assessment (and accountability) system to allow for specialization and choice by students Interim performance tasks can be used as part of a student accountability system like Wyoming’s or Rhode Island’s graduation systems Marion. Center for Assessment. Nov. 12, 2009

Some advice on RFA/RFP • Development is an ONGOING cost, not a one-time purchase! • Recognize and embrace the differences between high schools and elementary schools • Determine the absolutely essential pieces and then examine costs for additional components • Reconsider the current practice of having every student tested on every item • Matrix sampling is still a viable approach • Allow for multiple awards • Nobody has the “right” answer and even if they think they do, it won’t be “right” in all contexts • Especially true in high school • According to Rich Hill, a good RFP is: • Exceptionally clear on goals • Flexible on specific means unless you are absolutely clear on what you want • Think about a phase-in over the next 5 years • Recognize critical operational and bureaucratic constraints • Existing contracts, state laws, procurement rules Marion. Center for Assessment. Nov. 12, 2009

For more information Formal comments will be submitted by December 2, 2009 and available on request: smarion@nciea.org www.nciea.org Marion. Center for Assessment. Nov. 12, 2009

A Comprehensive Assessment System: Tough Choices for the RTT Assessment Competition