Designing a Statewide System for Measuring Teacher and Leader Effectiveness

Designing a Statewide System for Measuring Teacher and Leader Effectiveness Wyoming Accountability Advisory Committee Scott Marion & Chris Domaleski Center for Assessment June 14, 2012

Overview of presentation… Some background Outline key decisions for creating educator evaluation systems Our purpose today is to highlight some of the key decisions we will need to make through the interim We’ll be asking a lot more questions than providing answers, but we will need to answer these questions in order to move forward… A process note: Given the number of people on the WEBEX/call, I will pause at specific places in the presentation to respond to questions. Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Introduction • Wyoming, like an increasing number of states, intends to revise its teacher and leader evaluation practices • Educator effectiveness will be determined “in part by student achievement” • This enterprise holds great promise, but also presents real challenges • We are fortunate to be able to build off of the work in many other states. We are closely involved in: • CO, RI, NH, GA, PA, UT, NYC, HI, LA Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Rationale Why the interest in new forms of teacher evaluation? Nobody doubts the critical influence of teacher quality on student achievement Current (traditional) evaluation systems rarely identify either highly effective or ineffective teachers Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Key Decisions Center for Assessment. WY Accountability Advisory Committee (6/14/12) • From Aspen Report and our experience: • Vision and Goals • State-Local Roles and Responsibilities • Theory of Action • General Evaluation Model • Coherence • Specific Measurement Model(s) • Attribution rules • Combining multiple measures • Information Requirements • Capacity Requirements • Reporting & Communication • Consequences & Support • Monitoring and Evaluation

Goals and key principles Center for Assessment. WY Accountability Advisory Committee (6/14/12) What is the vision and what are the guiding principles of the system we will design? For example, will the system be designed to identify and “council out” low quality educators or is it designed primarily to improve the performance of the majority of educators?

Excerpt rom NH’s draft system Center for Assessment. WY Accountability Advisory Committee (6/14/12) • The primary purpose of the system is to maximize student learning • The system is designed to maximize educator development by providing specific information, including appropriate formative information that can be used to improve teaching quality. • Local instantiations of the State Model system must be designed collaboratively among teachers, leaders, and other key stakeholders such as parent and students as appropriate. Individual educators will have input into the specific nature of their evaluation and considerable involvement into the establishment of their specific goals. • The effectiveness rating of each educator shall be based on multiple measures of teaching practice and student outcomes including using multiple years of data when available, especially for measures of student learning. • The Model system is designed to ensure that the framework, methods, and tools lead to a coherent system that is also coherent with the developing NH Leader Evaluation System. • The Model system shall be applied by well trained leaders and evaluation teams using the multiple sources of evidence along with professional judgment to arrive at an overall evaluation for each educator.

Major policy decisions Center for Assessment. WY Accountability Advisory Committee (6/14/12) • What will be the “reach” of the state in defining local systems? • What factors must be considered in this decision? • Comparability/portability vs. flexibility • Support and capacity building • Oversight and monitoring • Required Framework, “State Model” or State-required system • We are proceeding here with the assumption that there will at least be a state required framework?

A Theory of Action… Grounds our design Clarifies the assumptions, purposes, and goals of the system Specifies the various indicators and mechanisms by which the system will fulfill its purposes (and minimize unintended negative consequences) Serves as a framework for evaluation The ToA on the following slide is oversimplified and somewhat naïve, but it is what is driving much of the policy. We’ll be working with more complex and honest ToAs as we do our work. Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Hiring Measures of Educator Effectiveness and Evaluation Processes Professional Development Student Outcomes Improve Placement Compensation Dismissal Career Ladder A Simplified Theory of Action for Reformed Educator Evaluation Systems Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Basic Structure of a Theory of Action Assumptions or Antecedents Proximal Indicators Distal Indicators (Intended Outcomes) Intermediate Indicators Activities and Mechanisms Activities and Mechanisms Consequences Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Theory of Action Center for Assessment. WY Accountability Advisory Committee (6/14/12) Let’s look at a more reasonable approximation for an improvement-based educator evaluation system

Simple ToA for an “improvement” system Educator evaluation system Focuses educators’ attention on productive practices Student performance is well measured Evaluation results improve Student Learning Improves Results are used to improve instruction Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Thinking Through a Theory of Action • Policy makers should have to very explicitly say why and how implementing test-based approaches to support educator effectiveness for these grades and subjects will lead to improved educational opportunities for students • For example, one might postulate that holding teachers accountable for increases in student test scores on classroom-based assessments will lead to the development of both better assessments and improvements in student learning. • What are the specific mechanism(s)by which the intended outcomes will occur? • E.g., targeted instruction, better PD, and/or more appropriate curricular materials? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

The General Evaluation Model Center for Assessment. WY Accountability Advisory Committee (6/14/12) • What will be the major components of our system? • Measures of teacher practice • Measures of student performance • Student voice? • Peer input? • Other? • How will these be combined and weighted? • How will these classes of indicators be integrated to form a coherent picture?

Coherence Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Involves ensuring that the school accountability and educator accountability systems are sending similar messages to schools and stakeholders • It would make sense to use data from the school accountability system to augment information from the educator system • Further, it would also make sense to integrate the various components of the educator evaluation system to avoid a silo effect

Specific Measurement Model Center for Assessment. WY Accountability Advisory Committee (6/14/12) The following slides present some of the key decisions related to measurement model that will need to be made as we proceed? As you know, the “devil is in the details” and there are many details with which to contend. This is even more complicated when trying to reconcile and be clear about the state role

Measures of Educator Practice Center for Assessment. WY Accountability Advisory Committee (6/14/12) • What are the indicators that operationalize the knowledge & skills that define educator practice? For example, domains from Danielson’s Framework for Teaching include: • Planning and Preparation • The Classroom Environment • Instruction • Professional Responsibilities • Should these be the default “standards of professional practice” or should WY adopt more general standards (e.g., ISLIC, NC,CO) or leave it up to districts?

Measures of Educator Practice Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Whatever standards are selected/developed, how shall they be measured? • Classroom observations? • Document (artifact) analysis? • Structured interviews? • Professional portfolios? • What about required data collection strategies and protocols (e.g., 4 observations/year)? • What are the expected levels of performance on the various indicators? • What about observer training and certification?

Student Performance Measures and Analytics • What indicators of student growth should be used for PAWS grades and content areas? • What performance (growth) indicators should be used for non-PAWS grades and content areas? • This is a huge issue! • Should state-level measures of student growth be combined with local measures of student performance for each educator determination? If so, how? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Student Performance: Analyzing Growth • What analytic approach (model) will be used for analyzing State test data? • What are the technical and policy issues that need to be considered in choosing a model? • What are the advantages/disadvantages of using SGPs for educator evaluation? • What is the standard for ‘good enough’ growth? • Should growth expectations be “conditioned” on factors other than prior performance such as poverty, etc.? • What information should be reported to whom and at what level? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Mapping educators to standards, assessments & growth (Lee, 2010, based on preliminary data from MA DOE) Center for Assessment. WY Accountability Advisory Committee (6/14/12)

The Non-Tested Challenge Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Lack of high quality measures of student performance, particularly for the purposes for which they are being used • Limitations of analytical options for calculating educator contributions to student performance • Comparability concerns • Lack of technical capacity at the local and even state levels • Lack of predictable course sequences • Not enough time • Not enough money • Too much policy pressure (e.g., 50%) • Huge risk of corruption • Challenging issues of attribution • Many of these are challenges for tested as well as non-tested, but may be exacerbated for non-tested subjects and grades

All Educators in NTSG are Not the Same Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Instead of dealing with each individual case, it makes sense to create an approach for addressing categories of educators • The general categorization can occur at the state level and should be fine-tuned at the district or even school level • One classification approach is based on the data available for the various groups of educators • The following excerpt of a chart, created for Colorado, provides examples of the nominal types of educators that would fall into the different data categories

Comparability • What do we mean by comparability in this context? • Educators within the units of analysis are held to similar levels of expectations, at least in some relative sense • For example, it would be a threat to the system if the teachers in grades 4-8 reading and math received noticeably lower ratings than the rest of the teachers (NTSG) in the school • At what levels is comparability important? • Within schools? Clearly yes. • Within districts? Probably yes. • Within states? It would be nice, but it might be too high of a bar right now. Center for Assessment. WY Accountability Advisory Committee (6/14/12)

What Measurement Approaches Are Being Proposed? • Norm-referenced tests (NRTs) • Commercial interim assessments • State or district created end-of-course exams (both externally and locally developed) • Includes new assessment development in places like DE, CO, Hillsborough, FL • School or teacher-developed measures of student performance • Often includes Student Learning Objectives *Note: 1 & 2 rarely cover courses beyond the core content areas and even then, not well in HS. Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Analytic Approaches Center for Assessment. WY Accountability Advisory Committee (6/14/12) If you thought the measurement/assessment issue was daunting…. It pales in comparison to the analytic challenges (i.e., how growth is calculated at local levels) Remember, using the most sophisticated VAM models with high quality state test data has been rightfully questioned based on challenges with causal inferences, unreliability (year-to-year), and other technical issues (e.g., EPI report, Braun, et al., 2010, Rothstein, 2009 & 2010)

What Approaches Are Being Proposed for NTSG? Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Growth models using pre and post test from the same subject • Value-added models • Pre and post test score in the same subject • Conditioned on data other than pretest from same content area as posttest • Student Growth Percentiles • Shared attribution of aggregate growth/VAM results • Student learning objectives (SLO)

Definitions Growth refers to measures of performance for the same students at two or more points in time and requires a common, often vertical, scale to evaluate the magnitude of change. Only true growth model here. VAM: Generally describes multivariate models that include certain variables to produce to an expectation against which actual performance is evaluated. Student Growth Percentiles (SGP) is a regression based measure of growth that works by evaluating current achievement based on prior achievement and describing performance (using percentiles) relative to other students with the “same” prior achievement histories. Student Learning Objectives (SLO) is a general approach (often called Student Growth Objectives) whereby educators establish goals for individual or groups of students (often in conjunctionwith administrators) and then evaluating the extent to which the goals have been achieved. Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Attribution • Attribution: linking educator behavior to student outcomes • Assigning accountability • Multiple educators contribute to instruction • “Contact time” requirements—how long does the student need to be in the teacher’s classroom to count • Opportunity to employ shared attribution strategies • Must be tied to local theories of action or theories of improvement Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Combining Multiple Measures • How should we arrive at an overall judgment of educator effectiveness? • Weighting of student performance and knowledge & skills • What are the different types of information that should be employed when evaluating principals compared with teachers? • We know the specific indicators and even standards will differ • Who should be responsible for making these overall judgments? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Information Requirements Center for Assessment. WY Accountability Advisory Committee (6/14/12) • Data system requirements to link students with teachers at the state level • Data system requirements to manage the data at the local level • Dealing with student mobility • Dealing with missing data, especially non-random missing data • “Full academic year” rules

Capacity Requirements Center for Assessment. WY Accountability Advisory Committee (6/14/12) • How will this be managed at the state level? • Data, information, and analytics • Reporting and communication • Support and capacity building • Training and monitoring • How will this be managed at the local level? • Capacity for implementation • Conducting observations, document analysis, etc • Induction, mentoring, and support • Training • Record keeping • Reporting and feedback • Decision making and appeals

Reporting & Communication Center for Assessment. WY Accountability Advisory Committee (6/14/12) How will results be communicated to educators to improve practice? How will information about the system be communicated to the public and policy makers while protecting educators?

Consequences & Support • What sanctions, rewards, and/or consequences are appropriate to advance prioritized outcomes? • What strategies will be employed to use information to support schools/ teachers/ students? • Is there capacity in the state (in the districts) to improve educator quality in WY? • What resources will be required for this improvement to occur? • Where will they come from? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Negative Consequences • As we consider the design and implementation of WY’s new educator evaluation system, we must be mindful that the likelihood of getting this wrong (i.e., leading to unintended negative consequences) are at least as high as the chances of getting it right (i.e., improving teacher quality and student learning) • Unintended consequences could include: • Narrowing curriculum • Competition vs. Cooperation • Assignment of students or teachers to selected classes for reasons unrelated to educational benefit • Educator transition • Educator attrition Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Campbell’s Law • "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (emphases added) http://en.wikipedia.org/wiki/Campbell%27s_Law • Educator accountability systems will invite significantly more implicit and explicit corruption than has been seen with school accountability Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Monitoring and Evaluation • What types of formative evaluation approaches need to be put in place to monitor implementation and consequences? • Evaluate claims in theory of action • Evaluate impact • Establish criteria to determine if results are reasonable • Develop methods and standards to assess the precision and stability of results • Does the system meet important utility criteria? Center for Assessment. WY Accountability Advisory Committee (6/14/12)

Next steps… Center for Assessment. WY Accountability Advisory Committee (6/14/12) How should we plan our work going forward? Who’s going to do what? How will we work? Goals for next meeting…

Designing a Statewide System for Measuring Teacher and Leader Effectiveness

Designing a Statewide System for Measuring Teacher and Leader Effectiveness

Presentation Transcript

Stronge Teacher and Leader Effectiveness Performance Evaluation System

Designing a Statewide System for Measuring Teacher and Leader Effectiveness

Teacher Professional Growth and Effectiveness System

Teacher and Leader Effectiveness: An Overview for Counselors

Measuring Teacher Effectiveness

Leader Effectiveness Performance Evaluation System

Measuring teacher and principal effectiveness

Stronge Teacher and Leader Effectiveness Performance Evaluation System

Leader Effectiveness Performance Evaluation System

Measuring Teacher Effectiveness in Physical Education

Teacher Keys Effectiveness System

Measuring Teacher Effectiveness (MTE)

Cobb Keys for Teacher and Leader Effectiveness

Models for Evaluating Teacher/Leader Effectiveness

Teacher and Leader Effectiveness Performance Evaluation System

Teacher and Leader Effectiveness Performance Evaluation System

Teacher / Leader Effectiveness

Designing for Effectiveness

Teacher and Leader Effectiveness (TLE) Observation and Evaluation System

Teacher Keys Effectiveness System

Teacher Keys Effectiveness System

Teacher and Leader Effectiveness (TLE) Observation and Evaluation System