Evaluating teacher effectiveness with multiple measures

Evaluating teacher effectiveness with multiple measures Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive Center for Teacher Quality Performance Evaluation Advisory Committee Hartford, CT  March 20, 2012

Laura Goe, Ph.D. • Former teacher in rural & urban schools • Special education (7th & 8th grade, Tunica, MS) • Language arts (7th grade, Memphis, TN) • Graduate of UC Berkeley’s Policy, Organizations, Measurement & Evaluation doctoral program • Principal Investigator for the National Comprehensive Center for Teacher Quality • Research Scientist in the Performance Research Group at ETS

The National Comprehensive Center for Teacher Quality • A federally-funded partnership whose mission is to help states carry out the teacher quality mandates of ESEA • Vanderbilt University • Learning Point Associates, an affiliate of American Institutes for Research • Educational Testing Service

Today’s presentation available online • To download a copy of this presentation or look at it on your iPad, smart phone or laptop now, go to www.lauragoe.com • Go to Publications and Presentations page • Today’s presentation is at the bottom of the page

The goal of teacher evaluation

Agenda • What we value in teachers and teaching • The role of teaching standards • An overview of teacher evaluation measures • Observations of practice • Indicators of professional responsibility • Peer, student and parent feedback • Multiple indicators of student learning • An overview of teacher evaluation models • Teacher professional development • Weighting, performance levels, exceptions • Final thoughts

Goe, Bell, & Little (2008) definition of teacher effectiveness • Have high expectations for all students and help students learn, as measured by value-added or alternative measures. • Contribute to positive academic, attitudinal, and social outcomes for students, such as regular attendance, on-time promotion to the next grade, on-time graduation, self-efficacy, and cooperative behavior. • Use diverse resources to plan and structure engaging learning opportunities; monitor student progress formatively, adapting instruction as needed; and evaluate learning using multiple sources of evidence. • Contribute to the development of classrooms and schools that value diversity and civic-mindedness. • Collaborate with other teachers, administrators, parents, and education professionals to ensure student success, particularly the success of students with special needs and those at high risk for failure.

Teaching standards • A set of practices teachers should aspire to • A teaching tool in teacher preparation programs • A guiding document with which to align: • Measurement tools and processes for teacher evaluation, such as classroom observations, surveys, portfolios/evidence binders, student outcomes, etc. • Teacher professional growth opportunities, based on evaluation of performance on standards • A tool for coaching and mentoring teachers: • Teachers analyze and reflect on their strengths and challenges and discuss with consulting teachers

Measures and models: Definitions • Measures are the instruments, assessments, protocols, rubrics, and tools that are used in determining teacher effectiveness • Models are the state or district systems of teacher evaluation including all of the inputs and decision points (measures, instruments, processes, training, and scoring, etc.) that result in determinations about individual teachers’ effectiveness

Multiple measures of teacher effectiveness • Evidence of growth in student learning and competency • Standardized tests, pre/post tests in untested subjects • Student performance (art, music, etc.) • Curriculum-based tests given in a standardized manner • Classroom-based tests such as DIBELS • Evidence of instructional quality • Classroom observations • Lesson plans, assignments, and student work • Student surveys such as Harvard’s Tripod • Evidence binder (next generation of portfolio) • Evidence of professional responsibility • Administrator/supervisor reports, parent surveys • Teacher reflection and self-reports, records of contributions

Teacher observations: strengths and weaknesses • Strengths • Great for teacher professional growth • If observation is followed by opportunity to discuss results • If support is provided for those who need it • Helps evaluator (principals or others) understand teachers’ needs across school or across district • Weaknesses • Essential to have alignment between teaching standards and observation instrument • Resource intensive (personnel time, training, calibrating) • Validity of observation results may vary with who is doing them, depending on how well trained and calibrated they are

Example: University of Virginia’s CLASS observation tool

Example: Charlotte Danielson’s Framework for Teaching

Validity of classroom observations is highly dependent on training • A teacher should get the same score no matter who observes him • This requires that all observers be trained on the instruments and processes • Occasional “calibrating” should be done; more often if there are discrepancies or new observers • Who the evaluators are matters less than the fact that they are trained to recognize evidence and score it consistently • Teachers should also be trained on the observation forms and processes so they can participate actively and fully in the process

Risk management vs. one-size-fits-all in teacher observations • Conducting high-quality observations is a resource-intensive process • A more efficient use of resources is for teachers who have not yet demonstrated competence to be on a more intensive observation schedule • New teachers • Teachers who have changed teaching assignments or schools • Other measures are less resource intensive and can be used routinely (surveys, student outcomes, portfolios)

Reliability results when using different combinations of raters and lessons Figure 2. Errors and Imprecision: the reliability of different combinations of raters and lessons. From Hill et al., 2012 (see references list). Used with permission of author.

Formal vs. informal observations • Formal observations are likely to be • Announced and scheduled in advance according to a pre-determined yearly schedule • Include pre- and post-conferences with review of lesson plans and artifacts • Last an entire class period • Result in a set of scores on multiple indicators • Informal observations are likely to be • Unannounced, drop-in • Last less than an entire class period • Result in informal verbal or written feedback to the teacher, perhaps on only one indicator

Questions to ask about observations • How many observations per year? • Vary by new vs. experience? • Vary by demonstrated competence? • Combination of formal and informal? • Who should conduct the observations? • Will multiple observers be required? • How will they be trained? • Workshops? Online (video-based)? • Will they need to be certified?

Valuing other professional contributions, other student outcomes • Professional contributions • Working closely with parents and community • Participating in Response to Intervention teams • Curriculum teams (within & across grades) • Leadership in grade/subject/school • Student outcomes • Successful outcomes for special populations may include social and behavioral outcomes • Improvements in attendance, behavior, participation in class, engagement, etc.

Measuring Professional Contributions • Some observation instruments include this category (Charlotte Danielson’s, TAP, Marzano) while others don’t (CLASS) • If professional contributions are not included, may want to consider teacher-constructed portfolios • Specific types of documents according to guidelines, not just a bunch of “kudos” • Document level of participation and contributions, not just “participation”

Peer feedback • Research on using peer feedback in K-12 teaching is scarce (maybe non-existent) but can be found in higher ed for formative and summative (tenure) purposes • Done in pairs or teams • Two key areas • Classroom observation with feedback • Feedback on classroom artifacts (lesson plans, teaching resources, assessments, etc.)

Parent feedback • Very little research on the use of parent surveys and feedback in teacher evaluation • A study by Peterson et al (2003) suggests parents took the anonymous survey seriously (answers were not random) • Positive: Teachers liked being able to select the survey as one of the measures for their evaluation (from a menu of measures) • Negative: Principals were worried about sampling (could be a problem in elementary or whenever you have small classes)

Parent survey (from Peterson et al, 2003) • Did you ask the teacher for and did the teacher give you (yes/no responses): • 1. An overview of class content & goals? • 2. Description of student’s progress? • 3. Ideas for home support of learning? • Answer Yes Somewhat No (5 4 3 2 1) • 4. Did your child know what was expected in this class? • 5. Was the classroom work the right difficulty for your child? • 6. Did the teacher treat your child with respect, care, and knowledge of child’s needs? • 7. Were you satisfied with your child’s overall school experience as provided by this teacher? • Do you have any comments for the teacher?

Tripod Student Survey (1) • Harvard’s Tripod Survey – the 7 C’s • Caring about students (nurturing productive relationships); • Controlling behavior (promoting cooperation and peer support); • Clarifying ideas and lessons (making success seem feasible); • Challenging students to work hard and think hard (pressing for effort and rigor); • Captivating students (making learning interesting and relevant); • Conferring (eliciting students’ feedback and respecting their ideas); • Consolidating (connecting and integrating ideas to support learning)

Tripod Student Survey (2) • Improved student performance depends on strengthening three legs of teaching practice: content, pedagogy, and relationships • There are multiple versions: k-2, 3-5, 6-12 • Measures: • student engagement • school climate • home learning conditions • teaching effectiveness • youth culture • family demographics • Takes 20-30 min • There are English and Spanish versions • Comes in paper form or in online version

Tripod Student Survey (3) • Control is the strongest correlate of value added gains • However, it is important to keep in mind that a good teacher achieves control by being good on the other dimensions

Tripod Student Survey (4) • Different combinations of the 7 C's predict different outcomes (student learning is one outcome) • Using the data, you can determine what a teacher needs to focus on to improve important outcomes • Besides student learning, other important outcomes include: • happiness • good behavior • healthy responses to social pressures • self-consciousness • engagement/effort • satisfaction

Surveys as evaluation measures • Relative to other measures, they’re very inexpensive (though there is a cost for Tripod) • They can provide useful, actionable information to teachers and to principals about aspects of teachers’ performance not captured in other measures • The survey can be repeated over time to show improvement in problem areas

Rhode Island’s SLO language • “Student Learning Objectives are not set by educators in isolation; rather, they are developed by teams of administrators, grade-level teams or groups of content-alike teachers and, are aligned to district and school priorities, wherever possible.” (pg 12) From Rhode Island’s “Guide to Measures of Student Learning for Administrators and Teachers 2011-2012” http://www.ride.ri.gov/educatorquality/educatorevaluation/Docs/GuideSLO.pdf

Evidence of growth in student learning • Evidence is strongest when it is • Standardized, meaning that all teachers used the assessment in exactly the same way • Gave the assessment on the same day • Gave students a specific amount of time to complete the test • Used the same preparation/instructions prior to the test • Recorded/reported results accurately • Valid, meaning that it measures what is intended • Items (questions) accurately capture students’ understanding and knowledge • Progress towards proficiency in a subject is captured because there are sufficient items to measure students at all levels • Recorded, meaning that student progress can be compared across classrooms and schools

Collect evidence in a standardized way (to the extent possible) • Evidence of student learning growth • Locate or develop rubrics with explicit instructions and clear indicators of proficiency for each level of the rubric • Establish time for teachers to collectively examine student work and come to a consensus on performance at each level • Identify “anchor” papers or examples • Provide training for teachers to determine how and when assessments should be given, and how to record results in specific formats

The 4 Ps (Projects, Performances, Products, Portfolios) • Yes, they can be used to demonstrate teachers’ contributions to student learning growth • Here’s the basic approach • Use a high-quality rubric to judge initial knowledge and skills required for mastery of the standard(s) • Use the same rubric to judge knowledge and skills at the end of a specific time period (unit, grading period, semester, year, etc.)

Assessing Musical Behaviors: The type of assessment must match the knowledge or skill 4 types of musical behaviors: Types of assessment Slide used with permission of authors Carla Maltas, Ph.D. and Steve Williams, M.Ed. See reference list for details. Responding Creating Performing Listening Rubrics Playing tests Written tests Practice sheets Teacher Observation Portfolios Peer and Self-Assessment

The “caseload” educators • For nurses, counselors, librarians and other professionals who do not have their own classroom, what counts for you is your “caseload” • May be all the students in the school • May be a specific set of students • May be other teachers • May be all of the above!

New Haven “matrix” Asterisks indicate a mismatch—teacher is very high on one area (practice or growth) and very low on the other area.

Considerations for choosing and implementing measures • Consider whether human resources and capacity are sufficient to ensure fidelity of implementation • Conserve resources by encouraging districts to join forces with other districts or regional groups • Establish a plan to evaluate measures to determine if they can effectively differentiate among teacher performance • Examine correlations among measures • Evaluate processes and data each year and make needed adjustments

Measuring teachers’ contributions to student learning growth: A summary of current models

Interpreting results for alignment with teacher professional learning options • Different approach; not looking at “absolute gains” • Requires ability to determine and/or link student outcomes to what likely happened instructionally • Requires ability to “diagnose” instruction and recommend/and or provide appropriate professional growth opportunities • Individual coaching/feedback on instruction • Observing “master teachers” • Group professional development (when several teachers have similar needs)

Memphis professional development system: An aligned system • Teaching and Learning Academy began April ‘96 • Nationally commended program intended to • “…provide a collegial place for teachers, teacher leaders and administrators to meet, study, and discuss application and implementation of learning…to impact student growth and development” • Practitioners propose and develop courses • Responsive to school/district evaluation results • Offerings must be aligned with NSDC standards • ~300+ On-line and in-person courses, many topics

Growth opportunities for all teachers Duke & Stiggins, 1986, p. 15

Measures that help teachers grow • Measures that motivate teachers to examine their own practice against specific standards • Measures that allow teachers to participate in or co-construct the evaluation (such as “evidence binders”) • Measures that give teachers opportunities to discuss the results with evaluators, administrators, colleagues, teacher learning communities, mentors, coaches, etc. • Measures that are aligned with professional development offerings • Measures which include protocols and processes that teachers can examine and comprehend

Results inform professional growth opportunities • Are evaluation results discussed with individual teachers? • Do teachers collaborate with instructional managers to develop a plan for improvement and/or professional growth? • All teachers (even high-scoring ones) have areas where they can grow and learn • Are effective teachers provided with opportunities to develop their leadership potential? • Are struggling teachers provided with coaches and given opportunities to observe/be observed?

Effectiveness can be improved! • Most teachers are doing the best they can • Help them do better with feedback, support, coaching, and a focus on classroom environment and relationships with students • Teachers who are discouraged may need to see successful teachers with similar kids • Teachers who are consistently effective should be encouraged to model and teach specific practices to less effective teachers • Classroom learning environment is key: helping teachers create and maintain a better classroom learning environment improves student oucomes

Weights and measures • There are no “rules” here; weights are likely to be determined by local priorities and beliefs • Need to decide whether a high score on one measure/component can make up for a low score on another (“compensatory”) • Need to decide whether to have a minimum score • High score on another component will not compensate • The specific “mix” of measures may be locally determined within state guidelines • The mix should be evaluated year-to-year to see how the set of measures and weights are working

Teacher evaluation in isolated and/or low-capacity districts • External evaluators may need to be brought in for very small, isolated districts • For example, a district where the superintendent is also principal, history teacher, and bus driver • May also be needed when evaluators’ objectivity is impacted by factors such as fear of losing teachers or damaging long-term relationships in the community • Evaluators could be “exchanged” across districts within a specific region (“you evaluate mine, and I’ll evaluate yours”)or regional evaluators could serve a set of districts

Before you implement teacher and principal evaluation systems, ask yourself… • How will this component of the teacher and principal evaluation system impact teaching and learning in classrooms and schools? • How will this component look different in low-capacity vs. high-capacity schools? • How will reporting on this component be done (to provide actionable information to teachers, principals, schools, districts, teacher preparation programs, and the state)? • How will we know if this component is working as we intended?

Final thoughts • The limitations: • There are no perfect measures • There are no perfect models • Changing the culture of evaluation is hard work • The opportunities: • Evidence can be used to trigger support for struggling teachers and acknowledge effective ones • Multiple sources of evidence can provide powerful information to improve teaching and learning • Evidence is more valid than “judgment” and provides better information for teachers to improve practice

Resources and links • Memphis Professional Development System • Main site: http://www.mcsk12.net/aoti/pd/index.asp • PD Catalog: http://www.mcsk12.net/aoti/pd/docs/PD%20Catalog%20Spring%202011lr.pdf • Individualized Professional Development Resource Book: http://www.mcsk12.net/aoti/pd/docs/Resource%20guide%2011-11.pdf • Charlotte Danielson’s Framework for Teaching http://www.danielsongroup.org/theframeteach.htm • CLASS http://www.teachstone.org/ • Peer Review of Teaching (Higher Ed) http://www.teaching-learning.utas.edu.au/__data/assets/pdf_file/0010/1054/Peer_review_of_teaching.pdf

Resources and links (cont’d) • Harvard’s Tripod Survey http://www.tripodproject.org/index.php/index/ • National Response to Intervention Center Progress Monitoring Tools http://www.rti4success.org/chart/progressMonitoring/progressmonitoringtoolschart.htm • Colorado Content Collaborativeshttp://www.cde.state.co.us/ContentCollaboratives/index.asp • New York State approved teacher and principal practice rubrics http://usny.nysed.gov/rttt/teachers-leaders/practicerubrics/ • Rhode Island Department of Education Teacher Evaluation – Student Learning Objectives http://www.ride.ri.gov/educatorquality/educatorevaluation/SLO.aspx • Tennessee Teacher Evaluation http://team-tn.org/

References • Anderson, L. (1991). Increasing teacher effectiveness. Paris: UNESCO, International Institute for Educational Planning. • Betebenner, D. W. (2008). A primer on student growth percentiles. Dover, NH: National Center for the Improvement of Educational Assessment (NCIEA). • http://www.cde.state.co.us/cdedocs/Research/PDF/Aprimeronstudentgrowthpercentiles.pdf • Braun, H., Chudowsky, N., & Koenig, J. A. (2010). Getting value out of value-added: Report of a workshop. Washington, DC: National Academies Press. http://www.nap.edu/catalog.php?record_id=12820 • Duke, DL; Stiggins, RJ. (1986.) Teacher Evaluation: Five Keys to Growth. West Haven, CT: National Education Association. ERIC #ED275069 (full text, pg 15) • Ellerson, N. M. (2009). Exploring the possibility and potential for pay for performance in America’s public schools. Washington, DC: American Association of School Administrators. Finn, Chester. (July 12, 2010). Blog response to topic “Defining Effective Teachers.” National Journal Expert Blogs: Education. • http://education.nationaljournal.com/2010/07/defining-effective-teachers.php • Fuller, E., & Young, M. D. (2009). Tenure and retention of newly hired principals in Texas. Austin, TX: Texas High School Project Leadership Initiative. • Glazerman, S., Goldhaber, D., Loeb, S., Raudenbush, S., Staiger, D. O., & Whitehurst, G. J. (2011). Passing muster: Evaluating evaluation systems. Washington, DC: Brown Center on Education Policy at Brookings. • http://www.brookings.edu/reports/2011/0426_evaluating_teachers.aspx#

Evaluating teacher effectiveness with multiple measures

Evaluating teacher effectiveness with multiple measures

Presentation Transcript

Evaluating Teacher Effectiveness

Models for Evaluating Teacher Effectiveness

Teacher Effectiveness

Models for Evaluating Teacher Effectiveness

Multiple Measures

Teacher leader evaluation: A multiple-measures system

Growth, Value-Added and Teacher Effectiveness Measures

Teacher evaluation : A multiple-measures system

Evaluating Teacher Effectiveness

Evaluating Measures

Evaluating Teacher/Leader Effectiveness Laura Goe, Ph.D.

Models for Evaluating Teacher/Leader Effectiveness

Teacher Effectiveness

Models for Evaluating Teacher Effectiveness

Teacher Effectiveness

Teacher Effectiveness

Evaluating Teacher Effectiveness

Multiple Measures in Teacher Evaluation

Teacher Evaluation Student Growth Multiple Measures

Multiple Measures of Teacher Effectiveness Laura Goe, Ph.D.

Multiple Measures of Teacher Effectiveness Laura Goe, Ph.D.

Teacher Effectiveness