Waiting Room • Today’s webinar will begin shortly. • REMINDERS: • Dial 800-503-2899 and enter the passcode 6496612# to hear the audio portion of the presentation • Download today’s materials from the sign-in page: • Webinar Series Part 6 PowerPoint slides • Correlation Example Excel file
Determining How to Integrate Assessments into Educator Evaluation: Developing Business Rules and Engaging Staff Webinar Series Part 6
Audience & Purpose • Target audience • District teams that will be engaged in the work of identifying, selecting, and piloting District-Determined Measures. • After today participants will understand: • Examples of practical solutions to issues of fairness in using District-Determined Measures (DDMs). • Practical examples of engaging educators in the process of implementing DDMs.
Agenda • Student Impact Rating Rollout Reminder • DDM Comparability • Identifying Bias • Standardizing DDMs • Ensuring Sufficient Variability • Q&A and Next Steps
Student Impact Rating Rollout: *ESE will release the June 2014 submission template and DDM implementation extension request form in December 2013.
DDM Key Questions • Is the measure aligned to content? • Does it assess what the educators intend to teach and what’s most important for students to learn? • Is the measure informative? • Do the results tell educators whether students are making the desired progress, falling short, or excelling? • Do the results provide valuable information to schools and districts about their educators?
Refining your Pilot DDMs • Districts will employ a variety of approaches to identify pilot DDMs (e.g., build, borrow, buy). • Key considerations: • How well does the assessment measure growth? • Is there a common administration protocol? • Is there a common scoring process? • How do results correspond to low, moderate, of high growth? • Is the assessment comparable to other DDMs? • Use the DDM Key Questions and these considerations to strengthen your assessments during the pilot year.
DDM Comparability: Two Types • DDMs must be “comparable across schools, grades, and subject matter district-wide.” (Per 603 CMR 35.09(2)a) • Comparability = Two types • (Type 1) Comparable across schools • (Type 2) Comparable across grades and subject matter • Learn more in Technical Guide B, page 9 and appendix G
Comparability (Type 1) • Comparable across schools • Example: Teachers with the same job (e.g., all 5th grade teachers) • Where possible, measures are identical • Easier to compare identical measures • Do identical measures provide meaningful information about all students? • When might they not be identical? • Different content (different sections of Algebra I) • Differences in untested skills (reading and writing on math test for ELL students) • Other accommodations (fewer questions to students who need more time)
Error and Bias • Error is the difference between true ability and a student’s score. • Random error • Student sleeps poorly, lucky guess, … etc • Systematic error (bias) • Error occurs for one type or group of students • ELL student misreads a set of questions • Systematic Error = Bias • Why This matters? • Error (OK) decreases with longer/additional measures • Bias (BAD) does not decrease with longer/additional measures • Even with identical DDM, bias threatens comparability
When does bias occur? • Situation: Students who score high on the pre-test have less of an opportunity to grow because they cannot get more than a top score (Ceiling Effect). • Situation: Special education students gain fewer points from pre-post test, and as a result are less likely to be labeled as having high growth.
Checking for Bias • Do all students have an equal chance to grow? • Is there a relationship between the initial score and gain score? • We can do this in EXCEL using correlation • We have • Pre-Test Score • Post-Test Score • Gain Score • Type “=correl”, click formula • Highlight Pre-Test Scores, Press “Comma” • Highlight Difference Scores, Close Parentheses, Press “Enter” Correlation formula in Excel: =CORREL(PRE-TEST SCORES, GAIN SCORES)
Interpreting Correlation • Correlation is the degree to which two numbers are related • Correlation • Number between -1 and 1. • A zero correlation means numbers are unrelated • Closer to 1 or -1 means strong correlation • DDMs should provide all students an opportunity to demonstrate growth • We want to see little to no correlation between pre-test scores and gain scores • A correlation above .3 or below -.3 suggests that there are systematic differences in gain for low and high ability students
Correlation Example • Demonstration of computing Correlation between pre-test and gain • Very Low Correlation • students of all ability were equally likely to demonstrate growth • Negative Correlation • Students of high ability systematically demonstrated less growth (due to ceiling effect) • Positive Correlation • Students with lower scores generally grew less (bias)
Interpreting Correlation • Strong correlation is an indication of a problem • A low correlation is not a guarantee of no bias! • Strong effect in small sub-population • Counteracting effects at both low and high end • Use common sense • Always look at a graph! • Create a scatter-plot graph and look for patterns
Example of Bias at Teacher Level Teacher A Teacher B Even though similar students gained the same amount Teacher A’s average gain is 2 Teacher B’s average gain is 5
Solution: Grouping • Grouping allows teachers to be compared based on similar students, even when the number of those students is different
Addressing Bias: Grouping • How many groups? • What bias are you addressing? • Enough students in each group? • Using Groups • Weighted average • Rule based (all groups must be above cut off) • Professional judgment
Comparability (Type 2) • Comparability across different DDMs • Across different grades and subject matter • Are different DDMs held to the same standard of rigor? • Does not require identical number of students in each of the three groups of low, moderate, and high • Common sense judgment of fairness
One option: Standardization • Standardization is a process of putting different measures on the same scale • For example • Most cars cost $25,000 give or take $5,000 • Most apples costs $1.50 give or take $.50 • Getting a $5000 discount on a car is about equal to what discount on an apple? • Technical terms • “Most are” = mean • “Give or take” = standard deviation
Guest Speaker • Jamie LaBillois – • Executive Director of Instruction, Norwell Public Schools
Developing Local Norms • Student A • English: 15/20 • Math: 22/25 • Art: 116/150 • Social Studies: 6/10 • Science: 70/150 • Music: 35/35 • We learned early on that we needed a process that would create one universal measurement unit to discuss student progress.
How? • Step One • Calculated the difference between Post and Pre (or any approach from Technical Guide B) • Step Two • Find the mean (average) of the difference scores • Step Three • Find the standard deviation of the difference scores
How? • Now, we’re ready to “transform” the difference scores into a universal measurement system. • Step Four • Calculate the z-score of each individual difference score • (observation – Mean) • Z = ------------------------------------ • Standard Deviation • Step Five • Calculate percentile rank for each z-score
Developing Local Norms • Student A • English: 15/20 • Math: 22/25 • Art: 116/150 • Social Studies: 6/10 • Science: 70/150 • Music: 35/35 • Student A • English: 62 %ile • Math: 72 %ile • Art: 59 %ile • Social Studies: 71 %ile • Science: 70 %ile • Music: 61 %ile
Examining an Educator’s Impact • Grade 4 DIBELS Oral Reading Fluency • MEDIAN %ile per class: • Teacher 1: 65 %ile • Teacher 2: 71 %ile • Teacher 3: 59 %ile • Teacher 4: 59 %ile • Teacher 5: 62 %ile • Teacher 6: 57 %ile • Teacher 7: 29 %ile • Teacher 8: 50 %ile Evaluator’s Focus
Lessons Learned • Growth vs. Achievement • Robust Tool • Timely Analysis • Re-Assessment of Instruction • Re-Assessment of Ability vs. Disability • Development of Building-Based Evaluators • Educator Engagement is Essential
Ensuring Sufficient Variability • Technical Guide B’s two key questions: • Is DDM aligned to content? • Does the DDM provide information to educators and evaluations? • Lack of variability reduces information
Looking for Variability • The second graph is problematic because it doesn’t give us information about the difference between average and high growth because so many students fall into the “high” growth category.
Guest Speaker • Experience with constructing measures with greater variability
Wrap-Up • Today, we discussed three strategies for evaluating the fairness of your DDMs • Check for bias by computing the correlation between pre-test scores and gain scores. • Remember: Zero correlation indicates that all students have an equal chance to demonstrate growth. • Standardization can help you compare DDMs in different content areas. • Look for variability in student growth. A lack of variability reduces the amount of information available to educators about their students.
Resources Available Now at http://www.doe.mass.edu/edeval/ddm/: • Technical Guide B • DDMs and Assessment Literacy Webinar Series • Technical Assistance and Networking Sessions • Core Course Objectives and Example DDMs Coming Soon • Using Current Assessments Guidance (Curriculum Summit) • Model Contract Language • DDM Pilot Plan Cohorts
Register for Webinar Series Part 7 • Part 7: Communicating Results • Date: December 5th, 2013Time: 4-5pm EST (60 minutes)Register: https://air-event500.webex.com/air-event500/onstage/g.php?d=597905353&t=a
Questions • Contact Craig Waterman at email@example.com Ron Noble at firstname.lastname@example.org Feedback • Tell us how we did: http://www.surveygizmo.com/s3/1421848/District-Determined-Measures-amp-Assessment-Literacy-Webinar-6-Feedback