Analyzing Growth: Effective Strategies for Educational Assessments

DDM Part IIAnalyzing the Results Dr. Deborah Brady

Agenda • Overview of how to measure growth in 4 “common sense” ways • Quick look at “standardization” • Not all analyses are statistical or new • We’ll use familiar ways of looking at student work • Excel might help when you have a whole grade’s scores, but it is not essential • Time for your questions; exit slips • My email dbrady3702@msn.com; • PowerPoint and handouts at http://tinyurl.com/k23opk6

2 Considerations Local DDMs,” 1. Comparable across schools • Example: Teachers with the same job (e.g., all 5th grade teachers) • Where possible, measures are identical • Easier to compare identical measures • Do identical measures provide meaningful information about all students? • Exceptions: When might assessments not be identical? • Different content (different sections of Algebra I) • Differences in untested skills (reading and writing on math test for ELL students) • Other accommodations (fewer questions to students who need more time) • NOTE: Roster Verification and Group Size will be considerations by DESE

2. Comparable across the District • Aligned to your curriculum (comparable content) K-12 in all disciplines • Appropriate for your students • Aligned to your district’s content • Informative, useful to teachers and administrators • “Substantial” Assessments (comparable rigor): • “Substantial” units with multiple standards and/or concepts assessed. (DESE began talking about finals/midterms as preferable recently) See Core Curriculum Objectives (CCOs) on DESE website if you are concerned http://www.doe.mass.edu/edeval/ddm/example/ • Quarterly, benchmarks, mid-terms, and common end of year exams • NOTE: All of this data stays in your district. Only HML goes to DESE with a MEPID for each educator.

Examples of 4 +1 Methods for Calculating Growth Each is in handout • Pre-post test • Repeated measures • Holistic Rubric (Analytical Rubric) • Post test only • A look at “standardization” with percentiles

Typical Gradebook and Distribution Page 1 of handout • Alphabetical order (random) • Sorted low to high • Determine “cut scores” (validate in the student work) • Use “Stoplight Method” to help see cut scores • Graph of distribution of all scores • Graph of distribution of High, Moderate, Low scores

“Cut” Scores and “common sense”: validate them with performances. What work is not moving at an average rate? What work shows accelerated growth? Some benchmarks have determined rates of growth over time

Pre/Post Test • Description: • The same or similar assessments administered at the beginning and at the end of the course or year • Example: Grade 10 ELA writing assessment aligned to College and Career Readiness Standards at beginning and end of year • Measuring Growth: • Difference between pre- and post-test. • Check if all students have an equal chance of demonstrating growth

Cut score? Look at work. Look at distribution. Pre- Post Tests

Holistic • Description: • Assess growth across student work collected throughout the year. • Example: Tennessee Arts Growth Measure System • Measuring Growth: • Growth Rubric (see example) • Considerations: • Option for multifaceted performance assessments • Rating can be challenging & time consuming

Holistic Example (unusual rubric) Example taken from Austin, a first grader from Anser Charter School in Boise, Idaho. Used with permission from Expeditionary Learning. Learn more about this and other examples at http://elschools.org/student-work/butterfly-drafts 11

HOLISTIC Easier for Large-Scale Assessmentslike MCAS Rubric Topic or Conventions and useful when categories overlap

MCAS Has 2 Holistic Rubrics

Pre and Post Rubric (2 Criteria) GrowthAdd the scores Rubrics do not represent percentages. A student who received a 1 would probably receive a 50. F? 1= 50 F Seriously at risk 2= range 60-72, 75? Dto C- At risk 3= 76-88, 89? C+ to B+ Average 4= 90-100 A to A+ Above most

Holistic Rubric or Holistic DescriptorKeeping 1-4 scale

Converting Rubrics to PercentagesNot recommended for classroom use because it distorts the meaning of the descriptors.May facilitate this large-scale use. District Decision Common Sense analysis Was the assessment too difficult? Zeros in pretest (3) Zero growth Only 1 student improved Change assessment scale? Look at all of the grade-level assessments. % conversion not helpful in this case?

Repeated Measures • Description: • Multiple assessments given throughout the year. • Example: running records, attendance, mile run • Measuring Growth: • Graphically • Ranging from the sophisticated to simple • Less pressure on each administration. • Authentic Tasks (reading aloud, running)

Repeated Measures • Description: • Multiple assessments given throughout the year. • Example: running records, attendance, mile run • Measuring Growth: • Graphically • Ranging from the sophisticated to simple • Considerations: • Less pressure on each administration. • Authentic Tasks

Repeated Measures Example Running Record Errors in ReadingAverage of high, moderate, and low error groups

Post test onlyAP exam: Use as baseline to show growth for each level or… for classroom • This assessment does not have a “normal curve” • An alternative for post test only for a classroom and to show student growth is to give a mock AP pre and post.

Looking for Variability • The second graph is problematic because it doesn’t give us information about the difference between average and high growth because so many students fall into the “high” growth category. • NOTE: Look at the work and make “common sense” decisions. • Consider the whole grade level; one class’s variation may be caused by teacher’s effectiveness • Critical Question: Do all students have equal possibility for success?

“Standardizing” Local NormsPercentages versus Percentiles% within class/course %iles across all courses in district • Many Assessments • with different standards • Student A • English: 15/20 • Math: 22/25 • Art: 116/150 • Social Studies: 6/10 • Science: 70/150 • Music: 35/35 • “Standardized” • Normal Curve • Student A • English: 62 %ile • Math: 72 %ile • Art: 59 %ile • Social Studies: 71 %ile • Science: 70 %ile • Music: 61 %ile • Percentage of 100% • Student A • English 75% • Math 88% • Art 77% • Social Studies 60% • Science 46% • Music 100

Standardization In Everyday Terms • Standardization is a process of putting different measures on the same scale • For example • Most cars cost $25,000 give or take $5,000 • Most apples costs $1.50 give or take $.50 • Getting a $5000 discount on a car is about equal to what discount on an apple? • Technical terms • “Most are” = mean • “Give or take” = standard deviation

Percentile/Standard Deviation

Excel FunctionsSort high to low or low to high, Graphing Function, Statistical Functions including Percentiles and Standard Deviation • Student grades can be sorted from highest to lowest score with one command • Table of student scores can be easily graphed with one command • Excel will easily calculate %, but this is probably not necessary

“Common Sense” • The purpose of DDMs is to assess Teacher Impact • The student scores, the Low, Moderate, and High growth rankings are totally internal • DESE (in two years) will see • MEPIDS and • L, M or H next to a MEPID • The important part of this process needs to be the focus: • Your discussions about student learning with colleagues • Your discussions about student learning with your evaluator • An ongoing process

Analyzing Growth: Effective Strategies for Educational Assessments

Analyzing Growth: Effective Strategies for Educational Assessments

Presentation Transcript

The Holocaust – Part II

Analyzing Investing Activities-II

Measuring and Analyzing Feedback Results

DDM Part II Analyzing the Results

The Crusades Part II

12.2 – Analyzing Survey Results

The Synapse: Part II

Track B - Analyzing Results

The CNS Part II

DDM development

From Creating the Form to Analyzing the Results:

Analyzing MD Results

Tevatron W/Z/  Results Part II Attack of the DiBosons

Analyzing Differences II

Part Two: Market Analyzing

The Reactions: Part II

DDM

Part II – The brothers

PART II - II

Analyzing the Results of an Experiment…

Analyzing Simulation Results ～ awk, gnuplot ～