Understanding and Implementing Academic Growth Standards

Interpreting State Test Growth Model Emeteric PVAAS

PVAAS Growth Methods

The Growth Standard Methodology • Each year a cohort’s estimated achievement (using all historical PSSA data available) will be located on the appropriate grade level distribution from the 2005-06 statewide distributions. • The 2005-06 performance distributions are used to establish “typical” performance at each grade level so that growth in consecutive years can be measured relative to the same standard each year.

Growth Standard MethodologyGrades 4 through 8 – Reading & MathA cohort makes one year’s growth when… The estimated achievement for the current year maintains the same relative position as the estimated achievement for the previous year in the statewide data base of all cohorts’ estimated achievement. 4th Grade Distribution 2005-06 Cohort Position 5th Grade Distribution 2005-06 Cohort Position

Mean Observed Score Predictive MethodologyWriting, Science, and Grade 11 Reading & MathA cohort makes one year’s growth when… The mean observed score from the actual test is not significantly different from the mean predicted score for the cohort. The mean predicted score is calculated based on all reading and math data in each student’s record in the cohort. Mean Observed Score ≈ Mean Predicted Score Mean Predicted Score ± error

Summary:One Year’s Growth

Using the Growth Standard • What is a Growth Standard and how is it set? • The Growth Standard specifies the minimal acceptable academic gain from grade to grade for a cohort of students.

Using the Growth Standard • How can we compare scores across different years? • The growth Standard uses converts PSSA scores to an Equal interval score that allows you to compare scores. Without the conversion, you cannot compare scores.

Using the Growth Standard • The use of a Growth Standard creates the possibility that ALL schools can demonstrate appropriate growth.

An Analogy

An Analogy • Doctors plot a child’s length/height over time. • Each child may have a unique growth curve.

When is growth “acceptable”? • The length/height measurement is increasing over time. • The length/height measurement maintains the approximate position in its length/height distributions as the child grows. • The child’s length/height continues to increase in a consistent manner.

When is growth “acceptable”? • The PSSA growth standard acts in a similar manner as a child’s growth chart • Deviation from “typical”  Further Investigation is needed

What is the Growth Standard for a child’s length/height? • The standard is that the child maintain the approximate same position each of the increasing distributions of length/heights as the child grows. • A significant deviation from that pattern does not indicate a problem; it indicates a need for further investigation.

Simulated Growth Standard Charts for Academic Achievement • Let us build an Academic Achievement Growth Chart. • Collect the average performances of a large sample of students using a uniform assessment during each year of their career through school. • Plot curves to represent appropriate percentile patterns. • An example: Suppose the following table represents the means and SDs of a group of students on the PSSA beginning in 3rd grade and continuing through 8th grade and ultimately 11th grade.

A Growth Standard Chart for Academic Achievement

An example of a cohort’s growth… This cohort’s mean performances have met the Growth Standard since • The growth curve approximately maintains its position in the distribution of scores. • There are no significant deviations in the pattern of growth over time. Note that there is a problem of comparing scaled scores across years…

A Problem… • It will take six years to create an academic growth chart. • We can use Base Year distributions. • Distributions of the Base Year match the distributions of a single cohort over time.

We use the base year distributions. The base year for PVAAS is 2006.

Using the Base Year 2006 Suppose the distributions from 2006 are given by Conversion to NCE scores will use the Base Year distributions in their calculations.

Suppose the means of a cohort in two consecutive years are:2007: 3rd  1390 and 2008: 4th  1450 NCE scores are calculated for both using the 2006 means and SD’s. 2007: 3rd  1390  2008: 4th  1450 All future PSSA scaled scores will be converted to NCE scores using the 2006 Base year parameters for the comparison to calculate the mean gain of a cohort of students.

The NCE Growth Curves

Some Thoughts… This Growth Standard concept demonstrates the need for longitudinal data when considering academic growth since each student has his/her own academic growth curve. But… The example also exhibits the remaining two issues for PVAAS value-added methods: • Comparing scores from year to year • Estimate the “true” level of achievement for input into the growth curve.

Calculation of Gain from year to year Student growth is measured by difference in performance in consecutive years. But there is a problem with this! These scores are not comparable!

Comparing scaled scores on the PSSA from different years PSSA tests have different means and standard deviations at each grade and for different years. For example, in 8th grade:

A Solution: Conversion to NCE Scores • NCE scores indicate the position of a scaled score on a reference scale (mean = 50, sd = 21.06) so that the scaled scores from different distributions with different scales can be compared. • The use of NCE scores does not impose a normal distribution on the data, nor does the use of NCE scores have any relationship to normed referenced tests. • NCEs are excellent for looking at scores over time. Using Data to Improve Student Learning in High Schools Victoria L. Bernhardt

NCE Scores Are About Position To calculate an NCE score: • Calculate the z-score of the data value of interest, that is, the number of standard deviations the data value is from the mean of its distribution: • The NCE score is calculated using the following formula:

The need for uniform scales… • George scores a 655 on the SAT mathematics exam. • George also scores a 28 on the ACT mathematics exam. Which score should he report to his colleges if he wants to provide the “better” score?

A Matter of Comparison How do we compare George’s scores? The nature of each distribution is irrelevant to the question of interest:

A Solution • Conversion of both scores to NCE scores allows for the identification of the position of each score on the same scale. • This identification of position provides the capability of comparison since the converted scores will be based on the same distribution parameters.

Which Score Should George Choose to Report? Using a NCE scale with mean 50 and standard deviation 21.06… SAT score of 655  NCE score 75.85 ACT score of 28  NCE score 80.74 ACT score SAT score Clearly, he should report his ACT score!

Consider Another Hypothetical Scenario… In 2006, Wilma was in 4th grade and scored as follows on the 4th grade PSSA: Mean for 4th Grade – 2006 = 1303.24 Standard Deviation for 4th Grade – 2006 = 164.20 Wilma’s scaled score = 1425 In 2005, Wilma was in 3rd grade and scored as follows on the 3rd grade PSSA: Mean for 3rd Grade – 2005 = 1356.75 Standard Deviation for 3rd Grade – 2005 = 126.20 Wilma’s scaled score = 1425 Do these scores indicated that Wilma progressed during 4th grade?

Let’s Look at it Graphically… Wilma Wilma Even though Wilma’s scaled scores were the same (both 1425), since the distributions were different, we really can’t compare the two scores…

A Tentative Solution: Conversion to Percentiles Wilma Wilma In our example, Wilma score of 1425 was in the 66th percentile for 2005 but was in the 76th percentile for 2006. These percentiles focus on Wilma’s position in each distribution.

But… • We cannot calculate Wilma’s gain – the difference of percentiles does not make sense… • Percentiles are not meaningful for calculating means for different years, gains, etc., since they are calculated from different distributions.

The Complete Solution: Conversion to NCE Scores • To establish a basis of comparison for different distributions from different schools in different years, we convert the scaled scores to units in the SAME scale. • The scale we will use is from the NCE distribution with mean 50 and standard deviation approximately equal to 21.06. Mean

The NCE Distribution and Wilma Wilma’s NCE score for 2005 (3rd grade) is 61 while her score for 2006 (4th grade) is 66. Wilma 2006 4th Wilma 2005 3rd

Wilma’s gain… Wilma’s gain =2006 NCE score – 2005 NCE score (4th Grade) (3rd Grade) = 66 – 61 = + 5 • The mean gain of all of the students in Wilma’s cohort (+5 NCE points) can now be compared to the Growth Standard for growth for Wilma’s cohort.

What about estimating the true level of achievement of a cohort of students?

The Assessment Dilemma True Student Achievement Any test is just a snapshot in time!

Student A Test Score (2009) PVAAS Statewide Methodology Student A Base Year NCE Score (2006) 2009 Observed School Mean NCE Scores

The Problem with the Mean of the Observed Scores The mean of the observed NCE scores at best represents a single snapshot in time of student achievement of the PSSA Anchors… Is it the most comprehensive assessment of the school’s TRUE level of achievement? How about the Bad Day syndrome?

Observed vs. Composite Estimate…Which is better? What if we combined the new, observed data with all of the prior PSSA assessment information that we have for this cohort of students? Would not a longitudinal view of the cohort’s performance yield a more precise and reliable estimate of the true level of achievement? This is the essence and power of the PVAAS value-added growth methodology!

Consider an Example… Determine the percent of candies that are blue… If you were to open only one bag and find that 13% of the candies are blue, how much confidence would you have in your estimate of the true percentage of blue candies for all candies?

Only One Sample? A Bit Risky… Let’s open 50 bags and look at the distribution of the percents of blue candies… Looking at these 50 bags, what would you estimate the “true” percent of blue candies for all candies?

Distribution with n = 50 What If? Let’s open 50 more bags and add them to the 50 selected earlier… Distribution with n = 100 With this additional data, we can make a better estimate of the true percent of blue candies!

The Function of Estimates • The PVAAS methodology provides estimates of current and previous achievement, and subsequent gain for the school entity using all information for each student, no matter how complete or sparse. • This process yields fair estimates of the impact of schooling on the rates of progress of the student populations and mitigates the problem of student mobility.

PVAAS Statewide Methodology 2009 Observed School Mean NCE Scores 2009 Estimated School Mean NCE Scores Computer 2008 Estimated School Mean NCE Score Gain = 2009 Estimate – 2008 Estimate 2007 Estimated School Mean NCE Score 2006 Estimated School Mean NCE Score Compare to Growth Standard  School Rating

How to Measure Growth of a School? Using a Growth Standard • Student scaled scores are converted to NCE scores (2006 parameters). • The mean NCE score for each school is calculated. • PVAAS revises all earlier estimates based on the addition of the current data. • PVAAS calculates an estimated NCE mean score. Estimated Mean NCE Gain = Current Estimated NCE mean – Previous Estimated NCE mean • Gain is compared to Growth Standard for School Effect Rating.

Here is the Fall 2009 PVAAS District/School Report

Understanding and Implementing Academic Growth Standards

Understanding and Implementing Academic Growth Standards

Presentation Transcript

VA State Board Test

Interpreting Puritanism

California State Test

California State Test

Perception – Interpreting

California State Test

Interpreting Graphs

STATE TEST DAY

ASL Interpreting

Interpreting Test Results

District Test Coordinator and State Test Coordinator Training

State Test Review

Interpreting Hamlet

STATE TEST REVIEW DAY

Interpreting Effectiveness

California State Test

Interpreting Test Scores: Making Sense of the Numbers

Clinical Effectiveness: Interpreting test results

Interpreting TEAS Test Results

District Test Coordinator and State Test Coordinator Training

Interpreting Your Test Results