Evaluating Impacts of MSP Grants

Evaluating Impacts of MSP Grants Common Issues and Potential Solutions Ellen Bobronnikov January 6, 2009

Overview • Purpose of MSP Program • GPRA Indicators • Teacher Content Knowledge • Student Achievement • Evaluation Design • Application of Rubric to Determine Rigor of Evaluation Design • Key Criteria for a Rigorous Design • Common Issues and Potential Solutions

Purpose of MSP Program The MSP Program supports partnerships between STEM faculties of institutions of higher education (IHE) and teachers in high-need school districts. These partnerships focus on: Facilitating professional development activities for teachers that focus on improving teacher content knowledge and instruction, Improving classroom instruction, and Improving student achievement. These are linked to indicators that the MSP Program needs to report on annually.

GPRA Indicators for MSP Program Under the Government Performance and Results Act (GPRA), all federal agencies are required to develop indicators in order to report to the U.S. Congress on federal program impacts and outcomes. For the MSP Program, the following indicators have been developed that look at the effects of the program on teacher and student outcomes: Teacher Knowledge • The percentage of MSP teachers who significantly increase their content knowledge as reflected in project-level pre- and post-assessments. Student Achievement • The percentage of students in classrooms of MSP teachers who score at the basic/proficient level or above in State assessments of mathematics or science. Note: The information necessary to report on these indicators is taken directly from the APR.

GPRA Indicators for MSP Program (continued) In order to provide information about the impact of the MSP intervention on teacher and student outcomes, a rigorous evaluation design is necessary. The following indicator gets at design issues. Evaluation Design • The percentage of MSP projects that use an experimental or quasi-experimental design for their evaluations that are conducted successfully and that yield scientifically valid results.

Measuring GRPA Indicators – Evaluation Design Criteria for Evaluating Designs: • We apply the Criteria for Classifying Designs of MSP Evaluations (hereafter, referred to as “the rubric”) to projects to determine which projects had rigorous evaluations. • The rubric sets the minimum criteria for an MSP evaluation to be considered rigorous. There are seven criteria included. An evaluation has to meet each of the seven criteria in order to meet the GPRA indicator. • Based on our previous experience, we have found that one of the most common issues in meeting all of the criteria is missing data. Therefore, throughout this presentation, we will let you know the information we need to apply the rubric. Information Sources: • We apply the rubric to final year projects only. • We primarily use the information contained in the finalevaluation reports, but we compare it to the evaluation data contained in the APRs, and the data do not always agree. It is important to ensure the information contained in all sources is consistent, and that information contained in the final evaluation report is complete.

Rubric Criteria • Type of design– needs to be experimental or quasi-experimental with comparison group • Equivalence of groups at baseline– for quasi-experimental designs, groups should be matched at baseline on variables related to key outcomes • Sufficient sample size– to detect a real impact rather than chance findings • Quality of measurement instruments– need to be valid and reliable • Quality of data collection methods– methods, procedures, and timeframes used to collect the key outcome data need to be comparable for both groups • Attrition rates– no more than 70% overall up to 15% differential attrition between groups • Relevant statistics reported – treatment and comparison group post-test means, and tests of statistical significance for key outcomes

Applying the Rubric – Type of Design 1. Type of Design • To determine impact on teacher and student outcomes, need to use an experimental or quasi-experimental design with a comparison group. Common Issues: • Many projects used one-group only pre-poststudies. These do not account for differences that would have naturally occurred in the absence of the intervention. Potential Solutions: • Using a comparison group will help to make a much more rigorous study.

Applying the Rubric – Baseline Equivalence 2. Baseline Equivalence of Groups (Quasi- Experimental Only) • Demonstration of no significant differences between treatment and comparison at baseline on variables related to the study’s key outcomes. • Pre-test scores should be provided for treatment and comparison groups. • A statistical test of differences should be applied to the treatment and comparison groups.

Applying the Rubric – Baseline Equivalence Common Issues: • No pre-test information on outcome-related measures. • Pre-test results given for the treatment and comparison groups, but no tests of between groups differences. Potential Solutions: • Administer pre-test to both groups and test for differences between groups. • Alternatively, provide means, standard deviations, and sample sizes of pretest scores for both groups, so differences can be tested. • If there were differences at baseline, control for the differences between groups in statistical analyses.

Applying the Rubric – Sample Size 3. Sample Size • Sample size is adequate • Based on a power analysis with recommended: • significance level = 0.05 • power = 0.8 • minimum detectable effect informed by the literature or otherwise justified

Applying the Rubric – Sample Size Common Issues: • Power analyses rarely conducted. • Different sample sizes given throughout the APR and Evaluation Report. • Sample sizes and subgroup sizes not reported for all teacher and student outcomes or are reported inconsistently. Potential Solutions: • Conduct power analyses. • Provide sample sizes for all groups and subgroups.

Applying the Rubric – Measurement Instruments 4. Quality of the Measurement Instruments • The study used existing data collection instruments that had already been deemed valid and reliable to measure key outcomes; or • Data collection instruments developed specifically for the study were sufficiently pre-tested with subjects who were comparable to the study sample, and instruments were found to be valid and reliable.

Applying the Rubric – Measurement Instruments Common Issues: • Locally developed instruments not tested for validity or reliability. • Instrument identified as “not tested for validity or reliability” in APR, but instruments were pre-existing instruments that had already been tested for validity and reliability. • Use many instruments, but do not report validity or reliability for all of them. • Assessments aligned with the intervention (this provides an unfair advantage to treatment participants). Potential Solutions: • Report on validity and reliability on all instruments. If the instrument was designed for the study, conduct a validity and reliability study. • If using pre-existing instrument, cite the validity and reliability of instrument. If using part of existing instruments, consider using full subscales rather than selecting a limited number of items. • Do not use instruments which may provide an “unfair advantage” to a particular group.

Applying the Rubric – Data Collection Methods 5. Quality of the Data Collection Methods • The methods, procedures, and timeframes used to collect the key outcome data from treatment and comparison groups were comparable. Common Issues: • Little to no information is provided in general about data collection or only provided for treatment group. • Timing of the tests were not comparable for treatment and comparison groups. Potential Solutions: • It is important to provide the names and timing of all assessments given to both groups.

Applying the Rubric – Attrition 6. Attrition • Need to retain at least 70% of original sample; AND • Show that if there is differential attrition of more than 15% between groups, it is accounted for in the statistical model. Common Issues: • Attrition information is typically not reported, or is reported for treatment groups only. • Sample and subsample sizes are not reported for all groups or are reported inconsistently. Potential Solutions: • Provide initial and final sample sizes for all groups and subgroups.

Applying the Rubric – Statistics Reported 7. Relevant Statistics Reported • Include treatment and comparison group post-test means and tests of significance for key outcomes; OR • Provides sufficient information for calculation of statistical significance (e.g., mean, sample size, standard deviation/standard error).

Applying the Rubric – Statistics Reported Common Issues: • Projects report that the results were significant or non-significant but do not provide supporting data. • Projects provide p-values but do not provide means or standard deviations. • Projects report gain scores for the treatment and comparison groups but do not provide between-group tests of significance. Potential Solutions: • Provide full data (means, sample sizes, and standard deviations/errors) for treatment and comparison groups on all key outcomes. • Provide complete information about statistical tests that were performed for both groups.

Projects with Rigorous Designs Projects that meet all of the rubric criteria will be able to make a more accurate determination of impact of their program on teacher and student outcomes.

Evaluating Impacts of MSP Grants

Evaluating Impacts of MSP Grants

Presentation Transcript

Evaluating Local Impacts of a Utility SCR Retrofit Project

Evaluating Development Impacts with Local Economy-wide Models

Evaluating Climate Policy Impacts on U.S. Manufacturing Competitiveness

Evaluating and Reporting Program Impacts: Contentious Public Issues

Evaluating Climate Change Impacts: an Integrated Approach

MSP Cars

Measuring Results That Matter: Evaluating CED Impacts

Evaluating the potential impacts of invasive species

MSP Challenge

Minnesota MSP Grants June 2007

Grants grants

MSP

MSP Hack

Evaluating the Social Impacts of Change in a Sustainable Future

Evaluating and Reporting Program Impacts: Contentious Public Issues

Evaluating Cumulative Impacts: The Value of Epidemiology

Grants 101: The Anatomy of Grants