Evaluation Research

Evaluation Research Issues, Methods, and Opportunities William H Fisher, PhD Center for Mental Health Services Research

What is Evaluation Research? • Evaluation is the systematic assessment of the worth or merit of something • In our context – systematic assessment of a policy, practice or service intervention. • Many of the same research principles apply – just in more complicated ways

Types of Evaluation Research • outcome evaluationsinvestigate whether the program or technology caused demonstrable effects on specifically defined target outcomes • impact evaluationis broader and assesses the overall or net effects -- intended or unintended -- of the program or technology as a whole • i.e. – it did this, but it also did that • cost-effectiveness and cost-benefit analysisaddress questions of efficiency by standardizing outcomes in terms of • dollar costs and values • “social costs” – we reduced hospitalizations but increased incarcerations • “opportunity costs” – because we did “A” we couldn’t do “B” • secondary analysis reexamines existing data to address new questions or use methods not previously employed – • can be used to do all of the above relatively inexpensively

A little history • 1965 – the big year for evaluation research • Great Society Programs • Recognition at HEW of social science • The “Head Start Program” – an early example • Question – did kids in that program benefit compared to those not in the program? • Donald Campbell (1975) “Reforms as Experiments” • Mental Health arena – Community Support Program • Did people benefit from the services they received? • Did community-based services keep people out of the hospital?

Evaluation Research vs. “Regular Research” • Traditional research - Scientific Method Theory -- Hypotheses -- Operationalization – Analysis • Evaluations • Scientific method but not always theory driven • Less theory “logic models” (which may sometimes be theory driven) • Guide to what the intervention is going to do and what its goals are

Example of Logic Model

Methodological Issues in Evaluation Research • Gold Standard of Experimental Designs – Randomized Clinical Trials – often not feasible • Over the years, new approaches have been developed – QUASI- EXPERIMENTAL DESIGNS

Conterfactual Explanation Basic Thrust of Evaluation Determine what the world would look like if the event, intervention, law etc. hadn’t been implemented?

Quasi- Experiments and “True” Experiments True Experiments: The Randomized Control Trial • RCTs function to minimize threats to validity – i.e., factors that can contaminate one’s study • Randomize to “arms” from a pool of individuals who meet pre-specified criteria • Able to isolate effects of change attributable to the intervention – everything else is controlled OOOOO --------- “Treatment” ----------- OOOOOOOO OOOOO -------- “No Treatment” ----------- OOOOOOOO

Quasi-Experimental designs • The “bread and butter” of evaluation research • Missing one or more features of “true experiments” • Methods aimed at minimizing the extent of these factors • Vary in “strength”; which one is used depends on what opportunities there are for data collection

Goal: Reduce Threats to Validity • Internal Validity • External Validity

Threats to Internal Validity • “Internal validity” refers to problems with the design that lead to inadequate control of extraneous variation. • For example • Sampling bias – control and treatment groups not sufficiently comparable (e.g., one group all male, other all female) • History effects – factors that went on in the environment of the treatment or intervention that could have had an effect on the outcome (major event that changed the outcome independent of the treatment or intervention – • change in reimbursement rates, hospital admission policies, • advent of a new medication, • economic downturn during an employment program

Threats to External Validity • Inability to generalize findings because • Poor internal validity – study is so badly designed that can’t be generalized to another setting • Study population or site is very idiosyncratic (which comes up occasionally in evaluations) • May be okay if it’s just a local effect you’re looking for and not planning to publish

Examples of Some Quasi-Experimental Designs Post-test only, no control (weakest) X o • Very weak design • Sometimes called a “case study” • “Here’s what happened when we did ‘X’” • Useful for guiding other studies, • need to couch within lots of caveats

Pre-Post, No Control ooooX oooo • Time series analysis is an example • Interrupted Time Series Analysis • Get a measure of outcome for time periods before and after the intervention • The “interruption” is the intervention • Good for looking at legal or policy changes

A “Classic” Time Series: The Cincinnati Bell Experiment

Questions with Time Series • How strong is the effect? • Is the effect • Abrupt or gradual? • Temporary or permanent?

Statistical Methods with Time Series Analysis • Depends on type and nature of data • ARIMA, Poisson Regression • All take the same approach, built on the counterfactual explanation • Based on previous patterns in the time series, • Forecast the future (i.e., what it would be in the absence of the intervention) • Compare observed post-intervention with forecast

Issues in Time Series • Seasonality • Secular trends – long term trends – years, not months • Autocorrelation- neighboring data points may be related • Data on long pre-intervention period is good • Total of 50 observations

Pre-Post, Non-Equivalent Control Group • One of the better designs - E.g. – comparing two types of case management where you can’t randomize oooooooooo X ooooooooooo oooooooooo Y ooooooooooo X= treatment, intervention, etc Y = alternative, placebo, etc

Example: Northampton Consent Decree Study • Federal court deinstitutionalization order in Western Massachusetts but not elsewhere • Compared Northampton State Hospital and Worcester State Hospital • Argued that areas “sort of different” except for the consent decree • Looked at changes in state hospital use and other factors

Which design to choose • What resources are available • Research assistants, etc • Funding • Time lines • What data are already available? • How accessible are subjects? • Can they be assessed before a treatment or intervention occurs? • Is there a sensible choice for a control? • e.g., Central Mass as control for Western Mass

Evaluating interventions in “real time” • Common evaluation activity • E.g., SAMHSA – often includes a requirement that an evaluation be done (usually with inadequate resources) • E.g., A wellness program is being initiated at a local mental health center • MISSION-DIRECT VET • This will be monitored

What do we need to do? • Operationalize expectations into measureable outcome measures • Develop plans for: • collecting data • examining “what happens” as the project goes forward – “process evaluation”

Process Analysis • What really happened? • Was the intervention delivered as proposed? • Was the design (sampling, followup, etc.) implemented as proposed? • Need to understand how the implementation actually occurs so that the outcome is more interpretable.

Is the Intervention what we said it would be? Fidelity • Many interventions have a strong evidence base • Some – example, Program for Assertive Community Treatment (PACT) have formal “fidelity measures.” • If you say you’re doing PACT, it must have the following elements; otherwise you’re not

Evaluation Research as an Activity • Major call for data-driven decision making • Scarce resources – can’t be wasted on things that don’t work or work poorly • Evaluation research now thrives as a major focus for researchers • American Evaluation Association • Evaluation Research journal devoted to methods and examples from multiple fields

Issues for Academic Researchers • Working with public agencies and other service providers can be rewarding and interesting • Cultures are different • Specific issues: “Quality Assurance / Evaluation vs. “Research” • Who owns the data? • IRB issues

Some consumers of evaluation research may not be interested in a fancy study

Publishing • Academic researchers want to publish • Need to straighten out issues in advance • Who can publish? • Does the organizational partner want to be involved? • Review of findings? • “Censorship”

Final thoughts • Evaluation research makes significant contribution to behavioral health research • Informs policy and practice • Our department recognizes its importance • Lots of opportunities

Evaluation Research