RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political ConstraintsInternational Perspectives on Impact EvaluationProfessional pre-session workshop #5Cairo29 March 2009 Facilitated by Jim Rugh Note: This PowerPoint presentation and the summary chapter of the book are available at: www.RealWorldEvaluation.org
Workshop Objectives 1. The seven steps of the RealWorld Evaluation approach for addressing common issues and constraints faced by evaluators such as: when the evaluator is not called in until the project is nearly completed and there was no baseline nor comparison group; or where the evaluation must be conducted with inadequate budget and insufficient time; and where there are political pressures and expectations for how the evaluation should be conducted and what the conclusions should say
Workshop Objectives • Defining what impact evaluation should be • Identifying and assessing various design options that could be used in a particular evaluation setting • Ways to reconstruct baseline data when the evaluation does not begin until the project is well advanced or completed. • How to identify and to address threats to the validity or adequacy of quantitative, qualitative and mixed methods designs with reference to the specific context of RealWorld evaluations
Workshop Objectives Note: This workshop will focus on project-level impact evaluations. There are, of course, many other purposes, scopes, evaluands and types of evaluations. Some of these methods may apply to them, but our examples will be based on project impact evaluations, most of them in the context of developing countries.
Workshop agenda 1. Introduction [15 minutes] 2. Quick summary of the RealWorld Evaluation (RWE) approach [30 minutes] 3. Small group self-introductions and sharing of RWE issues you have faced in your own practice. [45 minutes] 4. Scoping the evaluation and identifying budget and time constraints, also logic models, evaluation designs [60 minutes] --- short break --- 5. Addressing data constraints [30 minutes]6. Mixed methods [30 minutes] 7.a. Group exercise: preparing an evaluation design when working under budget, time, data or political constraints. The cases will also illustrate the different evaluation agendas and perspectives of evaluation consultants, project implementers and funding agencies. [30 minutes] --- lunch [60 minutes] --- 7.b. Small group work continues [60 minutes] 8. Plenary: Identifying threats to validity [30 minutes] 9. Paired groups negotiate their ToRs [45 minutes] 10-11-12: Feedback, wrap-up discussion, evaluation of the workshop
RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints OVERVIEW OF THE RWE APPROACH
RealWorld Evaluation Scenarios Scenario 1: Evaluator(s) not brought in until near end of project For political, technical or budget reasons: • There was no baseline survey • Project implementers did not collect adequate data on project participants at the beginning or during the life of the project • It is difficult to collect data on comparable control groups
RealWorld Evaluation Scenarios Scenario 2: The evaluation team is called in early in the life of the project But for budget, political or methodological reasons: • The ‘baseline’ was a needs assessment, not comparable to eventual evaluation • It was not possible to collect baseline data on a comparison group
Reality Check – Real-World Challenges to Evaluation • All too often, project designers do not think evaluatively – evaluation not designed until the end • There was no baseline – at least not one with data comparable to evaluation • There was/can be no control/comparison group. • Limited time and resources for evaluation • Clients have prior expectations for what the evaluation findings will say • Many stakeholders do not understand evaluation; distrust the process; or even see it as a threat (dislike of being judged)
RealWorld Evaluation Quality Control Goals • Achieve maximum possible evaluation rigor within the limitations of a given context • Identify and control for methodological weaknesses in the evaluation design • Negotiate with clients trade-offs between desired rigor and available resources • Presentation of findings must recognize methodological weaknesses and how they affect generalization to broader populations
The Need for the RealWorld Evaluation Approach • As a result of these kinds of constraints, many of the basic principles of rigorous impact evaluation design (comparable pre-test-post test design, control group, adequate instrument development and testing, random sample selection, control for researcher bias, thorough documentation of the evaluation methodology etc.) are often sacrificed.
The RealWorld Evaluation Approach An integrated approach to ensure acceptable standards of methodological rigor while operating under realworld budget, time, data and political constraints. See handout summary chapter extracted from RealWorld Evaluation book for more details
The RealWorld Evaluation approach • Developed to help evaluation practitioners and clients • managers, funding agencies and external consultants • Still a work in progress (more to be learned) • Originally designed for developing countries, but equally applicable in industrialized nations
Special Evaluation Challenges in Developing Countries • Unavailability of needed secondary data • Scarce local evaluation resources • Limited budgets for evaluations • Institutional and political constraints • Lack of an evaluation culture (though evaluation associations are addressing this) • Many evaluations are designed by and for external funding agencies and seldom reflect local and national stakeholder priorities
Special Evaluation Challenges in Developing Countries Despite these challenges, there is a growing demand for methodologically sound evaluations which assess the impacts, sustainability and replicability of development projects and programs …………………….
Most RealWorld Tools are not New—Only the Integrated Approach is New • Most of the RealWorld Evaluation data collection and analysis tools will be familiar to most evaluators • What is new is the integrated approach which combines a wide range of tools to produce the best quality evaluation under realworld constraints
Who Uses RealWorld Evaluation and When? • Two main users: • evaluation practitioners • managers, funding agencies and external consultants • The evaluation may start at: • the beginning of the project • after the project is fully operational • during or near the end of project implementation • after the project is finished
What is Special About the RealWorld Evaluation Approach? • There is a series of steps, each with checklists for identifying constraints and determining how to address them • These steps are summarized on the following slide and then the more detailed flow-chart … (See page 6 of handout)
The Steps of the RealWorld Evaluation Approach Step 1: Planning and scoping the evaluation Step 2: Addressing budget constraints Step 3: Addressing time constraints Step 4: Addressing data constraints Step 5: Addressing political constraints Step 6:Assessing and Addressing the strengths and weaknesses of the evaluation design Step 7: Helping clients use the evaluation
TheReal-World Evaluation Approach • Step 1: Planning and scoping the evaluation • . Defining client information needs and understanding the political context • . Defining the program theory model • . Identifying time, budget, data and political constraints to be addressed by the RWE • . Selecting the design that best addresses client needs within the RWE constraints Step 2 Addressing budget constraints A. Modify evaluation design B. Rationalize data needs C. Look for reliable secondary data D. Revise sample design E. Economical data collection methods Step 3 Addressing time constraints All Step 2 tools plus: F. Commissioning preparatory studies G. Hire more resource persons H. Revising format of project records to include critical data for impact analysis. I. Modern data collection and analysis technology • Step 4 • Addressing data constraints • . Reconstructing baseline data • . Recreating comparison groups • . Working with non-equivalent comparison groups • . Collecting data on sensitive topics or from difficult to reach groups • . Multiple methods Step 5 Addressing political influences A. Accommodating pressures from funding agencies or clients on evaluation design. B. Addressing stakeholder methodological preferences. C. Recognizing influence of professional research paradigms. Step 6 Assessing and addressing the strengths and weaknesses of the evaluation design An integrated checklist for multi-method designs A. Objectivity/confirmabilityB. Replicability/dependabilityC. Internal validity/credibility/authenticityD. External validity/transferability/fittingness Step 7 Helping clients use the evaluation A. Utilization B. Application C. OrientationD. Action 21
Self-introductions What constraints of these types have you faced in your evaluation practice? How did you cope with them? 23
RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints The challenge of the counterfactual
Attribution and counterfactuals How do we know if the observed changes in the project participants or communities • income, health, attitudes, school attendance etc are due to the implementation of the project • credit, water supply, transport vouchers, school construction etc or to other unrelated factors? • changes in the economy, demographic movements, other development programs etc
The Counterfactual • What would have been the condition of the project population at the time of the evaluation if the project had not taken place?
Where is the counterfactual? After families had been living in a new housing project for 3 years, a study found average household income had increased by an 50% Does this show that housing is an effective way to raise income?
Comparing the project with two possible comparison groups I n c o m e Project group. 50% increase 750 Scenario 2. 50% increase in comparison group income. No evidence of project impact 500 Scenario 1. No increase in comparison group income. Potential evidence of project impact 250 2000 2002
5 main evaluation strategiesfor addressing the counterfactual Randomized designs I. True experimental designs II. Randomized selection of participants & control Quasi-experimental designs III. Strong quasi-experimental designs IV. Weaker quasi-experimental designs Non-experimental designs. V. No logically defensible counterfactual
The most rigorous statistical designs: Randomized experimental or at least strong quasi-experimental evaluation designs Subjects randomly assigned to the project and control groups or control group selected using statistical or judgmental matching Conditions of both groups are not controlled during the project Gain score [impact] = P2 – P1 C2– C1
Control group and comparison group • Control group = randomized allocation of subjects to project and non-treatment group • Comparison group = separate procedure for sampling project and non-treatment groups that are as similar as possible in all aspects except the treatment (intervention)
Reference sources for randomized field trial designs 1. MIT Poverty Action Lab www.povertyactionlab.org 2. Center for Global Development “When will we ever learn?” http://www.cgdev.org/content/publications/detail/7973
The limited use of strong evaluation designs • In the realworld we estimate that • Less than 5-10% of impact evaluations use a strong quasi-experimental design • Significantly less than 5% use randomized control trials (experimental design)
There are other methods for assessing the counterfactual • Reliable secondary data that depicts relevant trends in the population • Longitudinal monitoring data (if it includes non-reached population) • Qualitative methods to obtain perspectives of key informants, participants, neighbors, etc. • We’ll talk more about this in the 5th session
RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints Step 1 PLANNING AND SCOPING THE EVALUATION
Step 1: Planning and Scoping the Evaluation • Understanding client information needs • Defining the program theory model • Preliminary identification of constraints to be addressed by the RealWorld Evaluation
A. Understanding client information needs Typical questions clients want answered: • Is the project achieving its objectives? • Are all sectors of the target population benefiting? • Are the results sustainable? • Which contextual factors determine the degree of success or failure?
A. Understanding client information needs A full understanding of client information needs can often reduce the types of information collected and the level of detail and rigor necessary. However, this understanding could also increase the amount of information required!
B. Defining the program theory model All programs are based on a set of assumptions (hypothesis) about how the project’s interventions should lead to desired outcomes. • Sometimes this is clearly spelled out in project documents. • Sometimes it is only implicit and the evaluator needs to help stakeholders articulate the hypothesis through a logic model.
B. Defining the program theory model • Defining and testing critical assumptions are a essential (but often ignored) elements of program theory models. • The following is an example of a model to assess the impacts of microcredit on women’s social and economic empowerment
Critical Hypothesis for a Gender-Inclusive Micro-Credit Program • Outputs • If credit is available women will be willing and able to obtain loans and technical assistance. • Short-term outcomes • If women obtain loans they will start income-generating activities. • Women will be able to control the use of loans and reimburse them. • Medium/long-term impacts • Economic and social welfare of women and their families will improve. • Increased women’s economic and social empowerment. • Sustainability • Structural changes will lead to long-term impacts.
Consequences Consequences Consequences PROBLEM PRIMARYCAUSE 1 PRIMARY CAUSE 2 PRIMARY CAUSE 3 Secondary cause 2.3 Secondary cause 2.1 Secondary cause 2.2 Tertiary cause 2.2.1 Tertiary cause 2.2.2 Tertiary cause 2.2.3
High infant mortality rate Children are malnourished Insufficient food Diarrheal disease Poor quality of food Need for improved health policies Contaminated water Unsanitary practices Flies and rodents Do not use facilities correctly People do not wash hands before eating
Reduction in poverty Women empowered Women in leadership roles Women able to reimburse loans Women educated Improved economic conditions Women achieve rights within household Credit provided to entrepreneurs S&L groups organized MFI provides credit Training of agents
Program Impact: Population-based survey (program baseline, program evaluation some time after projects completd) What does it take to measure indicators at each level? Project Impact :Population-based survey (baseline, evaluation) Effect:b) Population-based survey(usually only during baseline and evaluation) Effect: a) Follow-up survey of participants (can be done annually) Output: Measured by project staff annually Activities: On-going (monitoring) Inputs: On-going (financial accounts)
We need to recognize which evaluative process is most appropriate for measurement at various levels • Impact • Effect • Output • Activities • Inputs PROGRAM EVALUATION PROJECT EVALUATION PERFORMANCE MONITORING
Coming to agreement on what levels of the logic model to include in evaluation • This can be a sensitive issue: Project staff generally don’t like to be held accountable for more than the output level, while donors (and intended beneficiaries) may insist on evaluating higher-level outcomes. • An approach evaluators might take is that if the correlation between intermediary outcomes (or even qualified outputs) and impact has been adequately established though research and program evaluations, then assessing intermediary outcome-level indicators might suffice, as long as the contexts can be shown to be sufficiently similar to where such hypotheses have been tested.
Determining appropriate (and feasible) evaluation design • Based on an understanding of client information needs, required level of rigor, and what is possible given the constraints, the evaluator and client need to determine what evaluation design is required and possible under the circumstances.
Let’s focus for a while on evaluation design (a quick review) 1: Review different evaluation (experimental /quasi-experimental) designs 2: Develop criteria for determining appropriate Terms of Reference (ToR) for evaluating a project, given its own (planned or un-planned) evaluation design. 3: Use decision tree to make choices of what’s required (or feasible) to include in an evaluation ToR. 4: A life-of-project evaluation design perspective. 49
scale of major impact indicator An introduction to various evaluation designs Illustrating the need for quasi-experimental longitudinal time series evaluation design Project participants Comparison group baseline end of project evaluation post project evaluation 50
OK, let’s stop the action to identify each of the major types of evaluation (research) design … … one at a time, beginning with the most rigorous design. 51