Educational Assessment

Educational Assessment Assessment Issues and Program Evaluation Procedures

Outcomes Assessment • a.k.a How Do I Know If I’m Doing What I Think I’m Doing? • 1st: Identify what you are trying to do. This may include general outcomes and specific outcomes. For example: • Increase the number of women entering the fields of math and engineering (general) • Improve high school girls attitudes about math and engineering (specific) • 2nd: Identify ways to accurately assess whether these outcomes are occurring. • 3rd: Establish a procedure for program evaluation

Identify What You Are Trying To Do • Some examples: • Change attitudes about math and engineering • Increase girls’ sense of self-efficacy in math and engineering • Improve motivation to engage in math and engineering • Increase skills in math and engineering • Increase the number of girls who go on to major in math and engineering from your high school • Increase the number of women who graduate from college with math and engineering majors

Critical Issues for Assessment Tools • Reliability • Consistency of test scores • The extent to which performance is not affected by measurement error • Validity • The extent to which a test actually measures what it is supposed to measure

Types of Reliability • Test-Retest • Correlation of two tests taken on separate occasions by the same individual • Limits: Practice effects, recall of former responses • Alternate Form • Correlation of scores obtained on two parallel forms • Limits: May have practice effects, alternate forms often not available

Types of Reliability • Split-half • Correlation between two halves of a test • Limits: Shortens test, which affects reliability, difficult with tests that measure different things in the same test (heterogeneous tests) • Kuder-Richardson and Coefficient Alpha • Inter-item consistency: Average correlation of each item with every other item • Limits: Not useful for hetergeneous tests

Types of Validity • Content Validity • Checking to make sure that you’ve picked questions that cover the areas you want to cover, thoroughly and well. • Difficulties: “Adequate sampling of the item universe.” Important to ensure that all major aspects are covered by the test items and in the correct proportions • Specific Procedures: Content validity is built into the test from the onset through the choice of appropriate items.

Types of Validity • Concurrent and Predictive Validity • Definition: The relationship between a test and some criteria. The practical validity of a test for a specific purpose. Examples: • Do high school girls who score high on this test go on to succeed in college as engineering majors? (P) • Do successful women engineering majors score high on this test? (C) • Difficulties: Criterion contamination; trainers must not know examinees’ test scores • Specific Procedures: infinite, based on purpose of the test

Types of Validity • Construct Validity • Definition: the extent to which the test may be said to measure a theoretical construct or trait • Any data throwing light on the nature of the trait and the conditions affecting its development and manifestations represent appropriate evidence for this validation • Example: I have designed a program to lower girls’ math phobia. The girls who complete my program should have lower scores on the Math Phobia Measure compared to their scores before the program and compared to the scores of girls who have not completed the program

Optimizing Reliability & Validity • Here are some tips for making sure your test will be reliable and valid for your purpose (circumstances that affect reliability and validity): • The more questions the better (the number of test items) • Ask questions several times in slightly different ways (homogeneity) • Get as many people as you can in your program (N) • Get different kinds of people in your program (sample heterogeneity) • (Linear relationship between the test and the criterion)

Selecting and Creating Measures 1. Define the construct(s) that you want to measure clearly 2. Identify existing measures, particularly those with established reliability and validity 3. Determine whether those measures will work for your purpose and identify any areas where you may need to create a new measure or add new questions 4. Create additional questions/measures 5. Identify criteria that your measure should correlate with or predict, and develop procedures for assessing those criteria

Measuring Outcomes • Pre and post tests • Involves giving measure before intervention/training and then following the intervention in order to measure change as a result of the intervention • Important to identify what you are trying to change with your intervention (the constructs) in order to use measures that will pick up that change • Be sure to avoid criterion contamination • Limitations: If your group is preselected for the program, the variability will be restricted

Measuring Outcome • Follow-up Procedures • These may involve re-administering your pre/post measure again after some interval following the end of the program or any other criterion that should theoretically be predicted by your intervention, such as: • choosing to take math/engineering courses • choosing to major in math/engineering • choosing a career in math/engineering

Measuring Outcome Control Groups One critical problem faced by anyone who conducts an intervention is whether any observed change are related to the intervention or to some other factor (e.g., time, preselection, etc). The only way to be sure that your intervention is causing the desired changes is to use a control group. The control group must be the same as the treatment group in every way (usually by random assignment to groups), except the control group does not receive the intervention. Any differences between these groups can then be attributed to the intervention.

Measuring Outcome • Alternatives to randomly assigned control groups: • Matched controls • Comparison groups • Comparison across programs • Remember, you’ll need to use the same assessment and follow-up procedures for both groups

Comparing Across Programs • In order to compare successfully across programs, you will also need to assess: • Program characteristics • Participant characteristics • So you will need to also ask yourselves: • What are the important aspects of the programs that I should know about? • What are the important characteristics of the girls that I should know about?

An Ideal Outcome Assessment Treatment group All participants Participants receives All participants All participants fill out initial randomly intervention fill out post- are followed questionnaires assigned to questionnaires through college conditions Control group and to first job receives no intervention

A More Realistic Outcome Assessment? Girls involved Each program Programs in each program Girls Girls fill reports data & conducting fill out pre-tests participate out post- program charac- follow-ups and client in programs questionnaires teristics report follow- characteristics up data

Educational Assessment

Educational Assessment

Presentation Transcript

Preschool and Educational Assessment

Louisiana Educational Assessment Program

Educational Assessment of Children

Educational Needs Assessment

Educational Program Assessment

Common Educational Proficiency Assessment

National Assessment of Educational Progress

Comparing Assessment to Educational Research

Interpreting Assessment for Educational Intervention

Common Educational Proficiency Assessment

Educational Planning and Assessment System

National Assessment of Educational Progress

Michigan Educational Assessment Program

Association for Educational Assessment - Europe

Assessment and the Educational Setting

Association for Educational Assessment-Europe

Educational Credential Assessment

Michigan Educational Assessment Program