120 likes | 248 Vues
This document explores critical technical issues related to validity and reliability in data collection, particularly in the context of Quiz 1. Key concerns regarding the fairness of using quiz scores as part of grading are discussed, alongside the importance of ensuring that tests measure what they are intended to. Various types of validity including content, construct, criterion-related, and consequential are examined. Additionally, reliability is analyzed through stability, equivalence, and internal consistency, highlighting factors that can impact both validity and reliability adversely.
E N D
Technical Issues • Two concerns • Validity • Reliability
Data Collection – Quiz 1 • Answer the five questions on Quiz 1.
Data Collection – Quiz 1 Answers • Score you paper using the following key • A • B • A • B • B
Data Collection – Quiz 1 • How well did you do? • Should I use this score as a part of your grade? • Does this score indicate your level as a graduate student? • “What we have here is a serious lack of communication!” • Most students object strongly to using this score as a part of their grade because it isn’t fair. • Most students object strongly to being labeled “bright” or “challenged” on the basis of their grade. • Their reasoning is the test isn’t fair – it doesn’t cover material relevant to this course. • Welcome to the technical world of instrumentation.
Technical Issues • Validity – extent to which interpretations made from a test score are appropriate • Characteristics • The most important technical characteristic • Situation specific • Does not refer to the instrument but to the interpretations of scores on the instrument • Best thought of in terms of degree
Technical Issues • Validity (continued) • Four types • Content – to what extent does the test measure what it is supposed to measure • Item validity • Sampling validity • Determined by expert judgment
Technical Issues • Validity (continued) • Construct – the extent to which a test measures the construct it represents • Underlying difficulty defining constructs • Estimated in many ways • Criterion-related • Predictive – to what extent does the test predict a future performance • Concurrent - to what extent does the test predict a performance measured at the same time • Estimated by correlations between two tests
Technical Issues • Validity (continued) • Consequential – to what extent are the consequences that occur from the test harmful • Estimated by empirical and expert judgment • Factors affecting validity • Unclear test directions • Confusing and ambiguous test items • Vocabulary that is too difficult for test takers
Technical Issues • Factors affecting validity (continued) • Overly difficult and complex sentence structure • Inconsistent and subjective scoring • Untaught items • Failure to follow standardized administration procedures • Cheating by the participants or someone teaching to the test items
Technical Issues • Reliability – the degree to which a test consistently measures whatever it is measuring • Characteristics • Expressed as a coefficient ranging from 0 to 1 • A necessary but not sufficient characteristic of a test
Technical Issues • Test reliability • Stability – consistency over time with the same instrument • Test – retest • Estimated by a correlation between the two administrations of the same test • Equivalence – consistency with two parallel tests administered at the same time • Parallel forms • Estimated by a correlation between the parallel tests
Technical Issues • Test reliability (continued) • Internal consistency – artificially splitting the test into halves • Several coefficients – split halves, KR 20, KR 21, Cronbach alpha • All coefficients provide estimates ranging from 0 to 1