150 likes | 351 Vues
Validity and Reliabilty. Define different types of validity Define reliability. Criteria for Good Instruments. Validity Validity refers to the degree that the test measures what it is supposed to measure. Validity is the most important test characteristic. Criteria for Good Instruments.
E N D
Validity and Reliabilty • Define different types of validity • Define reliability
Criteria for Good Instruments Validity • Validity refers to the degree that the test measures what it is supposed to measure. • Validity is the most important test characteristic.
Criteria for Good Instruments • There are numerous established validity standards. • Content validity • Criterion-related validity • Concurrent validity • Predictive validity • Construct validity
Content Validity • Content validity addresses whether the test measures the intended content area - sometimes called Face Validity. • Content validity is the extent to which the questions are representative of all the questions that could be asked • Content validity is measured by expert assessment and judgment (content validation).
Content Validity • Content validity is concerned with both: • Item validity: Are the test items measuring the intended content? • Sampling validity: Do the items measure the content area being tested? • One example of a lack of content validity is a math test with heavy reading requirements. It may not only measure math but also reading ability and is therefore not a valid math test.
Criterion-Related Validity • Criterion-related validity is determined by relating performance on a test to performance on an alternative test or other measure.
Criterion-Related Validity • Two types of criterion-related validity include: • Concurrent: The scores on a test are correlated to scores on an alternative test given at the same time (e.g., two measures of reading achievement). • Predictive: The degree to which a test can predict how well a person will do in a future situation, e.g., GRE, (with predictor represented by GRE score and criterion represented as success in graduate school).
Construct Validity • Most important form of validity. • Construct validity assesses what the test is actually measuring and are the results significant, meaningful and useful. • It is very challenging to establish construct validity.
Construct Validity • Construct validity requires confirmatory and disconfirmatory evidence. • Scores on tests should relate to scores on similar tests and NOT relate to scores on tests of other constructs. • For example, scores on a math test should be more highly correlated with scores on another math test than they are to scores from a reading test.
Validity • Some factors that threaten validity: • Unclear directions • Confusing or unclear items • Vocabulary or required reading ability too difficult for test takers • Subjective scoring • Cheating • Errors in administration
Reliability • Reliability refers to the consistency of an instrument to measure a construct. • Reliability is expressed as a reliability coefficient based upon a correlation. • Reliability coefficients should be reported for all measures. • Reliability affects validity. • There are several forms of reliability.
Reliability • Test-Retest (Stability) reliability measures the stability of scores over time. • To assess test-retest reliability, a test is given to the same group twice and a correlation is taken between the two scores. • The correlation is referred to Coefficient of Stability.
Reliability • Alternate forms (Equivalence) reliability measures the relationship between two versions of a test that are intended to be equivalent. • To assess alternate forms reliability, both tests are given to the same group and the scores on each test are correlated. • The correlation is referred to as the Coefficient of Equivalence.
Reliability • Internal Consistency reliability represents the extent to which items in a test are similar to one another. • Split-half: The test is divided into halves and a correlation is taken between the scores on each half. • Coefficient alpha and Kuder-Richardson measure the relationship between and among all items and total scale of a test.
Reliability • Scorer and rater reliabilities reflect the extent to which independent scorers or a single scorer over time agree on a score. • Interjudge (inter-rater) reliability: Consistency of two or more independent scorers. • Intrajudge (intra-rater) reliability: Consistency of one person over time.