Evaluating Psychological Tests

Evaluating Psychological Tests

Psychological testing • Suffers a credibility problem within the eyes of general public • Two main problems • Tests used inappropriately • Goddard (1912) used a translation of Binet’s test to test ability of American immigrants - conclusion 79% of Italian immigrants = ‘feeble-minded’ - bias • Tests themselves can be flawed • Often measures supposed constructs which are not supported by proper factor analysis - (Internal locus of control)

External bias in tests • Do group differences imply test bias (difficulty unrelated to characteristic being assessed)? • V1 - innate abilities can be different across groups (Reynolds, 1995; Kline, 1993) • Japanese have higher than average spatial abilities • African Americans have ‘lower IQ’ (Hernstein & Murray, 1996) • V2 – Ethnic and gender groups must have the same underlying abilities – evidence to the contrary must be a product of measuring something other than what is relevant • Kline – ‘egalitarian fallacy’

Dealing with differences • Detected through different regression equation – not through different means • What purpose does research in this area serve? • Within group differences far outweigh between group differences

Detecting internal bias • If only gross scores are considered, hard and easy items for each group might balance themselves out giving a false impression of the test’s ‘health’ • Alternative – Run a mixed factorial ANOVA • Each test item (question) is entered as a level of repeated measures factor • Group = between subjects variable • Main effect of item – expected • Main effect of group shows external bias • Interaction show internal bias in that the pattern of responding is different across the groups • Such a method is susceptible to power manipulation

Bias - performance characteristics • Response bias • individuals are more likely to agree than disagree (Cronbach, 1946) – response set of acquiescence • Does not cause a problem if everyone behaves in same manner – standard score will be unaffected • But there are considerable individuals differences in acquiescence therefore it can cause a major problem • Changing polarity removes this difficulty • Social desirability • Counter acted by lie scales and consistency measures

Obvious influences • Motivation • Expectation • Anxiety • Test specific practise

Revisiting Validity

Validity – different definitions • Correctness or truth of an inference • Validity with respect to IV • Are we truly manipulating that which we think we are • Often relies on the construct of interest being adequately described • How do you manipulate something like the unconscious? • Validity with respect to the DV • Extent to which you are measuring what you claim to measure

Different types of validity • Content validity • Whether the target construct is adequately addressed • When measuring depression should assess aspects such as fatigue, anxiety, appetite, motivation, libido • Is assessed through expert opinion • Has a certain amount of subjectivity

Different types of validity • Criterion-Related validity • How measure compares to some already validated measure • Two types • Predictive • Concurrent

Different types of validity • Construct validity • Most important – Are the experimental manipulations that we make really manipulating the construct of interest • Evaluation requires • Clear definition of the construct • Can be difficult e.g., IQ – has many different facets • Assess match between construct and operations used to represent it (exp manipulations) • Can involve criterion and content validity • Viewed as an evolving never ending process

Different types of validity • Internal validity – degree to which the independent and dependent variables are causally linked • External validity – degree to which causal relationship holds across different settings

How relevant is validity to you • Reviewing articles is essentially addressing validity and reliability issues • In examination situation would be useful although not essential to talk about the different forms of validity • In discussion sections of reports again you are essentially evaluating the results with respect to validity and reliability • Would not really use the formal language used here – is a style issue

Evaluating Psychological Tests

Evaluating Psychological Tests

Presentation Transcript

Psychological and Educational Tests and Measurements

Screening Elderly Clinic Patients for Early Dementia: Psychological Tests, Brain Scans, Genetic Tests, and CSF Biomarke

Evaluating the Technical Quality of Computerized Adaptive Tests

Advising on the Construction of Psychological and Educational Tests

Reliability of Non-Destructive Tests for Evaluating Concrete Quality

Psychological Tests

Evaluating

Intelligence Tests and Psychological Experiments

Psychological

Unit tests, Integration tests Physics tests

Characteristics of Psychological Tests

Psychological

Evaluating the Patient With Abnormal Liver Tests-1

Evaluating the Patient With Abnormal Liver Tests-3

Evaluating Psychological Experiments and Observations (D1)

Tests for Evaluating Rank Histograms from Ensemble Forecasts

Evaluating

PSYCHOLOGICAL TESTS AND MEASUREMENT PSY 425

Nature and uses of Psychological Tests

Is it True? Evaluating Research about Diagnostic Tests

Advising on the Construction of Psychological and Educational Tests

Is it True? Evaluating Research about Diagnostic Tests