190 likes | 294 Vues
Automatic Generation of Verbal Analogy Items. Alan D. Mead Illinois Institute of Technology. AIG in employment testing. Rise of unproctored Internet testing (UIT) UIT may cause many security problems One is item theft and coaching
E N D
Automatic Generation of Verbal Analogy Items Alan D. Mead Illinois Institute of Technology
AIG in employment testing • Rise of unproctored Internet testing (UIT) • UIT may cause many security problems • One is item theft and coaching • Solution: Generate entire test from scratch for each examinee • Item theft less of a problem • Coaching less effective • Items could be “watermarked” • Also reduces cost and speeds deployment
AIG in employment testing (cont.) • Need a variety of test content • Verbal analogies • Vocabulary • Math • Perceptual speed and accuracy • Spatial ability • Personality • Situational Judgment • Etc.
Verbal Analogies Shovel:Dig::Fork • Buy • Cry • Eat • Stop Shovel:Dig • Bag:Buy • Baby:Cry • Fork:Eat • Car:Stop Pair responses Word Responses • Identify a “bridge”; you DIG with a SHOVEL • Find a matching answer; you EAT with a FORK
Generating Verbal Analogies • Identified database of relationships (e.g., “RIDER operates a BIKE”) • Identified additional bridge relationships (“BOVINE means COW-like” & “ABSENT is the opposite of PRESENT”) • Gathered data on word frequency and (part of this study) word familiarity
Generating Verbal Analogies (cont.) • Randomly select a bridge • Randomly select TWO pairs for this bridge (one for the stem, one for the key) • Randomly select 2-3 additional pairs from other bridges • Randomly assign key pair; fill in remaining pairs
Sample Items 1. paternal:father:: ? a. juvenile:child b. microphone:sound c. chalk:writer d. unfold:fold 3. rocket:astronaut:: ? a. lamp:light b. stick:skating rink c. jet:pilot d. demand:supply
Alternative format 1. paternal:father:: juvenile:? a. child b. sound c. writer d. fold 3. rocket:astronaut::jet:? a. light b. skating rink c. pilot d. supply
Keys 1. paternal:father:: ? [Bridge: FATHER is described by PATERNAL] a. juvenile:child *** b. microphone:sound (unrelated: sound is a (typical) theme of microphone) c. chalk:writer (unrelated: writer is a (typical) agent of chalk) d. unfold:fold (unrelated: unfold and fold are opposites/opposed) 3. rocket:astronaut:: ? [Bridge: ASTRONAUT operates ROCKET] a. lamp:light (unrelated: lamp is a (typical) result of light) b. stick:skating_rink (unrelated: skating_rink is a (typical) location of stick) c. jet:pilot *** d. demand:supply (unrelated: supply and demand are opposites/opposed)
Present Study • H1: Two forms of AIG analogies (word responses and pair responses) will have comparable reliability & validity • H2: AIG scales will have reliability comparable to manually-written scale • H3: AIG scales will have construct and criterion validity comparable to manually-written scale
Method • Sample of N=251 gathered online and from psychology classes • Measures: • n=20 AIG & human-written verbal analogy scales • N=40 vocabulary • Self-reported performance at work & school
Feasibility • Manually examined items for feasibility • 40/64 (63%) items were feasible • Reasons for infeasibility • Over-use of a bridge or a pair (some bridges have few pairs) • Ambiguous pairs (drum:drum?) • Foil inadvertently a correct key
Results for H1 H1: Two forms of AIG analogies (word responses and pair responses) will have comparable reliability & validity CONFIRMED
Results for H2 H2: AIG scales will have reliability comparable to manually-written scale NOT CONFIRMED because the AIG scales had better reliability
Results for H3 H3: AIG scales will have construct and criterion validity comparable to manually-written scale CONFIRMED
Future Directions • Better handling of senses (DRUM is for DRUMMING) • Better difficulty calculations based on larger sample of items • Automated feasibility checking • Enhanced database of relationships • Choosing foils to have more semantic similarity to other words