Valid, Reliable & Efficient:
Valid, Reliable & Efficient:. A Psychometric Evaluation of “Flash” Word Recognition and “NSSI” Passage Reading Measures. Kathleen J. Brown Matthew K. Fields University of Utah R. Darrell Morris Appalachian State University. Impetus for Current Study.
Valid, Reliable & Efficient:
E N D
Presentation Transcript
Valid, Reliable & Efficient: A Psychometric Evaluation of “Flash” Word Recognition and “NSSI” Passage Reading Measures
Kathleen J. BrownMatthew K. FieldsUniversity of UtahR. Darrell Morris Appalachian State University
Impetus for Current Study • Need for valid, reliable, efficient instruments to determine instructional reading level • Flaws with current instruments • DRA – no rate & time consuming • DIBELS – screen only • IRIs – psychometric evaluation often weak or missing
Impetus for Current Study • Growing use of “Flash” & selected graded passages a.k.a. “NSSI” (Virginia/ASU effect) • Initial psychometric evaluations positive • (Frye, 2004; Frye & Trathen, 2004; Frye Trathen, Olson, & Schlagal, 2002; Palmer, Trathen, Olson, & Schlagal, 2002)
Theoretical Framework (Anastasi, 1988; APA, 1985)
Methods • 4 schools • 2 = Title 1 1 = public, 1 = parochial • 2 = non-Title 1 both = public & mixed SES • 192 students in G2-G5 in March, 2006 • Rank ordered DIBELS or QRI, then sampled 12 students per grade: 4 high, 4 average, 4 poor to achieve a representative distribution for testing
Methods • 135 minutes of assessment in 3 sessions • Presentation order counterbalanced • Flash item selection counterbalanced • 9 on data team; 4 hours protocol training • Manual flash interrater differences = n.s.
Alternate Form Reliability • measure of temporal stability for scores • measure of consistency of response for scores
Alternate Form Reliability • To what extent are NSSI A passage scores equivalent to NSSI B passage scores? • To what extent are computer “Flash” scores equivalent to manual “Flash” scores?
Results: Alternate Form Reliability **p < .01
Content Validity • provides evidence that items on test represent a specific domain • provides evidence that the format and response properties of the test represent the domain
Content Validity • To what extent do the NSSI passages reflect/measure expected grade level benchmarks? • Maybe look at separate means for accuracy, rate, & comp & report those to show • To what extent does the Flash measure reading instructional level?
Concurrent Validity • To what extent are Flash scores and NSSI scores consistent with scores achieved on a “flagship” standardized reading measure (i.e., the GORT)?
Results: Concurrent Validity **p < .01
Conclusions: For G2-G5 • NSSI A and NSSI B seem to have high validity for identifying students’ instructional reading levels • NSSI A and NSSI B can be considered equivalent forms
Conclusions: For G2-G5 • Manual Flash and Computer Flash seem to have high validity for identifying students’ instructional reading levels when the criterion is set at 85% • Manual Flash and Computer Flash seem to be equivalent forms
Conclusions: For G2-G5 • The GORT does not seem to have high validity for identifying students’ instructional levels—at any grade level. • The GORT over-predicts instructional level—by approx. 2 years. • Note: most GORT comp questions are passage independent (Keenan & Betjemann, 2006)