Psychometric Evaluation of "Flash" and "NSSI" Reading Measures in Elementary Education

Valid, Reliable & Efficient: A Psychometric Evaluation of “Flash” Word Recognition and “NSSI” Passage Reading Measures

Kathleen J. BrownMatthew K. FieldsUniversity of UtahR. Darrell Morris Appalachian State University

Impetus for Current Study • Need for valid, reliable, efficient instruments to determine instructional reading level • Flaws with current instruments • DRA – no rate & time consuming • DIBELS – screen only • IRIs – psychometric evaluation often weak or missing

Impetus for Current Study • Growing use of “Flash” & selected graded passages a.k.a. “NSSI” (Virginia/ASU effect) • Initial psychometric evaluations positive • (Frye, 2004; Frye & Trathen, 2004; Frye Trathen, Olson, & Schlagal, 2002; Palmer, Trathen, Olson, & Schlagal, 2002)

Theoretical Framework (Anastasi, 1988; APA, 1985)

Methods • 4 schools • 2 = Title 1 1 = public, 1 = parochial • 2 = non-Title 1 both = public & mixed SES • 192 students in G2-G5 in March, 2006 • Rank ordered DIBELS or QRI, then sampled 12 students per grade: 4 high, 4 average, 4 poor to achieve a representative distribution for testing

Methods • 135 minutes of assessment in 3 sessions • Presentation order counterbalanced • Flash item selection counterbalanced • 9 on data team; 4 hours protocol training • Manual flash interrater differences = n.s.

Alternate Form Reliability • measure of temporal stability for scores • measure of consistency of response for scores

Alternate Form Reliability • To what extent are NSSI A passage scores equivalent to NSSI B passage scores? • To what extent are computer “Flash” scores equivalent to manual “Flash” scores?

Results: Alternate Form Reliability **p < .01

Content Validity • provides evidence that items on test represent a specific domain • provides evidence that the format and response properties of the test represent the domain

Content Validity • To what extent do the NSSI passages reflect/measure expected grade level benchmarks? • Maybe look at separate means for accuracy, rate, & comp & report those to show • To what extent does the Flash measure reading instructional level?

NSSI Reading Level Criteria

Performance Levelfor NSSI by Grade

Performance Levelfor Flash by Format & Criterion

Concurrent Validity • To what extent are Flash scores and NSSI scores consistent with scores achieved on a “flagship” standardized reading measure (i.e., the GORT)?

Results: Concurrent Validity **p < .01

Average Performance Levelfor NSSI & GORT by Grade Level

Conclusions: For G2-G5 • NSSI A and NSSI B seem to have high validity for identifying students’ instructional reading levels • NSSI A and NSSI B can be considered equivalent forms

Conclusions: For G2-G5 • Manual Flash and Computer Flash seem to have high validity for identifying students’ instructional reading levels when the criterion is set at 85% • Manual Flash and Computer Flash seem to be equivalent forms

Conclusions: For G2-G5 • The GORT does not seem to have high validity for identifying students’ instructional levels—at any grade level. • The GORT over-predicts instructional level—by approx. 2 years. • Note: most GORT comp questions are passage independent (Keenan & Betjemann, 2006)

Psychometric Evaluation of "Flash" and "NSSI" Reading Measures in Elementary Education