1 / 75

Using Diagnostic Assessment To Guide Timely Interventions

Using Diagnostic Assessment To Guide Timely Interventions. Natalie Rathvon, Ph.D. What We’ll Cover . A research-based framework for selecting and using diagnostic reading assessments Steps in the diagnostic assessment process Issues related to assessing the five reading components

Patman
Télécharger la présentation

Using Diagnostic Assessment To Guide Timely Interventions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Diagnostic Assessment To Guide Timely Interventions Natalie Rathvon, Ph.D.

  2. What We’ll Cover • A research-based framework for selecting and using diagnostic reading assessments • Steps in the diagnostic assessment process • Issues related to assessing the five reading components • Diagnostic assessment options for each component • Case examples

  3. Reading First Assessments • Screening: Brief measures to identify which students are at risk for reading problems • Progress monitoring: Brief measures to determine if students are making adequate progress in acquiring reading skills • Diagnostic: A comprehensive assessment to locate the source(s) of reading difficulty for individual students to guide instruction • Outcome: An assessment to determine the extent to which all students have achieved grade-level expectations in reading

  4. Questions to be Answered by Diagnostic Assessments • In which reading skill areas is this student achieving at expected levels? • In which reading skill areas is the student making less than expected progress? • What types, intensity, and duration of interventions are likely to be effective in addressing this student’s skill needs?

  5. So many tests, so few guidelines . . . • Growing number of print and online tests purporting to assess reading • Standards for Psychological and Educational Testing (AERA, APA, & NCME, 1999) • Gives general guidelines--not specific criteria--for evaluating psychometric quality

  6. Myths about Reading Assessment • All claims that a measure is “scientifically based” are equally valid. • A valid and reliable measure is equally valid and reliable for all examinees. • All measures of the same reading component yield similar results for the same examinee.

  7. The Case of Tim (Grade 1):Poor or Proficient Reader?

  8. Accelerating Student Outcomes Assessment Instruction Data-Based Instructional Planning

  9. Traditional “Standard Battery” (one size fits all) Assumes reading problems arise from internal child deficits Designed to provide a categorical label for educational programming Component-based Targets domains related to the identified deficits Assumes most reading problems arise from experiential and/or instructional deficits Designed to provide information for guiding instruction Reading Assessment Models

  10. Two Sets of Considerations in Selecting Assessments • Technical adequacy: Psychometric soundness • Usability: Degree to which practitioners can actually use a measure in applied settings

  11. Assessment Checklists • Checklist 1: Evaluating the technical adequacy of diagnostic reading measures • Checklist 2: Evaluating the usability of diagnostic reading measures

  12. Five Key Technical Adequacy Characteristics • Norms • Test floors • Item gradients • Reliability • Validity • Checklist 1: Evaluating Technical Adequacy

  13. Norms: How Do We Interpret Performance? • Norm-referenced measures: Comparisons with age/grade peers • Criterion-referenced measures: Comparisons with pre-determined performance standards • Nonstandardized measures: Research norms or examiner judgment

  14. Evaluating the Adequacy of Norms • Are they representative? • Criteria:Should match a national or appropriate reference population • Are they recent? • Criteria: No more than 7 – 12 years old • Are subgroup and sample sizes large enough? • Criteria: At least 100 & 1000, respectively

  15. Evaluating Norms, II • Are norm table intervals small enough to reflect changes in skill development? • Criteria: • No more than 6 months for students aged 7-11 and younger • No more than 1 year for students aged 8-0 to 18

  16. Reliability: Are Scores Consistent and Accurate? Alternate-form: Form A vs Form B Internal consistency: Item A vs Item B Test-retest: Time A vs Time B Interscorer: Scorer A vs Scorer B • Criteria: =/> .80 for screening measures and .90 for diagnostic measures

  17. Hidden Threat to Reliability • Examiner variance: Differences among assessors in administering tasks and recording responses • Especially likely on: • Live-voice tasks (phoneme blending) • Fluency-based tasks (CBM, TOWRE) • Tasks with complex administration or scoring systems (DIBELS ISF, LAC–3)

  18. Test Floors: Can the Test Detect Poor Readers? • Test floor: Lowest possible standard score when a student answers 1 item correctly • Adequate floors: Permit identification of students with very weak skills • Inadequate floors: Overestimate students’ level of skills

  19. Test Floor Criteria • A subtest raw score of 1 should yield a standard score > 2 SDs below the subtest mean. • SS of 3 or less for a subtest mean of 10 • SS of 69 or less for a subtest mean of 100

  20. Which Tests and Tasks Are Likely to Display Floor Effects? • “Cradle-to-grave” tests (WJ III) • Phonemic manipulation tasks (deletion, substitution, reversal) • Oral reading fluency tests • Pseudoword reading tests • Spelling tests • Reading comprehension tests

  21. Why Floor Effects Matter • TOWRE Phoneme Decoding Efficiency • A student in the 2nd month of Grade 1 with 1 item correct earns a SS of 97 (average). • WJ III Reading Vocabulary • A student in the 3rd month of Grade 1 with 1 item correct earns a SS of 94 (average).

  22. Item Gradients: Can the Test Detect Small Differences? • Item gradient: Steepness with which standard scores change from 1 raw score unit to another • Adequate gradient: Sensitive to small differences in performance • Steep gradient: Obscures differences among performance levels

  23. Item Gradient Criteria • 6 or more items between subtest floor and mean (M = 10) or • 10 or more items between subtest floor and mean (M = 100) • GRADE Listening Comprehension (K) • 17 items correct = 5th stanine • 18 items correct (100%) = 8th stanine

  24. Test Floors and Item Gradients: Special Cases • Screening tests • Critical issue is cutoff score accuracy, not floor/gradient violations • Tests not yielding standard scores • Deciles, percentiles, quartiles, stanines • Rasch-model tests • Preclude direct inspection of raw score-standard score relationships • WJ family : WJ III, WRMT-R/NU, WDRB

  25. Validity: Are the Results Meaningful? • Content validity: Effectiveness in assessing the relevant domain • Criterion-related validity:Effectiveness in predicting performance now (concurrent validity) or later (predictive validity) • Construct: Effectiveness in measuring what the test is supposed to measure • Criteria: Evidence of all three types of validity for the target population

  26. Content Validity: Are Tests Assessing the Same Domain?

  27. Predictive and Diagnostic Validity • Does the test predict reading outcomes for the target age/grade group? • Concurrent vs. predictive validity evidence • Does the test differentiate between students with and without reading problems? • Group differentiation studies

  28. The Rest of the Story: Usability Considerations • Usability often has more influence in test selection and use than technical adequacy. • “I know how to give it.” • “It doesn’t take long to give.” • “It’s easy to carry around.” • “I think I saw one in the storage closet.”

  29. Practical Characteristics • Test construction • Administration • Accommodations and adaptations • Scores and scoring • Interpretation • Links to intervention • Checklist 2: Evaluating Usability

  30. The Critical Usability Issue in Diagnostic Assessment • Is there evidence that test results can be used to design instruction to address the reading deficits that have been identified?

  31. The Diagnostic Assessment Process • What can we learn from the results of screening and/or progress monitoring measures? • Are there weaknesses in fluency, phonics, or phonemic awareness? • What can we learn from the results of outcome measures (if available)? • Are there weaknesses in vocabulary and/or comprehension?

  32. Types of Students with Reading Problems Students with specific phonological processing problems Students with global language deficits Reading Performance Problem Attentional Problems Disruptive Behavior Problems

  33. Future Language Deficits?

  34. Identified Deficit Comprehension Fluency Phonics Vocabulary Phonemic Awareness Reading-Related Cognitive Abilities

  35. The Critical Role of Fluency

  36. Issues in Assessing Fluency • Floor effects common • Task variations: foundational skills vs. word reading vs. contextual reading • Variations in level of text difficulty • Oral vs. silent reading formats • Interexaminer variance • Differences in fluency definitions

  37. Fluency Options • BEAR = WPM + Fluency Scale • CBM (student’s own text) = WCPM • CBM (DIBELS) = WCPM • GORT–4 Rate & Fluency = SS, PR, GE, AE • FOX Fluency = WCPM + Fluency Scale • Virginia PALS = WPM + Fluency Scale • Center City Consortium PALS = WCPM • TPRI = WCPM

  38. Best Practices in Assessing Fluency • Administer graded passages with documented readability levels. • Use WCPM as the fluency metric. • Assess at the passage level (i.e., more than 1 minute reading). • Take running records to obtain diagnostic and intervention planning information. • Beware of floor effects in norm-referenced tests.

  39. Phonics Subskills

  40. Issues in Assessing Phonics • Wide differences in content coverage for alphabet knowledge • WJ III Letter-Word ID 13 letters • TERA–3 Alphabet 13 letters • ERDA–2 Letter Recognition 26 letters • WRMT–R/NU Letter ID 51 letters • Floor effects common for pseudoword reading and spelling tests

  41. Phonics Issues, II • Differences in task types • Pseudoword reading = recognition • Spelling = recall (more sensitive) • Differences in pseudoword construction • vake = many neighbors (easier to read) • vaik = few neighbors (harder to read) • Pseudoword reading tests vulnerable to examiner variance and interscorer inconsistency

  42. Alphabet Knowledge Options • Book Buddies NS • CORE Phonics Survey NS • ERDA – 2 NR • FOX CR • PALS CR • TPRI CR • Random letter arrays NS

  43. Spelling Options: Looking in through the “Phonics Window” • Book Buddies (NS - developmental scoring) • CORE Phonics Survey (CR) • FOX (CR) • PALS (CR - developmental scoring) • TPRI (CR) • WIAT–II Spelling (NR) • WJ III Spelling, Spelling of Sounds (NR)

  44. Pseudoword Reading Options • CORE Phonics Survey NS • ERDA–2/WIAT–2 NR • FOX Decoding & Sight Words CR • PAT Decoding NR & CR • Phonics-Based Reading Test NR & CR • WRMT–R/NU Word Attack NR • WJ III Word Attack NR • Informal pseudoword measures

  45. Best Practices in Assessing Phonics • Assess all relevant phonics components. • Select measures with adequate content coverage. • Include both recognition (pseudoword reading) and recall measures (spelling). • Include developmental spelling measures with differentiated scoring systems.

  46. Phonological vs. Phonemic Awareness • Phonological awareness: General awareness of the sound structure of language vs. meaning • Phonemic awareness: Understanding that speech is composed of individual sounds that can be analyzed and manipulated

  47. Issues in Assessing Phonemic Awareness • Variations in linguistic unit, presentation and response formats, coverage, item types, and scoring (all or nothing vs. partial credit) • Variations in predictive power, depending on children’s stage of literacy development • Vulnerable to examiner and interscorer variance, especially for live-voice measures

  48. Which skills are being measured and how?

  49. Phonemic Awareness Options • CTOPP (7 tasks) NR • FOX (7 tasks) CR • LAC-3 (2 tasks) NR & CR • PALS (4 tasks) CR • PAT (6 tasks) NR & CR • TPRI (5 tasks) CR

  50. Best Practices in Assessing Phonemic Awareness • Select multiple measures with adequate content coverage for the domain. • Maximize diagnostic power by matching measures to children’s stage of literacy development. • Use individually administered measures with oral response formats. • Provide training and reliability checks for complex and live-voice measures.

More Related