reading assessment still time for a change n.
Skip this Video
Loading SlideShow in 5 Seconds..
Reading Assessment: Still Time for a Change PowerPoint Presentation
Download Presentation
Reading Assessment: Still Time for a Change

Reading Assessment: Still Time for a Change

56 Vues Download Presentation
Télécharger la présentation

Reading Assessment: Still Time for a Change

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Reading Assessment: Still Time for a Change P. David Pearson UC Berkeley Professor and Former Dean Former Slides available at

  2. Why did I pick such a boring topic? • I’m a professor! • Who needs fun? • The consequences are too grave. • I have a perverse standard of fun. Slides available at

  3. Valencia and Pearson (1987) Reading Assessment: Time for a Change. In Reading Teacher

  4. What is thinking? • You do it in your head, without a pencil..Alexandra, age 4 • You shouldn’t do it in the dark. It’s too scary, Thomas, age 5

  5. What is Thinking? • Thinking is when you’re doing math and getting the answers right, Sissy, age 5 • And in response… • NO! You do the thinking when you DON’T know the answer. Alex, age 5

  6. What is Thinking? • It’s very, very easy. The way you do it is just close your eyes and look inside your head. Robert, age 4

  7. What is Thinking? • You think before you cross the street! • What do you think about? • You think about what you would look like smashed up! Leon, age 5

  8. What is Thinking? • You have to think in swimming class. • About what? • About don’t drink the water because maybe someone peed in it…and don’t drown!

  9. Why did We Take this Stance? • Need a little mini-history of assessment to understand our motives Slides available at

  10. The Scene in the US in the 1970s and early 1980s • Behavioral objectives • Mastery Learning • Criterion referenced assessments • Curriculum-embedded assessments • Minimal competency tests: New Jersey • Statewide assessments: Michigan & Minnesota Slides available at

  11. Teach Assess Conclude Teach Assess Conclude Historical relationships between instruction and assessment Skill 1 Skill 2 The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill. Foundation: Benjamin Bloom’s ideas of mastery learning

  12. Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Skill 1 The 1970s, cont. And we taught each of these skills until we had covered the entire curriculum for a grade level. Skill 2 Skill 3 Skill 4 Skill 5 Skill 6

  13. Dangers in the Mismatch we Saw in 1987 • False sense of security. • Instructionally insensitive to progress on new curricula • Accountability will do us in and force us to teach to the tests • and all the bits and pieces.

  14. Pearson’s First Law of Assessment • The finer the grain size at which we monitor a process like reading and writing, the greater the likelihood that we will end up teaching and testing bits and pieces rather than global processes like comprehension and composition.

  15. The ideal • The best possible assessment • teachers observe and interact with students • as they read authentic texts for genuine purposes. • they evaluate the way in which the students construct meaning. • intervening to provide support or suggestions when the students appear to have difficulty.

  16. Pearson’s Second Law of Assessment • An assessment tool is valued to the degree that it can approximate the good judgment of a professional teacher!

  17. A new conceptualization of the goal

  18. A 1987 Agenda for the Future

  19. Pearson’s Third Law of Assessment • When we ask an assessment to serve a purpose for which it was not designed, it is likely to crumble under the pressure, leading to invalid decisions and detrimental consequences.

  20. Early 1990s in the USA • Standards based reform • State initiatives • IASA model • Trading flexibility for accountability • Move from being accountable for the means and leaving the ends up for grabs (doctor or lawyer model) TO • Being accountable for the ends and leaving the means up for grabs (carpenter or product model)

  21. Mid 1990s Developments • Assessment got situated within the standards movement • Content Standards: Know and be able to do? • Performance Standards: What counts? • Opportunity to Learn Standards: Quid pro quo?

  22. Standards-Based ReformThe Initial Theory of Action Standards Assessment Accountability Clear Expectations Motivation Higher Student Learning Ala Tucker and Resnick in the early 1990s

  23. Expanded Theory of Action Standards Assessment Accountability Clear Exp’s Motivation Instruction Professional Development Higher Student Learning Ala Elmore and Resnick in the late 1990s

  24. The Golden Years of the 90s? • A flying start in the late 1980s and early 1990s • International activity in Europe, Down Under, North America • Developmental Rubrics • Performance Tasks • New Standards • CLAS • Portfolios of Various Sorts • Storage bins • Showcase: best work • Compliance: Walden, NYC • Increase the use of constructed response items in NRTs

  25. Late 1980s/early 1990s:PortfoliosPerformance AssessmentsMake Assessment Look Like Instruction From which we draw Conclusions Activities On standards 1-n We engage in instructional activities, from which we collect evidence which permits us to draw conclusions about student growth or accomplishment on several dimensions (standards) of interest.

  26. The complexity of modern assessment practices: one to many Activity X Any given activity may offer evidence for many standards, e.g, responding to a story. Standard 1 Standard 2 Standard 3 Standard 4 Standard 5

  27. The complexity of performance assessment practices: many to one Standard X Activity 1 For any given standard, there are many activities from which we could gather relevant evidence about growth and accomplishment, e.g., reads fluently Activity 2 Activity 3 Activity 4 Activity 5

  28. The complexity of portfolio assessment practices, many to many Activity 1 Standard 1 Activity 2 Standard 2 Standard 3 Activity 3 Activity 4 Standard 4 Activity 5 Standard 5 • Any given artifact/activity can provide evidence for many standards • Any given standard can be indexed by many different artifacts/activities

  29. The perils of performance assessment: or maybe those multiple-choice assessments aren’t so bad after all……. • Thunder is a rich source of loudness • "Nitrogen is not found in Ireland because it is not found in a free state"

  30. The perils of performance assessment • "Water is composed of two gins, Oxygin and Hydrogin. Oxygin is pure gin. Hydrogin is gin and water.” • "The tides are a fight between the Earth and moon. All water tends towards the moon, because there is no water in the moon, and nature abhors a vacuum. I forget where the sun joins in this fight."

  31. The perils of performance assessment • "Germinate: To become a naturalized German." • "Vacumm: A large, empty space where the pope lives.” • Momentum is something you give a person when they go away.

  32. The perils of performance assessment • The cause of perfume disappearing is evaporation. Evaporation gets blamed for a lot of things people forget to put the top on. • Mushrooms always grow in damp places which is why they look like umbrellas. • Genetics explains why you look like your father, and if you don't, why you should.

  33. The perils of performance assessment • "When you breath, you inspire. When you do not breath, you expire."

  34. Post 1996: The Demise of Performance Assessment • A definite retreat from performance-based assessment as a wide-scale tool • Psychometric issues • Cost issues • Labor issues • Political issues

  35. The Remains… • Still alive inside classrooms and schools • Hybrid assessments based on the NAEP model • multiple-choice • short answer • extended response • The persistence of standards-based reform.

  36. No Child Left Behind • Accountability in Spades • Every grade level reporting • Census assessment rather than sampling (everybody takes the same test) • Disaggregated Reporting by • Income • Exceptionality • Language • Ethnicity

  37. NCLB, continued • Assessments for varied purposes • Placement • Progress monitoring • Diagnosis • Outcomes/program evaluation • Scientifically based curriculum too Slides available at

  38. There is good reason to worry about disaggregation L Achievement H School 1 School 2

  39. Height of bar = average achievement; width = number of students Large N Small N Large N Small N A A B B Disaggregation and masking Simpson’s Paradox? L Achievement H School 1 School 2

  40. Disaggregation: Damned if we do and damned if we don’t • Don’t report: render certain groups invisible • Do report: blame the victim (they are the group that did not meet the standard.

  41. Pearson’s Fourth Law of Assessment • Disaggregation is the right approach to reporting results. Just be careful where the accountability falls.

  42. Pearson’s Fourth Law: A Corollary • Accountability, in general, falls to the lowest level of reporting in the system.

  43. Assessment can be the friend or the enemy of teaching and learning • The curious case of DIBELS, … and other benchmark assessments The Dark Side

  44. A word about benchmark assessments… • The world is filled with assessments that provide useful information… • But are not worth teaching to • They are good thermometers or dipsticks • Not good curriculum

  45. The ultimate assessment dilemma… • What do we do with all of these timed tests of fine-grained skills: • Words correct per minute • Words recalled per minute • Letter sounds named per minute • Phonemes identified per minute • Scott Paris: Constrained versus unconstrained skills • Pearson: Mastery skills versus growth constructs

  46. Why they are so seductive • Mirror at least some of the components of the NRP report • Correlate with lots of other assessments that have the look and feel of real reading • Takes advantage of the well-documented finding that speed metrics are almost always correlated with ability, especially verbal ability. • Example: alphabet knowledge • 90% of the kids might be 90% accurate but… • They will be normally distributed in terms of LNPM