Download
using the irt and many facet rasch analysis for test improvement n.
Skip this Video
Loading SlideShow in 5 Seconds..
Using the IRT and Many-Facet Rasch Analysis for Test Improvement PowerPoint Presentation
Download Presentation
Using the IRT and Many-Facet Rasch Analysis for Test Improvement

Using the IRT and Many-Facet Rasch Analysis for Test Improvement

859 Vues Download Presentation
Télécharger la présentation

Using the IRT and Many-Facet Rasch Analysis for Test Improvement

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Using the IRT and Many-Facet Rasch Analysis for Test Improvement Desislava Dimitrova, Dimitar Atanasov New Bulgarian University “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” BILC Seminar, 10-15 October 2010-Varna

  2. Outline • Examination procedure • Main concepts and observations • Socio-cognitive test validation framework, Cyril Weir (2005) and criteria • Scoring validity for listening and reading parts of the test • Scoring validity for essay

  3. Test structure 1. Listening paper: two tasks • 15 MCQ 2. Reading paper: five tasks • 6 items matching response format • 10 items bank-cloze response format • 10 items open-cloze response format • 16 items short-answer response format • 2 open-ended questions • 5 MCQ 3. Essay: 180-220 words

  4. Too much? • The concept of communicative language ability (CEFR) • The concept of test usefulness (Bachman) • The concept of justifing the use of language assessment in real world (Bachman) • The concept of validity • The Code of practice (ALTE*, for example) *Association of Language Testers in Europe

  5. Statements NBU exam is high-stake. NBU exam is criterion-oriented. NBU exam is ‘independent’. Evidences for test validation were not established, BUT there was a routine practice for test development process and test administration.

  6. The Socio-cognitive Framework for test validation, Cyril Weir (2005) Test takers characteristics and: Context validity Theory-based validity Scoring validity Consequential validity Criterion-related validity

  7. “Before-the –test- event” Context validity Theory-based validity “After- the- test –event” Scoring validity Consequential validity Criterion-related validity

  8. Scoring validity for listening and reading parts of the test are established by: • Item analysis • Internal consistency • Error of measurement • Marker reliability Not just looking at them! Investigate, discuss, learn and take decisions!

  9. Analisis 3-parameter IRT model Advantages • Item parameter estimates are independent of the group of examinees used • Test taker ability estimates are independent of the particular set of items used Degree of Difficulty to specify the discrimination to specify the content

  10. Summer session, 2010

  11. Possible decisions • Remedial procedures • Classroom assessment • Only certification decision

  12. Scoring validity for writing is established by: • Criteria/rating scale • Rating procedures: Rater training Standardization Rating conditions Rating Moderation Statistical analysis Raters • Grading

  13. Good Two raters Analytic writing scale Rubrics and input Negative The score depends on the raters No task specific scale No standardization Conclusion for the essay:

  14. Now is fact that: We will continue our work for • item writer’s training • content and statistical specification of the items • test review and test revision

  15. Shearing: Investigation (small steps to “strong” validity). Comparison (language ability of the same population at the same level) Cooperation (in research project)

  16. Thank you New Bulgarian University www.nbu.bg