1 / 33

CAT’s Journey in Georgia

CAT’s Journey in Georgia. Introduction of GCAT. Decision Visit to CITO Tryouts First database of items (3P) Calibration, fine-tuning Algorithm, software Simulations, infrastructure Large-scale pre-test First GCAT. Fall 2010. November 2010. December 2010. December 2010.

fionn
Télécharger la présentation

CAT’s Journey in Georgia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CAT’s Journey in Georgia

  2. Introduction of GCAT • Decision • Visit to CITO • Tryouts • First database of items (3P) • Calibration, fine-tuning • Algorithm, software • Simulations, infrastructure • Large-scale pre-test • First GCAT Fall 2010 November 2010 December 2010 December 2010 Januray 2011 February 2011 March 2011 April 2011 May 2011

  3. History of CAT in Georgia • Used in School Leaving Exams • Administered yearly to 12th- and 11th-graders* • Usually administered at the end of the school year (May-June) • 8 subjects: • Georgian language ― Mathematics • Foreign Language ― Physics • History ― Chemistry • Geography ― Biology

  4. Scale and stakes • About 40,000 students take the test each year • Passing grade in all 8 subjects is necessary to obtain a school leaving certificate • The school leaving certificate is needed to enter the university, to work in the civil sector, etc. • If failed, the exam can be retaken next year

  5. What is CAT? • Computerized Adaptive Testing • Administered using the computer • Test is formed “on-the-fly”, adapting to the student’s performance. • Right equating of the results achieved through using the Item Response Theory (IRT) Result: Tailor-made tests for each student with standardized scores

  6. Analogy: 20 Questions Game • I am thinking of something. • You have 20 “yes-or-no” questions to figure it out. • What is the best strategy? • Is it writing up a set of 20 questions ahead of time? • Is it a living thing? • Is it a vegetable? • Is it red? • Is it bigger than a human being? • …

  7. 20 Questions Game • Isn’t it a better strategy to base a next question on the replies to the previous ones? • In the absence of information, start with something having a 50/50 chance of being true. • As information builds up along the way, ask more precising questions

  8. Game Test Run • Is it a living thing?

  9. Game Test Run • Is it a living thing? YES

  10. Game Test Run • Is it a living thing? YES • Is it a wild animal?

  11. Game Test Run • Is it a living thing? YES • Is it a wild animal? NO

  12. Game Test Run • Is it a living thing? YES • Is it a wild animal? NO • Is it bigger than human?

  13. Game Test Run • Is it a living thing? YES • Is it a wild animal? NO • Is it bigger than human? NO

  14. Game Test Run • Is it a living thing? YES • Is it a wild animal? NO • Is it bigger than human? NO • Is it furry?

  15. Game Test Run • Is it a living thing? YES • Is it a wild animal? NO • Is it bigger than human? NO • Is it furry? YES

  16. It’s a CAT

  17. Same principle used in CAT • Computer keeps track of student’s pattern of responses so far. • As test progresses, we learn more about the student’s ability • Computer chooses the next item to get maximal informationabout the student’s level of ability • Purpose of assessment: Get best possible information about students’ ability

  18. Why CAT? • Measurement Precision • More information with less items • Security • Large item bank, individual test forms • Equating • Done automatically, using Item Response Theory (IRT) • Good predictability • Using simulations and IRT

  19. Item Response Theory • Also called the Latent Trait Theory • Assumes the “thing to be measured” is a single entity expressible as a number – call it True Ability, usually denoted by  • Assumes that the student’s ability is related in a specific probabilistic way to the response the student gives to a particular item • Why probability?

  20. Why probability? Hard items Approach: Measure a student’s ability in terms of how difficult an item she can solve. But how? Does little around here Has 75% chance of solving a random item from around here Does everything around here Easy items

  21. Item Response Function The students ability and item parameters determine the probability of a correct response. In the two-parameter logistic model (2PL), this is done by the following function:

  22. Information functions

  23. Information functions

  24. Information functions

  25. Information functions

  26. CAT Algorithm 3 Random items Estimate Ability Choose next item (maximum information) NO Check stopping conditions Administeritem YES Estimate ability, scale and display score, terminate.

  27. Typical run of the CAT (2011, Geo) Ability estimate Item number

  28. Issues • Content validity across subdomains • Certain proportion across subject domains must be observed • Item exposure control • The most informative items tend to get overused • Difficulty control • Student with low ability might get an overly difficult item, or the high ability student might get an overly easy item. • New items calibration • To replenish the item bank, new items need to be tested in realistic conditions.

  29. Item Bank (Mathematics)

  30. Exposure Control • Each 5th item is chosen at random from a difficulty interval around the current ability estimate. • Overexposed (~3000 views) items are suspended.

  31. Exposure for math items (subdomain 4)

  32. Solutions • Difficulty control • The item is chosen from a restricted difficulty interval surrounding the current ability estimate of the student. • New items • Pilot (unscored) items are administered to each student at regular intervals during the actual test.

  33. The journey continues…

More Related