1 / 44

Item Analysis - Outline

Item Analysis - Outline. 1. Types of test items Selected response items Constructed response items 2. Parts of test items 3. Guidelines for writing test items. Item Analysis - Outline. 4. Item Analysis Distracter measures Item difficulty measures Item discrimination measures

sheng
Télécharger la présentation

Item Analysis - Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Item Analysis - Outline 1. Types of test items • Selected response items • Constructed response items 2. Parts of test items 3. Guidelines for writing test items

  2. Item Analysis - Outline 4. Item Analysis • Distracter measures • Item difficulty measures • Item discrimination measures 5. Item Response Theory • ICCS • Adaptive testing

  3. 1. Types of test items • Selected response • Multiple choice • Likert scale • Category • Q-sort • Constructed response

  4. Multiple choice or forced choice Task is to choose between set answers Advantage: ease of scoring Advantage: scoring requires little skill Disadvantage: may test memory rather than comprehension A. Selected response

  5. Multiple choice or forced choice Correct response must be distinct Distracters should not be obvious or ambiguous If distracters are bad,more = less reliable test Use 3-4 distracters per item A. Selected response

  6. Multiple choice or forced choice Likert format Test-taker chooses a point on a scale that expresses their attitude or belief Data lend themselves to factor analysis A. Selected response

  7. Likert scale example item Parking costs at the university are fair 1 2 3 4 5 Strongly agree neutral disagree strongly agree disagree

  8. Multiple choice or forced choice Likert format Category Similar to Likert but with more choices Test-taker’s commitment Reliability depends on good instructions & # of categories (≤ 10) Scoring shows context effects A. Selected response

  9. Multiple choice or forced choice Likert format Category Q-sort A large set of cards each with statement referring to a “target” Test-take sorts cards into piles in terms of how accurate statements are as a description of target Generally 9 piles A. Selected response

  10. 1. Types of test items • Selected response • Constructed response • Free response • Fill-in-the-blank • Essay tests • Portfolios • In-basket technique

  11. Free response Test-taker responds without constraint Describes what is important to him/her B. Constructed response items

  12. Free response Fill-in-the-blank Used to test for knowledge or to find out about beliefs and attitudes B. Constructed response items

  13. Free response Fill-in-the-blank Essay tests Preferred when you want to assess test-taker’s ability to think analytically, integrate ideas, and express himself B. Constructed response items

  14. Free response Fill-in-the-blank Essay tests Portfolios Not really a test Collections of things the person being evaluated has produced Let you evaluate things you can’t assess with a selected response test B. Constructed response items

  15. Free response Fill-in-the-blank Essay tests Portfolios In-basket technique Used in business Job candidate gets a set of “everyday” problems, says how he or she would deal with those problems Requires expert raters to grade response B. Constructed response items

  16. Strengths Assess higher-order skills More useful feedback to test-taker Positive influence on study habits? Easier to create items B. Constructed response items

  17. Weaknesses Time consuming to use Possible subjectivity in scoring B. Constructed response items

  18. 2. Parts of test items • Stimulus or item stem • Response format or method • Conditions governing the response • Procedures for scoring the response

  19. Stimulus or item stem What the subject responds to 2. Parts of test items

  20. Response format or method Typically multiple choice or constructed response 2. Parts of test items

  21. Conditions governing the response e.g., time limits; allowing probes for ambiguous responses; how response is recorded... 2. Parts of test items

  22. Procedures for scoring the response particularly important for constructed response items 2. Parts of test items

  23. To some extent, your choices on each of these parts will be dictated by: Precedent What did you do last time? Experience Did that work? Practical considerations How many people have to be tested? How much time is available? 2. Parts of test items

  24. 3. Writing test items – guidelines • Define clearly • Generate a pool of potential items • Monitor reading level • Use unitary items • Avoid long items • Break any response “set”

  25. Define clearly Why are you testing? What do you want to know? 3. Writing test items – guidelines

  26. Define clearly Generate a pool of potential items The larger the pool of items you select from, the better the test Selection from this pool based on item-analysis (see below) 3. Writing test items – guidelines

  27. Define clearly Generate a pool of potential items Monitor reading level level too low? more sophisticated test-takers may get bored level too high? you’re testing reading skill as well as domain you think you’re testing 3. Writing test items – guidelines

  28. Define clearly Generate a pool of potential items Monitor reading level Use unitary items Then the meaning of the response is clear 3. Writing test items – guidelines

  29. Define clearly Generate a pool of potential items Monitor reading level Use unitary items Avoid long items Longer items are more likely to be mis-interpreted by test-takers Short items are more likely to be unitary 3. Writing test items – guidelines

  30. Define clearly Generate a pool of potential items Monitor reading level Use unitary items Avoid long items Break any response “set” Use reverse-scored items to prevent test-taker’s from getting into a response set such as just responding “5” for every item on a Likert scale 3. Writing test items - guidelines

  31. 4. Item analysis • Multiple choice distracter analysis • Item difficulty measure P • Discrimination index D • Item – total correlation

  32. How many people choose each distracter? Distracters should be equally attractive Correct choice should be based on knowledge Where knowledge is lacking, choice should be random A. Multiple choice – distracter measures

  33. Difficulty determined by item and population tested P(i) = # got item correct # taking test B. Item Difficulty Measure P

  34. P = .50 is best P = 0 or P = 1 – such items do not distinguish ability levels B. Item Difficulty Measure P

  35. C. Item Discrimination Measures • Discrimination index D • Item-total correlation

  36. Extreme groups method U = # getting item correct in ‘top’ group L = # getting item correct in ‘bottom’ group nU = # in top group nL = # in bottom group D = U – L nU nL Discrimination Index D

  37. Good item High correlation People who get item correct have high score on the test People who get item wrong have low score on the test Poor item Low correlation: look at wording – may be testing reading skill Item Total Correlation

  38. 5. Item Response theory • Item characteristic curves • Adaptive testing using computers

  39. Most important idea: Item Characteristic Curves (ICCs) One curve for each test item X axis: test-taker ability (given by test score) Y axis: probability of choosing an answer A. Item characteristic curves

  40. Probability of correct response Item 1 Item 2 Item 3 Test Score

  41. Slope: how quickly the curve rises. indicates how well item discriminates among persons of differing abilities like P(i) in Classical Test Theory but sample-invariant A. Item Characteristic Curves

  42. Obtaining stable estimates of IRT parameters requires rather large samples Computationally complex IRT model assumes that the trait being measured is one-dimensional. It may not be. Problems with Item Response Theory

  43. computer selects harder or easier questions as test-taker gets each question right or wrong lets you tailor questions for each test-taker test-taker does not spend most of their time with questions that are too easy or too difficult B. Adaptive Testing Using Computers

  44. Facilitates testing of diverse ability groups Output = level of difficulty test-taker can deal with B. Adaptive Testing Using Computers

More Related