1 / 28

Test Development Strategies

Test Development Strategies. Ronna Turner, Ph.D. Educational Statistics and Research Methods Teaching and Faculty Support Center Workshop January 7, 2010 University of Arkansas rcturner@uark.edu. Academic Sources.

bryony
Télécharger la présentation

Test Development Strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Test Development Strategies Ronna Turner, Ph.D. Educational Statistics and Research Methods Teaching and Faculty Support Center Workshop January 7, 2010 University of Arkansas rcturner@uark.edu

  2. Academic Sources • Ackerman, T. (September, 1994). EDPSY 490 course handout. University of Illinois, Champaign-Urbana, IL. • Berk, Ronald A. (November, 1998). Writing test items is like a box of chocolates… Annual Lilly Conference on College Teaching, Oxford, OH. • Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Publishing. • Frey, B., Petersen, S., Edwards, L., Pedrotti, J., & Peyton, V. (2005). Item-writing rules: Collective wisdom. Teaching and Teacher Education, 21, 357-364. • Hopkins, K. (1997). Educational and psychological measurement and evaluation (8th edition). Allyn and Bacon, Old Tappan, NJ. • Kline, T. (2005). Psychological Testing: A practical approach to design and evaluation. Sage Publications, Inc. • Kubiszyn, T. & Borich, G. (2007). Educational testing and measurement: Classroom application and practice (8th edition). Wiley, Hoboken, NJ. • National Board of Osteopathic Medical Examiners (2007). Item writing strategies for medical educators: Workbook. Chicago, IL. • Nunnally, J., & Bernstein, I. (1994). Psychometric theory. McGraw-Hill, Inc. • Turner, R. (January, 2010). You can blame me for all of the bad examples.

  3. TFSC Workshop – Test Development Strategies Overview • Measurement and types of assessments • What are we going to focus on today? • Preparation for item writing • Creating a table of specifications; selecting item formats • Inquiry activity with item examples • Yes – you get to answer test questions! • You also get to rip apart examples I have provided! • Isn’t it nice to do the criticizing rather than hearing the complaints for a change? • Item-writing guidelines and suggestions • Item development strategies/tools • Post-hoc item evaluation procedures

  4. Type of Assessments Common Purposes of Academic Assessments • Psychological, Attitudinal, Aptitude, Achievement • Norm-referenced assessment: focus is on the comparison of a respondent to others completing the instrument • Criterion-referenced assessment: instrument designed to measure mastery of content knowledge, with comparisons being made to pre-determined proficiency levels rather than other test-takers • Admissions • Scholarship • Placement • Diagnostic • Content Mastery • Formative • Summative • Licensure

  5. Measurement • Measurement is the act of quantifying a behavior or attribute sampled by a test (Crocker & Algina, 1986). • In this workshop, I am only addressing content or constructs that we believe are quantifiable, and that apply to achievement or aptitude types of tests. • There are objectives we have for our students that may not be quantifiably measurable within the context of a classroom situation. • For example, I may want my students to gain an appreciation for the importance of obtaining evidence of validity for an instrument. However it is probably not something I would attempt to measure in relation to their course performance.

  6. Examples of Assessment FormatsAchievement/Aptitude Test Psychological/Attitudinal Inventory • Likert-Type Scale • Forced Choice • Semantic Differential Scale / Bipolar Adjective Pairs • Fill in the Blank • Multiple Choice • Matching • Completion / Fill in the Blank • Short Answer • Essay • Performance Assessment • Experiment • Project • Paper / Report • Portfolio

  7. Which types of items are best?(Table information from http://www.edtech.vt.edu/edtech/id/assess/items.html)

  8. Prerequisites to Writing Items for a Test / Exam Develop a table of specifications to guide item development (Significantly reduces the likelihood of a disparity between what is tested and what was taught, stressed in the class, or identified as being important.) Knowledge / Comp Application Analysis / Syn / Eval __________________________________________________________________________________________ Test Development (20%; 8 items) Test Purpose / Norm Group (2) 1 1 Operational Definitions (3) 2 1 Item Writing (3) 3 Reliability (35%; 14 items) Internal Consistency (4) 1 2 1 Test-Retest / Alternate Forms (3) 2 1 Standard Error of Measurement (3) 1 2 Composite Variability (2) 1 1 Criterion-Referenced Methods (2) 2 Validity (45%; 18 items) Content (5) 1 2 1 Criterion-Referenced (4) 1 3 Construct (9) 2 4 3 _________________________________________________________________________________________

  9. Can you answer these items correctly? • A group of hedgehogs is called an • a) parliament • b) rookery • c) array • d) bevy • e) gaggle • True/False items are never as effective as essay items in measuring mastery at the synthesis cognitive level. • True or False • Which of the following doctors on the long-running series MASH was not married? • a) Doogie Houser • b) Hawkeye Pierce • c) Madam Curie • d) Albert Einstein • e) Sharon Gaber

  10. Can you answer these items correctly?(avoid clues that allow examinees who have not learned the content to answer items correctly) • A group of hedgehogs is called a(an) • a) parliament ostentation • b) rookery • c) array • d) bevy • e) gaggle • Essay items are more effective than true/false items at measuring mastery at the synthesis cognitive level. • True or False • Which of the following doctors on the long-running series MASH was not married? • a) Trapper John McIntyre • b) Hawkeye Pierce • c) B.J. Honeycutt • d) Frank Burns • e) Henry Blake

  11. Make comments about any issues you see with the following items. • 4. Discuss the approach a counselor utilizing a Freudian conceptual framework would use when counseling a married couple with adultery issues. • 5. Which of the following types of errors are not included in the random error estimated in true score models? a) a headache that impairs exam performance b) having a stomach virus on test day c) a visual impairment d) all of the above

  12. Make comments about any issues you see with the following items. • 4. Discuss the approach a counselor utilizing a Freudian conceptual framework would use when counseling a married couple with adultery issues. • 4. Make 3 comparisons (including at least 1 similarity and 1 difference) between a Freudian and an Adlerian approach to counseling a married couple with adultery issues. Which do you believe would be most effective? Provide at least 2 reasons to support your choice. • 5. Which of the following types of errors are not included in the random error estimated in true score models? a) a headache that impairs exam performance b) having a stomach virus on test day c) a visual impairment d) all of the above • 5. Which of the following types of errors are included in the random error estimated in true score models? (circle all that apply) a) a headache that impairs exam performance b) guessing correctly on an item c) a visual impairment d) dyslexia

  13. Make comments about any issues you see with the following items. • 6. Solve for x: 3 x + 5 = 7 a) 2/3 b) 1 c) 3 d) 4 e) 8 • 7. Solve when x = 3. 4 ( x + 2 ) 2 = a) 196 b) 28 c) 100 d) 40 e) 400 f) 52

  14. Make comments about any issues you see with the following items. • 6. Solve for x: 3 x + 5 = 7 a) 2/3 b) 1 c) 3 d) 4 e) 8 • 6. Solve for x: 3 x + 5 = 7 a) 2/3 b) 1 ½ c) 2 1/3 d) 4 e) 6 • 7. Solve when x = 3. 4 ( x + 2 ) 2 = a) 196 b) 28 c) 100 d) 40 e) 400 f) 52 • 7. Solve when x = 3. 4 ( x + 2 ) 2 = a) 28 b) 40 c) 52 d) 100 e) 196 f) 400

  15. Make comments about any issues you see with the following items. • 8. Convert the following equation into the slope-intercept form of a line: 4x + 2y = 8 a) y = 4x - 8 b) y = 2x - 4 c) y = 2x + 4 d) y = -2x - 4 e) y = -2x + 4 • 9. John and Mark are playing three holes of golf in which four is par for each hole. John’s final score is 2 above par. How many strokes did John make in total?

  16. Make comments about any issues you see with the following items. • 8. Convert the following equation into the slope-intercept form of a line: 4x + 2y = 8 a) y = 4x - 8 b) y = 2x - 4 c) y = 2x + 4 d) y = -2x - 4 e) y = -2x + 4 • 8. Convert the following equation into the slope-intercept form of a line: 4x + 2y = 8 a) x = -½ y + 2 b) x + ½ y = 2 c) y = 2x + 4 d) y + 2x = 4 e) y = -2x + 4 • 9. John and Mark are playing three holes of golf in which four is par for each hole. John’s final score is 2 above par. How many strokes did John make in total? • 9. John played three games of golf. He hit the ball 32 times during each game. How many times did John hit the ball in total?

  17. Make comments about any issues you see with the following items. • 10. Write a paragraph (including a topic sentence, the supporting body of the paragraph, and a conclusion) describing one of your most positive personal assets. • 11*. Epinephrine: a) is never used along with a local anesthetic b) causes an increase in glycogenolysis c) causes a decrease in lipolysis d) causes bronchial smooth muscle constriction e) usually leads to hyperkalemia • 12*. Which of the following characteristics is correct? a) Vesicular breath sounds may be heard over the trachea b) Bronchial sounds are heard over most of the lung fields c) Bronchovesicular sounds are medium-pitched d) A and B e) B and C f) All of the above g) None of the above  * National Board of Osteopathic Medical Examiners (2007). Item writing strategies for medical educators: Workbook. Chicago, IL.

  18. Make comments about any issues you see with the following items. • 13*. Recollection of my adolescent years causes me to become melancholy a) strongly disagree b) disagree c) neither agree nor disagree d) agree e) strongly agree * Ackerman, T. (September, 1994). EDPSY 490 course handout. University of Illinois, Champaign-Urbana, IL.

  19. Make comments about any issues you see with the following items. • 13*. Recollection of my adolescent years causes me to become melancholy a) strongly disagree b) disagree c) neither agree nor disagree d) agree e) strongly agree • 13. Thinking about my life as a teenager makes me sad. a) strongly disagree b) disagree c) neither agree nor disagree d) agree e) strongly agree * Ackerman, T. (September, 1994). EDPSY 490 course handout. University of Illinois, Champaign-Urbana, IL.

  20. General Item Writing Recommendations • Limit complexity of item wording unless it is part of the construct being measured • Keep item stems as short and concise as possible; avoid ambiguous item stems • Answer options (distractors) should be longer than the stem • Avoid grammatical clues for stem response options (e.g., is / are, singular / plural) • Avoid specific determiners (e.g., always, never, all, none) and indefinite qualifiers (e.g., sometimes, frequently) • Minimize the use of stems written in negative form; when used, EMPHASIZE the negative word • For fill in the blank items, place the blank toward the end of the item • Place items with similar format together • Format item responses vertically • Order verbal responses logically or alphabetically; Order quantitative responses from lowest to highest

  21. General Item Writing Recommendations • Use plausible distractors; select distractors that diagnose specific misconceptions • Avoid stating the correct answer in textbook or more technical language when distractors are not (however it can be an effective technique to use for a distractor) • Avoid synonyms in distractor choices (once two distractor options are identified as representing a similar concept, they will automatically be disregarded in a “one correct answer” situation) • Similarly, “all of the above” distractor options can increase guessing probabilities if one of the remaining distractors can be discounted

  22. General Item Writing Recommendations • Create items from instructional objectives • Focus on one objective per item • Avoid extraneous information in items that increase item difficulty (e.g., examinees get item incorrect based on their lack of knowledge of other content not intended to be tested) • Limit the use of items that are dependent on each other (or only use them when subsequent items can be assessed independently of the previous answer) • True-false items should be entirely true or entirely false • Create rubrics for how an essay question will be evaluated BEFORE administering it • Include the measurement of higher-order thinking skills

  23. General Item Writing Recommendations • Avoid opinion items • Make items representative of diverse subpopulations • Avoid potentially offensive, stereotypical, or biased language and scenarios • Referencing students or class characteristics in your items can provide humor / interest during test taking and help students relax.

  24. Item Forms • To calculate the area of __A__, we must know the ____________. a) radius b) length and width c) base and altitude d) number of sides and side length e) length of bases and height • where the replacement set for A is circle, rectangle, triangle, parallelogram, or trapezoid. • A(n) __A__ is to the robot as a _____ is to the human body. a) brain b) limb c) finger d) joint • where the replacement set A includes servo, computer chip, _____, ______

  25. Item Forms • The concept of __A__ was chiefly promoted by _______________: a) Rudolf Dreikius b) Alfie Kohn c) Barbara Coloroso d) B.F. Skinner e) HaimGinott • where the replacement set for A is logical consequences, classroom community, inner discipline, behaviorism, congruent communication.

  26. Item Algorithms • Calculate the ROI (return on investment) using the Schmidt, Hunter, & Pearlman’s utility formula for the following case study: • Utility =   • (Create a scenario in which the years of duration (YD), number of employees trained (NT), performance difference between trained and untrained employees (PD), and cost per trainee (C) are provided.) • Simplify the following radical: a –3/2 • where a is a number whose square root is a whole number (i.e., 4, 9, 16, 25, 36) • Change the irregular adjective __A__ to its feminine form. • where A is any French irregular adjective not in its feminine form.

  27. Evaluating Test Items Multiple Choice Options Item 5 1 2* 3 4 5 N = 4 UPPER 0 100 0 0 0 Item-Test Correlation .37 N = 5 MIDDLE 20 80 0 0 0 Item Difficulty .62 N = 4 LOWER 25 0 25 0 50 Item 8 1 2 3* 4 5 N = 4 UPPER 0 0 75 25 0 Item-Test Correlation .62 N = 5 MIDDLE 0 0 60 20 20 Item Difficulty .54 N = 4 LOWER 25 0 25 25 25 Item 2 1 2 3 4 5* N = 4 UPPER 0 0 0 0 100 Item-Test Correlation .00 N = 5 MIDDLE 0 0 0 0 100 Item Difficulty 1.00 N = 4 LOWER 0 0 0 0 100 Item 12 1* 2 3 4 5 N = 4 UPPER 25 75 0 0 0 Item-Test Correlation -.38 N = 5 MIDDLE 40 40 20 0 0 Item Difficulty .46 N = 4 LOWER 75 25 0 0 0  * Correct Item Response

  28. Beginning of Test Theory & Psychometrics(Wundt, Weber, Fechner, Galton, Cattell in the late 1800’s; Simon, Binet, Thorndike in the early 1900’s) “I am sending you a dreadful book which I have written, which is no end scientific but devoid of any spark of human interest. You must make all your research men read it, but never look within its covers yourself. The figures, curves, and formulae would drive you mad.” E. L. Thorndike (1904) in a letter to his former mentor, William James, introducing him to the first published text on test theory.

More Related