1 / 46

Standardization the properties of objective tests

Standardization the properties of objective tests. Properties of Objective Tests. There are three standards by which you can judge an objective test Standardization Reliability Validity. Properties of Objective Tests.

rissa
Télécharger la présentation

Standardization the properties of objective tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Standardization the properties of objective tests

  2. Properties of Objective Tests • There are three standards by which you can judge an objective test • Standardization • Reliability • Validity

  3. Properties of Objective Tests • Standardization – scoring & use of scores does not vary across situations • Reliability – scores are consistent and remain stable over time • Validity – the test measures what it intends to measure

  4. Standardization Principles • Objective Scoring • Directions • Consistency • Accuracy and timeliness

  5. Standardization Principles • Administration • Appropriate conditions specified • Materials • Probing / Coaching

  6. Standardization Principles • Guidelines for interpretation and use • With whom? • For what purpose? • What do high and low scores mean?

  7. Standardization Principles • Norm tables • Based on large • Representative samples • From a defined population

  8. Standardization Principles • Specialized norm tables • Subgroup differences • For example: age, gender, race, primary language, etc.

  9. Standardization Principles • Raw scores and standard scores provided where appropriate • Standard scores • Percentile ranks • Age standardized scores

  10. Standardization Principles • Technical manual • Test development process • Guidelines for administration, scoring, and interpretation • Norm tables • Meets standards for Ed. & Psych. tests

  11. Norm Tables • Meaningful for interpretation when: • Norm referenced interpretation meets the goal of the test • Not a criterion referenced test

  12. Norm Tables • Meaningful for interpretation when: • Relative position in a group has interpretative meaning • Examinee is a member of the population

  13. Norm Tables • Meaningful for interpretation when: • The norm sample is large and representative of the population • The right norm table is used

  14. Norm Tables • All those taking the test for a given administration may work as a norm sample for an admissions or personnel selection purpose

  15. Norm Tables • However, the correct reference group varies by the purpose • Career counseling • Placement in the appropriate courses • Selection for a remedial program

  16. Interpreting Standard Scores • Raw score is transformed into a standard score • z = (score – mean)/SD • z score = SDs units away from mean • Includes measure of middle and spread

  17. Interpreting Standard Scores • z = 0, average score • z <=-1, low score • z >=1, high score • z is converted to some other scaling: • Mean 50 100 500 • SD 10 15 100

  18. Interpreting Standard Scores • pp. 42,43,48 in book give guidelines • Easiest to use when converted to percentiles • % of population that scores at or below a given score • Can be thought of as a rank out of 100 members of the population

  19. Interpreting Standard Scores • Common interpretation strategies: • Normal range is middle 68% of the population (T=40-60, z=-1 to 1, etc.) • Low and high scores fall outside this range (lower and upper 16%)

  20. Interpreting Standard Scores • Common interpretation strategies: • Normal range is middle 50% of the population (Quartiles 2 & 3) • Low and high scores fall outside this range (Quartiles 1 and 4)

  21. Interpreting Standard Scores • Safer to make broad classification like “Low”, “Within the normal, or expected, range”, or “High” than fine distinctions. • All scores have some measurement error in them. • Look for patterns across the battery, across multiple sources.

  22. An Example from WCCS • Christina, a 1st grade student at our school, took the Stanford Achievement Test last year. Here are her Word Study Skills subtest scores.

  23. Percent Correct • The number of correct responses, or the raw score, is divided by the total number of questions, then multiplied by 100 and expressed as a percentage.

  24. Percent Correct • Christina gave the correct answer to 83.33% of the questions on the Word Study Skills section of the test.

  25. Scaled Score • The raw score is standardized and normalized, then rescaled to the desired scaling. • z = (Raw Score – Mean) / SD • Scaled Score ≈ 500 + (100*z)

  26. Scaled Score • Scaled Scores have many convenient properties from a statistical standpoint. • However, for most people, percentile ranks are easier to understand.

  27. Scaled Score • Christina scored more than one Standard Deviation above average. Her scores are in the above average range.

  28. Percentile Rank • A percentile rank is a statement of the percentage of persons in a given group who fall at or below a given score. • The most common way of reporting test scores and the easiest to use.

  29. Percentile Rank • Christina scored as well or better than 81% of all students in the nation who took this section of the test.

  30. Percentile Rank • Christina scored as well or better than 57% of all students in ACSI schools who took this section of the test.

  31. Percentile Rank • This pattern is typical for our students on average. • ≈ 80th percentile nationally • ≈ 60th percentile for ACSI students • What does this mean?

  32. Stanine • Standard score of nine units • Developed by the military to contain test score information in one column on an IBM punch card • Nine groups (1-9), ½ SD, range of PRs

  33. Stanine • Christina’s scores fall in the 7th stanine, or above average compared to all students nationally. • Christina’s scores fall in the 5th stanine, or average for ACSI students.

  34. Grade Equivalent Scores • Attempt to translate test scores into the grade (grade and month) when the score is typical. • Have an intrinsic appeal. • Are problematic statistically. • Based on extrapolations.

  35. Grade Equivalent Scores • Christina, a 1st grade student at our school, in the area of Word Study Skills, is performing at the level of a typical 3rd grade student in the seventh month of the school year (on the 1st grade test).

  36. An SAT Example • Mark, a 12th grade student at our school, took the SAT test last year. Here are his scores.

  37. An SAT Example • Section mean ≈ 500, SD ≈ 100 • Range = 200-800 (-3z to +3z) • Total mean ≈ 1000, SD ≈ 200 • Range = 400-1600

  38. An SAT Example • Mark scored a 620 on the verbal section of the test. His score was more than one Standard Deviation above the mean and is considered above average.

  39. An SAT Example • Mark’s score on the verbal section of the test was as good or better than 83% of the students who took the test.

  40. An SAT Example • Mark scored a 570 on the quantitative section of the test. His score was within the normal range and is considered average.

  41. An SAT Example • Mark’s score on the quantitative section of the test was as good or better than 66% of the students who took the test.

  42. An SAT Example • Mark scored a 1190 total score and his score was within the normal range and is considered average.

  43. An SAT Example • Mark’s total score was as good or better than 61% of the students who took the test.

  44. General Principles • Tests do not measure innate ability • Test scores result from a combination of: • Innate ability • Environmental influences • Test taker motivation • Properties of the test itself

  45. Cautions about Interpretation • A low score in one norm group may be high in another, and vice versa. • A low score on one test will not necessarily lead to a high score on another test.

  46. Cautions about Interpretation • Interpretation is part art or clinical intuition and experience. • Become familiar with case studies in manuals.

More Related