Basic Measurement and Statistics in Testing

Basic Measurement and Statistics in Testing

Outline • Central Tendency and Dispersion • Standardized Scores • Error and Standard Error of Measurement (Sm) • Item Analysis

Central Tendency and Dispersion

Central Tendency • Measures of central tendency are measures of the location of the middle or the center of a distribution. The definition of "middle" or "center" is purposely left somewhat vague so that the term "central tendency" can refer to a wide variety of measures. The mean is the most commonly used measure of central tendency.

Mean • The arithmetic mean is what is commonly called the average. The mean is the sum of all the scores divided by the number of scores. • The formula in summation notation is:ΣX/N • The mean is a good measure of central tendency for roughly symmetric distributions but can be misleading in skewed distributions since it can be greatly influenced by scores in the tail. Therefore, other statistics such as the median may be more informative for distributions such as reaction time or family income that are frequently very skewed

Median • The median is the middle of a distribution: half the scores are above the median and half are below the median. • The median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions. • Computation of MedianWhen there is an odd number of numbers, the median is simply the middle number. For example, the median of 2, 4, and 7 is 4. When there is an even number of numbers, the median is the mean of the two middle numbers. Thus, the median of the numbers 2, 4, 7, 12 is (4+7)/2 = 5.5.

Mode • The mode is the most frequently occurring score in a distribution and is used as a measure of central tendency. It is the only measure of central tendency that can be used with nominal data. • The mode is greatly subject to sample fluctuations and is therefore not recommended to be used as the only measure of central tendency. A further disadvantage of the mode is that many distributions have more than one mode. These distributions are called "multi modal." • In a normal distribution, the mean, median, and mode are identical.

Spread, Dispersion, Variability • A variable's spread is the degree to which scores on the variable differ from each other. If every score on the variable were about equal, the variable would have very little spread. There are many measures of spread. The distributions shown below have the same mean but differ in spread: The distribution on the bottom is more spread out. • Variability and dispersion are synonyms for spread. Spread/Dispersion

Range • The range is the simplest measure ofspread or dispersion: It is equal to the difference between the largest and the smallest values. • The range can be a useful measure of spread because it is so easily understood. However, it is very sensitive to extreme scores since it is based on only two values. • The range should almost never be used as the only measure of spread, but can be informative if used as a supplement to other measures of spread. • Example:The range of the numbers 1, 2, 4, 6, 12, 15, 19, 26 = 26 -1 = 25

Variance • The variance is a measure of how spread out a distribution is. In other words, they are measures of variability. • The variance is computed as the average squared deviation of each number from its mean. • For example, for the numbers 1, 2, and 3, the mean is 2 and the variance will be: (1-2)2 + (2-2)2 + (3-2)2 = 0.667 3 Example of Calculation

Standard Deviation • The standard deviation formula is very simple: it is the square root of the variance. It is the most commonly used measure of spread. • In a normal distribution, about 68% of the scores are within one standard deviation of the mean and about 95% of the scores are within two standard deviations of the mean. • The standard deviation has proven to be an extremely useful measure of spread in part because it is mathematically tractable. Many formulas in inferential statistics use the standard deviation.

Different ways of calculating the standard deviation – the raw score method and the deviation method • Standard deviation score and standard deviation value

Standardized Scores Z scores and T scores and their uses

Standardized Scores : Z scores • Z-score • Raw score – mean score / standard dev. • Example:

Standardized Scores : Z scores • Using the Z-score • Comparing between scores in two tests • Example, compare previous score with this:

Standardized scores – T scores • Z scores are unfamiliar especially with ‘-’ scores • Formula for T-score: T = 10 (Z) + 50

Error and Standard Error of Measurement (Sm)

Error and Standard Error of Measurement (Sm) • Every score has an error • Error either adds or subtracts from your true score • True score = Obtained score +/- Error • How to calculate error? • Sm = SD1 - r

Example • Obtained score = 20; SD = 2; r = 0.64 • Sm = SD1 - r • = 2 1- 0.64 • = 2 0.36 • = 2 x 0.6 = 1.2 • True score = 20 – 1.2 = 18.8; and 20 + 1.2 = 21.2; or • Between 18.8 and 21.2 (at 1 SEM)

Item Analysis Item difficulty Item discrimination Distractor analysis

Item difficulty (p) • How difficult is the item? • Sometimes referred to as item facility. • Used only with objective type tests • Number of students who got the item correct divided by the number of students who attempted the item. • Every item has an item difficulty value • Possible values are from 0 to 1 with 0 indicating a difficult item

Example • 30 students attempted the item • A 4 B 0 C 8 *D 18 • Find p • p = No. of students who got it right No of students who attempted • = 18/30 = .60 • Note, this is also equal to 60 percent correct

Item Discrimination (D) • To discriminate between good and weak students • Must determine the good and weak students first • Performance of good students compared to performance of weak students divided by the number of students in either group • Every item has an item discrimination value which range from -1 to 1

Example • Total number of students = 45 • Number of students in Upper Group and Lower Group = 15 each • Options A B C *D • Upper (Ug) 2 0 3 10 • Lower (Lg) 2 1 6 6 • Compute D • D = No. in Ug correct – No. in Lg correct No of students in either group • D = 10 – 6 = 0.267 15

Deciding on Good and Bad Items • Item difficulty • Item discrimination • Check for miskeying, ambiguity and guessing • Evidence for miskeying: more chose distractor than key • Guessing: equal spread across options • Ambiguity: equal number chose one distractor and the key

END

Basic Measurement and Statistics in Testing

Basic Measurement and Statistics in Testing

Presentation Transcript

biostatistics ii: basic analytic statistics hypothesis testing

Basic Statistics

Basic statistics

Basic Statistics

Basic Statistics

Basic Concepts in Measurement

Measurement and Statistics

Basic Statistics

Basic Measurement

Basic Statistics

Microscopes and Basic Measurement

Basic Statistics

Basic Statistics

Basic Statistics

Basic Statistics

Environmental Modeling Basic Testing Methods - Statistics III

Measurement and Testing

BASIC STATISTICS

Environmental Modeling Basic Testing Methods - Statistics II