290 likes | 379 Vues
Measures of Central Tendency and Dispresion. Content Analysis- Challenges. Lose some nuance when coding How to select material from universe of possible material? Is material accurate? Unintentional problems Purposeful distortion Ultimately a question of validity Are coders accurate?
E N D
Content Analysis- Challenges • Lose some nuance when coding • How to select material from universe of possible material? • Is material accurate? • Unintentional problems • Purposeful distortion • Ultimately a question of validity • Are coders accurate? • Can establish reliability • Harder to establish validity
Statistics • Provides description of a sample or population • Simplification • Univariate- Only interested in one attribute at a time • Bivariate- consider relationships between 2 attributes • Multivariate- the sky is the limit
Percentages • Useful for comparing groups with unequal numbers
Percentages • To Compute: • (#with trait of interest/total #) X 100 • Example 1- Sample of 4 cats, one is black • (¼)X100- 25% • Example 2-Sample of 750, 612 approve of the president • (612/750)X100= 81.6%
What Constitutes the Denominator? • Percentage of Total • Percentage of Valid Cases • Excludes missing cases • Typically more appropriate • Cumulative Percent-what percentage so far have reached this level
Measures of Central Tendency • Mode • Mean (Average) • Median
Computing the Mean • Requires At least ordinal data • (Y1+ Y2+ Y3…. +Yi)/I • Example have people with incomes of 10,000, 15,000, 25,000, 55,000, 32,000, 29,500 • Mean=(10,000+15,000+25,000, +55,000+ 32,000+29,500)/6= 27,750
Mode • Most common with nominal data • Count frequencies, find most common • Ask 30 1st graders favorite color • 7 blue • 3 chartreuse • 4 purple • 2 yellow • 10 red • 3 green • 1 Black • Mode- Red
Computing the Median • Requires at least Ordinal Data • Put values in order • If odd number, value half are above, half below • If even number- Average of two middle cases • Income Example: • 10,000, 15,000, 25,000, 55,000, 32,000, 29,500 • 10,000, 15,000, 25,000, 29,500, 32,000, 55,000 • Median=25,250
When To Use Which? • Mode- nominal data • Better to actually give totals for all if few choices, e.g. 33% red, 10% green • Mean- when appropriate data • Median- with ordinal data, in cases where there are a few values that might cause a skew • Outlier- Data point with extreme value
Median vs. Mean • Created a fake town with 100 residents • Incomes 19,00-138,000 • Mean=57600, Median=49,500 • Suppose one person with 30,000 moves away, replaced by Millionaire • Mean=67,300, Median=55,000 • Replaced by 50,000,000 • Mean=557,300 Median= 55,000 • Replaced by Bill Gates (50 Billion) • Mean=500Million, Median= 55,000
Measures of Dispersion • Measure of Central Tendency loses something • Income example? • Dispersion • Measure of how much divergence there is from the mean • Histogram • Horizontal Axis breaks variable down into ranges • Vertical Axis-count within each range
Quantifying Dispersion- Standard Deviation • Find difference from mean for each observation • Add them up • Divide by the number of cases minus1
Standard Deviation from Previous cases • Mean= 50,024, S.D=992.5 • Min=46,834, Max=52,935
Mean=50,255 S.D.=4792 • Min=35,671 Max=65,095
Mean=50,311 S.D.=10,124 • Min=22,522 Max=78,642
Mean=50,982 S.D.=18,898 • Min=1591 Max=105,957
Gore Thermometer • Mean=57.4, S.D.=25.7 • 0=4.6%, 100= 5.6%
George W Bush Thermometer • Mean=56.1 S.D.=24.9 • 0= 4.4% 100=4.7%
Clinton Thermometer • Mean=55.2 S.D.=29.7 • 0=9.5% 100=7.1%
For Next Time • The Normal Distribution • Bivariate Relationships • Get stats assignments