Review

Review I volunteer in my son’s first grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week: Astronauts Ninjas Ponies Birds Total Boys 4 3 1 1 9 Girls 2 1 6 3 12 Total 6 4 7 4 21 Suppose we want to know whether sex and book type are independent. Which of the following is NOT a correct statement of the null hypothesis? • The distribution of book preferences is the same for boys and girls. • Boys like all book types equally, and so do girls. • Knowing whether a kid is male or female gives no information about his or her likely book preference. • Knowing a kid’s book preference gives no information about the kid’s sex.

Review I volunteer in my son’s first grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week: AstronautsNinjasPoniesBirds Total Boys43119 Girls216312 Total 647421 If book type is independent of sex, how many boys would be expected to pick pony books? • 2.25 • 2.63 • 3.00 • 3.50

Review Here are the expected frequencies, in red: AstronautsNinjasPoniesBirds Total Boys4 2.63 1.71 3.01 1.79 Girls2 3.41 2.36 4.03 2.312 Total 647421 Calculate the c2 statistic for testing independence. • 2.32 • 5.89 • 9.04 • 9.69

Non-Parametric Tests 12/5

Parametric vs. Non-parametric Statistics • Parametric statistics • Most common type of inferential statistics • r, t, F • Make strong assumptions about the population • Mathematically fully described, except for a few unknown parameters • Powerful, but limited to situations consistent with assumptions • When parametric statistics fail • Assumptions not met • Ordinal data: Assumptions not meaningful • Non-parametric statistics • Alternatives to parametric statistics • "Naive" approach: Far fewer assumptions about data • Work in wider variety of situations • Not as powerful as parametric statistics (when applicable)

Assumption Violations • Parametric statistics only work if data obey certain properties • Normality • Shape of population distribution • Determines shape of sampling distributions • Tells how likely extreme results should be; critical for correct p-values • More important with small sample sizes (Central Limit Theorem) • Homogeneity of variance • Variance of groups is equal (t-test or ANOVA) • Variance from regression line does not depend on values of predictors • Linear relationships • Pearson correlation cannot recognize nonlinear relationships

Assumption Violations • Parametric statistics only work if data obey certain properties • If these assumptions are true: • Population is almost fully described in advance • Goal is simply to estimate a few unknown parameters • If assumptions violated: • Parametric statistics will not give correct answer • Need more conservative and flexible approach Normal(m1, s2) Normal(m2, s2)

Ordinal Data • Some variables have ordered values but are not as well-defined as interval/ratio variables • Preferences • Rankings • Nonlinear measures, e.g. money as indicator of value • Can't do statistics based on differences of scores • Mean, variance, r, t, F • More structure than nominal data • Scores are ordered • Chi-square goodness of fit ignores this structure • Want to answer same types of questions as with interval data, but without parametric statistics • Are variables correlated? • Do central tendencies differ?

Non-parametric Tests • Can use without parametric assumptions and with ordinal data • Basic idea • Convert raw scores to ranks • Do statistics on the ranks • Answer similar questions as parametric tests • Your job: Understand what each is used for and in what situations

Spearman Correlation • Alternative to Pearson correlation • Produces correlation between -1 and 1 • Convert data on each variable to ranks • For each subject, find rank on X and rank on Y within sample • Compute Pearson correlation from ranks • Works for • Ordinal data • Monotonic nonlinear relationships (consistently increasing or decreasing) Y RY X RX

Mann-Whitney Test • Alternative to independent-samples t-test • Do two groups differ? • Combine groups and rank-order all scores • If groups differ, high ranks should be mostly in one group and low ranks in the other • Test statistic (U) measures how well the groups' ranks are separated • Compare U to its sampling distribution • Is it smaller than expected by chance? • Compute p-value in usual way • Works for • Ordinal data • Non-normal populations and small sample sizes 0 Perfectseparation U Noseparation

Wilcoxon Test • Alternative to single- or paired-samples t-test • Does median differ from m0? • Does median difference score differ from 0? • Subtract m0 from all scores • Can skip this step for paired samples or if m0 = 0 • Rank-order the absolute values • Sum the ranks separately for positive and negative difference scores • If m > m0, positive scores should be larger • If m < m0, negative scores should be larger • If sums of ranks are more different than likely by chance, reject H0 • Works for • Ordinal data • Non-normal populations and small sample sizes

Kruskal-Wallis Test • Alternative to simple ANOVA • Do groups differ? • Extends Mann-Whitney Test • Combine groups and rank-order all scores • Sum ranks in each group • If groups differ, then their sums of ranks should differ • Test statistic (H) essentially measures variance of sums of ranks • If H is larger than likely by chance, reject null hypothesisthat populations are equal • Works for • Ordinal data • Non-normal populations and small sample sizes

Friedman Test • Alternative to repeated-measures ANOVA • Do measurements differ? • Look at each subject separately and rank-order his/her scores • Best to worst for that subject, or favorite to least favorite • For each measurement, sum ranks from all subjects • If measurements differ, then their sums of ranks should differ • Test statistic (c2r) essentially measures variance of sums of ranks • If larger than likely by chance, reject H0 • Works for • Ordinal data • Non-normal populations and small sample sizes

Summary

Review Comparing average score between two groups, on an ordinal-scale variable. What test should you use? • Friedman • Mann-Whitney • Spearman • Kruskal-Wallis

Review Five subjects each measured in three conditions, on an interval-scale variable with a non-normal distribution. What test should you use? • Repeated-measures ANOVA • Kruskal-Wallis • Friedman • Spearman

Review Comparing average scores among five groups, on an interval-scale variable with a non-normal distribution. 100 subjects per group. What test should you use? • Kruskal-Wallis • Friedman • One-way ANOVA • Mann-Whitney

Review

Review

Presentation Transcript

Review

Review

Review

Review

Review

Review, REVIEW!

Review Notes Lecture Review

REVIEW, REVIEW, REVIEW!!

Review

Review

Review

ACT Review Paragraphs Review

Review

review

Review

Geometry Review CRCT Review

review

Review Trust Review

Review

Review