160 likes | 355 Vues
STAT 3120 Statistical Methods I. Lecture Notes 6 Analysis of Variance (ANOVA). Testing for Relationships Among Variables. AN alysis O f VA riance. A few points about ANOVA: Used to determine if statistical differences exist among three or more groups;
E N D
STAT 3120 Statistical Methods I Lecture Notes 6 Analysis of Variance (ANOVA)
ANalysis Of VAriance A few points about ANOVA: • Used to determine if statistical differences exist among three or more groups; (Question – why not use multiple ttests?) • Assumes that the groups are of approximately equal size, have approximately normal distributions and are independent of each other; • The test statistic for ANOVA is the F-stat.
ANOVA There are two “templates” to retain in your mind when conducting ANOVA. The first involves the structure of the data: This notation can be found on page 387.
ANOVA The second involves the actual computation of the test statistic: This notation can be found on page 389.
ANOVA Consider an Experiment where 4 groups of 5 subjects are exposed to 4 different advertising strategies, and we measure the level of Retention for the subjects in each group as shown:
ANOVA • If the sample means were identical or very similar, one would not claim that the Ad strategy affects retention. • If the sample means are substantially different, one might conclude that the Ad strategy affects retention.
ANOVA • According to the null hypothesis, the Ad strategy does not influence retention. • According to the alternate hypothesis, the Ad strategy affects retention.
ANOVA • If the classification into groups is ignored, one can compute the sum of squares of the observations about the mean of all of the scores. This is called the TOTAL SUM OF SQUARES. • That value can be decomposed into two independent sources of variability.
ANOVA • Some of the variability among the observations is a result of differences among people (or things). This is called the WITHIN GROUPS sum of squares. • Some of the variability among the observations is a result of differences among the groups. This is called the BETWEEN GROUPS sum of squares.
SST = Total Sum of Squares SSW = Sum of Squares Within Groups SSB = Sum of Squares Between Groups X = mean of data for all the sample groups combined Xj = mean of the jth sample group Xij = the ith element from the jth group n = number of samples in each group _ _ Decomposition of Total Deviation SST = SSW + SSB ij(Xij-X)2 = ij(Xij-Xj)2 +nj(Xj-X)2 _ _ _
Computation of Sums of Squares • Total Sum of Squares (SST) • ij(Xij-X)2 = 210 + 55 + 155 + 54 = 474 • Within Groups Sum of Squares • ij(Xij-Xj)2 = 30 + 50 + 30 + 34 = 144 • Between Groups Sum of Squares • nj(Xj-X)2 = 5(62 + 12 + 52 + 22) = 330 _ _ _
Decomposition of SST 474 Impact of All Other Variables (144) Exp. Factor (330) Within Between
Degrees of Freedom • Degrees of freedom must be computed for each source of variability. • The degrees of freedom for total is (nT -1) = 19. • The degrees of freedom for between groups is (t-1) = 3 • The degrees of freedom for within groups is (nT - t)= 16.