STATISTICAL INFERENCE PART IX

STATISTICAL INFERENCEPART IX HYPOTHESIS TESTING - APPLICATIONS – MORE THAN TWO POPULATION

INFERENCES ABOUT POPULATION MEANS • Example: • Ho: 1 = 2 = 3 where • 1 = population mean for group 1 • 2 = population mean for group 2 • 3 = population mean for group 3 • H1: Not all are equal.

Assumptions • Each of the populations are normally distributed (or large enough sample sizes to use CLT) with equal variances • Populations are independent • Cases within each sample are independent

INFERENCES ABOUT POPULATION MEANS - ANOVA Difference in means large relative to overall variability Difference in means small relative to overall variability  F tends to be small  F tends to be large Larger F-values typically yield more significant results. How large is large enough? We will compare with the tabulated value.

INFERENCES ABOUT POPULATION MEANS • If F test shows that there are significant differences between means, then, apply paired t-tests to see which one(s) are different. • Apply multiple testing correction to control for Type I error

Example • Kenton Food Company wants to test 4 different package designs for a new product. Designs are introduced in 20 randomly selected markets. These markets are similar to each other in terms of location and sales records. Due to a fire incidence, one of these markets are removed from the study, leading to an unbalanced study design. Example is taken from: Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman, W., (1996) Applied Linear Statistical Models, 4th edition, Irwin.

Example Is there a difference among designs in terms of their average sales?

Example > va1=read.table("VAT1.txt",header=T) > head(va1,3) Case Design Market Sales 1 1 1 1 11 2 2 1 2 17 3 3 1 3 16 > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design 1 483.08 483.08 31.186 3.289e-05 *** Residuals 17 263.34 15.49 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Degrees of freedoms are wrong! Since there are 4 different designs, d.f. should be 3.

Example > class(va1[,2]) [1] "integer" > va1[,2]=as.factor(va1[,2]) > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design 3 588.22 196.074 18.591 2.585e-05 *** Residuals 15158.20 10.547 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # or, alternatively: > aov1 = aov(Sales ~factor(Design),data=va1) 4 designs have different mean sales. But, which one(s) are different?

Example > library(multcomp) > c1=glht(aov1, linfct = mcp(Design = "Tukey")) > summary(c1) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: aov(formula = Sales ~ Design, data = va1) Linear Hypotheses: Estimate Std. Error t value Pr(>|t|) 2 - 1 == 0 -1.200 2.054 -0.584 0.9352 3 - 1 == 0 4.900 2.179 2.249 0.1545 4 - 1 == 0 12.600 2.054 6.135 <0.001 *** 3 - 2 == 0 6.100 2.179 2.800 0.0584 . 4 - 2 == 0 13.800 2.054 6.719 <0.001 *** 4 - 3 == 0 7.700 2.179 3.534 0.0141 * Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) 4th design has higher average sales than all other designs. 3rd design is slightly significantly better than 2nd design.

Example # or, alternatively > TukeyHSD(aov1, "Tasarim", conf.level=0.9) • There are many functions in R available for multiple testing correction. For instance, you can look into “p.adjust” function in stats library for other types of corrections (e.g. Bonferroni). Supply raw p-values  obtain adjusted p-values. • Different ANOVA types (e.g. 2-factor, repeated,…) in R; reference: Ilk, O. (2011) R YaziliminaGiris, ODTU, Chp. 7

STATISTICAL INFERENCE PART IX

STATISTICAL INFERENCE PART IX

Presentation Transcript

STATISTICAL INFERENCE PART III

Statistical Inference

Statistical Inference

STATISTICAL INFERENCE PART VI

STATISTICAL INFERENCE PART III

STATISTICAL INFERENCE PART IX

STATISTICAL INFERENCE PART VI

STATISTICAL INFERENCE PART VI

STATISTICAL INFERENCE PART VII

STATISTICAL INFERENCE PART VIII

STATISTICAL INFERENCE PART V

Statistical Inference

Statistical inference

STATISTICAL INFERENCE PART V

STATISTICAL INFERENCE PART IV

STATISTICAL INFERENCE PART V

STATISTICAL INFERENCE PART III

STATISTICAL INFERENCE PART III

STATISTICAL INFERENCE PART IV

STATISTICAL INFERENCE PART VI

STATISTICAL INFERENCE PART VIII

STATISTICAL INFERENCE PART V