1 / 31

3. Analysis of Variance (ANOVA)

3. Analysis of Variance (ANOVA). Example : We want to find out whether the beverage that people drink affects the reaction time If there were only two groups (e.g. water and fruit juice): a two sample t-test could be used. Three groups: water, fruit juice, coffee

ayersm
Télécharger la présentation

3. Analysis of Variance (ANOVA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3. Analysis of Variance (ANOVA) Example: Wewant to find out whether the beverage that people drink affects the reaction time If there were only two groups (e.g. water and fruit juice): a two sample t-test could be used. Three groups: water, fruit juice, coffee Three or more groups, different approach: the analysis of variance (ANOVA)

  2. ANOVA uses F-test based on variance ratios to determine whether or not significant differences exist among the means of several groups of observations. Variance within each group Variance between the groups Null hypothesis: the population means are equal H0: m1 = m2 = … = mk

  3. Example 1 Example 2

  4. Example A quantity of each of three chemical fertilizers was applied to three groups of five corn plants each, with all plants growing under identical conditions of temperature, rainfall, soil, seed, etc. From the following measures of corn growth (height after one month), determine whether there is any reason for one fertilizer to be better than another. Data, notations:

  5. There are two possible estimates for the population variance: a) On the basis of the variances within the groups sum of squares withingroups mean square for within groups If the number of measurements are identical in the groups, MSW equals the average of the sample variances of the groups with df = k(n-1) degrees of freedom:

  6. b) On the basis of the variance of the group means between groups sum of squares mean square for between groups If the number of measurements are identical in the groups, MSB equals n times the variance of the group means with df = k-1 degrees of freedom :

  7. If there is no difference between the groups, the deviationof the group means has the same reason (e.g. measurement error) than the deviation within the groups. Hence both MSB and MSW are an estimation for the variance of the population, and MSB / MSW has F-distribution. If there is difference between the groups, MSB / MSW is greater than the critical value of the F-distribution. The equality of variances is examined by F-test The test statistic:

  8. In the example: 0.468 < 3.89, the null hypothesis is accepted. Thus, there is no reason to believe that one fertilizer promotes growth more than another.

  9. ANOVA table Balanced design: same number of observations at every level of the factors

  10. Excel results

  11. 4. Factorial design Design of experiments (DOE) is a series of tests in which purposeful changes are made to the input variables (factors) of a system and the effects on response variables are measured. DOE is an effective tool for maximizing the amount of information gained from a study while minimizing the amount of data to be collected. Factorial experimental designs investigate the effects of many different factors by varying them simultaneously instead of changing only one factor at a time.

  12. Factorial designs allow estimation of the sensitivity to each factor and also to the combined effect of two or more factors. The main uses of design of experiments are • Discovering interactions among factors • Estimating the response variables • Establishing and maintaining quality control • Optimizing a process

  13. a, Classical design of experiments: one factor at a time b, Factorial design: vary all the factors simultaneously a, b,

  14. 2p full factorial design p factors; two levels; N = 2p trials Linear model Example 4-1. Two aim functions (y1, y2) are examined by a 22 factorial design. The results are:

  15. Transformation:

  16. The linear model for the response variable: In the example: estimation: For the uniform treatment the design can be extended by a variable x0 with the level +1.

  17. The parameters are estimated by multiple linear regression: I. Evaluation of the response y1: The estimation for y1 (equation of a plane):

  18. Determinig factor effects: Easily can be seen that: Effect graphs:

  19. Interaction between the factors (interaction graph): The effect lines of x1 belonging to the lower and upper level of x2 are parallel, i.e. the effects of x1 are identical at both levels of x2, so there is no interaction between the factors.

  20. II. Evaluation of the response y2: Estimations for the parameters of the linear model The estimation for y2 (equation of a plane):

  21. Determinig factor effects: Effect graphs:

  22. Interaction between the factors: The effect lines of x1 belonging to the lower and upper level of x2 are not parallel, i.e. the effects of x1 depends on the level of x2, so there is interaction between the factors.

  23. The model must be modified considering the interaction between the factors: The estimation for y2:

  24. If there is no interaction between the factors, the estimated two-factor model is an equation of a plane. If there is interaction between the factors the surface is warped. The estimated models can be used for interpolation, extrapolation.

  25. Significance of the estimated model parameters Does bj differ from zero significantly? The std. dev. of the parameters (sb) is necessary, which can be calculated from the std. dev. of the response variable (sy) : sy can be determined by replicated measurements performed in the centre of the design, where the factor levels are 0. In the example for y1 results in the centre: 25, 25, 26.

  26. Does bj differ from zero significantly? Null hypothesis: Test statistic: The null hypothesis is rejected if: t0.025(2) = 4.3; so a parameter is differs fro zero significantly if its absolute value greater than 0.289 X 4.3 = 1.243 In the example both b1 = -2 and b2 = 4 aresignificant.

  27. Is the linear model adequate? Or second order model is necessary? The centre of the design is examined. If the real Y1 surface is also linear, then the expected value of b0 is identical to the expected value of the y1 measurements in the centre. Null hypothesis: Alternative hyphotesis: Test statistic (similarly to the two samle t-test): where: df: l: the number of parameters in the model, kc: the number of replicates in the centre

  28. In the example: t0 < t0.025, null hypothesis is accepted, the linear model is adequate.

  29. The figure illustrates the case when the linear model is not adequate.

More Related