1 / 71

Experimental design and statistical analyses of data

Experimental design and statistical analyses of data. Lesson 5: Mixed models Nested anovas Split-plot designs. Randomized block design. All treatments are allocated to the same experimental units Treatments are allocated at random. Treatments ( a = 4). Blocks ( b = 3).

navid
Télécharger la présentation

Experimental design and statistical analyses of data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experimental design and statistical analyses of data Lesson 5: Mixed models Nested anovas Split-plot designs

  2. Randomized block design • All treatments are allocated to the same experimental units • Treatments are allocated at random Treatments (a= 4) Blocks (b = 3)

  3. Blocks (patients) Treatments (drugs)

  4. Response of patient j receiving drug i Effect of patient j Residual Overall mean Effect of drug i An alternative way of writing a GLM αi = μi - μ βj = μj - μ

  5. Response of patient j receiving drug i Predicted value of y αi = μi - μ βj = μj - μ

  6. Effects of drugs Effects of patients Ex: Patient 2 receiving treatment C:

  7. Consider the two questions: • Are the three patients different? • Are patients in general different? • In the first case, ”patients” is considered as a fixed factor • In the second case, ”patients” is considered as a random factor

  8. βj is assumed to be iid ND(0,σb2) i.e. independently and identically normally distributed with zero mean and variance σ²b Probability of β ”Patients” is a random effect: If patients are randomly chosen, βj will be a stochastic variable

  9. Residual variance Variance due to drug (factor a) Variance due to patient (factor b) Variances V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε) = σa2 + σb2 + σ2

  10. Both factors are fixed V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε) = σa2 + σb2 + σ2 Variance of a single observation: V(y) = σ2 Variance of an average:

  11. ”Patients” is a random factor (mixed anova) V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε) = σa2 + σb2 + σ2 Variance of a single observation: V(y) = σb2 + σ2 Variance of an average:

  12. Both factors are random V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε) = σa2 + σb2 + σ2 Variance of a single observation: V(y) = σa2 +σb2 + σ2 Variance of an average:

  13. Expected Means Squares

  14. σa2 = 0 → → σb2 = 0 → Expected Mean Squares E[MSa] = bσa2 + σ2 df = a-1 E[MSb] = aσb2 + σ2 df = b-1 E[MSe] = σ2 df = (a-1)(b-1) H0: αA = αB = αC = αD = 0 H0: β1 = β2 = β3 = 0

  15. Hvis ”patients” is a random factor, σb2 is estimated from E[MSb] = aσb2 + σ2 → Variance of a single observation: V(y) = σb2 + σ2 = 0.927+0.117 = 1.044 Variance of the average:

  16. How to do it with SAS

  17. DATA eks5_1; INPUT pat $ treat $ y; /* indlæser data */ CARDS; /* her kommer data. Kan også indlæses fra en fil */ 1 A 5.17 2 A 6.23 3 A 4.93 1 B 5.21 2 B 7.34 3 B 4.55 1 C 4.91 2 C 6.18 3 C 4.64 1 D 4.74 2 D 6.31 3 D 4.61 ; PROC GLM; /* procedure General Linear Models */ TITLE 'Eksempel 5.1'; /* medtages hvis der ønskes en titel */ CLASS pat treat; /* pat og treat er klasse (kvalitative) variable */ MODEL y = pat treat; RANDOM pat; /* Patienter er en tilfældig faktor */ RUN;

  18. Eksempel 5.1 8 13:18 Monday, November 5, 2001 General Linear Models Procedure Dependent Variable: Y Source DF Sum of Squares Mean Square F Value Pr > F Model 5 8.09475000 1.61895000 13.80 0.0031 Error 6 0.70401667 0.11733611 Corrected Total 11 8.79876667 R-Square C.V. Root MSE Y Mean 0.919987 6.341443 0.34254359 5.40166667 Source DF Type I SS Mean Square F Value Pr > F PAT 2 7.64831667 3.82415833 32.59 0.0006 TREAT 3 0.44643333 0.14881111 1.27 0.3666 Source DF Type III SS Mean Square F Value Pr > F PAT 2 7.64831667 3.82415833 32.59 0.0006 TREAT 3 0.44643333 0.14881111 1.27 0.3666 MSe MSb MSa

  19. Eksempel 5.1 18 09:00 Friday, November 16, 2001 General Linear Models Procedure Source Type III Expected Mean Square PAT Var(Error) + 4 Var(PAT) TREAT Var(Error) + Q(TREAT)

  20. Nested designs

  21. Replicates can also be regarded as nested within drugs and patients Patient j is the same for all drugs Patients are said to be nested within drugs Factor A (drug) AB C D Factor B (patient) 1 2 3 12 3 12 3 1 2 3 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Patient j is not the same for all drugs Model: Factor A (drug) AB C D Factor B (patient) 1231 23 123 123 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Model:

  22. Rules for finding the EMS(after Dunn and Clark) • For each effect, write down every possible variance component containing every letter of the effect name. For example, in a two way design with r replicates per cell, the EMS for factor A includes σa2, σab2 and σ(ab)e2, but not σb2 • For any nested factor add in parentheses to the effect name the name(s) of the factor within it is nested e.g if B is nested in A, σ(a)b2 is the variance of β(i)j. • For the coefficient of each variance component, use all letters not in the subscripts of the variance component • For each variance component, look at any subscripts outside parentheses that are not in the effect name; if any of these letters corresponds to a fixed effect, omit that variance component

  23. Model: Interaction between drug and patient Residual of the kth replicate nested within drug i and patient j Factor A (drug) AB C D Two-way anova (A and B fixed) Factor B (patient) 1231 23 123 123 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

  24. Factor A: σa2 + σab2 + σ(ab)e2 Factor B: σb2 + σab2 +σ(ab)e2 Factor AB: Residual: σab2 +σ(ab)e2 σ(ab)e2 Model: (1) For each effect, write down every possible variance component containing every letter of the effect name. For example, in a two way design with r replicates per cell, the EMS for factor A includes σa2, σab2 and σ(ab)e2, but not σb2

  25. Residual: σ(ab)e2 Model: (2) For any nested factor add in parentheses to the effect name the name(s) of the factor within it is nested e.g if B is nested in A, σ(a)b2 is the variance of β(i)j. Factor A: σa2 + σab2 + σ(ab)e2 Factor B: σb2 + σab2 +σ(ab)e2 Factor AB: σab2 +σ(ab)e2

  26. Factor A: brσa2 + rσab2 + σ(ab)e2 Factor B: arσb2 + rσab2 +σ(ab)e2 Factor AB: Residual: rσab2 +σ(ab)e2 σ(ab)e2 Model: (3) For the coefficient of each variance component, use all letters not in the subscripts of the variance component

  27. Residual: σ(ab)e2 Model: (4) For each variance component, look at any subscripts outside parentheses that are not in the effect name; if any of these letters corresponds to a fixed effect, omit that variance component Factor A: brσa2 + rσab2 + σ(ab)e2 Factor B: arσb2 + rσab2 +σ(ab)e2 Factor AB: rσab2 +σ(ab)e2

  28. Factor A (drug) AB C D Two-way anova (A and B fixed) Factor B (patient) 1231 23 123 123 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Model:

  29. NB! Factor A (drug) AB C D Two-way anova (A fixed, B random) Factor B (patient) 1231 23 123 123 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Model: βj is ND(0, σb2) (αβ)ijis ND(0; σab2(1-1/a))

  30. Factor A: AB C D Two-way anova (A and B random) Factor B: 1231 23 123 123 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Model: βi is ND(0, σb2) αi is ND(0, σa2) (αβ)ijis ND(0; σab2)

  31. Model: Factor A (drug) AB C D Nested anova (A fixed, B random) Factor B (patient) 1 2 3 12 3 12 3 1 2 3 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 β(i)j is ND(0, σ(a)b2)

  32. Model: Factor A (doctor) AB C D Nested anova (A and B random) Factor B (patient) 1 2 3 12 3 12 3 1 2 3 Replicate 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 β(i)j is ND(0, σ(a)b2) αi is ND(0, σa2)

  33. Model: Treatment (a = 3) 40% 20% 0% Four level nested anova 1 2 1 2 1 2 Tree (b = 2 ) Leaf (c = 3 ) 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 Replicate (r = 2) 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 β(i)j is ND(0, σ(a)b2) γ(ij)k is ND(0, σ(ab)c2)

  34. → → MS(ab)c = rs(ab)c2 + s2 MS(a)b = cr s(a)b2+ r s(ab)c2 +s2 = cr s(a)b2 + MS(ab)c MSa = bcrsa2 +cr s(a)b2+ r s(ab)c2 +s2 = bcrsa2 +MS(a)b

  35. How do it with SAS

  36. DATA nested; /* Nested anova (eks 6-4 inthe lecture notes) */ INFILE 'H:\lin-mod\eks6x.prn' firstobs =2 ; INPUT treat $ tree $ leaf $ disc $ Nitro ; PROC GLM; CLASS treat tree leaf disc; MODEL Nitro = treat tree(treat) leaf(tree treat); /* treatmentis a fixed factor, while treesandleaves are random */ RANDOM tree(treat) leaf(tree treat); /* gives the expected means squares */ RUN;

  37. General Linear Models Procedure Dependent Variable: NITRO Source DF Sum of Squares Mean Square F Value Pr > F Model 17 134.04000000 7.88470588 8.00 0.0001 Error 18 17.75000000 0.98611111 Corrected Total 35 151.79000000 R-Square C.V. Root MSE NITRO Mean 0.883062 3.271932 0.99303127 30.35000000 Source DF Type I SS Mean Square F Value Pr > F TREAT 2 71.78000000 35.89000000 36.40 0.0001 TREE(TREAT) 3 36.04666667 12.01555556 12.18 0.0001 LEAF(TREAT*TREE) 12 26.21333333 2.18444444 2.22 0.0618 Source DF Type III SS Mean Square F Value Pr > F TREAT 2 71.78000000 35.89000000 36.40 0.0001 TREE(TREAT) 3 36.04666667 12.01555556 12.18 0.0001 LEAF(TREAT*TREE) 12 26.21333333 2.18444444 2.22 0.0618 NB! These values are based on MSe as the error term, which is wrong!

  38. DATA nested; /* Nested anova (eks 6-4 inthe lecture notes) */ INFILE 'H:\lin-mod\eks6x.prn' firstobs =2 ; INPUT treat $ tree $ leaf $ disc $ Nitro ; PROC GLM; CLASS treat tree leaf disc; MODEL Nitro = treat tree(treat) leaf(tree treat); /* treatmentis a fixed factor, while treesandleaves are random */ RANDOM tree(treat) leaf(tree treat); /* gives the expected means squares */ RUN;

  39. General Linear Models Procedure Source Type III Expected Mean Square TREAT Var(Error) + 2 Var(LEAF(TREAT*TREE)) + 6 Var(TREE(TREAT)) + Q(TREAT) TREE(TREAT) Var(Error) + 2 Var(LEAF(TREAT*TREE)) + 6 Var(TREE(TREAT)) LEAF(TREAT*TREE) Var(Error) + 2 Var(LEAF(TREAT*TREE))

  40. PROC GLM; CLASS treat tree leaf disc; MODEL Nitro = treat tree(treat) leaf(tree treat); /* treatmentis a fixed factor, while treesandleaves are random */ RANDOM tree(treat) leaf(tree treat); /* gives the expected means squares */ TEST h=treat e= tree(treat); /* tests for the difference betweentreatmentswith MS for tree(treat) as denominator */ TEST h= tree(treat) e=leaf(tree treat); /* tests for the difference between trees with MS for leaf(tree treat) as denominator*/

  41. General Linear Models Procedure Dependent Variable: NITRO Tests of Hypotheses using the Type III MS for TREE(TREAT) as an error term Source DF Type III SS Mean Square F Value Pr > F TREAT 2 71.78000000 35.89000000 2.99 0.1933 Tests of Hypotheses using the Type III MS for LEAF(TREAT*TREE) as an error term Source DF Type III SS Mean Square F Value Pr > F TREE(TREAT) 3 36.04666667 12.01555556 5.50 0.0130

  42. PROC GLM; CLASS treat tree leaf disc; MODEL Nitro = treat tree(treat) leaf(tree treat); /* treatmentis a fixed factor, while treesandleaves are random */ RANDOM tree(treat) leaf(tree treat); /* gives the expected means squares */ TEST h=treat e= tree(treat); /* tests for the difference betweentreatmentswith MS for tree(treat) as denominator */ TEST h= tree(treat) e=leaf(tree treat); /* tests for the difference between trees with MS for leaf(tree treat) as denominator*/ MEANS treat / TukeyDunnett('Control') e= tree(treat) cldiff; /* findspossible significantdifferences between treatments and the control and the other treatments */ RUN;

  43. Tukey's Studentized Range (HSD) Test for variable: NITRO NOTE: This test controls the type I experimentwise error rate. Alpha= 0.05 Confidence= 0.95 df= 3 MSE= 12.01556 Critical Value of Studentized Range= 5.910 Minimum Significant Difference= 5.9134 Comparisons significant at the 0.05 level are indicated by '***'. Simultaneous Simultaneous Lower Difference Upper TREAT Confidence Between Confidence Comparison Limit Means Limit 20% - 40% -3.663 2.250 8.163 20% - Control -2.513 3.400 9.313 40% - 20% -8.163 -2.250 3.663 40% - Control -4.763 1.150 7.063 Control - 20% -9.313 -3.400 2.513 Control - 40% -7.063 -1.150 4.763

  44. Dunnett's T tests for variable: NITRO NOTE: This tests controls the type I experimentwise error for comparisons of all treatments against a control. Alpha= 0.05 Confidence= 0.95 df= 3 MSE= 12.01556 Critical Value of Dunnett's T= 3.866 Minimum Significant Difference= 5.4714 Comparisons significant at the 0.05 level are indicated by '***'. Simultaneous Simultaneous Lower Difference Upper TREAT Confidence Between Confidence Comparison Limit Means Limit 20% - Control -2.071 3.400 8.871 40% - Control -4.321 1.150 6.621

  45. PROC NESTED; CLASS treat tree leaf; VAR Nitro; RUN;

  46. Coefficients of Expected Mean Squares Source TREAT TREE LEAF ERROR TREAT 12 6 2 1 TREE 0 6 2 1 LEAF 0 0 2 1 ERROR 0 0 0 1

More Related