Group 5 AMS 572 Professor: Wei Zhu

Group 5 AMS 572 Professor: Wei Zhu

Members: • Foram Sanghvi :Brief review of ANOVA • Shihui Xiang: Introduction to Repeated Measures Design • Qianzhu Wu: One-way repeated measures ANOVA • Yue Tang: Using the repeated statement of proc anova • Yan Xu: Two-Factor ANOVA with repeated Measures on One Factor • Weina Gao: Two-Factor experience with Repeated Measure on both factors • Yi Hu: Three-Factor experiments with a repeated measure on the last factor • Xiaoke Fei: Three-Factor experiments with repeated measure on two factors • Yuzhou Song: Mixed Model 2／87

Review of ANOVAand need for repeated measures design Foram Sanghvi 3／87

Review of ANOVA The One-way ANOVA can test the equality of several population means. It is an extension of the pooled variance t-test That is: • H0(nullhypothesis): µ1 = µ2 = µ3 =……..= µn • Ha (alternative hypothesis): At least one of means differs from the rest. Assumptions: • Equal population variances • Normal population • Independent samples 4／87

Test statistic: ～Fa-1,N-a Conclusion: Reject H0 if Fo>Fa-1,N-a 5／87

Meaning of the terms: • MSA =Variance of group mean • MSE =Mean of within group variance • Total sample size N= • Sample mean: • Grand mean: • Yij =observed response from experimental unit i when receiving effect j ~N(µi ,σ2 ) 6／87

The most distinct disadvantage to the analysis of variance (ANOVA) method is that it requires two assumptions to be made: 1. All population means from each data group must be (roughly) equal. 2. All variances from each data group must be (roughly) equal. Obviously, we rarely have this luxury in real-world applications. • The most distinct disadvantage to the analysis of variance (ANOVA) method is that it requires two assumptions to be made: • All population means from each data group must be (roughly) equal. • All variances from each data group must be (roughly) equal. • The normal subject-to-subject variation may strongly affect the error sum of squares. 7／87

-- A repeated measures design is one in which at least one of the factors consists of repeated measurements on the same subjects or experimental units, under different conditions.

A repeated measures design involves measuring subjects at different points in time (typically after different treatments) It can be viewed as an extension of the paired-samples t-test (which involved only two related measures) Thus, the measures—unlike in “regular” ANOVA—are correlated, i.e., the observations are not independent

Data collected in a sequence of evenly spaced points in time • Treatments are assigned to experimental units

By collecting data from the same participants under repeated conditions the individual differences can be eliminated or reduced as a source of between group differences. Also, the sample size is not divided between conditions or groups and thus inferential testing becomes more powerful. This design also proves to be economical when sample members are difficult to recruit because each member is measured under all conditions.

As with any ANOVA, repeated measures ANOVA tests the equality of means. However, repeated measures ANOVA is used when all members of a random sample are measured under a number of different conditions. • As the sample is exposed to each condition, the measurement of the dependent variable is repeated. • Using a standard ANOVA in this case is not • appropriate because it fails to model the correlation between the repeated measures: the data violate the ANOVA assumption of independence.

• The simplest example of a repeated measures design is a paired t-test.Each subject is measured twice (time 1 and time 2) on the same variable or each pair of matched participants are assigned to one of two treatment levels.• If we observe participants at more than two time-points, then we need to conduct a repeated measures ANOVA.

What we would like to do is to decompose the variability into： (1) A random effect (2) A fixed effect • The effect of participants is always a random effect. We will only consider situations where the factor is a fixed effect

Yij = μj +Si+εij μj = The fixed effect. Si= The random effect of subject i. εij = The random error independent of Si

Assumptions of a repeated measures design • For a repeated measures design, we start with the same assumptions as a paired t-test : • Participants are independent and randomly selected from the population • Normality (actually symmetry). • Due to having more than two measurements on each participant, we have an additional assumption on the variances.

The assumptions we have to check for a repeated measures design are:1.Participants are independent and randomly selected from the population2.Normality (actually symmetry)3. Compound symmetry

Consider the following experiment: We have four drugs (1,2,3 and 4) that relieve pain. Each subject is given each of the four drugs. The subject’s pain tolerance is then measured. Enough time is allowed to pass between successive drug administrations so that we can be sure there’s no residual effect from the previous drug. The null hypothesis is: Mean(1)=Mean(2)=Mean(3)=Mean(4)

In the one-way analysis of variance without a repeated measure, we would have each subject receive only one of the four drugs. In this design, each subjects is measured under each of the drug conditions. This has several important advantages.

Each subject acts as his own control. i.e. : drugs effects are calculated by recording deviations between each drug score and the average drug score for each subject. The normal subject-to-subject variation can thus be removed from the error sum of squares.

reconstruct SAS code without using repeated statement DATA PAIN; INPUT SUBJ DRUG PAIN; DATALINES; 1 1 5 1 2 9 1 3 6 1 4 11 2 1 7 2 2 12 …… ; DATA PAIN; INPUT SUBJ @; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; DATALINES; 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ; PROC ANOVA DATA=PAIN; TITLE ‘without repeated statement'; CLASS SUBJ DRUG; MODEL PAIN=SUBJ DRUG; MEANS DRUG/DUNCAN; RUN;

To keep reading from the same line of data iterative loop a lot easier! SAS code without using repeated statement DATA PAIN; INPUT SUBJ @; DATALINES; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ;

initial value ending value Default: 1 SAS code without using repeated statement Remark 1: about the DO statement the general form: Do variable = start TO end BY increment; (SAS Statements) END;

initial value: 1 ending value: 4 to keep reading from the same line of data return to “DO” SAS code without using repeated statement Remark 1: about the DO statement in our example: DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END;

No “|” : they are each main effects and no interaction terms between them. SAS code without using repeated statement Remark 2: about the ANOVA procedure PROC ANOVA DATA=PAIN; TITLE ‘without repeated statement'; CLASS SUBJ DRUG; MODEL PAIN=SBJ DRUG; MEANS DRUG/DUNCAN; RUN;

SAS code using the REPEATED Statement DATA REPEAT; INPUT PAIN1-PAIN4; DATALINES; 5 9 6 11 7 12 8 9 11 12 10 14 3 8 5 8 ; PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 (1 2 3 4); RUN;

NOTICE that it does not have a DRUG variable SAS code using the REPEATED Statement Remark 1 : about the data set We need the data set in the form: SUBJ PAIN1 PAIN2 PAIN3 PAIN4

To compute pairwise comparisons SAS code using the REPEATED Statement Remark 2 : about the REPEATED Statement The general form: REPEATED factor_name CONTRAST(n); • N is a number from 1 to k, with k being # levels of repeated factor; • To get all pairwise contrasts, we need k-1 repeated statements

Request ANOVA tables for each contrast SAS code using the REPEATED Statement Remark 2 : about the REPEATED Statement In our example: PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 CONTRAST(1) / SUMMARY; REPEATED DRUG 4 CONTRAST(2) / SUMMARY; REPEATED DRUG 4 CONTRAST(3) / SUMMARY; RUN;

SAS code using the REPEATED Statement Remark 3 : more explanation of the ANOVA procedure PROC ANOVA DATA=REPEAT; TITLE 'using repeated statement'; MODEL PAIN1-PAIN4 = / NOUNI; REPEATED DRUG 4 (1 2 3 4); RUN; • No CLASS: our data set does not have an independent variable • NOUNI: not to conduct a separate analysis for each of the four PAIN • 4: the repeated factor “DRUG” has four levels; optional • (1 2 3 4): the labels we want printed for each level of DRUG

SAS code using the REPEATED Statement Remark 4 : comparison of the DATA steps DATA PAIN; INPUT SUBJ @; DO DRUG = 1 to 4; INPUT PAIN @; OUTPUT; END; DATALINES; 1 5 9 6 11 2 7 12 8 9 3 11 12 10 14 4 3 8 5 8 ; DATA PAIN; INPUT SUBJ DRUG PAIN; DATALINES; 1 1 5 1 2 9 1 3 6 1 4 11 2 1 7 2 2 12 …… ; DATA REPEAT; INPUT PAIN1-PAIN4; DATALINES; 5 9 6 11 7 12 8 9 11 12 10 14 3 8 5 8 ;

Repeated Factor B: TIME Factor A: GROUP PRE POST Subject Control Treatment

Time df=b-1 Treatment df=a-1 Between subjects Treatment × time df =(a-1)×(b-1) Error due to subjects within treatment df=a(n-1) Error or residual df =a×(n-1)×(b-1) Total Variance df=N-1 Within subjects a: # of treatment groups b: # of time points n: # of subjects per treatment N=a×b×n: total # of measurements

Data prepost; • Input subj group $ pretest postest; • datalines; • 1 c 80 83 • 2 c 85 86 • 3 c 83 88 • 4 t 82 94 • 5 t 87 93 • 6 t 84 98 • ; • run; • proc anovadata=prepost; • title'Two-way ANOVA with a Repeated Measure on One Factor'; • class group; • model pretest postest = group/nouni; • repeated time 2 (0 1); • means group; • run;

MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.13216314 26.27 1 4 0.0069 Pillai's Trace 0.86783686 26.27 1 4 0.0069 Hotelling-Lawley Trace 6.56640625 26.27 1 4 0.0069 Roy's Greatest Root 6.56640625 26.27 1 4 0.0069 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time*group Effect Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.32611465 8.27 1 4 0.0452 Pillai's Trace 0.67388535 8.27 1 4 0.0452 Hotelling-Lawley Trace 2.06640625 8.27 1 4 0.0452 Roy's Greatest Root 2.06640625 8.27 1 4 0.0452

Tests of Hypotheses for Between Subjects Effects Source DF Anova SS Mean Square F Value Pr > F group 1 90.75000000 90.75000000 11.840.0263 Error 4 30.66666667 7.66666667 Univariate Tests of Hypotheses for Within Subject Effects Source DF Anova SS Mean Square F Value Pr > F time 1 140.0833333 140.0833333 26.27 0.0069 time*group 1 44.0833333 44.0833333 8.27 0.0452 Error(time) 4 21.3333333 5.3333333 Level of -----------pretest----------- -----------postest----------- group N Mean Std Dev Mean Std Dev c 3 82.6666667 2.51661148 85.6666667 2.51661148 t 3 84.3333333 2.51661148 95.0000000 2.64575131

Group 5 AMS 572 Professor: Wei Zhu