Mixing it up: Mixed Models

Mixing it up: Mixed Models Tracy Tomlinson December 11, 2009

Outline • What are fixed effects • What are random effects • How do I know if my effects are fixed or random • Why do I care about fixed and random effects • Mixed models • SAS and mixed models • SPSS and mixed models

Fixed Effects • Specific levels of interest of a factor are selected • May use all levels or a subset of levels • These are the specific levels of interest • Interest in comparing these levels • Inference only for these levels

Random Effects • Levels of a factor selected from a probability distribution • Interested in the extent to which the random factor accounts for variance in the dependent variable • May be a control variable or a variable of interest • Rather than being interested in the individual means across the levels of the fixed factor, we are interested in the variance of means across the levels of a random factor

Fixed Effects: One Factor • Running a clinical trial in which a drug is administered at four different dose levels • Model Equation: Yij =  + i + eij • i corresponds to 1, 2, 3, or 4 dose levels •  is the effect of the drug on the mean

Random Effects: One Factor • Clinical trial using a new drug at 20 different clinics in DC selected at random • Model Equation: • Where i corresponds to the 20 clinics • Where represents the mean of all dosages in the population, not just the observed study • The effects of ai are random variables with mean 0 and variance a2 Yij =  + ai + eij Yij =  + i + eij Yij =  + i + eij

Levels of a factor chosen of specific interest Interested in the means across the chosen levels of the factor Levels of a factor selected from a probability distribution Interested in the variance of means across the levels of the factor Fixed Versus Random

Determining Fixed and Random Effects • YOU determine what effects you have! • As the researcher you select your levels of interest: • Are the specific levels of interest? • Are you interested in comparing group means? • Did you sample the levels from a larger population? • YOU determine the population!

Fixed or Random Effect? Is it reasonable to assume that the levels of the factor come from a probability distribution? No Yes Fixed Factor Random Factor

Fixed or Random Effect? Do you care about comparing the specific factor level means? No Yes Fixed Factor Random Factor

Why does it Matter, or Does it? • Assumptions about random effects differ from those for fixed effects • Error terms are different depending on fixed versus random effects • Random effects have additional error terms beyond 2 • The effects of ai are random variables with mean 0 and variance a2 • Important for inferential statistics

Fixed and Random Effects Error Terms • Fixed effects error = MSerror = MSwithin = 2 • Random effects error: Depends on the nature of the random and fixed effects

Two Factor Model One error term: MSwithin e2 • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + i + bj + (ab)ij + eij • A random, B random • Yij =  + ai + j + (ab)ij + eij Two error terms: 1) MSwithin e22) MSab ab2

Two Factor Model In all cases the highest level interaction term is tested against MSwithin e2 • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + i + bj + (ab)ij + eij • A random, B random • Yij =  + ai + j + (ab)ij + eij

Two Factor Model All effects tested against MSwithin e2 • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + i + bj + (ab)ij + eij • A random, B random • Yij =  + ai + j + (ab)ij + eij

Two Factor Model Fixed effect A tested against MSab ab2 • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + i + bj + (ab)ij + eij • A random, B random • Yij =  + ai + j + (ab)ij + eij Random effect B tested against MSwithin e2

Two Factor Model • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + i + bj + (ab)ij + eij • A random, B random • Yij =  + ai + j + (ab)ij + eij Random effect A tested against MSwithin e2 Fixed effect B tested against MSab ab2

Two Factor Model • A fixed, B fixed • Yij =  + i + j + ()ij + eij • A fixed, B random • Yij =  + i + bj + (ab)ij + eij • A random, B fixed • Yij =  + ai + j + (ab)ij + eij • A random, B random • Yij =  + ai + bj + (ab)ij + eij All effects tested against MSwithin e2

Fixed and Random Effects and Effect Size • Effect size estimates assume effects are fixed or random • Fixed • 2 • 2 • Random • 

Why does it Matter, or Does it? • Assumptions about random effects differ from those for fixed effects • Error terms are different depending on fixed versus random effects • Random effects have additional error terms beyond 2 • The effects of ai are random variables with mean 0 and variance a2 • Important for inferential statistics • Interpretation differs • Fixed effects interpretations constrained to the levels of the factor(s) in the study • Random effects interpretations have a broader generalization to the population of interest

Mixed Models • Contains both fixed and random effects • Randomized blocks designs • Nested/ Hierarchical designs • Split-plot designs • Clustered designs • Repeated measures • Analysis comparable • One way ANOVA • Two way ANOVA • ANCOVA

Linear Mixed Model (LMM) • Handles data where observations are not independent • LMM correctly models correlated errors, whereas procedures in the general linear model family (GLM) usually do not • Nature [structure] of the correlation must be correctly modeled for the tests of mean differences to be unbiased • LMM is a further generalization of GLM to better support analysis of a continuous dependent for: • Random effects • Hierarchical effects • Repeated measures

Mixed Models Randomized Blocks Example • Testing four drugs and assigning n subjects to one of 4 groups carefully matched by demographic variables. Each person in the four groups gets one of the drugs. • Fixed effect of treatment • Random effect of blocks Yij =  + i + bj + eij •  represents unknown fixed parameters - intercept and the four drug treatment effects • bj and eij are random variables representing blocks and error • bj assumed to have an error of b2 • Error (eij) assumed to have an error of 2

Mixed Models Hierarchical Example • Using four dosage levels of a drug in 20 clinics. In each clinic each patient was randomly assigned to one of the 4 dose levels. Yijk =  + i + bj + cij + eijk • Where ai, bj, and cij are the effects due to drug dose i, clinic j, and clinic-by-dose interaction • IF you assume that the 20 clinics are not sampled this experiment may now have only fixed effects Yijk =  + i + j + cij + eijk

Mixed Models Repeated Measures and Split-Plot • Three drug treatments randomly assigned to subjects with subjects observed at 1, 2, …, 7, and 8 hours post-treatment Where Yijk =  + i + s()ij + k + ()ik + eijk • Where  represents treatment effects •  represents time (or hour) effects • s() represents the random subject within treatment effects

Mixed Models • Contains both fixed and random effects • Randomized blocks designs • Nested/ Hierarchical designs • Split-plot designs • Clustered designs • Repeated measures • Analysis comparable • One way ANOVA • Two way ANOVA • ANCOVA

PROC MIXED • Specifically designed to fit mixed effect models. • It can model: • Random and mixed effect data • Repeated measures • Spatial data • Data with heterogeneous variances and autocorrelated observations • The MIXED procedure is more general than GLM in the sense that it gives a user more flexibility in specifying the correlation structures, particularly useful in repeated measures and random effect models • GLM uses OLS estimation • Mixed uses ML, REML, or MIVQUE0 estimation

PROC MIXED v. GLM • The PROC MIXED syntax is similar to the syntax of PROC GLM. • The random effects and repeated statements are used differently • Random effects are not listed in the model statement for MIXED • GLM has MEANS and LSMEANS statements • MIXED has only the LSMEANS statement

General SAS Mixed Model Syntax • PROC MIXED options; CLASS variable-list; MODEL dependent=fixed effects/ options;RANDOM random effects / options; REPEATED repeated effects / options;CONTRAST 'label' fixed-effect values | random-effect values/ options; ESTIMATE 'label' fixed-effect values | random-effect values/ options; LSMEANS fixed-effects / options; MAKE 'table' OUT= SAS-data-set < options >; RUN;

PROC MIXED statement PROC MIXED options; PROC MIXED noclprint covtest ; • The NOCLPRINT option prevents the printing of the CLASS level information • The first time you run the program you probably don’t want to include noclprint • When there are lots of group units, use NOCLPRINT to suppress the printing of group names. • The COVTEST option tells SAS that you would like hypothesis tests for the variance and covariance components.

CLASS statement CLASS variable-list; CLASS IDpatient IDclinic; • The CLASS statement indicates that SCHOOL is a classification variable whose values do not contain quantitative information • The variables that we want SAS to treat as categorical variables go here. • Variables that are characters (e.g., city names) must be on this line (it won’t run otherwise).

MODEL statement MODEL dependent = fixed effects/options; MODEL dosage = /solution; • If you have no fixed effects you would have no independent variables listed in the model statement • /solution option asks SAS to print the estimates for the fixed effects • Intercept is included as a default in SAS • If you would like to fit a model without the intercept you add the /NOINT option to the model statement

Random statement RANDOM intercept /sub=IDclinic; • By default there is always at least one random effect, usually the lowest-level residual • You can specify the intercept on the RANDOM statement: This indicates the presence of a second random effect and the intercept should be treated not only as a fixed effect but also as a random effect • The SUB option on the RANDOM statement specifies the multilevel structure, indicating how the individuals (level-1 units) are divided into higher level groups (level-2 units)

General SAS Mixed Model Syntax PROC MIXED options; CLASS variable-list; MODEL dependent=fixed effects/ options;RANDOM random effects / options; REPEATED repeated effects / options;CONTRAST 'label' fixed-effect values | random-effect values/ options; ESTIMATE 'label' fixed-effect values | random-effect values/ options; LSMEANS fixed-effects / options; MAKE 'table' OUT= SAS-data-set < options >; RUN;

General SPSS Mixed Model Syntax Mixed dependent with independent /print=solution /fixed=fixed effects /random intercept random effects | subject(random effect grouping) covtype (options) /repeated = repeated effect | subject(repeated effect grouping) covtype (options).

Recap Main Points Slide • Fixed versus random effects • Error terms • Effect size • Interpretation/inference • Sampling • Independence of observations • SAS and SPSS syntax for random and mixed models

For your Reading Pleasure • Schabenberger, O. (2006). SAS system for mixed models (Second Edition). Cary, NC: SAS Institute. • McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (Second Edition). New York: Chapman and Hall. • McCulloch, C., & Searle, S. (2008) Generalized, linear, and mixed models (Second Edition). New York: Wiley. • Verbeke, G. E., & Molenberghs, G. (1997). Linear mixed models in practice: A SAS-oriented approach. New York: Springer. • Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical modeling based on generalized linear models. • Heidelberg: Springer-Verlag. Lindsey, J. (1993). Models for repeated measurements. Oxford: Clarendon Press. • Singer website: http://gseweb.harvard.edu/~faculty/singer/

Data Example with PROC MIXED • High School and Beyond data example (Byrk & Raudenbush, 1992) • 7,185 students in 160 schools • MATHACH: Student level (level-1) outcome is math achievement • SES: Student level (levl-1) covariate is socio-economic status • MEANSES: School-level (level-2) covariate of mean SES for the school • SECTOR: School-level (level-2) covariate of school type (dummy coded, public= 0 and Catholic=1) Singer (1998)

Random Effects Model • Unconditional means model examining the variation of MATHACH across schools • One-way random effects ANOVA model • Model Equation • Yij =  + j + rij • SAS syntax proc mixed; class school; model mathach = ; random school;

MIXED Model Two-Level Approach • Level 1: students outcome (Yij) is expressed as the sum of an intercept for the students school (0j) and a random error term (rij) associated with the ith student in the jth school Yij = 0j+ rij • Level 2: We express the school level intercept as the sum of an overall mean (00) and a series of random deviations from that mean (0j). 0j = 00 + 0j • This leads to a model equation of Yij = 00 + 0j+ rij • SAS syntax proc mixed noclprint covtest; class school; model mathach = /solution; random intercept/sub=school;

Mixed Model Two-Level Approach Output Variance component for between school: 00 Variance component for within school:2

SAS syntax SPSS syntax SPSS for Random Effects Model proc mixed noclprint covtest; class school; model mathach = /solution; random intercept/sub=school; mixed mathach /print=solution /random intercept | subject(school). SPSS output

Including Level-2 Predictors • School-level (level-2) predictor of MEANSES • Level 1: students outcome (Yij) is expressed as the sum of an intercept for the students school (0j) and a random error term (rij) associated with the ith student in the jth school Yij = 0j+ rij • Level 2: We express the school level intercept as the sum of an overall mean (00), MEANSES and a series of random deviations from that mean (0j). 0j = 00 + 01MEANSESj + 0j • This leads to a model equation of Yij = [00 + 01MEANSESj ] + [0j+ rij ] • SAS syntax Fixed Effects Random Effects Compute df for fixed effects with “between/ within” proc mixed noclprint covtest; class school; model mathach = meanses/solution ddfm=bw; random intercept/sub=school;

SAS Output Random effect: 00 Random effect: 2 Fixed effect: 00 Fixed effect: 01

SAS syntax SPSS syntax SPSS for Including Level-2 Predictor proc mixed noclprint covtest; class school; model mathach = meanses/solution ddfm=bw; random intercept/sub=school; mixed mathach with meanses /print=solution /fixed = meanses /random intercept | subject(school). SPSS output

Including Level-1 Predictors • Student-level (level-1) predictor of SES • Level 1: students outcome (Yij) is expressed as a function of an intercept for the students school (0j), individual CSES (centered by MEANSES), and a random error term (rij) associated with the ith student in the jth school Yij = 0j+ 1jCSESij +rij • Level 2: We express the school level intercept as the sum of an overall mean (00) and a series of random deviations from that mean (0j). 0j = 00+ 0j 0j = 10+ 1j • This leads to a model equation of Yij = [00 + 01CSESj ] + [0j+ 1j(CSES)+ rij ] • SAS syntax Fixed Effects Random Effects Don’t print iteration page proc mixed noclprint covtest noitprint; class school; model mathach = cses/solution ddfm=bw notest; random intercept cses/sub=school type = un; Unstructured specification

SAS Output

Including Level-1 Predictors in SPSS Yij = [00 + 01CSESj ] + [0j+ 1j(CSES)+ rij ] • SAS syntax • SPSS syntax proc mixed noclprint covtest noitprint; class school; model mathach = cses/solution ddfm=bw notest; random intercept cses/sub=school type = un; mixed mathach with cses /print=solution /fixed = cses /random intercept cses | subject(school) covtype(un).

Including Level-1 AND Level-2 Predictors • Model with the effect of students SES (CSES) and school SES (MEANSES) with a second level-2 factor of SECTOR Yij = 0j+ 1jCSESij +rij 0j = 00+ 01MEANSESj + 02SECTORj + 0j 0j = 10+ 11MEANSESj + 12SECTORj + 1j • This leads to a model equation of Yij = 00+ 01MEANSESj + 02SECTORj + 10CSESij + 11MEANSESj(CSESij) + 12SECTORj(CSESij) + 0j + 1j(CSESij) + rij • SAS syntax proc mixed noclprint covtest noitprint; class school; model mathach = meanses sector cses meanses*cses sector*cses/solution ddfm=bw notest; random intercept cses/type=un sub=school;

SAS Output

Mixing it up: Mixed Models

Mixing it up: Mixed Models

Presentation Transcript

Algorithmic Trading: An Overview of Applications And Models.

ArcGIS Spatial Analyst Statistical Modeling

Ordinal and Multinomial Models

Lecture 3 Empirical Bayes and Proc Mixed

Other Models of Computation

Major Models and Hypotheses of Chiropractic Subluxation: II. Neurologic Models

Topics in Microeconometrics

Interaction Design Models

Conditional Random Fields

Unit 2’s Concepts

Level of Detail (LOD) Models Part Two

Economics: Foundations and Models

Polymer Mixtures (Blends/Alloys)

Applying Theories and Models

Network Models

Recent Results from the BaBar Experiment.

Chapter 6: Network Models

Watershed Management Runoff models

Bilingual Code-Mixing

NON-IDEAL FLOW Residence Time Distribution

2.5 Using Linear Models