A Practical Guide to Multiplicative Interaction Variables in Policy Research Garry Young GW Institute of Public Policy January 25, 2006
Why Interactive Variables? Answer: Sometimes the impact of a given independent variable may depend (or be conditional on) the level of another independent variable. Example: Impact of education on income may depend on gender or race.
What I’ll Cover • Interactive effects using OLS • Case of a single discrete (dummy) variable interacted with a continuous variable. • Case of two continuous variables interacted. • Centering, significance tests, interpretation • Very brief discussion of extensions.
Misc. • Data, Stata do file, and this powerpoint available at www.gwu.edu/~gwipp. • Data used is fake and includes the following variables as follows: • Race (Klingon and Earthling) • Income (100 – 280 credits) • Education (4 – 16 years) • Age (25 – 60)
Interactive Model • Does education affect income differently by race? • Find out by multiplying observations for Education by observations by race • Educationi X Earthlingi
Common Mistakes • Omitting variables that are part of the interaction • All variables that are part of the interaction stay in the equation • e.g., don’t drop the Education and Earthling variables while leaving in Education * Earthling
Common Mistakes • Omitting variables. • Not performing an F-test • Need to know if interaction contributes to model
Common Mistakes • Omitting variables. • Not performing an F-test. • Failure to understand the conditional nature of coefficients
Education is Conditional on Earthling = 0; Earthling is conditional on Education = 0
Common Mistakes • Omitting variables that are part of the interaction • Not performing an F-test • Failure to understand the conditional nature of coefficients • Failure to test for statistical significance of conditional slopes from zero
Evaluating the Overall Model • Interactive terms lessen parsimony, increase difficulty of interpretation. • Don’t do unless the interactive adds explanatory power. • For OLS perform an F-test.
F-Test Formula The F-test formula is where k denotes the number of variables, subscript 1 refers to original model and subscript 2 refers to the expanded model.
F-Test = (.74-.70)/(3-2) (1-.74)/(100-3-1) = 14.8 Critical value for F(1, 96) < 3.84 14.8 > 3.84 so interactive model is statistically significant
Mean Centering Def.: Subtracting the mean from each observation of the independent variable of interest so that the new mean is equal to zero.
Why mean center? • Makes coefficients easier to interpret • Some argue it reduces multicollinearity (Cronbach 1987) • Otherwise doesn’t affect substance of results, e.g., R2 is unaffected.
Centering Education Using Stata: summarize education, meanonly gen educmean = r(mean) gen educ_ct = education - educmean * show two variables in comparison summarize education educ_ct Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- education | 100 11.11 3.06131 4 16 educ_ct | 100 3.43e-07 3.06131 -7.11 4.89
The Impact of Education Impact of Education conditional on Klingon (Earthling = 0) Simply take the value of the Education coefficient: 10.31 Education conditional on Earthling: Education coef. + Interactive coef. = 10.31 + 6.85 = 17.16
The Impact of Race Earthling income conditional on average education: 166.43 + 53.22(1) = 219.65 Klingon income conditional on average education: 166.43 + 53.22(0) = 166.43
Slope Significance • Is each individual slope statistically distinct from zero? • Two ways to calculate • Formula using variance-covariance matrix. See Friedrich (1982: 810). • A simple trick: Rescoring and recomputing
Recall that the slope for Education is conditional on Klingon. The slope is statistically significant.
How to Rescore and Recompute • Create a new variable where Klingon = 1 gen klingon = 0 replace klingon = 1 if earthling == 0 • Create new product interactive variable using new race variable: Education X Klingon • Re-run regression.
Interacting 2 Continuous Variables • Say we think Age and Education interact • Steps: • Center both variables • Create new product term (Age * Education) • Run regressions
Analysis • Carry out F-Test as before • Interpretation of Age: Impact of Age on Income conditional on Education=0 • Note that Education is mean centered so Age coefficient is conditional on Education=mean • Interpretation of Education: Impact of Education on Income conditional on Age=0 • Note that Age is mean centered so Education coefficient is conditional on Age=mean
Analysis (cont) • Other diagnostics similar as before • Can use rescoring to center variables at values of interest, e.g., what is the impact of Age on Income conditional on Education being set to High, Medium, and Low values. See, e.g., Young and Perkins (2005: 1197-1198).
Extensions • Standardized variables, see Jaccard and Turrisi (2003) • Multiple dummy variables, e.g. case where Romulans, Earthlings, and Klingons yields two dummy variables. See, e.g., Young and Perkins (2006) • Three-way interactions, e.g., Age * Education * Tenure. See Friedrich (1982) or Jaccard & Turrisi (2003) • Other complex interactions, e.g., quadratric, see Friedrich or Jaccard and Turrisi. • Non-linear functional forms (Jaccard 2001)
Places for More Info • Friedrich (1982): Still considered the gold standard, at least in political science • Jaccard, Turrisi, and Wan (1990): Better on providing technical details than its successor • Jaccard and Turrisi (2003): Good for practical applications • Braumoeller (2004): Good tips on hypothesis • Brambor, Clark, and Golder (2005): Dos and Don’ts
References • Brambor, Thomas, William Roberts Clark, and Matt Golder. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 13: 1-20. • Braumoeller, Bear. 2004. “Hypothesis Testing and Multiplicative Interaction Terms.” International Organization 58: 807-820. • Cronbach, L.J. 1987. “Statistical Tests for Moderator Variables.” Psychological Bulletin 87: 51-57. • Friedrich, Robert. 1982. “In Defense of Multiplicative Terms in Multiple Regression Equations.” American Journal of Political Science 26: 797-833. • Jaccard, James. 2001. Interaction Effects in Logistic Regression Thousand Oaks: Sage Publications. • Jaccard, James and Robert Turrisi. 2003. Interaction Effects in Multiple Regression Second Edition. Thousand Oaks: Sage Publications. • Jaccard, James, Robert Turrisi, Choi K. Wan. 1990. Interaction Effects in Multiple Regression Thousand Oaks: Sage Publications. • Young, Garry and William Perkins. 2005. “Presidential Rhetoric, the Public Agenda, and the End of Presidential Television’s ‘Golden Age,’” Journal of Politics 67: 1190-1205.