Graphing interactions so that curious people can actually *understand* them
Dan Kahan Posted on
Thursday, October 4, 2012 at 11:03AM A friend & collaborator asked me,
So...could you send me a quick tip/reference on how to best graph interactions in regression? I'm just thinking of simple line-charts, comparing divergent slopes for two or three different groups after controlling for the other vars in the equation. I'm *sure* this is easily done, but I'm blanking on how. I mean, it's easy enough to draw the slope based on the unstandardized coefficient. And the Y-intercept to start that line from is...what? the B of the constant?
My response:
There are excellent papers that reflect general disatisfaction w/ how social scientists tend to graphically report (or not) the results of multivariate regression -- but the ones I'm thinking of don't focus on interactions per se. They include:
- Gelman, A., Pasarica, C. & Dodhia, R. Let's Practice What We Preach: Turning Tables into Graphs. Am Stat 56, 121-130 (2002).
- King, G., Tomz, M. & Wittenberg., J. Making the Most of Statistical Analyses: Improving Interpretation and Presentation. Am. J. Pol. Sci 44, 347-361 (2000).
- Kastellec, J.P. & Leoni, E.L. Using Graphs Instead of Tables in Political Science. Perspectives on Politics 5, 755-771 (2007).
some tips I'd give:
a. *don't* graph data after splitting sample (e.g., into "high," "medium" & "low" in political sophistication)... Graph the results of the model that includes all the relevant predictors & cross-product interactin terms as applied to the entire sample; those are the results you are trying to display & splitting sample will change/bias the parameter estimates.
b. consider z-score normalization for outcome variable: you won't have to worry about the intercept (it should be zero, of course), you'll avoid lots of meaningless "white space" if values within of +/-1 or +/-2 SDs (the end points for y-axis) are concentrated within a middling portion of the outcome measure. Also for most readers, reporting the impact in terms of SDs of the outcome variable will be more intelligible than differences in raw units of some arbitrary scale (the sort you'd get by summing the likert items to form a composite likert scale, e.g.)
c. rather than graphing *slopes*, consider plotting regression estimates based on sensibly contrasting values for the predictors (and corresponding values for the cross-product interaction term); the "practical effect" of the interaction is likely to be easier to grasp that way than comparison of visual differences in slopes
d. if you are using OLS to model responses to a likert item, consider using ordered logit instead -- maybe you should be doing this anyway, but in any case, probabilities of responding at particular level (or maybe range of levels; say "agree either slight, moderately or strongly vs disagres slighly, modreately, or strongly") conditional on levels of predictor & moderator are graphically more intelligible than estimated values on an arbitrary continuous scale.
e. consider graphing estimated *differences* (& corresponding CI) in the outcome variable at different levels of moderator; e.g, if difference increases between subjects who are from different groups (or who vary +/- 1 SD on some continuous predictor) conditional on whether the value of some continuous moderator, then use bar graph w/ CIs or some such to show how much greater the estimated difference between the two groups is at the two levels of the moderator
f. consider monte carlo simulation of estimated impact of contrasting sets of predictors & moderators (& associated interactions); do kernel-density plots for 1,000 or 2,000 values of each -- it's a *really* good way to show both the contrast in the estimates & the precision of the estimates (much better than standard CIs). See King et al. above
g. usually prefer connected lines to bar graphs to display contrasts; former are more comprehensible
h. in general, don't use standardized regression coefficients but do center continuous predictors (or convert them to z-scores) so that people who are reading the table can more readily interpret them
Have attached [reproduced below] a bunch of CCP study examples that reflect one or another of these strategies or related ones. BTW, of course, all of these reflect things that I learned to do from collaborating w/ Don [Braman], who like all great teachers teaches people how to teach themselves.
note: all examples below are clickable thumbnails that expand to larger size for closer inspection



Reader Comments