follow CCP

Recent blog entries
popular papers

Science Curiosity and Political Information Processing

What Is the "Science of Science Communication"?

Climate-Science Communication and the Measurement Problem

Ideology, Motivated Cognition, and Cognitive Reflection: An Experimental Study

'Ideology' or 'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment

A Risky Science Communication Environment for Vaccines

Motivated Numeracy and Enlightened Self-Government

Making Climate Science Communication Evidence-based—All the Way Down 

Neutral Principles, Motivated Cognition, and Some Problems for Constitutional Law 

Cultural Cognition of Scientific Consensus

The Tragedy of the Risk-Perception Commons: Science Literacy and Climate Change

"They Saw a Protest": Cognitive Illiberalism and the Speech-Conduct Distinction 

Geoengineering and the Science Communication Environment: a Cross-Cultural Experiment

Fixing the Communications Failure

Why We Are Poles Apart on Climate Change

The Cognitively Illiberal State 

Who Fears the HPV Vaccine, Who Doesn't, and Why? An Experimental Study

Cultural Cognition of the Risks and Benefits of Nanotechnology

Whose Eyes Are You Going to Believe? An Empirical Examination of Scott v. Harris

Cultural Cognition and Public Policy

Culture, Cognition, and Consent: Who Perceives What, and Why, in "Acquaintance Rape" Cases

Culture and Identity-Protective Cognition: Explaining the White Male Effect

Fear of Democracy: A Cultural Evaluation of Sunstein on Risk

Cultural Cognition as a Conception of the Cultural Theory of Risk

« ASTAR: bringing the culture of science to law--and the culture of law to science | Main | The aporetic judge »

Graphing interactions so that curious people can actually *understand* them

A friend & collaborator asked me,

So...could you send me a quick tip/reference on how to best graph interactions in regression? I'm just thinking of simple line-charts, comparing divergent slopes for two or three different groups after controlling for the other vars in the equation. I'm *sure* this is easily done, but I'm blanking on how. I mean, it's easy enough to draw the slope based on the unstandardized coefficient. And the Y-intercept to start that line from is...what? the B of the constant?

My response:

I'm sure you are asking b/c you are unsatisfied, understandably in my view, w/ the graphing recommendations that appear in references like Aiken, L.S., West, S.G. & Reno, R.R. Multiple Regression: Testing and Interpreting Interactions. (Sage Publications, Newbury Park, Calif.; 1991) &  Jaccard, J. & Turrisi, R. Interaction Effects in Multiple Regression, Edn. 2nd. (Sage Publications, Thousand Oaks, Calif.; 2003) -- even though those are definitely the best references for understanding the statistical logic of interactions & making intelligent modeling choices.

There are excellent papers that reflect general disatisfaction w/ how social scientists tend to graphically report (or not) the results of multivariate regression models. They include:
  • Gelman, A., Pasarica, C. & Dodhia, R. Let's Practice What We Preach: Turning Tables into Graphs. Am Stat 56, 121-130 (2002).
  • King, G., Tomz, M. & Wittenberg., J. Making the Most of Statistical Analyses: Improving Interpretation and Presentation. Am. J. Pol. Sci 44, 347-361 (2000).
  • Kastellec, J.P. & Leoni, E.L. Using Graphs Instead of Tables in Political Science. Perspectives on Politics 5, 755-771 (2007).
They don't deal w/ interactions per se, but b/c they address the objective of how to make regression model results intelligible in general, you can easily derive from them ideas about strategies that work w/ models that include cross-product interaction terms.

I'll show you some examples below but here are some general tips I'd give: 

a. *don't* graph data after splitting sample (e.g., into "high," "medium" & "low" in political sophistication)... Graph the results of the model that includes all the relevant predictors & cross-product interaction terms as applied to the entire sample; those are the results you are trying to display & splitting sample will change/bias the parameter estimates.

b. consider z-score normalization for the outcome variable: you won't have to worry about the intercept (it should be zero, of course), you'll avoid lots of meaningless "white space" if values within of +/-1 or +/-2 SDs (the end points for y-axis) are concentrated within a middling portion of the  outcome measure. Also for most readers, reporting the impact in terms of SDs of the outcome variable will be more intelligible than differences in raw units of some arbitrary scale (the sort you'd get by summing the likert items to form a composite likert scale, e.g.)

c. rather than graphing *slopes*, consider plotting regression estimates based on sensibly contrasting values for the predictors (and corresponding values for the cross-product interaction term); the "practical effect" of the interaction is likely to be easier to grasp that way than comparison of visual differences in slopes

d. if you are using OLS to model responses to a likert item, consider using ordered logit instead -- maybe you should be doing this anyway, but in any case, probabilities of responding at particular level (or maybe range of levels; say "agree either slight, moderately or strongly vs disagres slighly, modreately, or strongly") conditional on levels of predictor & moderator are graphically more intelligible  than estimated values on an arbitrary continuous scale.

e. consider graphing estimated *differences* (& corresponding CIs) in the outcome variable at different levels of moderator; e.g, if difference increases between subjects who are from different groups (or who vary  +/- 1 SD on some continuous predictor) conditional on whether the value of some continuous moderator, then use bar graph w/ CIs or some such to show how much greater the estimated difference between the two groups is at the two levels of the moderator 

f. consider monte carlo simulation of estimated impact of contrasting sets of predictors & moderators (& associated interactions); do kernel-density plots for 1,000 or 2,000 values of each -- it's a *really* good way to show both the contrast in the estimates & the precision of the estimates (much better than standard CIs). See King et al. above 

g. usually prefer connected lines to bar graphs to display contrasts; former are more comprehensible

h. in general, don't use standardized regression coefficients but do center continuous predictors (or convert them to z-scores) so that people who are reading the table can more readily interpret them

Have attached [reproduced below] a bunch of CCP study examples that reflect one or another of these strategies or related ones. BTW, of course, all of these reflect things that I learned  to do from collaborating w/ Don [Braman], who like all great teachers teaches people how to teach themselves.

note: all examples below are clickable thumbnails that expand to larger size for closer inspection




PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>