Key Insight

Score! Former Freud expert & current stats legend  Andrew Gelman posted a blog (one he likely wrote in the late 1990s; he stockpiles his dispatches, so probably by the time he sees mine he’ll have completely forgotten this whole thing, & even if he does respond I’ll be close to 35 yrs. old  by then & will be interested in ... Read more

Former Freud expert & current stats legend  Andrew Gelman posted a blog (one he likely wrote in the late 1990s; he stockpiles his dispatches, so probably by the time he sees mine he’ll have completely forgotten this whole thing, & even if he does respond I’ll be close to 35 yrs. old  by then & will be interested in other things like drinking and playing darts) in which he said he liked one of my graphics!

Actually, he said mine was “not wonderful”—but that it kicked the ass of one that really sucked!

Time to get back the never-ending project of self-improvement that I’ve dedicated my life too.

The question is, How can I climb to that next rung—“enh,” the one right above “not wonderful”?

I’m going to show you a couple of graphics. They aren’t the same ones Gelman showed but they are using the same strategy to report more interesting data.  Because the data are more interesting (not substantively, but from a graphic-reporting point of view), they’ll supply us with even more motivation to generate a graphic-reporting performance worthy of an “enh”—or possibly even a “meh,” if we can get really inspired here.

I say we because I want some help.  I’ve actually posted the data & am inviting all of you—including former Freud expert & current stats legend Gelman (who also is a bully of WTF study producers , whose only recourse is to puff themselves up to look really big, like a scared cat would)—to show me what you’d do differently with the data.

Geez, we’ll make it into a contest, even!  The “Gelman Graphic Reporting Challenge Cup,” we’ll call it, which means the winner will get—a cup, which I will endeavor get Gelman himself to sign, unless of course he wins, in which case I’ll sign it & award it to him!

Okay, then. The data, collected from a large nationally representative sample, shows the relationship between religiosity, left-right political outlooks, and climate change.

It turns out that religiosity and left-right outlooks actually interact. That is, the impact of one on the likelihood someone will report “believing in” human-caused climate change depends on the value of the other.

That’s  a scatter plot with left_right, the continuous measure of political outlooks, on the x-axis, and “belief in human-caused climate change” on the right.

Belief in climate change is actually a binary variable—0 for “disbelief” and 1 for “belief.”

But in order to avoid having the observations completely clumped up on one another, I’ve “jittered” them—that is, added a tiny bit of random noise to the 0’s and 1’s (and a bit too for the left_right scores) to space the observations out and make them more visible.

Plus I’ve color-coded them based on religiosity!  I’ve selected orange for people who score above the mean on the religiosity scale and light blue for those who score below the mean. That way you can see how religiosity matters at the same time that you can see that political outlook matters in determining whether someone believes in climate change.

Or at least you can sort of see that. It’s still a bit blurry, right?

So I’ve added the locally weighted regression lines to add a little resolution.  Locally weighted regression is a nonmodel way to model the data. Rather than assuming the data fit some distributional form (linear, sigmoidal, whatever) and then determining the “best fitting” parameters consistent with that form, the locally weighted regression basically slices the x-axis predictor  into zillions of tiny bits, with individual regressions being fit over those tiny little intervals and then stitched together.

It’s the functional equivalent of getting a running tally of the proportion of observations at many many many contiguous points along left_right (and hence my selection of the label “proportion agreeing” on the y-axis, although “probability of agreeing” would be okay too; the lowess regression can be conceptualized as estimating that).

What the lowess lines help us “see” is that in fact the impact of political outlooks is a bit more intense for subjects who are “low” in religiosity. The slope for their S-shaped curve is a bit steeper, so that those at the “top,” on the far left, are more likely to believe in human-caused climate change. Those at the “bottom,” on the right, seem comparably skeptical.

The difference in those S-shaped curves is what we can model with a logistic regression (one that assumes that the probability of “agreeing” will be S-shaped in relation to the x-axis predictor).  To account for the possible difference in the slopes of the curve, the model should include a cross-product interaction term in it that indicates how differences in religiosity affect the impact of differences in political outlooks in “believing” in human-caused climate change.

I’ve fit such a model, the parameters of which are in the table in the inset.

That  regression actually corroborates, as it were, what we “saw” in the raw data: the parameter estimates for both religiosity and political outlooks “matter” (they have values that are practically and statistically significant), and so does the parameter estimate for the cross-product interaction term.