Wanna see more data? Just ask! Episode 1: another helping of GM food

Okay, here’s a new feature for the blog: “Wanna see more data? Just ask!”

The way it works is that if anyone sees interesting data in one of my posts, or in any of our studies (assuming it was one I worked on; for others, I’ll pass on requests but don’t necessarily expect an answer; some of my colleagues have actual lives), and has some interesting question that could be addressed by additional analyses, that person can post a request (in comments section or by email to me) & I’ll do the analyses and post the results.

Now notice I said the question has to be “interesting.” Whether it meets that standard is something I’ll decide, using personal judgement, etc. But here are some general, overlapping, related criteria:

1.  The request has to be motivated by some conjecture or question.  Basically, you have have some sort of theoretically grounded hypothesis in mind that can be tested by the analysis you’d like to see. The most obvious candidate would be a conjecture/question/hypothesis that’s in the sprit of a plausible alternative explanation for whatever conclusion it was that I reached (or the study did) in presenting the data in the first place. But in any case, give some indication (can be brief; should be!) of what the question/hypothesis/conjecture that you are curious about is & why.

2. Tell me how I can do the analysis and why doing it that way can be expected to generate some result that gives us more reason to accept a particular answer to the motivating question, or more rason to accept or reject the motivating hypothesis, than we would have had without the analysis.  The “how to do” part obviously will be constrained by what sorts of variables are in the dataset. Usually we have lots of demographic data as well as our cultural outlook variables and so forth. The “why” question requires specifying the nature of the causal inference that you think can be drawn from the analysis.  It’s gotta make sense to be interesting.

3. No friggin’ fishin trips! Don’t ask me to correlate global warming with the price of cottage cheese just because you think that would be an amusing thing to do.

4. Don’t even think of asking me to plug every conceivable variable into the right-hand side of a regression and see what sort of gibberish pops out. Of course, I’m happy to do multivariate analyses, but each variable has to be justified as part of a model that relates in a specifiable way to the interesting conjecture motivating the request and to the nature of the inference that can be drawn from the analysis. Or to put it another way, the analysis has to reflect a cogent modelling strategy. Overspecified regression analyses are usually a signature of the lack of a hypothesis — people just see what turns out to be significant (something always will with enough variables) & then construct a post-hoc, just-so story for the result. In addition, the coefficients for overspecified models are often meaningless phantoms– the impact of influences “holding constant” influences that in the real world are never “constant” in relation to those influences…. I’ll write another post on why “over-controlling” is such a pernicious, mindless practice….

Okay. This first installment is responsive to questions posed in response to “part 3” of the GM food risk series. Disccusants there were curious about whether the “middling” mean score for the GM food risk item was best understood as “not sure; huh?,” as I proposed, or as a genuine, mid-level of concern. One suggested seeing some more raw data might help, and on reflection I can think of some ways to look at them that might, at least a bit.

Consider these histograms, which reflect the distribution of responses to the 8-point industrial-strength risk perception item for “Global warming” (left) and “Genetically modified foods” (right):

Here are some things to note. First, GM food distribution is much more “normal” — bell shaped — than the global warming distribution. Indeed, if you compare the black line — the statistical “normal density distribution” given the mean & SD for the global warming data –with the red one — the kernel density plot, which “fits” a locally weighted regression to the data– you can see that the distribution for global warming risk perceptions is closer to bimodal, meaning that the subjects are actually pretty divided between those who see “low risk” and those who see “high.”  There’s not so much division for GM foods.

Second, the GM foods distribution has a kind of a fat mid-point (low kurtosis). That’s because a lot of survey respondents picked “3,” “4,” & “5.” Because an excess of “middle choices” is a signature of “umm, not sure” for risk perception measures of this sort, I am now even more persuaded that the 800 members of this nationally representative sample didn’t really have strong views about GM foods in relation to the other risks, all of which were ones that displayed substantial cultural polarization.

But my confidence in this conclusion is only modest.  The cases in which a middling mean signifies generalized “don’t know” often have much more dramatic concentrations of responses toward the middle of the scale (high kurtosis); indeed, the labels that were assigned to each point on the likert item risk-perception measure were designed to mitigate the middle/don’t-know effect, which is usually associated with scales that ask respondents to estimate a probability for some contingency (in which case people who don’t know mean to convey that with “50%.”).

Now consider these two figures:

These are the kernel density estimates for responses to these two risk-perception items when the sample is split at the mean of the “individualism-communitarianism” scale. Basically, the figures allow us to compare how “individualists” and “communitarians” are divied on global warming (left) and GM foods (right).

Do you see what I do? The individualists and communitarians are starkly divided on climate change: the latter is skewed strongly toward high risk, and the former toward low (although perhaps a bit less so; if I looked at “hierarch individualists,” you’d really see skewing). That division (which, again, is compounded when the hierarchical disposition of the subjects is taken into account as well) is the source of the semi-bimodal distribution of responses to the global warming item.

Now look at individualists & communitarians on GM foods. They see more or less eye-to-eye. This is corroboration of my conclusion in the last post that there isn’t, at least not yet, any meaningful cultural division over GM foods. (BTW, the pictures would look the same if I had divided the subjects into “hierarchs” and “egalitarians”; I picked one of the two worldview dimensions for the sake of convenience and clarity).

Whaddya think? Wanna see some more? Just ask!


de Bruin, W.B., Fischhoff, B., Millstein, S.G. & Halpern-Felsher, B.L. Verbal and Numerical Expressions of Probability: “It’s a Fifty–Fifty Chance”. Organizational Behav. & Human Decision Processes 81, 115-131 (2000)

Leave a Comment