Some of the thoughtful things people said in connection with my 3-part series on the “external validity” of science-communication studies made me realize that it would be helpful to say a bit more about that concept and its connection to doing evidence-based science communication.
In the posts, I described “internal validity” as referring to qualities of the design that support drawing inferences about what is happening in the study, and “external validity” as referring to qualities of the design that support drawing inferences from the study to the real-world dynamics it is supposed to be modeling.
I’m going to stick with that.
But what makes me want to elaborate is that I noticed some people understood me to be referring to “external validity” more broadly as the amenability of a science-communication study to immediate or direct application. I was thought to be saying “be careful: you can’t just take the stimulus of a ‘framing’ experiment or whathaveyou, send it to people in the mail or wave it around, etc., and expect to see the results from the lab reproduced in the world.”
I would (often) say that!
But I’d say it about many studies that are externally valid.
That is, these studies are modeling something of consequence in the world, and telling us things about how those dynamics work that it is important to know. But they aren’t always telling us what to do to make effective use of that knowledge in the world.
That’s usually a separate question, requiring separate study.
This is the very point I stress in my paper, “Making Climate Science Communication Evidence-based—All the Way Down.” There I say there must be no story-telling anywhere in an evidence-based system of science communication.
It’s a mistake—an abuse of decision-science—for someone (anyone, including a social scientist) to reach into the grab-bag of mechanisms, pull out a few, fabricate a recommendation for some complicated phenomenon, and sell it to people as “empirically grounded” etc.
Because there are in fact so many real mechanisms of cognition that play a role in one or another aspect of risk perception and the like, there will always be more plausible accounts of some problem—like the persistence of public conflict over climate change—than are true!
Such accounts are thus conjectures or hypotheses that warrant study, and should be clearly designated as such.
The hypotheses have to be tested—with internally and externally valid methods—designed to generate evidence that warrants treating one or another conjecture as more worthy of being credited than another.
Very very important!
But almost never enough.
The kinds of studies that help to decide between competing plausible mechanisms in science communication typically simplified models of the real-world problem in question. The models deliberately abstract away from the cacophony of influences in those settings that make it impossible to be sure what’s going on.
An internally valid study is one that has successfully isolated competing mechanisms from these confounding effects and generated observations that give us more reason to credit one, and less reason to credit the other, than we otherwise would have had.
(Yes, one can test “one” mechanism against the “null” but then one is in effect testing that mechanism against all others. Such designs frequently founder on the shoals of internal validity precisely because, when they “reject the null,” they fail to rule out that some other plausible mechanism could have produced the same effect. I’ll elaborate on why it makes more sense to use designs that examine the relative strength of competing mechanisms instead “tomorrow.”)
Such a study is useful, of course, only if the mechanisms that are being tested really are of consequence in the real-world, and only if the simplifying model hasn’t abstracted away from influences of consequence for the operation of those mechanisms.
That’s the focus of external validity.
But once someone has done all that—guess what?
Such a study won’t (or at least almost never will ) tell a real-world communicator “what to do.”
How could it? The researcher, in order to be confident that she is observing the influence of the mechanisms of interest and that they are behaving in ways responsive to whatever experimental manipulation she performed, has deliberately created a model that abstracts away from all myriad influences that apply in any particular real-world setting,
If the study succeeds, it helps to identify what plausible mechanisms of consequence a real-world communicator should be addressing and—just as importantly—which plausibly consequential ones he should in fact ignore.
But there will be more plausible ways to engage that mechanism in a manner that reproduces in the world the results the experimenter observed in the lab—than are true, too!
The only way to connect the insight generated by the lab study to the real-world is to do in the real-world exactly what was done in the lab to sort through the surplus of plausible conjectures: that is, by constructing internally and externally valid field studies that give real-world communicators more reason to believe than they had before that one plausible conjecture about how to engage the communication mechanism of consequence is more likely correct than another one.
In other words, evidence-based science communication practice must be evidence based all the way down.
No story telling in lieu of internally and externally valid studies of the mechanisms of cognition that one might surmise is at work.
And no story telling about how a lab study supports one or another real-world strategy for communication.
Researchers who carry on as if that their lab studies support concrete prescriptions in particular real-world settings are being irresponsible. They should instead be telling real-world communicators exactly what I’m saying here—that field testing, informed by the judgment of those who have experience in the relevant domain—are necessary.
And if they have the time, inclination, and patience, they should then offer to help carry out such studies.
This is the m.o. of the Southeast Florida Evidence-based Science Communication Initiative that the Cultural Cognition Project, with very generous and much appreciated funding from the Skoll Global Threats Fund, is carrying out in support of the science-communication efforts of the Southeast Florida Climate Compact.
But now, getting back to the “external validity” concept, it should be easier to see that when I say a study is "externally invalid," I’m not saying that it doesn’t generate an immediately operational communication strategy in the field.
But the same can be said for almost all externally valid lab studies.
When I say that a study isn’t “externally valid,” I’m saying it is in fact not modeling the real-world dynamics of consequence. Accordingly, I mean to be asserting that it furnishes no reliable guidance at all.
So to be clear about all this, let’s add a new term to the discussion: operational validity.
“Operational validity,” a term I’m adapting from Schellenberger (1974), refers to that quality of a study design that supports the inference that doing what was done in the study will itself generate in the real-world the effects observed in the study.
A study has “high operational validity” if in fact it tests a communication-related technique that real-world actors can themselves apply and expect to work. For the most part, those will be field-based studies.
A study that is internally and externally valid has “low operational validity” if, in order for it to contribute to science communication in the real-world, additional empirical studies connecting that study’s insights to one or another real-world communication setting will still need to be performed.
A study with “low operational validity” can still be quite useful.
Indeed, there is often no realistic way to get to the point where one can conduct studies with high operational validity without first doing the sort of stripped-down, pristine “low operational validity” lab studies suited to winnowing down the class of cognitive mechanisms plausibly responsible for any science-communication problem.
But the fact is that when researchers have generated these sorts of studies, more empirical work must still be done before a responsible science-communication advisor can purport to answer the “what do I do?” question (or answer it other than by saying "you tell me! & I'll measure ...").
Three distinct concepts: internal validity; external validity; operational validity.
All three matter.
This is, admittedly, too abstract a discussion. I should illustrate. But I’ve spent enough time on this post (about 25 mins; 30 mins is the limit).
If there is interesting discussion, then maybe I’ll do another post calling attention to examples suggested by others or crafted by me.