follow CCP

Recent blog entries

Homework assignment: what's the relationship between science literacy & persistent political conflict over decision-relevant science?

I've agreed to do a talk at the annual American Geophysical Union in December. It will be part of a collection on "climate science literacy."

Here's the synopsis I submitted:

The value of civic science literacy

The persistence of public conflict over climate change is commonly understood to be evidence of the cost democracy bears as a result of the failure of citizens to recognize the best available decision-relevant science. This conclusion is true; what’s not is the usual understanding of cause and effect that accompanies this perspective. Ordinarily, the inability of citizens to comprehend decision-relevant science is identified as the source of persistent political conflict over climate change (along with myriad other issues that feature disputed facts that admit of scientific investigation). The truth, however, is that it is the persistence of public conflict that disables citizens from recognizing and making effective use of decision-relevant science. As a result, efforts to promote civic science literacy can’t be expected to dissipate such conflict. Instead, the root, cultural and psychological sources of such conflict must themselves be extinguished (with the use of tools and strategies themselves identified through valid scientific inquiry) so that our democracy can realize the value of educators' considerable skills in making citizens science literate. 

I have ideas along these lines -- ones that have figured in various papers I've written, informed various studies I've worked on, and appeared in one or another blog posts on this site.

But I haven't come close to working all this out.  

What's more, I worry (as always) that I could be completely wrong about everything.

So I welcome reflections by others on the basic claim expressed here-- reflections on how to convey it effectively; on what to do about the practical problem it reflects; but also on how to continue to probe and test to see whether it is true and to help identify any alterative account that's even more well founded and that furnishes an even more useful guide to action.

So get going-- don't put this off until the day before the talk & pull an all nighter! 



Can we SENCERize the communication of science?

I had the tremendous privilege—which yielded an even larger benefit in enlargement of personal knowledge—of being able to participate in the SENCER summer institute at Santa Clara University last week.

SENCER—which stands for Science Education for New Civic Engagements and Responsibilities—is an integrated set of practical research initiatives aimed at promoting the development and use of scientific knowledge on how to teach science.  It is actually one of a family of programs create to carry out the broader mission of the National Center for Science and Civic Engagement, “to inspire, support, and disseminate campus-based science education reform strategies that strengthen learning and build civic accountability among students in colleges and universities.”

It’s not amusing that those job it is to impart knowledge on empirical methods so infrequently even ask themselves whether their own methods for doing so—from the mode of teaching they use in the classroom to the materials and exercises they assign to students to the examinations they administer to test student comprehension—are valid and reliable.

On the contrary, it’s an outright scandal that demeans the culture of science.

SENCER comprises a sprawling, relentless, and expanding array of resources aimed at dissolving this embarrassing contradiction. These include a growing stockpile of empirical research findings; a trove of practical materials designed to enable use of this knowledge to improve science education; the sponsorship of regular events at which such knowledge is shared and plans for enlarging it formulated; a set of regional centers that coordinate efforts to promote evidence-based methods in the teaching of science; and most important of all a critical mass of intelligent and passionate people committed to the program’s ends.

The occasion for SENCER—the peculiar insularity of a craft dedicated to propagating valid empirical methods from empirical evidence relating to the realization of its own goals—is not unique to science education.

It is at the root, too, of what I have called the science communication problem—the failure of ample, compelling, readily accessible and indeed widely disseminated evidence to quiet persistent public controversy over risks and other facts to which that evidence directly speaks. Climate change is, of course, the most conspicuous example of the science communication problem but it is hardly the only consequential instance of it.

Immense resources are being dedicated to solving this problem and appropriately so.

But the aggressive resistance to evidence-based practice that pervades the climate-change advocacy community and their counterparts on other issues means that the vast majority of these resources are simply wasted. 

I’m not kidding: hundreds and hundreds of millions of dollars are foreseeably expended on programs that are certain not to have any positive impact (aside from raising the profile of those who operate the programs)—not so much because the initiatives being sponsored are ill-considered (although many indisputably are!) but because those who are being awarded the money to carry them out aren’t genuinely committed (or maybe just not genuinely capable) of considering empirical evidence. 

They don’t meaningfully engage existing evidence on communication dynamics to determine what psychological and political mechanisms their initiatives presuppose and what is known about those mechanisms.

They don’t carry out their initiatives in a manner that is geared to generating what might be called programmatic evidence in the form of pretest results or early-return data that can be used to refine and calibrate communication efforts as they are unfolding.

And worst of all, they lack any protocols that assure information on the impact of their efforts (including the lack thereof) is collected, preserved, and freely distributed in the manner that enables the progressive accretion of knowledge.

Instead, every surmise from every source—no matter how innocent of the conclusions of those who have previously used scientific methods to test theirs—is created equal in the world of science communication advocacy. 

Everyday is a new day, to be experienced free of the burden to take seriously what was learned (from failure as well as success) the day before.

I have written a paper about this.

So has Amy Luers, in a perceptive, evidence-informed article in Climatic Change that was addressed specifically to the foundations that are the primary sources of support for efforts to promote constructive engagement with climate science.

Her article is evidence of a heartening awareness that the evidence-free culture that has characterized science communication in this area of public policy and others is barren of the supportive practices and habits and outlooks that nourish growth of empirical knowledge.

Maybe things will change.

But there are still other science-communication professions that are puzzlingly—unacceptably, intolerably!—innocent of science in their own operations.

Science journalism—including (here) popular science writing and science documentary production as well as science news writing—is one. 

I have said before that I regard these professionals with awe—and gratitude, too.  Much as the bumblebee defies the calculations of physicists who insist that their capacity for flight defies physical laws, so science journalists seem to defy basic mechanisms of psychology by creating a form of commensurability in understanding that enables the curious nonscientist to participate in—and thus experience the wonder of—what scientists, by applying their highly specialized knowledge, discover about the mysteries of nature.

There is no communication alchemy involved here. Using a form of professional judgment exquisitely tuned by experience, the science journalist mines the fields of common cultural understanding for the resources needed to construct this remarkably engineered bridge of insight.

Yet how to do what they do is a matter that constantly confronts the members of this special profession with factual questions that they themselves do not have confident answers to—or have confident but conflating opinions about.

Do norms of journalistic neutrality—such as “balanced” coverage of science issues that generate controversy, within science or without—distort public understanding or help inform curious individuals of the nature of competing claims?

Is the segment of the population that experiences wonder and awe at scientific discovery more culturally diverse than the one than the current regular consumers of the highest quality science documentaries? If so, do those programs convey meanings collateral to their core, scientific content that constrain the size and diversity of their audience?

(These are issues that figured, actually, in two of the sessions of my Science of Science Communication course from last spring; I am delinquent in my promise to report on the nature of those sessions.)

These are empirical questions, ones the answers to which would be made better if journalists had evidence generated specifically to informing the ongoing collective discussion and practice that are the source of their craft knowledge.  But instead, we see here, too, the sort of “every-conjecture-created-equal,” “every-day-a-new-day” style of engagement that is the signature of evidence-free, nonscientific thought that by its nature is incapable of creating incremental enlargement of knowledge.

I could go on; not just about science journalism, but about many other evidence-or science-communication professions that are evidence-free about the nature of their own practices. Like the law, e.g.

But the point is that these professions, too, are ripe for SENCERizing.  They need to be fortified with the sorts of resources and programs that SENCER comprises.  And to get that fortification they require a core of practitioners who not only agree with this philosophy—I think they all already have them, actually—but also structures of collective action that will, through the dynamics of reciprocity, create the self-reinforcing contributions of those practioners to those resources and programs.

SENCER itself might be well be a vehicle for such developments.  It’s gracious invitation to me to participate in its summer institute reflects the interest of its members in enlarging the scope of their endeavor to the communication of decision-relevant science.

But it would be a mistake to think that SENCERizing science communication generally means relying on SENCER, or SENCER alone, to facilitate the advent of evidence-based practices within the relevant science-communication professions.

The remarkable founder of SENCER, Wm. David Burns, made this clear to me, in fact.

I asked him if he himself regarded the program as an “engine for” or a “model of” what needs to be done to make science education and science communication generally more evidence based.

He answered that the only appropriate way to think of SENCER is as an “experiment” of a fractal nature: by enabling those who believe science education must be evidence based to continuously form, refine, and test competing conjectures about how to build on and refine their knowledge of how to effectively impart scientific knowledge, SENCER itself is a test of a hypothesis that the particular mode of organization that it is and will become in such a process is an effective way to achieve its own ends.

SENCER, then, is surely a model (an iterative, self-updating one at that!) of the style of conjecture and refutation that is the engine that drives scientific discovery.

And such a model is necessarily one that cannot be reduced to a particular form or formula. For the very logic on which its own success is founded consists in the continuous engagement of competing models, whose successive remedies for one another's inevitable imperfections are what continuously make us smarter than we were before.


Weekend update: Yale professor does *what*, you say?

Maybe @Paul Mathews has a point after all, but I think the commenter who offered to sell the Brooklyn Bridge to the author of this blog post has the better of the argument (you might have thought the title of my post would have given him a clue as well).

Needless to say, I am a tad anxious about Preet Bharara getting wind of all this...


Motivated system 2 reasoning--experimental evidence & its significance for explaining political polarization

My paper Ideology, Motivated Reasoning, and Cognitive Reflection was published today in the journal Judgment and Decision Making.

I’ve blogged on the study that is the focus of the paper before.  In those posts, I focused on the relationship of the study to the “asymmetry thesis,” the view that ideologically motivated reasoning is distinctive of (or at least disproportionately associated with) conservativism.

The study does, I believe, shed light on (by ripping a fairly decent-sized hole in) the asymmetry thesis. But the actual motivation for and significance of the study lie elsewhere.

The cultural cognition thesis (CCT) holds that individuals can be expected to form risk perceptions that reflect and reinforce their connection to groups whose members subscribe to shared understandings of the best life and the ideal society.

It is opposed to various other accounts of public controversy over societal risks, the most significant of which, in my view, is the bounded rationality thesis (BRT)

Associated most prominently with Kahneman’s account of dual process reasoning, BRT attributes persistent conflict over climate change, nuclear power, gun control, the HPV vaccine, etc. to the public’s over-reliance on rapid, visceral, affect-laden, heuristic reasoning—“System 1” in Kahneman’s terms—as opposed to more deliberate, conscious, analytical reasoning— “System 2,” which is the kind of thinking, BRT theorists assert, that characterizes the risk assessments of scientists and other experts.

BRT is quite plausible—indeed, every bit as plausible, I’m happy to admit—as CCT. Nearly all interesting problems in social life admit of multiple plausible but inconsistent explanations.  Likely that’s what makes them interesting.  It’s also what makes empirical testing—as opposed to story-telling—the only valid way to figure out why such problems exist and how to solve them

In my view, every Cultural Cognition Project study is a contribution to the testing of CCT and BRT.  Every one of them seeks to generate empirical observations from which valid inferences can be drawn that give us more reason than we otherwise would have had to view either CCT or BRT as more likely to be true.

click on it -- you know you can't resist!In one such study, CCP researchers examined the relationship between perceptions of climate change risk, on the one hand, and science literacy and numeracy, on the other. If the reason that the public is confused (that’s one way to characterize polarization) about climate change and other risk issues (we examined nuclear power risk perceptions in this study too) is that it doesn’t know what scientists know or think the way scientists think, then one would expect convergence in risk perceptions among those members of the public who are highest in science literacy and technical reasoning ability.

The study didn’t find that.  On the contrary, it found that members of the public highest in science literacy and numeracy are the most divided on climate change risks (nuclear power ones too).

That’s contrary to what BRT would predict, particularly insofar as numeracy is a very powerful indicator of the disposition to use “slow” System 2 reasoning.

That science literacy and numeracy magnify rather than dissipate polarization is strongly supportive of CCT.  If people are unconsciously motivated to fit their perceptions of risk and comparable facts to their group commitments, then those who enjoy highly developed reasoning capacities and dispositions can be expected to use those abilities to achieve that end.

In effect, by opportunistically engaging in System 2 reasoning, they’ll do an even “better” job at forming culturally congruent perceptions of risk.

Now enter Ideology, Motivated Reasoning, and Cognitive Reflection. The study featured in that paper was aimed at further probing and testing of that interpretation of the results of the earlier CCP study on science literacy/numeracy and climate change polarization.

The Ideology, Motivated Reasoning, and Cognitive Reflection study was in the nature of experimental follow up aimed at testing the hypothesis that individuals of diverse cultural predispositions will use their “System 2” reasoning dispositions opportunistically to form culturally congenial beliefs and avoid forming culturally dissonant ones.

The experiment reported in the paper corroborates that hypothesis.  That is, it shows that individuals who are disposed to use “System 2” reasoning—measured in this study by use of the Cognitive Reflection Test, another performance based measure of the disposition to use deliberate, conscious (“slow”) as opposed to heuristic-driven (“fast”) reasoning—exhibit greater motivated reasoning with respect to evidence that either affirms or challenges their ideological predispositions.

The evidence on which subjects demonstrated motivated reasoning concerned how “closed-minded” and “unreflective” individuals of opposing ideologies are.

Closed mindedness” is a very undesirable trait generally.

It’s also what those on each side of politically polarized debates like the one over climate change identify as the explanation for the other’s refusal to accept what each side sees as the clear empirical evidence in favor of its own position.

One might thus expect individuals who have a stake in forming perceptions of facts congenial to their cultural commitments to react in a defensive way to evidence that those who share their commitments are less “open-minded” and “reflective” than those who harbor opposing commtiments.

So I tested that.  I advised subjects that psychological evidence suggests that the Cognitive Reflection Test measures “open-mindedness” (some psychologists take that position; I actually think they are wrong—as I’ll explain in a moment!).  Members of a control group were told no more than this.  But subjects in two other groups were told either that climate change “skeptics” score higher than climate change “believers” or vice versa.

I found that subjects displayed motivated reasoning with respect to the evidence of the “validity” of the Cognitive Reflection Test as a measure of “open mindedness.” That is, they credited the evidence that the CRT is a “valid” test of “open-mindedness” and “reflection” much more readily if they were advised that individuals who hold the climate-change position consistent with the subjects’ ideologies scored higher, but rejected that evidence when they were informed that those same individuals score lower, than individuals with the opposing position on climate change.

Moreover, this tendency was highest among individuals with the highest Cognitive Reflection Test scores.

That finding is highly inconsistent with BRT, which assumes that a deficit in System 2 reasoning capacities explains the failure of the members of the public to converge on conclusions supported by the best available decision-relevant science.

But it very much consistent with CCT, which predicts that individuals will use their System 2 reasoning capacities strategically and opportunistically to reinforce beliefs that the their cultural group’s positions on such issues reflect the best available evidence and that opposing groups’ positions do not.

It's consistent, too, with a growing collection of findings in political psychology.  This research shows not only that ideologically motivated reasoning drives political polarization (generating perverse effects, e.g., like hardening of commitment to mistaken beliefs when "fact checkers" try to correct false claims), but also that this effect intensifies as individuals become more sophisticated about politics.

Some could have attributed this effect to a convergence between political knowledge and intensity of partisanship.  But the result in my study makes it more plausible to see the magnification of polarization associated with political knowledge as reflecting the tendency of people who simply have a better comprehension of matters political to use their knowledge in an opportunistic way so as to maintain congruence between their beliefs and their ideological identities. (I've addressed before how "cultural cognition" relates to the concept of ideologically motivated reasoning generally, and will even say a bit more on that below.)

As for the asymmetry thesis, the study also found, as predicted, that this tendency was symmetric with respect to right-left ideology.  That’s not what scholars who rely on the “neo-authoritarian personality” literature—which rests on correlations between conservativism and various self-report measures of “open-mindedness”—would likely have expected to see here.

Interestingly, I also found that there is no meaningful correlation between cognitive reflection and conservativism.

The Cognitive Reflection Test is considered a “performance” or “behavioral” based “corroborator” of the self-report tests (like “Need for Cognition,” which involves agreement or disagreement with statements like “I usually end up deliberating about issues even when they do not affect  me personally” and "thinking is not my idea of fun") that are the basis of the neo-authoritarian-personality literature on which “asymmetry thesis” rests.

It has also been featured in numerous studies that show that religiosity, which is indeed negatively correlated with cognitive reflection, predicts greater resistance to engaging evidence that challenges pre-existing beliefs.

Accordingly, one might have expected, if the “asymmetry thesis” is correct, that Cognitive Reflection Test scores would be negatively correlated with conservativism.  Studies based on nonrepresentative samples—ones consisting of M Turk workers or of individuals who visited a web site dedicated to disseminating research findings on moral reasoning style—have reported such a finding.

But in my large, nationally representative sample, scores on the Cognitive Reflection Test were not meaningfully correlated with political outlooks.

Actually, there was a very small positive correlation between cognitive reflection and identification with the Republican Party.  But it was too tiny to be of any consequence for anything as consequentially large as the conflict over climate change.

Moreover, there was essentially zero correlation between cognitive reflection and a more reliable, composite measure of ideology and political party membership.

Because I think the only valid way to test for motivated reasoning is to do do experimental tests that feature that phenomenon, I don’t really care that much about correlations between cognitive style measures and ideology.

But if I were someone who did think that such correlations were important, I’d likely find it pretty interesting that conservativism doesn’t correlate with Cognitive Reflection Test scores.  Because this test is now widely regarded as a better measure of the disposition to engage in critical reasoning than are the variety of self-report measures on which the “asymmetry thesis” thesis literature rests—and, as I said, has been featured prominently in recent studies of the cognitive reasoning style associated with religiosity—the lack of any correlation between it and conservative political outlooks raises some significant questions about exactly what the correlations reported in that literature were truly measuring.

For this reason, I anticipate that “asymmetry thesis” supporters will focus their attention on this particular finding in the study.  Yet it’s actually not the finding that is most damaging to the “asymmetry thesis”; the experimental finding of symmetry in motivated reasoning is!  Indeed, I obviously don’t think the Cognitive Reflection Test—or any other measure of effortful, conscious information processing for that matter—is a valid test of open-mindedness (which isn't to say there might not be one; I'd love to find it!).  But it has been amusing—a kind of illustration of the experiment result itself—to see “asymmetry thesis” proponents, in various responses to the working paper version of the study, attack the the Cognitive Reflection Test as “invalid” as a measure of the sort of “closed mindedness” that their position rests on!

One final note:

The study characterizes differences in individuals’ predispositions with a measure of their right-left political leanings rather than their cultural worldviews. I’ve explained before that “liberal-conservative ideology” and “cultural worldviews” can be viewed as alternative observable “indicators” of the same latent motivating disposition.  I think cultural worldviews are better, but I used political outlooks here in order to maximize engagement with those researchers who study motivatated reasoning in political psychology, including those who are interested in the “asymmetry thesis,” the probing of which was, as indicated, a secondary but still important objective of the study. I have also analyzed the study data using cultural worldviews as the predisposition measure and reported the results in a separate blog post.


Weekend update 2: Money talks, bullshit on scientific consensus (including lack thereof) walks

The comment thread following yesterday's "update" on the persistent, and persistently unenlightening, debate over the most recent "97% consensus" study has only renewed my conviction that anyone genuinely interested in helping confused and curious members of the public to assess the significance of the best available evidence on climate change would not be bothering with surveys of scientists but would instead be creating a market index in securities the value of which depends on global warming actually occurring.

I've explained previously how such an index would operate as a beacon of collective wisdom, beaming a signal of considered judgment through a filter of economic self-interest that removes the distorting influence of cultural cognition & like forms of bias.

I just instructed my broker to place an order for $153,252 worth of stocks in firms engaged in arctic shipping. I wonder how many of the people arguing against the validity of the Cook et al. study are shorting those same securities?





Weekend update: The distracting, counterproductive "97% consensus" debate grinds on

I don’t want to go back there but since 10's of millions of people get all their news exclusively from this blog (oh, btw, there was a royal baby, everyone, in case any of you care) I felt that I ought to note that controversy continues to attend the Cook et al. study that, “97%” of climate scientists agree that human activity is contributing to climate change.

Studies making materially identical findings have been appearing at regular intervals for the better part of a decade. Every time, they are widely heralded; indeed, the media have been saturated with claims that there is “scientific consensus” on climate change since at least 2006, when Al Gore made that message the centerpiece of a $300-million effort to build public support for policies to reduce carbon emissions in the U.S.

But it is demonstrably the case (I'm talking real-world evidence here) that the regular issuance of these studies, and the steady drum beat of “climate skeptics are ignoring scientific consensus!” that accompany them, have had no—zero, zilch—net effect on professions of public “belief” in human-caused climate change in the U.S.

On the contrary, there’s good reason to believe that the self-righteous and contemptuous tone with which the “scientific consensus” point is typically advanced (“assault on reason,” “the debate is over” etc.) deepens polarization.  That's because "scientific consensus," when used as a rhetorical bludgeon, predictably excites reciprocally contemptuous and recriminatory responses by those who are being beaten about the head and neck with it.

Such a mode of discourse doesn't help the public to figure out what scientists believe. But it makes it as clear as day to them that climate change is an "us-vs.-them" cultural conflict, in which those who stray from the position that dominates in their group will be stigmatized as traitors within their communities.  

This is not a condition conducive to enlightened self-government.

Nevertheless, the authors of the most recent study announced (in a press release issued by the lead author’s university) that “when people understand that scientists agree on global warming, they’re more likely support politics that take action on it,” a conclusion from which the authors inferred that “making the results of our paper widely-known is an important step toward closing the consensus gap and increasing public support for meaningful climate change.”

Unsurprisingly, the study has in the months since its publication supplied a focal target for climate skeptics, who have challenged the methods the authors employ.

It’s silly to imagine that ordinary members of the public can be made familiar with results of particular studies like this.  

But it’s very predictable that they will get wind of continuing controversy over “what scientists believe” so long as advocates keep engaging in impassioned, bitter, acrimonious debates about the validity of studies like this one.

That’s too bad because, again, the best evidence on why the public remains divided on climate change is the surfeit of cues that the issue is one that culturally divides people.  Those cues motivate members of the public to reject any evidence of “scientific consensus” that suggests it is contrary to the position that predominates in their group. Under these circumstances, one can keep telling people that there is scientific consensus on issues of undeniable practical significance, and a substantial proportion of them just won’t believe what one is saying.

The debate over the latest “97%” paper multiplies the stock of cues that climate change is an issue that defines people as members of opposing cultural groups. It thus deepens the wellsprings of motivation that they have to engage evidence in a way that reinforces what they already believe. The recklessness  that the authors displayed in fanning the flames of unreason that fuels this dynamic is what motivated me to express dismay over the new study.

But look: Matters like these are admittedly complex and open to reasonable disagreement. I could be wrong, and I welcome evidence & reasoned argument that would give me reason to revise my views. In the best spirit of scholarly conversation, the lead author of the latest "97%" study, John Cook, penned a very perceptive, engaging, and gracious response--and I urge people to take a look at it & decide for themselves if my reaction was well-founded.

So what’s the new development?

Mike Hulme, a climate scientist who is famous for his own conjectures about public conflict over climate change has apparently added his voice to the chorus of critics.

I say apparently because the comments attributed to Hulme appear in a short on-line comment on a blog post that described an interview of the UK Secretary of State for Energy and Climate Change. I assume Hulme must be the actual author of the comment because no one seems to be challenging that and he hasn’t disavowed it. 

Anyway, in the comment, Hulme (assuming its him!) acidly states:

Needless to say, the comment—because it comes from a figure of significant stature among proponents of aggressive policy engagement with the risks posed by climate change—has lifted the frenzy surrounding the latest “97%” study to new heights (most noticeably in dueling twitter posts, a form of exchange more suited for playground-style taunting than serious discussion).

What to say?

First, what a sad spectacle.  Honestly, it’s hard for me to conceive of an issue that could be further removed from the important questions here—ones involving what the best empirical evidence reveals about climate change and about the pathologies that make public debate impervious to the same—than whether the latest “97%” study is “sound.”

Second, I think Hulme’s frustration, while probably well-founded, is not as well articulated as it should be.  What exactly does he mean, e.g., when he says “public understanding of the climate issue has moved on”?  The statement admits of myriad interpretations, many of which would be clearly false (such as that polarization in the U.S., e.g., has abated). 

Of course, it's not reasonable to expect perfect clarity or cogency in 5-sentence blog comment. Hulme has written a very thoughtful essay in which he presents an admirably clear and engaging case against trying to buy public consensus in the currency of appeals to the authority of "scientific consensus." His argument is founded on the manifestly true point that science's way of knowing consists neither in nose counting nor appeals to authority--and to proceed as if that weren't so demeans science and makes the source of the argument look like a fool.

My position is slightly different from his, I think.

I'd say it makes perfect sense for the public to try to give weight to what they perceive to be the dominant view on decision-relevant science. Indeed, it's a a form of charming but silly romanticism to think that ordinary members of the public should "take no one's word for it" (nullius in verba) but rather try to figure out for themselves who is right when there are (as is inevitably so) debates over decision-relevant science.

Members of the public are not experts on scientific matters. Rather they are experts in figuring out who the experts are, and in discerning what the practical importance of expert opinion is for the decisions they have to make as individuals and citizens.  

Ordinary citizens are amazingly good at this.  Their use of this ability, moreover, is not a substitute for rational thought; it is an exercise rational thought of the most impressive sort.

But in a science communication environment polluted with toxic partisan meanings, the faculties they use to discern what most scientists believe are impaired.

The problem with the suggestion of the authors' of the latest "97%" study that the key is to "mak[e] the results of [their] paper widely-known" is that it diverts serious, well-intentioned people from efforts to clear the air of the toxic meanings that impede the processes that usually result in public convergence on the best available (and of course always revisable!) scientific conclusions about people can protect themselves from serious risks.

Indeed, as I indicated, the particular manner in which the "scientific consensus" trope is used by partisan advocates tends only to deepen the toxic fog of cultural conflict that makes it impossible for ordinary citizens to figure out what the best scientific evidence is. 

Meanwhile, time is “running out.”  On what? Maybe on the opportunity to engage in constructive policies on climate change.

But more immediately, time is running out on the opportunity to formulate a set of genuinely evidence-based strategies for promoting constructive engagement with the IPC’s 5th Assessment, which will be issued in installments beginning this fall. It will offer an authoritative statement of best current evidence on climate change. 

Much of what it has to say, moreover, will consist in important revisions and reformulations of conclusions contained in the 4th Assessment.

That’s inevitable; it is in the nature of science for all conclusions to be provisional, and subject to revision with new evidence.

In the case of climate change, moreover, revised assessments and forecasts can be expected to occur with a high degree of frequency because the science involved consists in iterative modeling of complex, dynamic systems—a strategy for advancing knowledge that (as I’ve discussed before) self-consciously contemplates calibration through a process of prediction & error-correction carried out over time.

My perspective is limited, of course. But from what I see, it is becoming clearer and clearer that those who have dedicated themselves to promoting public engagement with the best available scientific evidence on climate change are not dealing with the admittedly sensitive and challenging task of explaining why it is normal, in this sort of process, to encounter discrepancies between forecasting models and subsequent observations and to adjust the models based on them.  And why such adjustment in the context of climate change is causefor concluding neither that “the science was flawed” nor that “there is in fact nothing for anyone to be concerned about.”

Part of the evidence, to me, that they aren’t preparing to do this is how much time they are wasting instead debating irrelevant things like whether “97%” of scientists believe a particular thing.

p.s. Please don’t waste your & readers’ time by posting comments saying (a) that I am arguing there isn’t scientific consensus on issues of practical significance on climate change (I believe there is); (b) that I think it is “unimportant” for the public to know that (it’s critical that that it be able to discern this); or (c) that I am offering up no “alternative” to continuing to rely on a strategy that I say doesn’t work (not true; but if it were-- then what? I should nod approvingly if you propose that we all resort to prayer, too?).  Not only are none of these things either stated or implied in what I’ve written. They are mistakes that I’ve corrected multiple times (e.g., here, here, here . . .).




Dual process reasoning, Liberalism, & disgust

Interesting discussion ongoing in connection with Yoel Inbar's guest post Is Disgust a Uniquely "Conservative" Moral Emotion? I think the contributions made to it so far are more interesting than anything I have to say today, and I am loath to preempt additional contributions to that discussion. So today is an official "more discussion" day.

But just to give a sense of the nature of the matters being discussed, among the interesting questions that came up (in an exchange w/ Inbar initiated by Jon Baron)  is the relationship between the "disgust is conservative" thesis (DIC) and dual-process reasoning theories (DRT) in moral psychology.  Consider two possibilities:

A. The two could be combined. E.g., one could take the view (1) that moral reasoning is reliable & valid only when it is guided either exclusively by conscious reflection or by intuitive sensibilities including emotions the content of which would be validated by reflection; (2) that disgust is unreliable because either unreflective or, on reflection, not valid because on reflection not susceptible to validation by a normatively defensible moral theory; and (3) disgust is characteristically "conservative" either b/c conservatism is associated with a cognitive style hostile to cognitive reflection or b/c disgust involves moral appraisals that on reflection are "conservative"--or, more interestingly, illiberal in the sense of being antagonistic to key premises of Liberalism understood in the political philosophical sense.

B. Alternatively, one could separate DIC from DRT.  The validity of moral reasoning, on this account, doesn't depend on it involving or being validated by reflection. Indeed, one might believe that emotions and other "automatic," "intuitive," "unconscious," "perceptive" etc. forms of cognition play some indispensable role in moral reasoning-- a role that can't be reproduced by conscious reflection, etc. On this view, then, diverse moral styles would be distinguished not by the degree of reflection they involve, necessarily, but by the nature of the appraisals that are embodied in the emotions that those who subscribe to them use to size up goods and states affairs.  "Disgust" would be "conservative," this account would say, insofar as "disgust" reliably guides appraisals to the ones that fit the "conservative" moral style. But "liberals" would then be understood to be relying on some alternative emotion or set of emotions calibrated to generating "liberal" perceptions and related affective stances toward those same goods and states of affairs

Baron, as I understood him, was taking issue with Inbar on the assumption that Inbar subscribed to something like position A.  Inbar replied that he was somewhere closer to B -- or at least that he thought "liberals" as well as "conservatives" were relying on emotion to the same extent in their reasoning; he expressed uncertainty as to whether emotion is simply a heuristic substitution for reflection in moral reasoning or a unique and indispensable ingredient of it.

I had tried to identify scholars who clearly are committed to either A or B.  I proposed Martha Nussbaum for B.  For A, I suggested maybe John Jost, although in fact he has not (as far as I know) written about disgust. I suggested that I saw Haidt as sometimes A & sometimes B, although Inbar offered that he viewed Haidt as pretty clearly in the camp B.

As it turns out, I happened to read an excellent article yesterday that is pure, unadulterated A

We review evidence that disgust, in the context of bodily moral violations, differs from other emotions of moral condemnation, particularly anger, in three different senses of the word unreasoning. First, bodily moral disgust is weakly associated with situational appraisals, such as whether a behavior is harmful or justified.Instead, it tends to be based on associations with a category of object or act; certain objects are just disgusting. Second, bodily moral disgust is relatively insensitive to context, both in thoughts and behaviors, and therefore disgust is less likely to change from varying contexts. Third, bodily moral disgust is less likely to be justified with external  reasons; instead, persons often use their feelings of disgust as a tautological justification. These unreasoning traits can make disgust a problematic sociomoral emotion for a liberal society because it ignores factors that are important to judgments of fairness, such as intentionality, harm, and justifiability.

Very much worth reading! And further evidence, as Inbar emphasized in his excellent post, that debate in this area remains vibrant and ongoing.

There were other interesting issues under debate too, including regular commentator Larry's surmise that disgust is a kind of feigned strategic posturing on the part of "liberals."

I propose that additional comments -- I hope there will be some! -- be added to the existing trail originating in Yoel's post.


"Integrated & reciprocal": Dual process reasoning and science communication part 2

This is the second in what was to be a two-part series on dual process reasoning and science communication.  Now I’ve decided it must be three!

In the first, I described a conception of dual process reasoning that I don’t find compelling. In this one, I’ll describe another that I find more useful, at least for trying to make sense of and dispel the science communication problem. What I am planning to do in the 3rd is something you’ll find out if you make it to the end of this post.

A brief recap (skip down to the red type below if you have a vivid recollection of part 1):

Dual process theories (DPT) have been around a long time and come in a variety of flavors. All the various conceptions, though, posit a basic distinction between information processing that is largely unconscious, automatic, and more or less instantaneous, on the one hand, and information processing that is conscious, effortful, and deliberate, on the other. The theories differ, essentially, over how these two relate to one another.

In the first post I criticized one conception of DPT, which I designated the “orthodox” view to denote its current prominence in popular commentary and synthetic academic work relating to risk perception and science communication.

The orthodox conception, which reflects the popularity and popularization of Kahneman’s appropriately influential work, sees the “fast,” unconscious, automatic type of processing—which it refers to as “System 1”—as the default mode of processing.

System 1 is tremendously useful, to be sure. Try to work out the optimal path of evasion by resort to a methodical algorithm and you’ll be consumed by the saber-tooth tiger long before you complete your computations (etc).

But System 1 is also prone to error, particularly when used for assessing risks that differ from the ones (like being eaten by saber-tooth tigers) that were omnipresent at the moment of our evolutionary development during which our cognitive faculties assumed their current form.

Our prospects for giving proper effect to information about myriad modern risks—including less vivid and conspicuous but nevertheless highly consequential ones, like climate change; or more dramatic and sensational but actuarially less significant ones like those arising from terrorism or from technologies like nuclear power and genetically modified foods the benefits of which might be insufficiently vivid to get System 1’s attention—depends on our capacity, time, and inclination to resort to the more effortful, deliberate, “slow” kind of reasoning, which the orthodox account labels “System 2.”

This is the DPT conception I don’t like.

I don’t like it because it doesn’t make sense.

The orthodox position’s picture of “reliable” System 2 “monitoring” and “correcting” “error-prone” System 1 commits what I called the “System 2 ex nihilo fallacy”—the idea that System 2 crates itself “out of nothing” in some miraculous act of spontaneous generation.

Nothing makes its way onto the screen of consciousness that wasn’t instants earlier floating happily along in the busy stream of unconscious impressions.  Moreover, what yanked it from that stream and projected it had to be some unconscious mental operation too, else we face a problem of infinite regress: if it was “consciously” extracted from the stream of unconsciousness, something unconscious had to tell consciousness to perform that extraction.

I accept that the sort of conscious reflection on and re-assessment of intuition associated with System 2 truly & usefully occurs.  But those things can happen only if something in System 1 itself—or at least something in the nature of a rapid, automatic, unconscious mental operation—occurs first to get System 2's attention.

So the Orthodox DPT conception is defective. What’s better?

I will call the conception of DPT that I find more compelling “IRM,” which stands for the “integrated, reciprocal model."

The orthodox conception sees “System 1” and “System 2” as discrete and hierarchical.  That is, the two are separate, and System 2 is “higher” in the sense of more reliably connected to sound information processing.

“Discrete and hierarchical” is clearly how Kahneman describes the relationship between the two modes of information processing in his Nobel lecture.

For him, System 1 and 2 are "sequential": System 1 operations automatically happen first; System 2 ones occur next, but only sometimes. So the two are necessarily separate. 

Moreover, what System 2 does when it occurs is check to see if System 1 has gotten it right. If it hasn’t, it “corrects” System 1’s mistake. So System 2 “knows better,” and thus sits atop the hierarchy of reasoning processes within an ordering that ranks their contribution to rational thought.

IRM sees things differently. It says that “rational thought” occurs as a result of System 1 and System 2 working together, each supplying a necessary contribution to reasoning. That’s the integrated part.

Moreover, IRM posits that the ability of each to make its necessary contribution is dependent on the other’s contribution. 

As the “System 2 ex nihilo” fallacy helps us to see, conscious reflection can make its distinctive contribution only if summoned into action by unconscious, automatic System 1 processes, which single out particular unconscious judgments as fit for the sort of interrogation that System 2 is able uniquely to perform.

But System 1 must be seletctive:  there are far too many unconscious operations going on for all to be monitored, much less forced onto the screen of conscious tought, which would be overwhelmed by such indiscriminate summoning! But in being selective, it has to pick out the "right" impressions for attention & not ignore the ones unreflective reliance on which would defeat an agent's ends.  

How does System 1 learn to perform this selection function reliably? From System 2, of course.

The ability to perform the valid conscious reasoning that consists in making valid inferences from observation, and the experience of doing so regularly, are what calibrate unconscious processes, and train them to select some for the attention System 2, which is then summoned to attend to them. 

When it is summoned, moreover, System 2 does exactly what the orthodox view imagines: it checks and corrects, and on the basis of mental operations that are indeed more likely to get the “right” answer than those associated with System 1.  That event of correction will itself conduce to the calibration and training of System 1.

That’s the reciprocal part of IRM: System 2 acts on the basis of signals from System 1, the capacity of which to signal reliably is trained by System 2.

I do not by any means claim to have invented IRM!  I am synthesizing it from the work of many brilliant decision scientists.

The one who has made the biggest contribution to my view that IRM, and not the Orthodox conception of DRT, is correct is the brilliant social psychologist Howard Margolis.

Margolis presented an IRM account, as I’ve defined it, in his masterful trilogy (see the references below) on the role that “pattern recognition” makes to reasoning. 

Pattern recognition is a mental operation in which a phenomenon apprehended via some mode of sensory perception is classified on the basis of a rapid, unconscious process that assimilates the phenomenon to a large inventory of “prototypes” (“dog”; “table”; “Hi, Jim!”; “losing chess position”; “holy shit—those are nuclear missile launchers in this aerial U2 reconaisance photo! Call President Kennedy right away!” etc).

For Margolis, every form of reasoning involves pattern recognition.  Even when we think we are performing conscious, deductive or algorithmic mental operations, we are really just manipulating phenomena in a manner that enables us to see the pattern in the manner requisite to an accurate and reliable form of unconscious prototypical classification. Indeed, Margolis ruthlessly shreds theories that identify critical thinking with conscious, algorithimic or logical assessment by showing that they reflect the incoherence I've described as the "System 2 ex nihilo fallacy."

Nevertheless, how well we perform pattern recognition, for Margolis, will reflect the contribution of conscious, algorithmic types of reasoning.  The use of such reasoning (particularly in collaboration with experienced others, who can vouch through the use of their trained pattern-recognition sensibilities that we are arriving at the “right” result when we reason this way) stocks the inventory of prototypes and calibrates the unconscious mental processes that are used to survey and match them to the phenomena we are trying to understand.

As I have explained in a previous post (one comparing science communication and “legal neutrality communication”), this position is integral to Margolis’s account of conflicts between expert and lay judgments of risk. Experts, through a process that involves the conscious articulation and sharing of reasons, acquire a set of specialized prototypes, and an ability reliably to survey them, suited to their distinctive task. 

The public necessarily uses a different set of prototypes—and sees different things—when it views the same phenomena.  There are bridging forms of pattern recognition that enable nonexperts to recognize who the “experts” are—in which case, the public will assent to the experts’ views (their “pictures,” really).  But sometimes the bridges collapse; and there is discord.

Margolis’s account is largely (and brilliantly) synthetic—an interpretive extrapolation from a wide range of sources in psychology and related disciplines.  I don’t buy it in its entirety, and in particular would take issue with him on certain points about the sources of public conflict on risk perception.

But the IRM structure of his account seems right to me.  It is certainly more coherent—because it avoids the ex nihilo fallacy—than the Orthodox view.  But it is also in better keeping with the evidence. 

That evidence, for me, consists not only in the materials surveyed by Margolis.  They include too work by contemporary decision scientists.

The work of some of those decision scientists—and in particular that of Ellen Peters—will be featured in Part 3.

I will also take up there what is in fact the most important thing, and likely what I should have started with: why any of this matters.

Any “dual process theory” of reasoning will necessarily be a simplification of how reasoning “really” works.

But so will any alternative theory of reasoning or any theory whatsoever that has any prospect of being useful.

Better than simplifications, we should say such theories are, like all theories in science, models of phenomena of interest.

The success of theories as models doesn’t depend on how well they “correspond to reality.”  Indeed, the idea that that is how to assess them reflects a fundamental confusion: the whole point of “modeling” is to make tractable and comprehensible phenomena that otherwise would be too complex and/or too remote from straightforward ways of seeing to be made sense of otherwise.

The criteria for judging the success of competing models of that sort are pragmatic: How good is this model relative to that one in allowing us to explain, predict, and formulate satisfying prescriptions for improving our situation?

In Part 3, then, I will also be clear about the practical criteria that make IRM conception so much more satisfying than the Orthodox conception of dual process reasoning.

Those criteria, of course, are ones that reflect my interest (and yours; it is inconceivable you have gotten this far otherwise) in advancing the scientific study of science communication--& thus perfecting the Constitution of the Liberal Republic of Science



Is Disgust a Uniquely "Conservative" Moral Emotion?

As the 14 billion regular readers of this blog know, I went through a period where I was obsessed with disgusting things. Not incest or coprophagia, or any of that mundane stuff but rather things like the "Crickett," the miniaturized but fully functional .22 rifle that is marketed under the logo "My first rilfe!," and that is intended to be purchased by parents for preadolsecent children (they come in a variety of styles featuring child-attractive motifs, like pink-colored laminated stocks meant to appeal to young girls) in order to introduce them to the wonders of a cultural style in which guns are symbols of shared commitments and also instruments or tools that enable various sorts of role-specific behavior that transmit and propagate commitment to that style.... People who harbor an opposing style say they are disgusted by the Crickett--and I see (feel) where they are coming from.  That place, moreover, is very remote from "conservative" political ideology or a "conservative" moral style, which Jonathan Haidt and others have identified in extremely important and appropriately influential work as uniquely (or at least disproportionately) associated with the use of "disgust" as a moral sensibility. Rather, they seem like the people who subscribe to the "liberal" moral style that, in the work of Haidt and others, makes no or at least less use of disgust as a form of moral appraisal and instead relies on perceptions of harm. The reaction to the Crickett--that it and the way of life in which it figures are disgusting (a reaction widely expressed in the aftermath of the widely covered tragic accidental shooting of a two-year old Kentucky girl by her Crickett-toting five-year old brother), seemed like evidence to me for a different position, one I associate with Mary Douglas and William Miller, who view disgust as a universal moral sensibility that adherents to diverse cultural systems across place and time make use of to focus their perception of the objects and behavior characteristic of opposing styles; and to motivate their denunciation of them, in terms that are strikingly illiberal in the sense of being disconnected to harm, which is imputed to behavior that offends the cultural norms of those experiencing this reaction...

Readers also know that one of my favorite strategies for advancing my own knowledge and that of others is to recklessly offer my own conjectures on matters such as this as a way of luring/provoking those who know more to respond & correct the myriad mistakes they see in my ruminations!  Well, I've succeed once again!  

Below is an amazingly thoughtful & penetrating response from Yoel Inbar. Inbar is a social psychologist whose work on disgust, which is broadly in alignment with the account I attributed to Haidt, is of tremendous quality and importance and central to ongoing scholarly discussion of the role of disgust in informing moral and related sensibilities.  He takes issue with me, of course! I am much smarter as a result of reading and thinking about his essay & offer it to my loyal readers so that they can enjoy the same benefit!

Is Disgust a Uniquely "Conservative" Moral Emotion?

Yoel Inbar

Yoel Inbar (left)Among politically liberal academics, the emotion of disgust has an unsavory reputation. The philosopher Martha Nussbaum has argued that disgust is wielded by privileged social groups to marginalize and dehumanize those of lower status, and indeed research has found that the disgust-prone are more negative towards immigrants, foreigners, and "social deviants." Furthermore, disgust seems to have a relationship with political conservatism: self-described political conservatives are more easily disgusted, and states where people are on average more disgust sensitive were (all else equal) more likely to go for McCain over Obama in the 2008 U.S. presidential election. A tempting conclusion for liberals might be that disgust is an irrational, immoral, and politically suspect emotion, at least when it is applied to morality. 

Yet the view that disgust as a moral emotion is only important to political conservatives has a problem: on its face, it seems obviously wrong. As Dan Kahan pointed out on this blog, political liberals often use the word "disgust" when talking about things they find immoral: liberals say they are disgusted by multi-million-dollar Wall Street bonuses, gun manufacturers who make weapons for 10-year-olds, racism, and lots of other things. Doesn't this mean that liberals are just as likely as conservatives to base their moral judgments on disgust? Perhaps (liberal) researchers are simply more likely to label moral positions that they disagree with as disgust-based (and therefore, by implication, irrational) while giving positions they agree with a free pass.

Although political bias in social psychology is a real problem, this objection misses a crucial difference between liberals and conservatives, namely what they find morally objectionable. There are some behaviors that are at least in theory harmless, but (for a lack of a better word) gross. For example, consider a man who, every Saturday, buys a whole chicken at the supermarket, masturbates into it, cooks it, and eats it for dinner (this wonderful and by now famous story was invented by Jon Haidt). Almost everyone finds this disgusting. However, most liberals will concede that despite being disgusting, having sex with a chicken and consuming it is not morally wrong, because no one is harmed (after all, the chicken is already dead). Many conservatives (although by no means all) will say that despite being harmless, this behavior is wrong--because it is disgusting. In fact, conservatives are more likely than liberals to say that many different kinds of disgusting-but-harmless behaviors are morally wrong. Unusual habits regarding food, hygiene, and (especially) sex are often seen by conservatives as immoral regardless of whether they directly harm anyone. And the emotion that people feel when contemplating these kinds of behaviors (which Haidt and his colleagues have called purity violations) is disgust. Certainly Western liberals may also feel disgusted when considering these behaviors, but they are often reluctant to call them immoral unless they can point to a victim--to someone who is directly harmed.

Of course, many people who morally object to (for example) certain kinds of sex between consenting adults claim that their objection is motivated by the putative harm caused by the behavior, not by the observer's queasy feelings. In such a case, how are we to know whether beliefs about harm caused the moral conviction, or whether they are merely post-hoc rationalizations of a (disgust-based) moral intuition? This is a difficult question, but there are several good reasons to think the latter answer is right: 1) When Jon Haidt and his collaborator, Matthew Hersh, asked liberals and conservatives to defend their views about the moral permissibility of anal sex between two men, conservatives but not liberals were likely to defend their beliefs even when they admitted they could not give (harm-based) justifications for them (a phenomenon Haidt has called moral dumbfounding); 2) in the same study, judgments of moral permissibility were statistically predicted by subjects' self-reported emotional reactions to imagining the acts in question, and not by their judgments of their harmfulness; 3) when people are asked directly about how much different considerations are relevant to deciding whether something is right or wrong, conservatives rate "whether someone violated standards of purity and decency" and "whether or not someone did something disgusting" as more morally relevant than do liberals.

What, then, of liberals who say they're disgusted by gun manufacturers or Goldman Sachs? Well, it turns out that "disgust" is a tricky term, at least in English--many laypeople use "disgusted" in a metaphorical sense, to mean "angry." As David Pizarro and I recently argued with one or two exceptions there's very little evidence that people are physically disgusted by immoral behavior that doesn't involve food, cleanliness, or sex. In fact, recent research by Roberto Gutierrez, Roger Giner-Sorolla, and Milica Vasiljevic suggests that people use the word "disgust" to mean physically disgusted when judging unusual sexual or dietary practices, but use the same word to mean something much closer to "angry" when judging instances of deceit or exploitation.  Of course, this is an area that's actively being researched at the moment, and this may change, but the balance of evidence so far suggests that when people use "disgust" to refer to their reactions to unfairness, exploitation, or violations of someone's rights, they are doing so metaphorically, not literally.

This is not to say that disgust qua disgust plays no role in liberals' moral judgments. For example, consider another story invented by Jon Haidt: Mark and Julie are siblings who are vacationing together in the south of France. One night, they decide that it would be fun and interesting if they tried making love. Julie is on birth control, but just to be safe Mark also uses a condom. They both enjoy the experience, but they decide not to do it again and to keep it a special secret between the two of them. Was this morally wrong? Here, liberals and conservatives seem equally likely to say "yes"--and equally unable to back up those judgments with harm-based justifications. When Jon Haidt and Matthew Hersh asked their undergraduate subjects about the moral permissibility of incest, they found that liberals were just as likely as conservatives to reject it, and just as likely to become morally dumbfounded when attempting to defend their judgments. For both liberals and conservatives, visceral disgust sometimes leads to moral revulsion, but this seems to be more common for conservatives. This is likely to be for two reasons: 1) conservatives are more readily disgusted in general; and 2) conservatives seem to be more comfortable pointing to feelings of disgust as a justification for moral beliefs (for example, conservative bioethicist Leon Kass's well-known argument for the "wisdom of repugnance."

Does this mean that liberals are better moral decision-makers than conservatives? After all, if conservatives base more of their moral judgments on disgust, an unreasoned emotion, and liberals base more of their moral judgments on whether someone was harmed or treated unfairly, doesn't this mean that liberals are more careful, thoughtful, and reasoned in their moral judgments? The answer is unambiguously no. There is no evidence that liberals are any less likely to base their moral judgments on (unreasoned) intuitions than conservatives, although liberals and conservatives do often rely on different moral intuitions. But what moral intuitions underlie the moral judgments of political liberals, and why these intuitions can be just as fallible as those of conservatives, are questions big enough to leave for a separate post.


"System 1" and "System 2" are intuitively appealing but don't make sense on reflection: Dual process reasoning & science communication part 1

“Dual process” theories of cognition (DPT) have been around a long time but have become dominant in accounts of risk perception and science communication only recently, and in a form that reflects the particular conception of DPT popularized by Daniel Kahneman, the Nobel Prize winning behavioral economist.

In this post--the first in a 2-part series-- I want to say something about why I find this conception of DPT unsatisfying.  In the next, I'll identify another that I think is better.

Let me say at the outset, though, that I don't necessarily see my argument as a critique of Kahneman so much as an objection to how his work has been used by scholars who study public risk perceptions and science communication.  Indeed, it's possible Kahneman would agree with what I'm saying, or qualify it in ways that are broadly consistent with it and that I agree improve it.

So what I describe as "Kahneman’s conception, while grounded in his own exposition of his views, should be seen as how his position is understood and used by scholars diagnosing and offering prescriptions for the pathologies that afflict public risk perceptions in the U.S. and other liberal democratic socieities.

This conception of DPT posits a sharp distinction between two forms of information processing: “System 1,” which is “fast, automatic, effortless, associative and often emotionally charged,” and thus “difficult to control or modify”; and “System 2,” which is “slower, serial, effortful, and deliberately controlled,” and thus “relatively  flexible and potentially rule-governed.” (Kahneman did not actually invent the “system 1/system 2” terminology; he adapted it from Keith Stanovich and Richard West, psychologists whose masterful synthesis of dual process theories is subject to even more misunderstanding and oversimplification than Kahneman's own)

While Kahneman is clear that both systems are useful, essential, “adaptive,” etc., System 2 is more reliably connected to sound thinking.  

In Kahneman’s scheme, System 1 and 2 are serial: the assessment of a situation suggested by System 1 always comes first, and is then—time, disposition, and capacity permitting—interrogated more systematically by System 2 and consciously revised if in error.

All manner of “bias,” for Kahneman, can in fact be understood as manifestations of people’s tendency to make uncorrected use of intuition-driven System 1 “heuristics” in circumstances in which the assessments that style of reasoning generates are wrong.

Human rationality is “bounded” (an idea that Kahneman and those who elaborate his framework take from the pioneer decision scientist Herbert Simon) but how perfectly individuals manifest rationality in their decisionmaking, on Kahneman’s account, reflects how adroitly they make use of the “monitoring and corrective functions of System 2” to avoid the “mistakes they commit” as a result of over-reliance on System 1 heuristics.

This account has attained something akin to the status of an orthodoxy in writings on public risk perception and science communication (particularly in synthetic works in the nature of normative and prescriptive “commentaries,” as opposed to original empirical studies).  Popular writers and even many scholars use the framework as a sort of template for explaining myriad public risk perceptions—from those posed by climate change and terrorism nuclear power and genetically modified foods—that, in these writers’ views, the public is over- or underestimating as a result of its reliance on “rapid, intuitive, and error-prone” System 1 thinking, and that experts are “getting right” by relying on methods (such as cost-benefit analysis) that faithfully embody the “deliberative, calculative, slower, and more likely to be error-free” assessments of System 2.

This is the account I don’t buy.

It has considerable intuitive appeal, I agree.  But when you actually slow down a bit and reflect on it, it just doesn’t make sense.

The very idea that "conscious" thought "monitors" and "corrects" unconscious mental operations is psychologically incoherent.

There is no thought that registers in human consciousness that wasn’t, an instant earlier, residing (in some form, but unlikely one that could usefully be described as a “thought” or at least anything with a concrete, articulable propositional content) in some element of a person’s “unconsciousness.”

Moreover, whatever yanked it out of the stream of unconscious “thought” and projected it onto the screen of consciousness also had to be an unconscious mental operation.  Even if we imagine (cartoonishly) that there was a critical moment in which a person consciously “noticed” a useful unconscious “thought” floating along and “chose” to fish it out, some unconscious cognitive operation had to occur prior to that for the person to “notice” that thought, as opposed to the literally infinite variety of other alternative stimuli, inside the mind and out, that the person could have been focusing his or her conscious attention on instead.

Accordingly, whenever someone successfully makes use of the “slower, serial, effortful, and deliberately controlled” type of information processing associated with System 2 to “correct” the “fast, automatic, effortless, associative and often emotionally charged” type of information processing associated with System 1, she must be doing so in response to some unconscious process that has reliably identified the perception at hand as one in genuine need of conscious attention.

Whatever power “deliberative, calculative, slower,” modes of conscious thinking have to "override" the mistakes associated with the application of “rapid, intuitive, and error-prone” intuitions about risk, then, necessarily signify the reliable use of some other form of unconscious or pre-conscious mental operations that in effect “summon” the faculties associated with effortful System 2 information processing to make the contribution that they are suited to making to information processing.

Thus, system 2 can’t only reliably “monitor” and “correct” System 1 (Kahneman’s formulation) unless System 1 (in the form of some pre-conscious, intuitive, affective, automatic, habitual, uncontrolled etc mental operation) is reliably monitoring itself.

The use of System 1 cognitive processes might be integral to the “boundedness” of human rationality.  But how close anyone can come to perfecting rationality necessarily depends on the quality of those very same processes.

The problem with the orthodox picture of deliberate, reliable conscious,"System 2" checking impetuous, impulsive "System 1" can be called the “system 2 ex nihilo fallacy”: the idea that the form of conscious, deliberate thinking one can use to "monitor" and “correct” automatic, intuitive assessments just spontaneously appears—magically, “out of nothing,” and in particular without the promptibng of unconscious mental processes—whenever heuristic reasoning is guiding one off the path of sound reasoning.

The “System 2 ex nihilo fallacy” doesn’t, in my view, mean that dual process reasoning theories are “wrong” or “incoherent” per se.

It means only that the truth that such theories contain can’t be captured by a scheme that posits the sort of discrete, sequential operation of “unconscious” and “conscious” thinking that is associated with the view I’ve been describing—a conception of DPT that is, as I’ve said, pretty much an orthodoxy in popular writing on public risk perception and science communication.

In part 2 of this series, I’ll suggest a different conception of DPT that avoids the “System 2  ex nihilo fallacy.”

It is an account that is in fact strongly rooted in focused study of risk perception and science communication in particular.  And it furnishes a much more reliable guide for the systematic refinement and extension of the study of those phenomena than the particular conception of DPT that I have challenged in this post.

Kahneman, D. Maps of Bounded Rationality: Psychology for Behavioral Economics. Am Econ Rev 93, 1449-1475 (2003).

Simon, H.A. Models of bounded rationality (MIT Press, Cambridge, Mass.; 1982).

Stanovich, K.E. & West, R.F. Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences 23, 645-665 (2000).

Sunstein, C.R. Laws of Fear: Beyond the Precautionary Principle. (Cambridge University Press, Cambridge, UK ; New York; 2005).



A measured view of what can be validly measured with M Turk samples

As the nation continues to be convulsed by polarized debate and street demonstrations following last week's publication of Chandler, Mueller & Paolacci, Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers, Behavioral Methods (advance on-line 2013), CCP Blog is proud to present ... an EXCLUSIVE guest post from Jesse Chandler, lead author of this important article! Jesse offers his views on the critique I posted on the validity of MT samples for studying the interaction of culture, cognition, and perceptions of risk and other policy-relevant facts.

I wanted to elaborate a bit on some of the issues that Dan raised, amplifying some of his concerns about generalizability, but also mounting something of a defense of Mechanical Turk workers, and their use in research. In brief, I wanted to reinforce the point that he made about understanding the sample that is used and consciously deciding what inferences to make from it. However, I also wanted to push back a bit on his claim that MTurk is demonstrably worse than other methods of collecting data. Along the way, I also have to dispute his characterization of MTurk workers as liars and frauds.

MTurk is not representative, but it is more representative than many samples researchers currently use 

As Dan notes, Mechanical Turk is, in principle, fine as a sample for any research study for which college students are currently deemed “representative enough” (which is a larger proportion of the social sciences than the site’s readers may appreciate).  If anything, MTurk samples are more representative than other convenience samples, and discovering that a finding observed among college students is robust in a different and more homogeneous population  is useful. 

Moreover, in the social sciences it should generally be assumed that any observed process generalizes unless there is a reason to think that it does not (nothing insidious here, just Occam’s razor). If a researcher believes that a finding would not replicate on another population, then they should try to replicate it across both samples and compare results. Ideally, they have a reason why they expect the populations to differ that they can articulate, operationalize and use in meditational analysis. In other words, concerns about the validity of findings on MTurk represent an opportunity to advance theory, not a reason to dismiss findings out of hand.  

Know thy sample 

Perhaps more importantly, I think Dan is spot on in emphasizing the importance of understanding the sample one is using and the question being asked. “Representative enough” is clearly not suitable for some research questions, and some inferences do not logically follow from non-representative samples. Likewise, for researchers interested in specific populations, MTurk results may vary. Some populations (like conservatives) may be missing or underrepresented in this sample, which is bad for Dan. Other populations, like the unemployed,underemployed and socially anxious may be over-represented  which is great for someone else. For researchers with limited budgets who work at homogeneous colleges, some populations, like people from other cultures or who speak other languages may only be available on MTurk. 

Another closely related point Dan alludes to that I also want to reemphasize is that the constituents of a particular MTurk sample cannot be taken for granted. Workers are not randomly selected from the pool of available workers and assigned to studies. They choose what they want to participate in. While there are ways to convert selection bias based on study content into attrition (e.g. by placing the consent form after workers accept the HIT), other procedural factors may influence who completes a HIT. We show, for example, that if a HIT makes it onto Reddit, the sample can end up much younger and disproportionately male. It is likely that sample characteristics may also depend on other variables including the requester’s reputation, the sample size, payment and the minimum reputation of the recruited workers (none of which have been thoroughly studied).

It is important to collect relevant variables from participants directly, rather than only appealing to the demographic characteristics collected by other researchers. Very simple demographic differences can fundamentally change point estimates on survey responses. As Dan notes, MTurk is overwhelmingly pro-Obama. There might be a complicated reason for this, but it may also reflect the fact that  American MTurk workers are more likely to be young, lower income, and female - all of these demographic characteristics predict more support for Obama.

Dan thinks the Internet is full of weirdos and frauds.

Despite agreeing with the spirit of Dan's comments, I have to take issue with his argument that Mechanical Turk workers are more likely to engage in immoral behavior than other samples, and thus MTurk samples are inferior to other kinds of panel data. 

I take particular issue with these claims because the take home implication from them is that data provided by MTurk workers is less credible, not because the workers are a non-representative population, but because the data are more likely to fabricated than data obtained from other sources. If this were true, this issue of internal validity would be a far more serious threat to the usefulness of findings on MTurk and would call into question all data collected on it. However, there is little evidence to suggest these concerns are true. These are comparative arguments for which comparative data does not exist, and often even the data for MTurk itself is missing or misleading. 

Yes, adult content exists on MTurk, but workers must opt in to view HITs that may contain adult content (including flagging it for removal from non-adult sites). Around 80 000 workers have opted to do so. We don’t know how many workers actually view this content, let along how this proportion compares to the population of Internet users who watch adult content.

Yes, some workers probably intend to engage in fraudulent behavior on MTurk. Again, we don’t know how many workers do this. Dan notes that a large proportion of posted HITs commit fraud, in the sense that they ask requesters to “like” social media posts contrary to Amazon’s ToS. Taking this as evidence for worker fraud relies on the assumptions that i.) these HITs are actually completed, ii.) by workers in general and not just a subsample of super productive fraudsters (analogous to our research superturkers), that iii.) there is overlap between the sample that completes spam HITs and research HITs and iv.) that workers even understand that this is a fraudulent activity (Dan read the Amazon’s terms of service, but hey, he is a lawyer). 

Another variation of the argument that workers are somehow fundamentally strange comes from the question “who would work for $1.50 an hour?” If I had to guess who works for these low wages, I would say that it is the large number of long term unemployed and other people living at the margin in a country muddling through an economic catastrophe. Although MTurk pays little, the money it does pay is at the margin. Moreover, there may be good reasons why workers accept low wages:  MTurk work is flexible enough to be completed in slack time, and accommodate other life commitments (for a discussion see here). Also, we live in a country where people pay to click mice repeatedly. Knowing this, it is not so surprising that people will do the same to earn money. I would not be surprised though if different workers had different reserve wages, and if sample characteristics changed as a function of wages, or in response to external economic conditions.

Workers are people. Don’t be surprised if they act like… people

Problems with worker data quality do not need to be explained by pathologizing workers. Many of the issues that vex researchers could arise from workers acting basically like ordinary people.

Workers will lie or distort the truth if incentivized to do so. Indeed, research shows that MTurk workers lie for money (see here), but a close reading of the paper will show that they may lie less than “real world” participants who participated in similar studies on participant honesty. This may explain why workers misreport US residency. US workers are paid in cash, those in many other countries are paid in Amazon credit.

Workers like other people are forgetful and workers who “refuse” to opt out of studies they have already completed should surprise nobody. Large proportions of people forget things like spending the night in a hospital or being the victim of a violent crime (see here), all of which are more important to their lives then Study 3 of your dissertation. Researchers who want to avoid duplicate workers (they should) should make life easy for both workers and themselves by preventing duplicates automatically.

It is true that you cannot know what work workers have completed for other researchers, but these concerns can be greatly reduced if researchers took the time to create their own stimuli. I am sometimes surprised at the laziness of researchers. Gabriele Paolacci and I used a simple attention check (“have you ever had a fatal heart attack”) once, three years ago. We mentioned this in a paper and it shows up verbatim all the time on MTurk. The Oppenheimer “Instructional Manipulation Check” is also frequently copied verbatim. Seriously. Stop. It. Now. 

If there is one thing that workers hate, it is negative feedback. This means they will generally bend over backwards to accommodate requesters. They generally understand that researchers do not like people talking about the contents of their HITs and try to avoid this. When they do communicate information, they seem to assume that the details they reveal will not matter, and methodologically problematic slips (e.g. discussing information in one condition but not another) are inadvertent. However, they also hate it when requesters reject their work because they failed an “attention check.” From a worker’s perspective, this probably feels unfair in only the way that an elite private school refusing to give out nickels can. Oh, and this problem is not unique to MTurk, sharing information for mutual benefit happens in college samples too.

Are Panels any Good?

All of these concerns about the quality of data collected on MTurk assume that workers are somehow different from respondents in other sample pools, and that these issues will simply go away if only data were collected somewhere else. This may be true, but how much do we really know about panel respondents and panel data quality? It is unfair to compare observed data in MTurk against a Platonic ideal of a survey panel. If MTurk workers lie to be eligible for studies (like our malingerers), why wouldn’t panel members lie for yet larger incentives? Likewise, if we are going to worry that MTurk samples are not representative because workers look at naked people in the internet, then perhaps we should worry about whether panels built using random digit dialing are representative, given that almost every normal person  screens their calls

Researchers who use other pay panels should be as critical toward these samples as Dan would like us to all be toward Mechanical Turk. Paid sources vary a lot in methodology and it is likely that beyond differences in how they are supposed to be constructed, there are yet larger differences in how the panel design is executed. Research always seems cleaner when you don’t know how the sausage gets made. Dig deep. Get worried. While data quality, representativeness and honesty may be issues that are particularly salient for MTurk samples, we (as in social scientists who are not survey research methodologists) may simply know more about their issues because the sample is relatively transparent and somebody happened to look.

The Take Home Message

In sum, Dan notes issues with Mechanical Turk that I agree are potential problems. However, I think the most important lessons that can be drawn from this discussion are what questions to ask about our hypotheses and our sample, and how to collect data from them, rather than who to collect data from. Further the solutions to the problems he identifies lie ultimately in better research design, with or without finding a better sample population. 




Proof of ideologically motivated reasoning--strong vs. weak

A couple of weeks ago I posted the abstract & link to Nam, Jost & Van Bavel’s “Not for All the Tea in China!” Political Ideology and the Avoidance of Dissonance, and asked readers to comment on whether they thought the article made a good case for the “asymmetry thesis.”

The "asymmetry thesis"—a matter I’ve actually commented on about a billion times on this blog (e.g., herehereherehere,
 here  . . .)—is the claim that individuals who subscribe to a conservative or “right-wing” political orientation are uniquely or disproportionately vulnerable to closed-minded resistance to evidence that challenges their existing beliefs. 

The readers' responses were great.

Well, I thought I’d offer my own view at this point.  

I like the study. It's really interesting.  

Nevertheless, I don't think it supplies much if any additional evidence for treating the asymmetry thesis as true than one would have had before the study. Consequently, if one didn't find the thesis convincing before (I didn't), then NJV-B doesn't furnish much basis for reconsidering.

One reason the study isn't very strong is that NJV-B relied on a Mechanical Turk sample.  I just posted a two-part set of blog entries explaining why I think MT samples do not support valid inferences relating to cultural cognition and like forms of motivated reasoning.

But even leaving that aside, the NJV-B study, in my view, rests on a weak design, one that defeats confident inferences that any ideological “asymmetries” observed in the study correspond to how citizens engage real-world evidence on climate change, gun control, the death penalty, health care, or other policies that turn on contested empirical claims.

NJV-B purported to examine whether “conservatives” are more averse to “cognitive dissonance” than “liberals” with respect to their respective political positions—a characteristic that would, if true, suggest that the former are less likely to expose themselves to or credit challenging evidence.

They tested this proposition by asking subjects to write “counterattitudinal essays”—ones that conflicted with the positions associated with subjects’ self-reported ideologies—on the relative effectiveness of Democratic and Republican Presidents.  Democrats were requested to write essays comparing Bush favorably to Obama, and Reagan favorably to Clinton; Republicans to write ones comparing Obama favorably to Bush, and Clinton favorably to Reagan.

They found that a greater proportion of Democrats complied with these requests. On that basis, they concluded that Republicans have a lower tolerance for actively engaging evidence that disappoints their political predispositions.

Well, sure, I guess.  If the two groups had demonstrated an equal likelihood to resist writing such essays, then I suppose that would count as evidence of “symmetry,” so their unwillingness to do so by the same token is evidence the other way.

The problem is that it’s not clear that the intensity of the threat that the respective tasks posed to Republicans’ and Democrats’ predispositions was genuinely equal.  As a result, it’s not clear whether the “asymmetry” NJV-B observed in the willingness of the subjects to perform the requested tasks connotes a comparable differential in the disposition of Democrats and Republicans to engage open-mindedly with evidence that challenges their views in real-world political conflicts.

By analogy, imagine I hypothesized that Southerners were lazier than Northerners. To test this proposition, I asked Southerners to run 5 miles and Northerners to do 50 sit-ups. Observing that a greater proportion of Northerners agreed to my request, I conclude that indeed Southerners are lazier—more averse to physical and likely all other manner of exertion—than Northerners are.

This is obviously bogus.  One could reasonably suspect that doing 50 sit-ups is less taxing than running 5 miles. If so, then we’d expect agreement from fewer members of a group of people asked to do the former than from members of a group asked to do the latter—even if the two groups’ members are equally disposed to exert themselves.

Well, is it as “dissonant” for a Democrat to compare Bush favorably to Obama, and Reagan favorably to Clinton, as it is for a Republican to compare Obama favorably to Bush and Clinton favorably to Reagan? 

I think we could come up with lots of stories—but the truth is, who the hell knows? We don’t have any obvious metric by which to compare how threatening or dissonant or “ideologically noncongruent” such tasks are for the respective groups, and hence no clear way to assess the probative significance of differences in the willingness of each to engage in the respective tasks they were requested to perform.

So, sure, we have evidence consistent with “asymmetry” in NJV-B—but since we have no idea what weight or strength to assign it, only someone motivated to credit the “asymmetry” thesis could expect a person who started out unconvinced of it to view this study as supplying much reason to change his or her mind, given all the evidence out there that is contrary to the asymmetry thesis.

The evidence contrary to the asymmetry thesis rests on study designs that don’t have the sort of deficiency that NJV-B displays.  Specifically, the studies I have in mind use designs that measure how individuals of diverse ideologies assess one and the same item of evidence, and show that they are uniformly disposed to credit or discredit it selectively, depending on whether the researcher has induced the study subjects to believe that the piece of evidence in question supports or challenges, affirms or threatens, a position congenial to their respective group commitments.

One example involved the CCP study featured in the paper They Saw a Protest. There, subjects, acting as jurors in a hypothetical trial, were instructed to view a videotape of a political protest and determine whether the demonstrators physically threatened bystanders. Half the subjects were told that the demonstrators were anti-abortion activists protesting outside of an abortion clinic, and half that they were pro-gay/lesbian activists protesting “don’t ask, don’t tell” outside of a military recruitment center.

We found that what “Republicans” and “Democrats” alike reported seeing—protestors “blocking” and “screaming” in the face of “fearful” bystanders or instead noncoercive advocacy inducing shame, embarrassment, and resentment among those seeking to enter the facility—flipped depending on which type of protest they believed they were watching.

Are Republicans and Democrats (actually, we used cultural worldview measures, but also reported the results using partisan self-identification, too) “equally” invested in their respective positions on abortion and gay rights?

I don’t know.  But I don't need to in order to draw inferences from this design.  For however strongly each feels, they both were equally prone to conform their assessment of evidence to the position that was most congenial to their ideologies.

That’s evidence of symmetry in motivated reasoning. And I think it is pretty darn strong.

I’ve addressed this point more generally in previous posts that describe what counts as a “valid” design for an ideologically motivated reasoning experiment. In those posts, I’ve shown how motivated reasoning relates to a Bayesian process of information processing

Bayesianism describes the logical operations necessary for assimilating new information or evidence with one’s existing views (which themselves reflect an aggregation of all the other evidence at one’s disposal).  Basically, one revises (updates) one’s existing view of the probability of a proposition (or hypothesis) in proportion to how much more consistent the new evidence is with that proposition as opposed to some other, alternative hypothesis—a property of the information known as the “likelihood ratio” (a ratio of how likely the proposition is to be true given the evidence and how likely it is to be false given the evidence).

In Bayesian terms, the reasoning deficiency associated with motivated reasoning consists in the opportunistic adjustment of the likelihood ratio.  When they display ideologically or culturally motivated reasoning, individuals treat the new information or evidence as “more consistent” or “less consistent” with the proposition in question (the film shows the protestor “blocked entry to the building” or instead “made an impassioned verbal appeal”) depending on whether the proposition is one that gratifies or disappoints their motivating ideological or cutlural commitments.

When people's reasoning reflects motivated cognition, their ideological commitments shape both their prior beliefs and the likelihood ratio they attach to new evidence.  As a result, they won't update their “prior beliefs” based on “new evidence,” but rather assign to new evidence whatever weight best "fits" their ideologically determined priors.  

Under these conditions, ideologically diverse people won’t converge in their assessments of a disputed fact (like whether the earth is heating up as a result of human CO2 emissions), even when they are basing their assessments on the very same evidence.

The study in They Saw a Protest involved a design aimed at testing whether individuals do this.  The information that the subjects received--the images displayed in the video--were held constant, while the ideological stake the subjects had in giving that information effect with respect to whether the protestors resorted to physical intimidation was manipulated.

The study found that subjects gave selective effect to the evidence--opportunistically adjusted the likelihood ratio in Bayesian terms--in a manner that gratified their ideologies.  Moreover, they did that whether their outlooks were "liberal" or "conservative."

So again, I believe that’s convincing evidence of “symmetry” in the vulnerability of ideologically diverse citizens to motivated reasoning--evidence that is a lot more probative (has a much higher likelihood ratio, in Bayesian terms!) than what NJV-B observed in their study given the relative strength of the respective study designs.

Nor is our Saw a Protest study the only one that used this kind design to look at ideologically motivated reasoning. In a companion follow-up post, I’ll identify a variety of others, some by CCP researchers and some by others, that use the same design and reach the same conclusion.

All the studies I am aware of that use this design for testing motivated reasoning (one, again, that manipulates the ideological motivation that subjects have to credit or discredit evidence, or opportunistically adjust the "likelihood ratio" they assign to one and the same piece of information) reach the conclusion that ideologically motivated reasoning is symmetric.

The only studies that support the asymmetry thesis are ones that use designs that either are not valid or that suffer from a design limitation that defeats reliable comparison of the reasoning styles of subjects of opposing predispositions.

NJV-B is in the latter category. As a result, I give it a likelihood ratio of, oh, 1.001 in support of the asymmetry thesis.

Some references 



Let's keep discussing M Turk sample validity

Lots of great discussion on “Fooled twice, shame on who?,” part 2 of the 2-part set on validity of M Turk samples for study of individual differences in cognition rooted in ideological, cultural & like dispositions.  Indeed, some of the discussion appears over at Stats Legend Andrew Gelman’s Legendary Statistical Modeling, Causal Inference, and Social Science blog.

The comments make for more interesting reading than anything I would have to say today, and maybe others will want to add to them.

But here are some of the interesting points that have come up & that have furnished me w/ reason to reflect on whether & how what I had to say warrants refinement, qualification, revision etc:

1. Contextualization

I wanted to be clear that the sort of “sample validity” issue I was raising about M Turk was specific to study of a particular class of psychological dynamics—the ones that I myself am most interested in—involving the interaction of critical thinking dispositions and the sort of group commitments that typically are assessed with measures of ideology, cultural worldviews & the like. That was why I broke my discussion into two posts, the first of which stressed that “sample validity” shouldn’t be identified with some checklist of abstract properties like “representativeness” but instead addressed in a fine-grained manner aimed at determining whether subjects selected in a particular fashion support reliable and valid inferences about the psychological dynamics being investigated.

But I’m convinced I didn’t do a good enough job on this. 

Part of what made me realize that was a comment by Hal Pashler in the discussion at Statistical Modeling, Causal Inference. Pashler argued convincingly that researchers had through careful testing confirmed the validity of M Turk samples for a range of fundamental cognitive dynamics (primarily ones involving rapid, automatic processing of visual stimuli).

I fully accept this and agree with the overall thrust of Pashler's comment! But the need for him to make it (in part in response to the course of the discussion at the SMCI blog) was proof to me that I had failed—in part by having neglected to identify dynamics that differ in relevant respects from the one I was focusing on (again, the influence of group values in assessment of evidence on societal risks & related policy-relevant facts) & that as a result might well admit of valid study w/ M Turk samples.

So: avoid generalization; determine “sample validity” by focusing on the particular issues relevant to determining whether reliable, valid inferences can be drawn from any given sample about the psychological dynamic under investigation; and recognize, then, that M Turk samples might be “valid” for some purposes and not others.  Check!

2. Validation of “partisan typicality”

One of the main reasons I don’t regard M Turk samples as valid for studying individual differences in cognition related to ideology is that I think there is reason to believe the self-described “conservatives” who are participating in M Turk samples are not typical of self-described conservatives in the general population.

Solomon Messing convincingly pointed out that the way to address this is to look at studies that address exactly that by comparing how MT subjects respond to questions in relation to how ones included in familiar samples such as those in American National Election Studies surveys—and he cited studies that do exactly that (here & here).

He’s right; I’m eager to read those papers.

Jarret Crawford amplified this point, referring to studies he’s done (here & here; I have read those; they are excellent & reflect ingenious designs; I’ve been meaning to run a blog post on them!) that furnish evidence of the “symmetry” of motivated reasoning in conservatives & liberals, a convergence with non-MT sample studies that ought to give us more confidence in MT samples (provided, of course, the designs of the studies are valid).

I have a hunch that the Messing & Crawford responses demonstrate that even in assessing the validity of M Turk for studying public opinion & political partisanship, one needs to be very precise about the fit between MT samples and the kinds of hypotheses being tested.  But in any case, they show I need to think more.  Good.

3. “Fixing” M Turk

Messing also discusses the possibility that the defects in M Turk samples might be “fixed” with some appropriate types of protocols, a matter that Chandler, Mueller & Paolacci address in their new study.

This is indeed a point that merits further discussion.  As I suggested in some of my own responses, I think what CMP suggest needs to be done actually can’t be feasibly expected to happen. 

In effect, to avoid the “repeat exposure” of MT subjects to cognitive-performance measures, there would have to be a “central registry” that would keep track of all the ID numbers of MT “workers” who have participated in social science studies and the measures that have been administered to them.

Who is going to set up this registry? Who will administer it? How will compliance of researchers with the registry be monitored and enforced?

Don't look at Amazon! It’s not a survey firm & could care less about whether MT workers furnish a valid source of subjects for social science research or, if they do at t1, about making sure they continue to at t2, t3, . . . tn.

Even if we started the registry today, moreover, we still wouldn't know whether the "newly registered" M Turk subjects hadn't participated already in studies featuring CRT and other common measures.

And what do we do now, as we wait for such a registry to be created? Should researchers be continuing to use M Turk for studies featuring measures the validity of which is compromised by prior exposure? And should journals be continuing to accept such studies?

* * * *

So still plenty more to discuss! Add your own thoughts ( in the discussion thread following the “Fooled Twice” post)!


Fooled twice, shame on who? Problems with Mechanical Turk study samples, part 2

From Mueller, Chandler, & Paolacci, Soc'y for P&SP, 1/28/12This is the second post in a two-part series on what I see as the invalidity of studies that use samples of Mechanical Turk workers to test hypotheses about cognition and political conflict over societal risks and other policy-relevant facts.

In the first, I discussed the concept of a “valid sample” generally.  Basically, I argued that it’s a mistake to equate sample “validity” with any uniform standard or any single, invariant set of recruitment or stratification procedures.

Rather, the validity of the sample depends on one thing only: whether it supports valid and reliable inferences about the nature of the psychological processes under investigation.

College student samples are fine, e.g., if the dynamic being studied is reasonably understood to be uniform for all people.

A nonstratified general population sample will be perfectly okay for studying processes that vary among people of different characteristics so long as (1) there are enough individuals from subpopulations whose members differ in the relevant respect and (2) the recruitment procedure didn’t involve methods that might have either discouraged participation by typical members of those groups or unduly encouraged participation by atypical ones.

Indeed, a sample constructed by methods of recruitment and stratification designed to assure “national representativeness” might not be valid (or at least not support valid inferences) if the dynamic being studied varies across subgroups whose members aren’t represented in sufficient number to enable testing of hypotheses relating specifically to them.


Now I will explain why, on the basis of this pragmatic understanding of what sample validity consists in, MT samples aren’t valid for the study of culturally or ideologically grounded forms of “motivated reasoning” and like dynamics that it is reasonable to believe account for polarization over climate change, gun control, nuclear power, and other facts that admit of empirical study.

I don’t want to keep anybody in suspense (or make it necessary for busy people to deal with more background than they think they need or might already know), so I’ll just start by listing what I see as the three decisive “sample validity” problems here. I’ll then supply a bit more background—including a discussion of what Mechanical Turk is all about, and a review of how this service has been used by social scientists—before returning to the three validity issues, which I’ll then spell out in greater detail

Ready? Here are the three problems:

1.  Selection bias.  Given the types of tasks performed by MT workers, there’s good reason to suspect subjects recruited via MT differ in material ways from the people in the world whose dispositions we are interested in measuring, particularly conservative males.

2.  Prior, repeated exposure to study measures.  Many MT workers have participated multiple times in studies that use performance-based measures of cognition and have discussed among themselves what the answers are. Their scores are thus not valid.

3.  MT subjects misrepresent their nationality.  Some fraction of the MT work force participating in studies that are limited to “U.S. residents only” aren't in fact U.S. residents, thereby defeating inferences about how psychological dynamics distinctive of U.S. citizens of diverse ideologies operate. 

That’s the short answer. Now some more detail.

AWhat is MT? To start, let’s briefly review what Mechanical Turk is—and thus who the subjects in studies that use MT samples are.

Operated by, MT is essentially an on-line labor market.  Employers, who are known as “requesters,” post solicitations for paid work, which can be accepted by “workers,” using their own computers.

Pay is very modest: it is estimated that MT workers make about $1.50/hr.

The tasks they perform are varied: transcription, data entry, research, etc.

But MT is also a well-known instrument for engaging in on-line fraud.

MT workers get paid for writing fake product or service reviews—sometimes positive, sometimes negative, as the requester directs.

They can also garner a tiny wage for simply “clicking” on specified links in order to generate bogus web traffic at the behest of “requesters” who themselves have contracted to direct visitors to legitimate websites, who are in this case the victims of the scam.

These kinds of activities are contrary to the “terms of use” for MT, but that doesn’t restrain either “requesters” from soliciting “workers” or “workers” form agreeing to engage in them.

Another common MT labor assignment—one not contrary to MT rules—is the indexing of sex acts performed in internet pornography.

MT Requester solicitation for porn indexing, July 10, 2013

B. The advent of MT “study samples.” A lot of MT workers take part in social science studies.  Indeed, many workers take part in many, many such studies.

The appeal of using MT workers in one’s study is pretty obvious. They offer a reasearcher a cheap, abundant supply of eager subjects.  In addition, for studies that examine dynamics that are likely to vary across different subpopulations, the workers offer the prospect of the sort of diversity of characteristics one won’t find, say, in a sample of college students.

A while back researchers from a variety of social science disciplines published studies aimed at “validating” MT samples for research that requires use of diverse subjects drawn from the general population of the U.S. Encouragingly, these studies reported that MT samples appeared reasonably “representative” of the general population and performed in manners comparable to how one would expect members of the general public generally to perform.

On this basis, the floodgates opened, and journals of all types—including ones in elite journals—began to publish studies based on MT samples.

To be honest, I find the rapidity of the decision of these journals to embrace MT samples mystifying.  

Even taking the initial studies purporting to find MT samples “representative” at face value, the fact remains that Amazon is not in the business of supplying valid social science research samples.  It is in the business (in this setting) of brokering on-line labor contracts. To satisfy the booming demand for such services, it is constantly enrolling new “workers.”  As it enlarges its MT workforce, Amazon does nothing—zip—to assure that the characteristics of its “workers” won’t change in ways that make them unsuited for social science research.

In any case, the original papers—which reflect data that are now several years old—certainly can’t be viewed as conferring a “life time” certification of  validity on MT samples.  If journals care about sample validity, they need to insist on up-to-date evidence that MT samples support valid inferences relating to the matters under investigation.

The most recently collected evidence—in particular Chandler, Mueller, Paolacci (in press) [actually, now published!] & Shapiro, Chandler & Mueller (2013)—doesn’t justify that conclusion.  On the contrary, it shows very convincingly that MT samples are invalid, at least for studies of individual differences in cognition and their effect on political conflict in the U.S.

C.  Three major defects MT samples for the study of culturally/ideological motivated reasoning

1.  Selection bias

Whatever might have been true in 2010,  it is clear that the MT workforce today is not a picture of America.

MT workers are “diverse,” but are variously over- and under-representative of lots of groups.

Like men: researchers can end up with a sample that is 62% female.

African Americans are also substantially under-represented: 5% rather than the 12% they make up in the general population.

There are other differences too but the one that is of most concern to me—because the question I’m trying to answer is whether MT samples are valid for study of cultural cognition and like forms of ideologically motivated reasoningis that MT grossly underrepresents individuals who identify themselves as “conservatives.”

This is clear in the frequencies that researchers relying on MT samples report. In Pennycook et al. (2012),  e.g., 53% of the subjects in their sample self-identified as liberal and 25% identified as conservative.  Stratified national surveys (from the same time as this study) suggest that approximately 20% of the general population self-identifies as liberal and 40% as conservative.

In addition to how they “identify” themselves, MT worker samples don’t behave like ones that consisted of ordinary U.S. conservatives (a point that will take on more significance when I return to their falsification of their nationality).  In an 2012 Election Day survey, Richey & Taylor (2012)  report that “73% of these MTurk workers voted for Obama, 15% for Romney, and 12% for ‘Other’ ” (this assumes we can believe they were eligible to vote in the U.S. & did; I’ll get to this).

But the reason to worry about the underrepresentation of conservatives in MT samples is not simply that the samples are ideologically “unrepresentative” of the general population.  If that were the only issue, one could simply oversample conservatives when doing MT studies (as I’ve seen at least some authors do).

The problem is what the underrepresentation of conservatives implies about the selection of individuals into the MT worker “sample.” There’s  something about being part of the MT workforce, obviously, that is making it less appealing to conservatives.

Maybe conservatives are more affluent and don’t want to work for $1.50/hr.

Or maybe they are more likely to have qualms about writing fake product reviews or watching hours of porn and indexing various sex acts. After all,  Jonathan Haidt & others have found that conservatives have more acute  disgust sensibilities than liberals.

But in any case, since we know that conservatives by and large are reticent to join the MT workforce, we also can infer there is something different about the conservatives who do sign up from the ones who don’t.

What's different about them, moreover, might well be causing them to respond differently in studies from how ordinary conservatives in the U.S. population would.  There must be if we consider how many of them claim to have voted for Obama or a third-party candidate in the 2012 election!

If they are less partisan, then, they might not demonstrate as strong a motivated reasoning effect as ordinary conservatives would.

Alternatively, their decision to join the MT workforce might mean they are less reflective than ordinary conservatives and are thus failing to ponder the incongruity between indexing porn, say, and their political values.

For all these reasons, if one is interested in learning about how dispositions to engage in systematic information  processing are affected by ideology, one just can’t be sure that what we see in “MT conservatives” will generalize to the real-world population of conservatives.

I’ve seen one study based on an MT sample that reports a negative correlation between “conservativism” and scores on the Cognitive Reflection Test, the premier measure of the disposition to engage in conscious, effortful assessment of evidence—slow, “System 2” in Kahneman’s terms—as opposed the rapid, heuristic-driven, error-prone evidence neglectful sort (“System 1”).

That was the study based on the particular MT sample I mentioned as grossly overrepresenting liberals and underrepresenting conservatives.

I’ve collected data on CRT and ideology in multiple general population surveys—ones that were designed to and did generate nationally representative panels by using recruitment and stratification methods validated by the accuracy of surveys using them to predict national election results. I consistently find no correlation between ideology and CRT.

In short, the nature of the MT workforce—what it does, how it is assembled, and what it ends up generating—makes me worry that the underrepresentation of conservatives reflects a form of selection bias relative to the sort of individual differences in cognition that I’m trying to measure.

That risk is too big for me to accept in my own research, and even if it weren't, I'd expect it to be too big for many consumers of my work to accept were they made aware of the problem I'm identifying. 

BTW, the only other study I’ve ever seen that reports a negative correlation between conservativism and CRT also had serious selection bias issues.  That study used subjects enticed to participate in an experiment at an internet site that is targeted to members of the public interested in moral psychology. As an incentive to participate in the study, researchers promised to tell the subjects what their study results indicated about their cognitive style. One might think that such a site, and such an incentive, would appeal only to highly reflective people, and indeed the mean CRT scores reported for study participants (liberals, conservatives, and libertarians) rivaled or exceeded the ones attained by students at elite universities and were (for all ideological groups) much higher than those typically attained by members of the general public.   As a colleague put it, purporting to infer how different subgroups will score on the CRT from such a sample is the equivalent of a researcher reporting that “women like football as much as men” based on a sample of visitors to!

2. Pre- & multiple-exposure to cognitive performance measures

Again, isn’t in the business of furnishing valid study samples.  One of the things that firms that are in that business do is keep track of what studies subjects they recruit have participated in so that researchers won’t be testing people repeatedly with measures that don’t generate reliable results in subjects who’ve already been exposed to them.

The Cognitive Reflection Test fits that description.  It involves three questions, each of which seems to have an obvious answer that is in fact wrong; people disposed to search for and reflect on evidence that contradicts their intuitions are more likely to get those answers right.

But even the most unreflective, visceral thinker is likely to figure out the answers eventually, if he or she sees the questions over & over. 

That’s what happens on M Turk.  Subjects are repeatedly recruited to participate in studies on cognition that use the CRT and similar test of cognitive style.

What’s more they talk about the answers to such tests with each other.  MT workers have on-line “hangouts” where they share tips and experiences.  One of things they like to talk about are the answers to the CRT.  Another is why researchers keep administering an “intelligence test” (that’s how they interpret the CRT, not unreasonably) that we clearly know the answers to?

These facts have been documented by Chandler, Mueller, and Paolacci in an article in press [now out--hurry & get yours before news stand sells out!] in Behavior Research Methods.

Not surprisingly, MT workers achieve highly unrealistic scores on the CRT, ones comparable to those recorded among students at elite universities and far above those typically reported for general population samples.

Other standard measures relating to moral reasoning style--like the famous "trolley problem"--also get administered to and answered by the same MT subjects over & over, and discussed by them in chat forums.  I'm guessing that's none to good for the reliablility/validity of responses to those measures either.

As Chandler, Mueller, Paolacci note, 

There exists a sub-population of extremely productive workers which is disproportionately likely to appear in research studies. As a result, knowledge of some popular experimental designs has saturated the population of those who quickly respond to research HITs; further, workers who read discussion blogs pay attention to requester reputation and follow the HITs of favored requesters, leading individual researchers to collect fans who will undoubtedly become familiar with their specific research topics.

There’s nothing that an individual researcher can effectively do to counteract this problem.  He or she can’t ask Amazon for help: again, it isn’t a survey firm and doesn’t give a shit whether its workforce is fit for participation in social science studies.

The researcher can, of course, ask prospective MT “subjects” to certify that they haven’t seen the CRT questions previously.  But there is a high probability that the workers—who know that their eligibility to participate as a paid study subject requires such certification—will lie.

MT workers have unique id numbers.  Researchers have told me that they have seen plenty of MT workers who say they haven’t taken the CRT before but who in fact have—in those researchers’ own studies.  In such cases, they simply remove the untruthful subject from their dataset.

But these and other researchers have no way to know how many of the workers they’ve never themselves tested before are lying too when they claim to be one of the shrinking number of MT workers who have never been exposed to the CRT. 

So researchers who collect data on performance-based cognition measures from MT workers really have no way to be sure  that these very high-scoring subjects are genuinely super reflective or just super dishonest.

I sure wouldn’t use take a risk like this in my own research.  And I’m also not inclined to take the risk of being misled by relying on studies of searchers who have disregarded it in reporting how scores on CRT or other cognitive performance measures relate to ideology (or religion or any other individual difference of interest). 

3. Misrepresentation of nationality (I know who these guys are; but who are MT workers? I mean—really?)

Last but by no means least: Studies based on MT samples don’t support valid inferences about the interaction of ideology and cognition in polarizing U.S. policy debates because it’s clear that some fraction of the MT subjects who claim to be from the U.S. when they contract to participate in a study aren’t really from the United States.

This is a finding from Shapiro, Chandler and Muller (2013), who in a survey determined that a “substantial” proportion of the MT workers who are “hired” for studies with “US only” eligibility are in fact participating in them via foreign internet-service providers.  

I also know of cases in which researchers have detected MT subjects using Indian IP addresses participating in their "US only" studies. 

Amazon requires MT workers to register their nationality when joining the MT labor force. But because MT workers recognize that some “requesters” attach “US worker only” eligibility criteria to their labor requests, MT workers from other countries—primarily India, the second largest source of MT labor outside the U.S.—have an incentive to misrepresent their nationality. 

I'm not sure how easy this is to pull off since Amazon now requires US citizens to supply Social Security numbers and non-US citizens who reside in the US to supply comparable information relevant to tax collection.

But it clearly isn't impossible for determined, internet-savvy and less-than-honest people to do. 

Part of pulling off the impersonation of a US resident involves signing up for MT through an account at a firm that uses a VPN to issue US IP addresses to internet users outside the U.S.  Indeed, aspiring non-US MT workers have an even bigger incentive to do that now because Amazon, in response to fraudulent use of its services, no longer enrolls new non-US workers into the MT labor force.

Shapiro, Chandler & Muller recommend checking the IP addresses of subjects in “US only” studies and removing from the sample those whose IP addresses showed they participated from India or another country.

But this is not a very satisfying suggestion.  Just as MT workers can use a VPN to misrepresent themselves as U.S.-residents when they initially enroll in MT, so they can use a VPN to disguise the location from which they are participating in U.S.-only studies. 

Why wouldn’t they? If they didn’t lie, they might not be eligible to “work” as a study subjects--or work period if they signed up after the period in which Amazon stopped enrolling non-US workers. 

True, lying is dishonest.  But so are a great many of the things that MT workers routinely do for paying MT requesters.

Charmingly, Shapiro, Chandler and Muller (2013) also found that MT subjects, who are notorious for performing MT tasks at the office when they are supposed to be working, score high on a standard measure of the disposition to engage in “malingering.”

That’s a finding I have complete confidence in. Remember, samples that are not “valid” for studying certain types of dynamics can still be perfectly valid for studying others.

* * * *

The name for Amazon’s “Mechanical Turk” service comes from a historical episode in the late 18th century in which a con artist duped amazed members of the public into paying him a small fee for the chance to play chess against “the Turk”—a large, turban-wearing, pipe-smoking manikin who appeared to be spontaneously moving his own pieces with his mechanized arm and hand.

The profitable ruse went on for decades, until finally, in the 1820s, it was discovered that the “Turk” was being operated by a human chess player hidden underneath its boxy chassis.

Today social scientists are lining up to pay a small fee—precisely because it is so much smaller than what it costs to recruit valid general population sample—to collect data on Amazon’s “Mechanical Turk.”

But if the prying open of the box reveals that the subjects performing the truly astonishing feats of cognition being observed in these researchers’ studies are “malingering” college students in Mumbai posing as  “U.S. Democrats” and “Republicans” in between jobs writing bogus product reviews and cataloging sex acts in on-line porn clips, I suspect these researchers will feel more foolish than anyone who paid to play chess with the original “Turk.”

Some references

Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2011). Using Mechanical Turk as a subject recruitment tool for experimental research. Political Analysis, 20(3), 351-368. 

Chandler, J., Mueller, P., & Paolacci, G. Methodological Concerns and Advanced Uses of Crowdsourcing in Psychological Research (in press) Behavior Research Methods.

Experimental Turk: a blog on social science experiments on Amazon Mechanical Turk

Mueller, Chandler & Paolacci, Advanced uses of Mechanical Turk in psychological research, presentation at Society for Personality & Social Psychology, Jan. 28, 2012.

Pennycook, G., Cheyne, J. A., Seli, P., Koehler, D. J., & Fugelsang, J. A. (2012). Analytic cognitive style predicts religious and paranormal belief. [doi: 10.1016/j.cognition.2012.03.003]. Cognition, 123(3), 335-346.

Richey, S,., & Taylor, B. How Representatives Are Amazon Mechanical Turk Workers? The Monkey Cage,(2012).

Shapiro, D. N., Chandler, J., & Mueller, P. A. (2013). Using Mechanical Turk to Study Clinical Populations. Clinical Psychological Science. doi: 10.1177/2167702612469015



What's a "valid" sample? Problems with Mechanical Turk study samples, part 1

It’s commonplace nowadays to see published psychology studies based on samples consisting of “workers” hired to participate in them via Amazon’s “Mechanical Turk,” a proprietary system that enables Amazon to collect a fee for brokering on-line employment relationships.

I’ve been trying to figure out for a while now what I think about this practice.

After considerable reading and thinking, I’ve concluded that “MT” samples are in fact a horribly defective basis for the study of the dynamics I myself am primarily interested in—namely, ones relating to how differences in group commitments interact with the cognitive processes that generate cultural or political polarization over societal risks and other facts that admit of scientific study.

I’m going to explain why, and in two posts.  To lay the groundwork for my assessment of the flaws in MT samples, this post will set out a very basic account of how to think about the “validity” of psychology samples generally.

Sometimes people hold forth on this as if sample validity were some disembodied essence that could be identified and assessed independently of the purpose of conducting a study. They say things like, “That study isn’t any good—it’s based on college students!” or make complex mathematics-pervaded arguments about “probability based stratification” of general population samples and so forth.

The reason to make empirical observations is to generate evidence that gives us more reason or less than we otherwise would have had to believe some proposition or set of propositions (the ones featured in the study hypotheses) about how the world works.

The validity of a study sample, then, depends entirely on whether it can support inferences of that sort. 

Imagine someone is studying some mental operation that he or she has reason to think is common to all people everywhere—say, “perceptual continuity,” which involves the sort of virtual, expectation-based processing of sensory stimuli that makes people shockingly oblivious to what seem like shockingly obvious but unexpected phenomena, like the sudden appearance of a gorilla among a group of basketball players or the sudden substitution of one person for another during a conversation between strangers.

Again, on the researcher's best understanding of the mechanisms involved, everyone everywhere is subject to this sort of effect, which reflects processes that are in effect “hard wired” and invariant.  If that’s so, then pretty much any group of people—so long as they haven’t suffered some sort of trauma that might change the operation of the relevant mental processes—will do.

So if a reasearcher wants to test whether a particular intervention—like telling people about this phenomenon—will help to counteract it, he or she can go ahead and test it on any group of normal people that researcher happens to have ready access to—like college undergraduates.

But now imagine that one is studying a phenomenon that one has good reason to believe will generate systematic differences among individuals identified with reference to certain specific characteristics. 

That’s true of “cultural cognition” and like forms of motivated reasoning that figure in the tendency of people to fit their assessments of information—from scientific “data” to expository arguments to the positions of putative experts to (again!) their own sense impressions—to positions on risk and like facts that dominate among members of their group.

Because the phenomenon involves individual differences, a sample that doesn’t contain the sorts of individuals who differ in the relevant respects won’t support reliable inferences.

E.g., there’s a decent amount of evidence that white males with hierarchic and individualistic values (or with “conservative” political orientations; cultural values and measures of political ideology or party affiliation are merely alternative indicators of the same latent disposition, although I happen to think cultural worldviews tend to work better) are motivated to be highly skeptical of environmental and technological risks. Such risk claims, this work suggests, are psychically threatening to such individuals, because their status and authority in society tends to be bound up with commercial and industrial activities that are being identified as dangerous, and worthy of regulation.

If one wants to investigate how a particular way of “framing” information might dissipate dismissiveness and promote open-minded engagement with evidence on climate change, then it makes no sense to test such a hypothesis on, say, predominantly female undergraduates attending a liberal east-coast university.  How they respond to the messages in question won’t generate reliable inferences about how white, hierarchical individualistic males will—and they are the group in the world that we have reason to believe is reacting in the most dismissive way to scientific evidence on climate change.

Obviously, this account of “sample validity” depends on one being right when one thinks one has “good reason to know” that the dynamics of interest are uniform across people or vary in specific ways across subpopulations of them.

But there’s no getting around that! If one uses a “representative” general population sample to study a phenomenon that in fact varies systematically across subpopulations, then the inferences one draws will also be faulty, unless one both tests for such individual differences and assures that the sample contains a sufficiently large number of the subpopulation members to enable detection of such effects. Indeed, to assure that there are enough of members of the subpopulations--particularly if one of them is small, like, say, a racial minority--is to oversample, generating a nonrepresentative sample!

The point is that the validity of a sample depends on its suitability for the inferences to be drawn about the dynamics in question.  That feature of a sample can’t be determined in the abstract, according to any set of mechanical criteria.  Rather it has to be assessed in a case-specific way, with the exercise of judgment. 

And like anything of that sort—or just anything that one investigates empirically—the conclusions one reaches will need to be treated as provisional insofar as later on someone might come along and show that the dynamics in question involved some feature that evaded detection with one’s sample, and thus undermines the inferences one drew.  Hey, that's just the way science works!

Maybe on this account Mechanical Turk samples are “valid” for studying some things.   

But I’m convinced they are not valid for the study of how cultural or ideological commitments influence motivated cognition: because of several problematic features of such samples, one cannot reliably infer from studies based on them how this dynamic will operate in the real world.

I’ll identify those problematic features of MT samples in part two of this series.



No one is afraid but we can still learn a lot from studying nanotechnology risk perceptions

No one is afraid of nanotechnology.

And at this point, no one any longer seems to be afraid that people will become afraid of nanotechnology.

There used to be a lot of anxiety about that.  Federal research agencies, foundations, and private industry all supported studies aimed at predicting how the public would react to nanotechnology—and how unreasoning fear, or self-reinforcing dynamics of political polarization, might be averted through adroit communication strategies. The worry was that nanotechnology might follow the path of nuclear power or genetically modified foods in Europe.

The concern seemed reasonable.  Indeed, CCP did a study, which found that when individuals were exposed to balanced information on nanotechnology’s potential risks and benefits, they polarized along lines that reflected their cultural predispositions toward other environmental and technological risks, such as climate change and nuclear power.

But nanotechnology has been around for many years now, and nothing of any consequence has happened. The public remains largely oblivious—neither concerned in general nor polarized.

To illustrate, consider some data collected from a large, nationally representative sample in April.  Nanotechnology is a risk-perception blip compared to climate change and nuclear power.

The public not only has failed to become anxious about nanotechnology. It still hasn't really noticed that nanotechnology exists.  Surveys continue to show that very few people say they have heard about or know much of anything about it.

Maybe things will still “heat up.”  But I’d be surprised at this point. Very surprised.

That doesn’t mean I think it was a waste for researchers to have studied public reactions to nanotechnology. 

On the contrary, I think the self-conscious effort to try to forecast its possible risk-perception trajectories—for the purpose, if possible, of guiding it away from influences inimical to reasoned and constructive engagement of the best available scientific evidence—was a model one, well worth emulating for emerging technologies in general.

The number of putative risk sources that are amenable to cultural polarization will always exceed by a large margin the number that actually do generate polarization. But because the public welfare costs of such conflict are so high, it makes sense to try to learn what influences cause emerging technologies to become suffosed with this pathology, and what sorts of steps can be taken to steer them clear of it.

Still, it does seem to me that at this point the question is not so much “whether” nanotechnology will become suffused in controversy but why exactly it didn’t.  Researchers who are continuing to focus on nanotechnology should try to figure that out—by, say, taking a close look at the career of nanotechnology in public discourse and comparing it with various other technologies, both ones that have generated high degrees of concern and controversy and ones that haven’t.

The only valid way to learn about causation is to examine carefully both the occurrence and non-occurrence of events of interest.

 Now, here’s a conjecture to get things started.

Although most people still haven’t heard of nanotechnology, my own casual observations suggest that if one asks people to “complete the phrase” that begins “nano. . .,” the most likely response (after “huh?”) is “Ipod.”

I suspect Apple has immunized nanotechnology from controversy by infusing it with the positive connotation evoked, more or less universally, by its more or less universally beloved entertainment technology.

Can this outcome—which surely was a matter of sheer happenstance—be consciously directed in the future?

Today there is widespread concern that synthetic biology will be the next “nuclear power” or “GMO” in risk-perception terms.

To forestall this, I propose—the Synbio iPad!

Using the engineering techniques associated with synthetic biology, researchers have in fact created a form of e coli capable of solving complicated math problems.

So, just fuse some e coli with the processor of the iPad 4 or 5 or 9 or whatever we are up to—and voilà: as the tide of public infatuation rises, good will will spill over onto synthetic bio (“synbio . . . iPad!”), immunizing it form mindless contentiousness that has infected climate change, GM foods (in Europe), and nuclear power (everywhere but in France) etc!

(Actually, e coli are the rock stars of synbio—being taught how to do all sorts of astonishing things, including how to fill the air with pleasant fragrances; a very nice turnaround for an organism that has borne a retched stigma since the beginning of human understanding of microbial life.)

As I said, merely a conjecture.

But inspired insight of this sort is among the predictable public-welfare returns on investments in scientific  risk-perception forecasting.


Does communicating research on public polarization polarize the public?

One of the things that makes this blog so astonishingly popular (we recently broke through the 14 billion unique daily readers ceiling!) is its relentless topicality.

Well, just yesterday, world famous  world class USA Today science journalist Dan Vergano published an amazingly informative story on research into the psychology of public conflict over climate change—and today we present a guest post from the same Dan Vergano on what it’s like to write about the psychology of public conflict over climate change!

DV addresses the challenges of communicating information on polarization to a polarized general public.  Is effective communication of scientific research on this topic constrained by the same dynamics that account for polarization? Does trying to explain the phenomeon of cultural polarization itself polarize citizens?

I’m sure there will be consensus among this site's 14+ billion regular readers that these are fascinating and difficult questions, and that DV’s insights are penetrating.

I'veadd my own reflections on the experience of communicating work like mine is to the public. I anticipate the usual dissensus among site commentators on the coherence & value of those -- indeed, I'd be disappointed by anything other than that!


Dan Vergano: Pole-Vaulting a polarized public?

How do you solve a problem like Dan Kahan and his polarization puzzle? I confess it worries me. How, for example, do I write about his finding that conservative-minded men view risks in a way poles apart from other people without  feeding into that very same polarization? And more important, how do I write about it in a way that doesn’t prevent me from doing my job?

I write news for a living. Sadly a rare thing now, I write news stories for the general reader, the average Joe, the man-or-woman on the street, the likely not-you if you are reading this post.

Continue reading Vergano


Dan Kahan: Enabling consensus on the sources & consequences of cultural dissensus

Dan Vergano initially asked me if I had any recommendations about the challenges of communicating the science of science communication -- & what it says about the sources of polarization --  to the polarized public, and in particular how to do this without triggering the sorts of dynamics that polarize culturally diverse citizens.   

I thought initially I’d just draw on my own experience in this regard—and realized that would be utterly unhelpful because the sort of “public” I communicate with is significantly different from the one he writes for. Indeed, I realized that what DV is up to is quite amazing and that I really wanted him to tell me & others how he pulls it off.

Continue reading Kahan


Decisive strike in the "asymmetry" debate?!

I've been underwater & unable to post with my normal frequency (indeed, I'm underwater b/c of posting with my normal frequency, and thus falling behind on other things!)

But here is something to consider: a new paper from Nam, Jost & van Bavel on whether "conservatives" are more prone to "cognitive dissonance avoidance" than "liberals."

But the question: does the result bear on the "asymmetry thesis" (AT)?

AT asserts that conservatives should be more disposed to ideologically motivated reasoning than liberals.

The basis for this hypothesis is the finding of Jost and other scholars who correlate ideology with self-report measures of critical thinking -- Need for Cognition, Need for Closure,  Dogmatic thinking, and other scales assessing attitudes toward complexity & uncertainty etc. -- that "conservatives" display a more closed-minded cognitive style.

I've posted 913 entries on the asymmetry thesis ( hereherehere, for examples) & also done my own study that tries to test it.

But maybe it's game over? This paper is the decisive strike?  

"Cognitive dissonance avoidance" is very much related to motivated reasoning (itself a tendency to adjust one's assessments of facts to avoid disappointing one's predispositions). And here NJV-B report data that they see as demonstrating asymmetry -- conservatives are more disposed to "cognitive dissonance avoidance," they say, than liberals.

Chris Mooney, who has done an admirable job in synthesizing the relevant literature and making it publicly accessible in his book "The Republican Brain" sees this as compelling proof in favor of AT.

Obviously, I have views.  But not time to express them right now.  And besides, my views are not usually nearly so interesting as the ones that emerge in the discussion that they are the occasion for.

So let's do an experiment: can we have an interesting discussion w/o my saying anything (other than "hey-- what about this?")?

So what do others think of this study? Game over? 

Be a relief to have the debate on AT resolve, I suppose, since researchers could then turn all their attention to more important questions, like what the American public thinks of the NSA's policy on collecting metadata!



UK Environment Secretary Owen Paterson wants a "constructive, well informed and evidence-led" public discussion on GM foods. Any advice?

I received a thoughtful message from Jonas Kathage, who related interesting news about public discussion of GM foods in the UK and asked for my reaction.  Since “he asked for it,” I let him have it—blasting him with  a massive barrage of verbiage.  Putting content aside, I do think the length of my responseis is an accurate measure of the importance and difficulty of the issues his query raises. I reproduce the exchange below & of course invite others to offer their advice (whether or not they have the time & patience to work through my own) on how Secretary Paterson can achieve his objective of initiating a a “constructive, well informed and evidence-led” public discussion of the risks and benefits of GM food technology.


UK environment secretary Owen Paterson recently delivered a speech on genetically modified crops, calling for relaxing the restrictions on their cultivation we have in Europe. The speech is broadly supportive and points to various benefits of gm crops for people and environment. The full speech is here.

Now aside from the predictable outrage among anti-gmo groups, I was struck by a piece in the Guardian that seemed happy about efforts to restart a public discussion about gm crops, but at the same time argued Paterson wasted the opportunity by following the outdated deficit model and suggesting people are stupid. Here's the Guardian piece.

I'm turning to you because you are an expert on the science of science communication. While I remember you don't consider the communication environment to be polluted in the US, I feel it's a bit different in Europe. I'm wondering whether you agree with the Guardian commentator that Paterson's speech represents a wasted opportunity. Since I couldn't get a good idea from the piece or my follow-up on twitter about how the speech could be improved, what are your thoughts?

My response:

This is a very interesting development and a nice example of the challenges that are involved in promoting engaged and constructive public interaction with decision-relevant science. Thanks for pointing the story and the Guardian piece, and also for your thoughtful framing of the issues -- without that I'm confident I wouldn't have been able to appreciate the value that reflecting on Paterson's speech presents.

I agree, the issue is about how to address a “polluted science communication environment.”  I’m sure ours—in the US—is as bad as yours.

Or really what I’m sure is that we both have problems that can be characterized this way, ones in which the ordinary, and ordinarily reliable rational faculties that ordinary people use to identify the best available decision relevant science becomes enfeebled by “toxic meanings,” which turn positions on facts into badges of group commitment and loyalty. You have a problem like that on GM foods; I don’t think we have it on that particular issue, yet. But obviously we have it. On lots of issues.

That said, I feel obliged to risk disappointing you by adopting a decidedly uncertain stance.  For two reasons, one general and one specific.  I hope I won’t wear out your patience in making you invest the time it will take to work through what I feel impelled say in order to get what I agree will likely be a modest return (more in the nature of an investment strategy rather than a return in fact).

1. To start, I think affecting a posture of certainty and confidence on "how to communicate" on issues that feature or are vulnerable to cultural polarization will quite often be a mistake.  

We do know a good amount, as a result of careful empirical study, about the dynamics that generate toxic meanings; about steps that can help neutralize them; and about strategies that can help detoxify the science communication enviornment if those steps fail, or weren't taken to begin with.  

But the sort of knowledge we have tends to be very general.  It concerns the mechanisms of consequence and how they interact with various influences.

Having that knowledge is extremely valuable, because there are many genuine mechanisms of social psychology and the like that could be playing a role, and without knowledge about which really are and how, then the likelihood that anyone will ever figure out what to do or not do (or even know whether what they did contributed to making things better or worse) will be essentially nil.  We will drown in an ocean of just-so stories.

But most of that knowledge was gleaned in studies carefully designed to bear in on the mechanisms of interest and exclude everything else.  That’s essential for one to be able to manipulate the mechanisms in revealing ways and to observe with confidence how they are responding (I’m sure you likely know all of this, so forgive me for the wind up).

But those kind of pristine settings are—by design--simplified models. The settings in which one has to act will be much more complicated.  One knows from the studies—from the models—what sorts of mechanisms it makes sense to try to engage in those settings.  But one doesn’t know precisely how.

There’s only one way to figure that out: through use of the same methods that one used to identify the mechanisms of consequence in the first place! One has to engage in empirical field studies aimed at testing hypotheses about what sorts of “communication strategies” (very broadly understood; the strategies necessary to avert or treat a polluted science communication environment will often involve things other than just uttering words) can reproduce in the world the effects one attained in the “lab.”

Indeed, if one doesn’t do that, we will simply find ourselves again drowning in stories. For just as there are more plausible accounts of why we see cultural polarization than are actually true, there are more plausible accounts about how to use the genuine insights of the science of science communication to treat that pathology.  The currency of storytelling is just as valueless, and will buy us just as little real progress, at the “how to” stage as it did at the “what’s the problem” one.

I feel very strongly about this. So at the cost of an opportunity maybe to enjoy flattering attention, when you or some other thoughtful person in the middle of a communication problem (as I gather you are) asks me, “So what do we do, given what you’ve told us you know about the psychology of cultural polarization and science communication,” I feel constrained to say w/o equivocation, “I don’t know.”

But to recover – what? maybe some semblance of dignity! but more importantly the opportunity to be of use – I then add:

You tell me! And I will help you at that point by helping you to collect the evidence that will help you to figure out if you are right.

You are the one in the middle of this real world situation. You know lots of specific, relevant things about it—much more than I (or anyone else who studies the dynamics of communication for a living) does.  I’ve told you things that are important for you to know and that can help you make informed decisions about which of the things you were thinking about doing (likely one of them is the right thing; but which one?) is likely to work.  So the likelihood that you’ll know what to do is higher than the likelihood I will if I just make a wild-ass guess.

So tell me what you think--now that you have the benefit of knowing what I do--what you think it’s possible for you to do that might produce the effects I’ve been describing to you.  Indeed, tell me four or five such things, and we’ll talk them through

Then I will again do what I am equipped to do.  I will help you set up your communication operation in a way that is suited to generating evidence that will help you assess whether the things it occurred to you might work really are working. And just as important, help you recognize why they have and haven’t—so that you can refine and adjust and extend.

Then, I’ll add, that while I’m genuinely willing to help in this way, the only condition I myself would impose on assisting is that this person agree to share what we learn from this exercise with anyone else interested in helping to promote .  Because the situation you are in, you agree, is both maddeningly familiar and bad.  If enough of the others who had been in this kind of situation had done what I’m proposing and shared the results—each building on what the other is learned—then maybe you, and me and millions of others wouldn’t still be in this situation, or in it so often, and on so many issues. . ..

2.  So—that was the general part!

On the specific.

I am going to be modest, and in an even more “local” way here.  I don’t know enough of the background to have an opinion on whether the Guardian columnist is right about Paterson.

I will say, though, that I do have an opinion—and every strong one!—that the columnist is right to be thinking along the lines reflect in his essay.

For sure there is “more to it” than just getting “the information” out.  In addition, the “more” includes things in the nature of the ones that the columnist emphasizes.  For sure, those engaged in communicating need to address those on both sides in a manner that avoids conveying any sense that those on the other side are “stupid” or “anti-science”; absurd! Absurd in the U.K., absurd in the U.S., absurd in every nation that has had the benefit of being passing over the threshold of social development that marks a society’s entry into the privileged domain of liberal market democracy.  That sort of reckless, obnoxious talk is a form of science communication pollution—or in any case, should be reserved for the serious occasions when we are looking at the real thing.

Those engaged in the “debate” also have to show that they recognize why other reasoning citizens feel differently from them.

They have to demonstrate too– by seizing every opportunity that presents itself—that they are themselves not “recommitted,” and are thus willing to take seriously claims and proposals one might have expected them to resist. Also that they unwilling to tolerate any of their own number engaging in distortions of fact.

The columnist is thinking, and demonstrating how to think clearly, abut those issues.

But I don’t know if he has cause to see Paterson’s proposal for a “debate” as insensitive to these sorts of concerns.  I just don’t know enough to know.

Also—and I’d be shocked if the columnist disagreed—while it’s foolish to carry on as if “the facts,” the “evidence,” the “science” were all there were to it (a form of presentation that evinces contempt for those with whom one disagrees; one is implying, necessarily, that they are “idiots” or “liars”), it certainly is part of what there is to it!

Indeed, it is the most important part.  We are—or at least I am—motivated by the goal of assuring that the best available decision-relevant science is actually made use of by all those whose welfare it can enhance.  The problem ofa polluted science communication enviornment is that it makes it so much harder for people to recognize what the best available decision-relevant science is and what its significance is for them.

What’s it significance is is for them to decide; as free, reasoning individuals and citizens.

But what free and reasoning person is confident he or she can reliably see what the best evidence is or what it implies given his or her values through the toxic fog of cultural recrimination that pervades issues like climate change, nuclear power, and—in the UK, I gather –GM foods?

So I think Paterson is surely right to want the UK to engage the best available information. And to want to be sure that citizens can recognize what the best available evidence is.

That’s not a goal anyone could be criticized for. The only issue involves means.

And I’m so useless, sadly, on that for you!  I don’t know whether the means contemplated by Paterson are the wrong ones.

I’m sure too, though, the columnist would agree that what the right means are of promoting informed public engagement with GM foods in the UK are is not “obvious” but rather something that requires the sort of evidence-based orientation that I described in part 1.

Well, you asked!  That’s my reaction.

And thanks again for giving me something very interesting to think about.



How religiosity and science literacy interact: Evolution & science literacy part 2

This is the second of two posts on science literacy and evolution.

And religion.

And liberal democratic society as the naturally congenial but sometimes precariously raucous—or maybe better, simultaneously congenial and precarious because naturally raucous—home for science.

And how the common misunderstanding of what public “disbelief” in “evolution” truly signifies can actually interfere with popular dissemination of scientific knowledge.  Plus compromise norms of respect for cultural pluralism that are essential to the practice of liberal democracy.

See? Get it?

Okay, well, in the last post I described the vast body of long established but persistently--weirdly--ignored work that social scientists have amassed on the relationship between public “disbelief” in evolution and public understanding of evolution and other basic elements of science.

That work shows that there  isn't any relationship. What people say they “believe” about evolution is a measure of who they are, culturally.  It’s not a measure of what they know about what’s known to science.

Indeed, many people who say they “believe” in evolution don’t have the foggiest idea how the modern synthesis hangs together. Those who say they “disbelieve” are not any less likely to understand evolutionary theory--but they aren't any more more likely to either.

That so few members of the public have a meaningful understanding of the workings of genetic variance, random mutation, and natural selection (the core elements of the modern synthesis) is a shame, and definitely a matter of concern for the teaching of science education.

But it’s a problem about what people “know” and not what they say they “believe.” What people say they "believe" and what they "know" about evolution are vastly different things. That's what the ample scientific evidence on public understandings of science show.

In this post I want to add a modest increment of additional evidence corroborating this important point.

The evidence has to do specifically with the relationship between religion, science literacy, and belief in evolution.

The evidence is from a survey of 2,000 US adults recruited and stratified in a manner designed to assure national representativeness. 

The survey instrument included the NSF science indicators.

It also contained various measures of religiosity, including regularity of church attendance; regularity of prayer; and perceived “importance of God” in one’s life. These cohered in a manner that enabled them to be formed into a reliable “religiosity” scale.

And the survey contained an item that Gallup and other pollsters routinely use to measure the public’s “beliefs” about evolution.

What do these data show?

Well, I’ll state in summary form what I regard as the findings of interest, and then supply the supporting details:

1.   Neither the “Evolution” nor the “Big Bang” items in the NSF’s "Science Indicators" battery can plausibly be viewed as reliably measuring “scientific literacy” in subjects who are even modestly religious.

2. When subjects who are highly science literate but highly religious answer “False” to the NSF Indicator’s Evolution item, their response furnishes no reason to infer that they lack knowledge of the basic elements of the best scientific understanding of evolution.

3. For respondents who are below average in religiosity, a high score in “science literacy” predicts a higher probability of “believing” in “Naturalistic Evolution”—and so does a low score!

4. For those who are above average in religiosity, a high score in science literacy doesn’t predict a higher probability of believing in Naturalistic Evolution. But it does predict a higher probability of believing in Theistic Evolution.

5.  A higher score in science literacy predicts a lower probability of believing in Young Earth Creationism—whether respondents are below or above average in religiosity.

Okay. Here are the specifics.

1. In general, religiosity (measured, as I said, by aggregating items on church attendance, frequency of prayer, and perceived personal importance of God) is correlated negatively with science literacy.

But the effect is modest. The large overlap in the density distribution plots to the left makes it clear that the portions of population “above” and “below average” in religiosity (“AARs” and “BARs,” let’s call them) both comprise individuals of a wide range of scores on the NSF science literacy battery.

Or at least they do when one leaves Evolution and Big Bang out of the tally, as the NSF itself decided to do in 2010, and & as I have here. To make the science literacy scale more reliable and discerning, I’ve added items from the Indicators' “science process” battery, which tests knowledge relating to probability and validity of experimental methods.

Consider, though, how AARs and BARs scoring in the top 50% of the science literacy test so measured respond to Evolution and Big Bang:

The difference in the percentages of the two moderately “science literate” groups who answer “true” to these questions is stunningly high. 

Now one can use even more intricate statistical tests—ones involving, say, Cronbach’s alpha, factor analysis, and structural equation modeling—to convincingly show that Evolution and Big Bang are not measuring the same latent proficiency in acquiring scientific knowledge as are the remaining NSF Indicator items. 

But nothing more intricate than this discrepancy in the performance of modestly science literate AARs and BARs is necessary to see that these two items aren’t a valid measure of science literacy in the former.

2. The NSF Indicators test of science literacy is far from perfect, but I think it’s reasonable to infer that people who do above average have acquired more understanding of basic science knowledge than those who score below average.

I doubt that a majority of BARs who score in the top 50% of the NSF Indicator battery (sans Evolution and Big Bang and avec the process items) know the basic elements of the theory of evolution, including the role that genetic variance, random mutation, and natural selection play in it. 

But I think more of them are likely to understand those things than BARs who score in the bottom 50%.

By the same token, there’s reason to believe that AARs who score in the top 50% on the NSF science literacy test are more likely to have acquired an elementary knowledge of evolutionary theory than those—BARs or AARs—who score in the bottom 50%.   

Nothing in how the above-average science literacy AARs answer the Evolution item furnishes any reason to doubt this. How they respond to that item, I’ve just pointed out, is not, for them at least, a measure of what they know about science.  And in any case, as has been established by researchers on multiple occasions, there’s zero correlation between whether one says one “believes in” evolution and whether can give a passable account of the modern synthesis.

3. Now let’s consider what we can learn from the responses to the “popular opinion poll” item on beliefs in evolution.

That item asks respondents to indicate “which one of the following statements comes closest to your views on the origin and development of human beings—” 

  • Humans developed over millions of years from less advanced forms of life, but God guided this process
  • Human beings have developed over millions of years from less advanced forms of life, but God had no part in this process; or
  • God created human beings pretty much in their present form at one time within the last 10,000 years or so." 

Let’s call these responses “Theistic Evolution,” “Naturalistic Evolution,” and "Young Earth Creationism," respectively.

Theistic Evolution was the most popular response but by was supported by only a plurality (38%). Young Earth Creationism was second and Naturalistic (or "Godless") Evolution third but the proportions who selected each differed by only a slight amount (32% vs. 29%, respectively).

These numbers, by the way, differ a bit from what Gallup tends to report. The percent selecting Theistic Evolution is in consistent with that. But Godless Evolution runs closer to Young Earth Creationism than it does in Gallup polls.

What to make of this? Well, I’ll write a blog soon about the validity of on-line public opinion samples. But suffice it to say that based on the predictive accuracy of surveys conducted by YouGov, the premier on-line survey firm that recruited the sample for this study, and surveys conducted by Gallup in the 2010 and 2012 elections, YouGov is probably getting closer to the “true” general population values.

What we are interested in, though, is how science literacy and religiosity influence selection of these responses.

Consider first the relationship between these responses & science literacy.

Whoa ... the Jesus fish symbol popped out of my regression!

Maybe not shocking but note that support for Naturalistic peaks at only about 55% even among the most science literate. The relationship between support and for that position and science literacy, moreover, is “U”-shaped—higher at both the low and high ends. This relationship was confirmed by a multinomial logistic regression with appropriate quadratic terms; the fitted values from that regression are what I’m graphing (these plots are very true to what one would see in the “raw” data).

Now add religiosity. The following plots contrast the probabilities that AARs and BARs will select one or another of the response to the popular pollster item. They are derived from the same multinomial logistic regression, which confirmed that the impact of science literacy on the probability of selecting one response or another varies depending on level of religiosity.

It’s clear that the “U”-shaped relationship between science literacy and believing in Naturalistic Evolution is being driven by BARs.

In other words, BARs are more likely to believe in Naturalistic Evolution as they become either extremely science literate or extremely science illiterate!

Is this a surprise? Well, I wasn’t expecting this. My inspection of the data was pretty much exploratory, without strong hypotheses.

But I was reminded of a finding in what I regard as one of the very best studies of how high-quality instruction in the teaching of evolutionary theory generates improvements in knowledge but not changes in belief

In the study, Anton Lawson and collaborators found that high school students, particularly those scoring highest in critical reasoning skills, readily acquired knowledge of various aspects of evolution through instruction, but that acquisition of such knowledge did not produce a corresponding shift in belief among the students who began as nonbelievers.  

Nevertheless, the subgroup of such students who did back away from two particular beliefs hostile to naturalistic evolution (that the “living world is controlled by a force greater than humans” and that “all events in nature occur as part of a predetermined master plan”) consisted of the students who scored the lowest in critical reasoning skills. 

Speculating on why, Lawson et al. noted that “experience tells us that people change their beliefs for other than rational reasons. For example, hearing the opinion of an acknowledged authority figure could cause one to change a belief. Perhaps intuitive [students] are more likely than reflective students to change their beliefs for this reason.”

Lawson et al. don’t themselves explicitly suggest this, but a consistent conjecture might be that students who are higher in critical reasoning skills might be more inclined to push back on identity-threatening “beliefs” (even while taking on more knowledge) than those who are less reflective. That would be consistent with findings that motivated reasoning can be amplified by science literacy and cognitive reflection.

Someone should do a study to test that hypothesis!

4.  For AARs, in contrast, an increase in science literacy does not predict belief in Naturalistic Evolution. On the contrary, it seems to predict a slight decrease, although the effect is pretty much zero for all but those AARs whose scores are quite low.

So much for the idea that “disbelief” in evolution is a sign of low science literacy.  It isn’t.  “Disbelief” is just as consistent with being high in science literacy as low.

The only thing “disbelief” in Naturalistic Evolution reliably signifies is that one is religious.  This is consistent with the hypothesis that evolution “beliefs” are actually measures of cultural identity (as reflected in religiosity).

This conclusion is strongly corroborated by the relationship between science literacy and the increased probability of believing in Theistic Evolution among AARs. Offered the opportunity—as they aren’t in the NSF Science Indicators science knowledge battery—to select a position simultaneously consistent with “belief” in evolution and religious identity, the most science literate AARS grab hold of it!

5. Indeed, those same subjects—AARs who score high in science literacy—are less likely to espouse Young Earth Creationism than their less science literate counterparts.

What does this tell us? I suppose other interpretations are possible, but I’d say that AARs high in science literacy are in fact eager to affirm their “belief” in evolution, so long as they can be presented with a means of doing so that doesn’t denigrate their cultural identities.

Not surprisingly, BARs also less likely to express support for Young Earth Creationism as they become more science literate.

Support for Young Earth Creationism is associated disproportionately with being simultaneously above average in religiosity and below average in science literacy.

* * * * *

Some concluding thoughts:

1. “Disbelief” in evolution doesn’t reflect a deficiency in science literacy or shortcomings in science education in our society.  

I think it is very reasonable to think members of our society are not as science literate as they should be, and also that our education system must do better in imparting scientific knowledge to citizens generally. 

But it’s wrong to think that the level of “disbelief” in evolution is evidence of those things.  It’s wrong to think that because that view is contrary to empirical evidence.

The evidence that many researchers have compiled and that I’ve added to in a very modest way here show overwhelmingly that an individual's unwillingness to profess “belief” in evolution doesn't indicate science illiteracy or her unfamiliarity with the rudiments of evolutionary theory. 

It measures her expression of her cultural identity. What saying “I don’t believe in evolution” means, culturally speaking, is that one belongs to a community whose members subscribe to a particular set of understanding on best way to live.

2.  Those dedicated to the critical task of promoting scientific literacy, including public knowledge of the best scientific understanding of evolution, should not be focusing on what percentage of the population says they “believe” in evolution.

They shouldn’t be focusing on that because that information tells us nothing about how much scientific knowledge or even knowledge of evolution the public has.  Those who want to test how well society is doing in imparting knowledge of evolution should be measuring instead what fraction of the population can give a cogent account of genetic variance, random mutation, and natural selection. It’s pitifully small, among both those who say they “believe” in evolution and those who say they don’t.

But even more important, those who want to promote public acquisition of scientific knowledge should avoid making professions of “belief” in evolution their aim because doing so is much more likely to deter than promote acquisition of basic scientific knowledge.

People who have a religious identity—who include plenty of science literate people and people capable of becoming even more so—see profession of “belief” as denigrating their cultural identities.  Naturally, then, they will see the demand that they not only learn but publicly affirm their "belief” in evolution as an attack on their community by members of another who harbor a shared understanding of the best life hostile to theirs.

They’ll resent that.  And with good reason. It's appropriate--absolutely essential, even--that a liberal democracy oblige those who furnish the public good of education to impart to people of all cultural identities the best available understanding of how the universe works, including the career of life on earth.  But citizens who make it their business to force others who have cultural views different from theirs to submit to purely symbolic rituals of identity-abnegation are engaged in a noxious, fundamentally illiberal form of conduct.

Such behavior, moreover, predictably breeds motivated resistance to acquiring knowledge of what science knows. Fear of the loss of status associated with "assenting" to facts symbolically linked to the identity of a rival cultural group is exactly what blocks citizens from converging on the best scientific evidence on issues climate change, nuclear power, the HPV vaccine, and other culturally contested policies.

In their study of how effectively imparting knowledge of evolutionary theory does not produce “belief,” Anton Lawson & William Worsnop conclude:

Of course, every teacher who has addressed the issue of special creation and evolution in the classroom already knows that highly religious students are not likely to change their belief in special creation as a consequence of relative brief lessons on evolution. Our suggestion is that it is best not to try to do so, not directly at least. Rather, our experience and results suggest to us that a more prudent plan would be to utilize instruction time, much as we did, to explore the alternatives, their predicted consequences, and the evidence in a hypothetico-deductive way in an effort to provoke argumentation and the use of reflective thought. Thus, the primary aims of the lesson should not be to convince students of one belief or another, but, instead, to help students (a) gain a better understanding of how scientists compare alternative hypotheses, their predicated consequences, and the evidence to arrive at belief and (b) acquire skill in the use of this important reasoning pattern-a pattern that appears to be necessary for independent learning and critical thought.

This is a sensible prescription for those who (very appropriately!) want to promote the widest dissemination of basic science knowledge in the general public.

But it also happens to be a prescription consistent with the basic liberal injunction to respect the entitlement of individual citizens to freely use their own reason both to understand what is known by science and to decide for themselves what constitutes a virtuous life.

The convergence of the two is not any sort of accident.  It reflects a deep truth about the reciprocal affinity of science and political liberalism.

Page 1 ... 7 8 9 10 11 ... 24 Next 20 Entries »