follow CCP

Recent blog entries
popular papers

Science Curiosity and Political Information Processing

What Is the "Science of Science Communication"?

Climate-Science Communication and the Measurement Problem

Ideology, Motivated Cognition, and Cognitive Reflection: An Experimental Study

'Ideology' or 'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment

A Risky Science Communication Environment for Vaccines

Motivated Numeracy and Enlightened Self-Government

Making Climate Science Communication Evidence-based—All the Way Down 

Neutral Principles, Motivated Cognition, and Some Problems for Constitutional Law 

Cultural Cognition of Scientific Consensus

The Tragedy of the Risk-Perception Commons: Science Literacy and Climate Change

"They Saw a Protest": Cognitive Illiberalism and the Speech-Conduct Distinction 

Geoengineering and the Science Communication Environment: a Cross-Cultural Experiment

Fixing the Communications Failure

Why We Are Poles Apart on Climate Change

The Cognitively Illiberal State 

Who Fears the HPV Vaccine, Who Doesn't, and Why? An Experimental Study

Cultural Cognition of the Risks and Benefits of Nanotechnology

Whose Eyes Are You Going to Believe? An Empirical Examination of Scott v. Harris

Cultural Cognition and Public Policy

Culture, Cognition, and Consent: Who Perceives What, and Why, in "Acquaintance Rape" Cases

Culture and Identity-Protective Cognition: Explaining the White Male Effect

Fear of Democracy: A Cultural Evaluation of Sunstein on Risk

Cultural Cognition as a Conception of the Cultural Theory of Risk


Perplexed--once more--by "emotions in criminal law," Part 2: The "evaluative conception"

This is the second in an n-part series describing my evolving view of the significance of emotions in substantive criminal.  

Actually shifting view would be a better way to put it.  I took a position at one point that I later concluded missed if not the point then a very important point, one that had caused me to lose confidence in the original position.  

Now I find myself thinking that the successor position is also likely inadequate. Maybe the earlier position was right after all. Or perhaps some sort of dialectical synthesis will reveal itself to me if I think more about how the pieces of evidence before me actually fit together.

I'm really not sure!

Should I be worried that I don't know whether either of the announced positions I took before is right, and thus what I actually believe anymore?

The point of this series of posts, in addition to inviting reflection & comment on an interesting part of the law, is to explore "changing one's mind." 

One of my principal research interests is the ubiquity of defensive resistance to evidence that challenges people's perceptions of risk and like facts on culturally contested issues--climate change, gun control, etc.

But more intriguing to me at this particular moment is that it seems just as unusual for scholars studying this very phenomeon--or pretty much any other intriguing aspect of human behavior or cognition--to change their minds about what explains it.

Why would this be so?  By hypothesis, those scholars are using empirical methods to make sense of complex phenomena, the workings of which don't' admit of direct observation and that must therefore be investigated indirectly, on the basis of the observations of other things we'd expect to see or not depending on the truth of different plausible theories of how those unobserved phenomena work.

Given the very nature this activity, one might expect shifts in position to be common-place. If  the phenomena in quesiton are complex and not open to direct observation; if multiple plausible theories compete to account for them; and if the evidence for deciding between those theories consists of observations that necessarily do nothing more than alter incrementally the balance of then-existing considerations in favor of one position or another, then why wouldn't individual researchers' positions display the character of successive estimates of a random variable subject to imperfect measurement?

Meanignful shifts might be expected to abate over time, as sound studies--valid measurements of the quantity of interest--start to coverge on some value, estimates of which are less and less affected by the marginal impact of additional studies.  But where something is complex, and measuring instruments imperfect, that sort of stability will often take quite a while to emerge.  Moreover, it is during the interval it takes for such a state to form that we should expect to see the greatest volume of active, intense research--and thus the most occassion for those carrying out such investigations to shift positions as they update their views based on new evidence.

Scholarly inquiry as a whole takes this form.  We view such shifts in prevailing understanding as signs of "progress," a byproduct of the enlargement of knowledge associated with the of use science's signature method of inquiry. (I really do mean to be talking only about "normal science," or as I prefer "progressive research programs," the operation of which is predominantly made up of successive incremental advances driven by investigation of competing solutions to unresolved questions or unexplained anomalies; so-called "paradigm shifts" are another matter altogether.)

So why shouldn't we observe this same thing in the career of individual researchers' own understandings of the complex phenomena they are studying? If scholars' own research programs are progressing, and their knowledge of the phenonena they are studying enlarging as a result, then shouldn't their own work be expected to furnish them periodically with reason not just for refinement and fine tuning of their previous understandings but with cause for announcing that they've discovered some decisive objection to an inference they drew earlier?

In Part 1, I reproduced an excerpt from Two Conceptions of Two Conceptions of Emotion in Criminal Law: An Essay Inspired by Bill Stuntz, in the Political Heart of Criminal Procedure 163  (David Skeel Michael Klarman & Carol Steiker eds., 2011).  In that exerpt, I sketched out the hard question that the treatment of emotion in criminal law puts: namely, "what is really going on"-- when courts selectively treat impassioned behavior as a grounds for mitigating or aggravating the law's appraisal of the moral quality of an offender's, or victim's, conduct?

Here's another snippet from that same essay, one in which I trace out an answer developed in Two Conceptions of Emotion in Criminal Law, 96 Colum. L. Rev. 269 (1996), an article I coauthored with Martha Nussbaum.  It's a position that, for reasons I'll discuss in "tomorrow's" post, I decided, at the time this article was written, no longer seemed right to me. The "day after tomorrow" I'll exlain why I now don't think the reason I rejected that earlier paper seems right either.

But I'll tell you now how I feel about this: kind of excited, actually.

* * *

I will call [one account of the discordant themes that pervade the criminal law’s discussion of emotions] the two conceptions thesis, or “TCT.” This label derives from [Kahan & Nussbaum (1996)]. But the basic position—this particular solution to the puzzle of emotions in criminal law—was in line with ones that other scholars, including Sam Pillsbury and Victoria Nourse were developing at roughly the same time, and that many others, including Cynthia Lee and Carol Steiker, have since refined and extended.

TCT posits that substantive criminal law features two competing views of what emotions are and why they matter. The first is the mechanistic conception, which sees emotions as thoughtless surges of affect or “impulses.” Emotions excuse or mitigate, on this account, because—and to the extent—that they deprive an individual of the power to control his or her actions.

The second account is the evaluative conception of emotion. This view treats emotions and related sensibilities as a species of moral judgment that express an actor’s evaluation of contingencies that threaten or promote ends the actor cares about. As such, emotions, on this view, can be evaluated, not just as strong or weak, but as good or bad, right or wrong, reasonable or unreasonable, depending on whether we at the values they express are ones we think are appropriate or not for someone in the actor’s situation.

Each conception of emotion has an impressive pedigree in philosophy and psychology, and both are on display in the Oklahoma Court of Criminal Appeals decisions I started with. The mechanistic figures in those portions of the opinions emphasizing the “intensity of mental shock,” and resulting “loss of control,” “unseating of reason,” and “unbalancing of mind” that attend the discovery of adultery; the evaluative in those that distinguish between the “man of good character” and “refined sensibilities,” whose aggrievement warrants our solicitude, from the “rounder and libertine,” whose resentment of a man whose disregard for “the sanctity of the home” and “the virtue of women” he himself shares does not.

On their surface, the doctrines of criminal law are pervaded by mechanistic idioms and metaphors. But at their core, TCT asserts, they are evaluative. All of the doctrines contain one or another normative element that invites (or at least enables) decisionmakers to confine their liability-discharging or punishment-mitigating consequence to offenders whose emotional evaluations decisionmakers morally approve of. If they find that element to be satisfied, they needn’t find that the offenders’ passion embodied any particular quantum of volition-destroying force; if they find that this particular quality to be absent, they needn’t afford the slightest dispensation no matter how overwhelming or irresistible the offender’s (or victim’s, in the case of “intervening causation”) was.

The anger of the man who kills his wife or her paramour, for example, is right for someone in his situation, because adultery is “the gravest possible offence which a wife can commit against her husband” and “the highest invasion of [a man's] property” by another man. Having “no such right to control the woman as a husband has to control his wife,” in contrast, the resentment of the man who kills the lover of his mistress reveals a morally incorrect overvaluation of his own prerogatives. Only the “heat of passion” of the former, then, will be deemed to have been “adequately provoked” for purposes of the involuntary manslaughter doctrine.

The fear of the woman who aids the armed robber to protect her child appropriately loves her children more than she loves strangers, whereas one who acquiesces in the abuse of her own child to avoid harm to herself excessively prefers her own well-being to her children’s. The threat to the former, then, but not the latter is sufficient to “overcome the will of a person of reasonable firmness”—not because their wills were any more or less compromised but because reasonable women appropriately value their children’s well-being over that of anyone else’s, including their own.

What’s “true” about the man who stands his ground and kills is his character: like a “true beam,” it is straight, not warped. Because he appropriately values his “rights,” “liberty,” and “sacredness of . . . person” more than the life of a “wrongful” aggressor who tries to drive him from a public place where he has a “right to be,” he “reasonably” perceives flight to be as destructive of his “self-preservation” as death. The true woman, quite evidently, does not make the mistake of thinking her right to stay put ahead of the life of her abusive husband, even if the alternative is to remain “a life of the worst kind of torture and . . . degradation.”

The law refuses to accept any expert definition of “mental disease” for purposes of insanity. “[F]or all his insight into the dynamics of behavior, [the medical expert] has not solved the riddle of blame. The question remains an ethical one, the answer to which lies beyond scientific truth.” However implausible, then, it might be to think the explosive shock of infidelity invariably reverberates with greater intensity in the mind of a “man of refined sensibilities, having high conceptions of the sanctity of the home and the virtue of women,” than in that of a “moral degenerate, in the habit of consorting with prostitutes and dissolute women,” it is perfectly compatible with the law to characterize the former alone as sick.

The TCT solution to the puzzle of emotions in criminal law has three principal strengths. The first is its explanatory power. The evaluations that decisionmakers make of the values expressed in impassioned offenders’ emotions are informed by social norms. It is thus no surprise to see decisionmakers who are using the evaluative conception of emotion selectively exonerating (in whole or in part) offenders’ whose emotional valuations conform to prevailing expectations of what goods and states of affairs individuals occupying particular social roles are expected to value.

These norms, of course, are not fixed. They shift over time, and at any given moment might well be in a state of flux and contestation. . . . TCT thus explains . . . why the law’s appraisal of impassioned offenders shifts over time and why at any given moment can be the focus of intense political conflict.***

A second, related strength of TCT is its critical power. . . . TCT proponents have often successfully exposed the conservative bias of [commentators], who piously denounce as “political” any shift or proposed reform in the law’s treatment of impassioned offenders while displaying a comically blind eye to the necessarily political content of the evaluations that inform traditional doctrines and their applications. . . .

The third and final attraction of TCT is its prescriptive power. Critical commentary begs the question: what should the law be? Accounts that treat the mechanistic veneer of the doctrine seriously don’t help; at best they produce muddle, and at worst they make us unwitting apologists for the norms that just below the surface inform traditional doctrine and doctrinal applications. If the core of the law is evaluative, then those who want to make the law as good as it can be should be self-consciously evaluative, TCT proponents (myself included!) argued. We should face up to the necessity and appropriateness of making the law a reflection of the best moral and political understanding we can fashion of the values that good people ought to have.



Perplexed--once more--by "emotions in criminal law": Part 1

So to try to terminate my obsession with the " 'hot hand' fallacy" fallacy, I have resorted to intellectual methadone, finding a new puzzle that I can substitute to quench my cravings but that I'm sure I'll be able to drop once those subside....

Actually, it is the issue that was in the background of yesterday's post on "changing my mind." I offered up the topic of "emotions in criminal law"--the question how the law conceives of their nature and their normative significance--as a matter on which I had acknowledged, in a published paper (Kahan 2011)-- that the position I had taken in an article written yrs earlier (Kahan & Nussbaum 1996) had come to seem wrong to me based on things I had learned in the interim.

But in the course of reminding myself what position I had adopted in the later paper, it occurred to me that there were certain things about it that now seemed hard to reconcile with what I'd learned in the 4 yrs since I wrote that paper....

So I'm going to try work out what my new for-now position should be based on the current state of how I understand various not directly observable things in the world to work.  

In the course of doing that, moreover, I want to advance a claim about being in exactly this situation -- of finding that what one offered as a well-considered account of some phenomenon has to be qualified or simply replcaed with a different position based on new things one has learned.

The claim is that this should be a normal, even common-place thing.  Or at least it should be if one, first, chooses to devote one's attention to matters of genuine complexity, phenomena the workings of which are not demonstrable on the basis of direct inspection but rather only indirectly inferrable on the basis of evidence, i.e., additional phenomena that can be observed and that one has reason to believe are caused by those nonobservable complex matters; and, second, recognizes that anything pertinent one discovers under these conditions necessarily doesn't settle the issue but rather supplies one only with more or less reason to credit one plausible account, rather than another, about what's really going on.

For in that situation, whatever one's current best undertanding is will be in the nature of an estimate of a very fine quantity, and ones' work in the nature of progressively more precise measurements, which can be expected to jump from one side of some critical value to the other and back again as one's knowledge continues to expland . . . .

This is actually how things look, more or less, within a "progressive research program" that engages the collaborative, conversational attention of a group of researchers engaged in scholarly conversation.

So shouldn't it in be the way the work of any particular researcher working within such a program looks, too, if he or she is genuinely trying to figure out the truth about some complex thing, the operations of which cannot be directly see but rather only indirectly inferred on the basis of disciplined observation & measurement?....

Well, anyway, this post is the first of what I anticipate will be between 3 and 600 on the evolution of my understanding on "emotions in criminal law," which has been marked by a series of shifting positions animated by a constant state of perplexity.

In this first part, I reproduce an excerpt from Two Conceptions of Two Conceptions of Emotion, the essay I mentioned in yesterday's post, which is designed to conjure apprehension of the unobservable phenomenon apprehension of which is the goal of the inquiry.

* * *


To introduce (or re-introduce) the puzzle I am concerned with, I will start with a pair of old decisions, both by the Oklahoma Court of Criminal Appeals. The issue in each was the same: whether the trial court erred by foreclosing the effective presentation of an insanity defense by a man charged with murder for killing his wife’s paramour.

In the first case, the court reversed the defendant’s conviction.[1] “Two doctors,” the court noted, “testified that the defendant . . . temporarily lost control of his mental processes” as a result of the “provocation” of his wife’s seduction.[2] “[W]e can perceive,” the court continued, that

a man of good moral character such as that possessed by the defendant, highly respected in his community, having regard for his duties as a husband and the virtue of women, upon learning of the immorality of his wife, might be shocked, or such knowledge might prey upon his mind and cause temporary insanity. In fact it would appear that such would be the most likely consequence of obtaining such information.[3]

In the second case, however, the court affirmed the conviction.[4] In that case, the court noted, “the state, over the objections of the defendant,” introduced evidence of “specific conduct tending to show . . . the defendant [to be] . . . a rounder and a libertine”:[5]

Facts were shown indicating that defendant's ideals of the sanctity of the home and the virtue of women were not so exalted, and that therefore the shock to his mind and finer sensibilities could not be so very great--at least not so great as to unbalance his mind. . . .

We think, in reason, that the shock would not be so great as it would to a man of refined sensibilities, having high conceptions of the sanctity of the home and the virtue of women.[6]

Thus, any trial rulings that prevented him from presenting a temporary insanity defense, the court held, were at most harmless error.

What’s really going on here? That is the question that any thoughtful reader who sets these two opinions out next to each other will feel compelled to ask. The court’s conclusion is straightforward: discovery of a wife’s infidelity is likely to deprive a sexually faithful man of his ability to comprehend or control his actions; such a discovery is not likely to have that effect, however, on an unfaithful man. But what’s not so straightforward is how to integrate the mélange of psychological and moral concepts that inform the court’s reasoning—“intensity of mental shock,” “unbalan[cing of] mind,” “loss of control,” on the one hand; “good moral character,” “regard for . . . the virtue of women,” “rounder and libertine,” on the other—into a coherent whole. How exactly does the court conceive of the nature of the emotional state of the “mentally insane” offender? What is it, precisely, about that condition that entitles someone to a defense?

These questions try to make sense of the decisions in philosophical or jurisprudential terms; but we might also feel impelled to ask “what is going on here” from a psychological or even political point of view. Do the judges really believe their own explanation of the distinction between two cases? Or are they deliberately concealing part of what they think from view? If concealing, are they trying to fool us, or are they just being coy? Do we imagine them straight-faced and earnest, or winking and slyly grinning, as they pronounce their judgments?

What’s likely to strike thoughtful readers as puzzling about these two decisions, it turns out, is the puzzle of emotions in criminal law. The discordant pictures that the decisions paint—of “highly respected” men of “good moral character” who are “shocked” to the point of mindless “loss of control,” on the one hand; of “rounders and libertines,” whose own lack of virtue insulates them from “mind-unbalancing” assaults on their reason, on the other—pervades basic doctrines and their application.

“Detached reflection cannot be demanded in the presence of an uplifted knife,” we are told.[7] Hence we cannot blame the “true man” who refuses to flee “an assailant, who by violence or surprise maliciously seeks to” drive him from a public place “where [he] has the right to be.”[8] But the woman who “ ‘believed herself . . . doomed . . .  to a life of the worst kind of torture and . . . degradation” cannot on that basis be excused for killing her abusive husband in his sleep: because she had the option of leaving their home and striking out on her own, her will was not overcome by the “primal impulse” of “self-preservation.”[9]

A man who “discovered his wife in flagrante delicto with a man who was a total stranger to him, and at a time when [he] was trying to save his marriage and was deeply concerned about both his wife and his young child,” will necessarily experience the form of “ungovernable passion” that mitigates first-degree murder to manslaughter.[10] The same volitional impairment cannot be imputed to the man who kills the lover of his mistress, however, for he “has no such right to control the woman as a husband has to control his wife.”[11]

The deep “shame” of being subjected to rape is one of the “physical and mental injuries, the natural and probable result of which would render the [an unmarried woman] mentally irresponsible,” making her subsequent commission of suicide an act attributable to her rapist, who could therefore be convicted of murder.[12] But a man could not be deemed to have “caused” the death of his (8-months pregnant) wife—“a high tempered woman” who was “hard to get along with” and who on previous “occasions ran off and left her husband” alone with the couple’s infant—because her decision to expose herself to the nighttime cold of winter in fleeing their farmhouse was her own choice following a fight.[13]

Again and again, we are confronted with a kaleidoscope of dissonant reports of virtuous offenders too mentally enfeebled to obey the law and impassioned ones too vicious not to be deemed to have “voluntarily” chosen to transgress. So what is really going on?


[1] Hamilton v. State, 244 P.2d 328 (Okla. Crim. App. 1952).

[2] Id. at 335.

[3] Id.

[4] Coffeen v. State, 210 P. 288 (Okla. Crim. App. 1922).

[5] Id. at 290.

[6] Id. at 290-91

[7] Brown v. United States, 256 U.S. 335, 343 (1921) (Holmes, J.).

[8] State v. Bartlett, 71 S.W. 148, 152 (Mo. 1902).

[9] State v. Norman, 378 S.E.2d 8, 11, 12-13 (N.C. 1989).

[10] State v. Thornton, 730 S.W.2d 309, 312, 315 (Tenn. 1987).

[11] Rex v. Greening, 3 KB. 846, 849 (1913).

[12] Stephenson v. State, 179 N.E. 633, 635, 649 (Ind. 1932).

[13] Hendrickson v. Commonwealth, 3 S.W. 166, 167 (Ky. Ct. App. 1887).



Changing my mind on "emotions in criminal law"

I sometimes get asked--sometimes in a challenging way--whether I've ever "changed my mind" or "admitted I was wrong" about something.  Hell yeah! Here's an example-- Kahan, D. M. (2011), Two Conceptions of Two Conceptions of Emotion in Criminal Law: An Essay Inspired by Bill Stuntz,In The Political Heart of Criminal Procedure. D. S. Michael Klarman & C. Steiker (Eds.), (pp. 163-176): Cambridge University Press  (working paper version here) , where I shift my views on a number of key points from an earlier paper, Kahan, D. M., & Nussbaum, M. C. (1996). Two Conceptions of Emotion in Criminal Law. Colum. L. Rev., 96, 269. There's more where this came from, too!  

Indeed, I was looking at this particular paper the other day (after I offered it as an example to someone challenging me to show that I've very acknolwedged I was "wrong") & wondering if maybe it's wrong in light of Kahan, D. M., Hoffman, D. A., Evans, D., Devins, N., Lucci, E. A., & Cheng, K. (in press), 'Ideology'or'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment. U. Pa. L. Rev., 164.  There's at least a tension to be explained....Maybe the first paper was right...

Do I like saying I've changed my mind? Sure, if the reason is that I actually managed to figure out something that I didn't know before. If one never had occassion to announce that one had changed his or her mind for that reason, it would mean either (a) one was studying unchallenging, non-complex things (boring); or (b) one wasn't actually advancing in understanding in the course of study & reflection.

Do I worry that, as a result of saying "I think I wasn't right on X," people might not "believe me" when I say think I know something in the future? No. First of all, they ought to be thinking critically about anything I say. Second, they ought to trust me more when they know that if I conclude I was wrong or have to qualify my previous view in some important way, I'll make an effort to tell them! Those who prefer to put their trust in scholars who wouldn't change their minds when they should, or wouldn't tell them when they did, are ones whose confidence I take no particular pride in earning.

Two Conceptions of Two Conceptions of Emotion in Criminal Law: An Essay Inspired by Bill Stuntz

Dan M. Kahan 

This essay examines alternative explanatory theories of the treatment of emotion in criminal law. In fact, it re-examines a previous exposition on this same topic. In Two Conceptions of Emotion in Criminal Law (Kahan & Nussbaum 1996), I argued that the law, despite a surface profession of fidelity to a mechanistic conception of emotion, in fact reflects an evaluative one: rather than thoughtless surges of affect that impair an actor’s volition, emotions, on this account, embody a moral evaluation of the actor that is in turn subject to moral evaluation by legal decisionmakers as “right” or “wrong,” “virtuous” or “vicious,” and not merely as “strong” or “weak” in relation to the actor’s volition. I now qualify this claim—and indeed reject certain parts of it.  I do so on the basis of an alternative conception of the evaluative conception of emotion: whereas the position in Kahan & Nussbaum (1996) treats the evaluative conception as  implementing a conscious moral appraisal on the part of decisionmakers, the alternative sees it, at least sometimes, as a product of decisoinmakers’ unconscious vulnerability to appraisals they themselves would view as subversive of the law’s moral principles, which might well invest volitional impairment with normative significance. I examine the empirical evidence, amassed by various researchers including (without giving this point much thought) by me, for this third view, which I label the “cognitive conception” as opposed to the earlier (Kahan & Nussbaum 1996) “moral conception” of the “evaluative” view of emotions in criminal law.



Weekend update: Still fooled by non-randomness? Some gadgets to help you *see* the " 'hot hand' fallacy" fallacy

Well, I'm still obsessed with the " 'hot hand fallacy' fallacy." Are you?

As discussed previously, the classic "'hot hand' fallacy"  studies purported to show that people are deluded when they perceive that basketball players and other athletes enjoy temporary "hot streaks" during which they display an above-average level of proficiency.

The premise of the studies was that ordinary people are prone to detect patterns and thus to  confuse chance sequences of events (e.g., a consecutive string of successful dice rolls in craps) as evidence of some non-random process (e.g., a "hot streak," in which a craps player can be expected to defy the odds for a specified period of time).

For sure, people are disposed to see signal in noise.

But the question is whether that cognitive bias truly accounts for the perception that athletes are on a "hot streak."

The answer, according to an amazing paper by Joshua Miller & Adam Sanjurjo, is no

Or in any case, they show that the purported proof of the "hot hand fallacy" itself reflects an alluring but false intuition about the the conditional independence of binary random events.

The "test" the "hot hand fallacy" researchers applied to determine whether a string of successes indicate a genuine "hot hand"--as opposed to the illusion associated with our over-active pattern-detection imaginations--was to examine how likely basketball players were to hit shots after some specified string of "hits" than they were to hit shots after an equivalent string of misses.  

If the success rates for shots following strings of "hits" was not "significantly" different from the success rates for shots following strings of "failures," then one could infer that the probability of hitting a shot after either a string of hits or misses was not significantly different from the probability of hitting a shot regardless of the outcome of previous shots. Strings of successful shots being no longer than what we should expect by chance in a random binary process, the "hot hand" could be dismissed as product of our vulnerability to see patterns where they ain't, the researchers famously concluded.


This analytic strategy itself reflects a cognitive bias-- an understanding about the relationship of independent events that is intuitively appealing but in fact incorrect.

Basically, the mistake -- which for sure should now be called the " 'hot hand fallacy' fallacy" -- is to treat the conditional probability of success following a string of successes in a past sequence of outcomes as if it were the same as the conditional probability of success following a string of successes in a future or ongoing sequence. In the latter situation, the occurrence of independent events generated by a random process is (by definition) unconstrained by the past.  But in the former situation -- where one is examining a past sequence of such events --  that's not so.  

In the completed past sequence, there is a fixed number of each outcome.  If we are talking about successful shots by a basketball player, then in a season's worth of shots, he or she will have made a specifiable number of "hits" and "misses."

The cool Miller-Sanjurjo machine! It can be yours, because you-- unlike some *other* people (or robots or aliens or badgers with operational internet connections) who shall remain namless -- never miss an episode of this blog! Just click!Accordingly, if we examine the sequence of shots after the fact, the probability the next shot in the sequence will be a "hit" will be lower immediately following a specified number of "hits" for the simple reason that the proportion of "hits" in the remainder of the sequence will necessarily be lower than it it was before the previous successful shot or shots.

By the same token, if we observe a string of "misses," the proportion of "misses" in the remainder will be lower than it had been before the first shot in the string.  As a result, following a string of "misses," we can deduce that the probability has now gone up that the next shot in the sequence will turn out to have been a "hit."

Thus, it is wrong to expect that, on average, when we examine a past sequence of random binary outcomes, P(success|specified string of successes) will be equal to P(success|specified string of failures).  Instead, in that that situation, we should expect P(success|specified string of successes)  to be less than P(success|specified string of failures).

That means the original finding of the "hot hand fallacy" researchers that P(success|specified string of successes) = P(success|specified string of failures) in their samples of basketball player performances wasn't evidence that the "hot hand" perception is an illusion.  If P(success|specified string of successes) = P(success|specified string of failures) within an adequate sample of sequences, then we are observing a higher success rate following a string of successes than we would expect to see by chance

In other words, the data reported by the original "hot hand fallacy" studies supported the inference that there was a hot-hand effect after all!

So goes M&S's extremely compelling proof, which I discussed in a previous blog.  The M&S paper was featured in Andrew Gelman's Statistical Modeling, Causal Inference blog, where the comment thread quickly frayed and broke, resulting in a state of total mayhem and bedlam!

How did the "hot hand fallacy" researchers make this error? Why did it go undetected for 30 yrs, during which the studies they did have been celebrated as classics in the study of "bounded rationality"? Why do so many smart people find it so hard now to accept that those studies themselves rest on a mistaken understanding of the logical properties of random processes?

The answer I'd give  for all of these questions is the priority of affective perception to logical inference.

Basically, we see valid inferences before we apprehend, through ratiocination, the logical cogency of the inference.

What makes people who are good at drawing valid inferences good at that is that they more quickly and reliably perceive or feel the right answer -- or feel the wrongness of a seemingly correct but wrong one -- than those less adapt at such inferences.

This is an implication of a conception of dual process reasoning that, in contrast to the dominant "System 1/System 2" one, sees unconscious reasoning and conscious effortful reasoning as integrated and reciprocal rather than discrete and hierarchical.

The "discrete & hierarchical" position imagines that people immediately form a a heuristic response ("System 1") and then, if they are good reasoners, use conscious, effortful processing ("System 2")  to "check" and if necessary revise that judgment.

The "integrated and reciprocal" position, in contrast, says that good reasoners experience are more likely to experience an unconscious feeling of the incorrectness of a wrong answer, and the need for effortful processing to determine the right answer, than are people who are poor reasoners. 

The reason the former are more likely to feel that right answers are right and wrong answers wrong is that they have through the use of their proficiency in conscious, effortful information processing trained their intuitions to alert them to the features of a problem that require the deployment of conscious, effortful processing.

Now what makes the fallacy inherent in the " 'hot hand fallacy' fallacy" so hard to detect, I surmise, is that those who've acquired reliable feelings about the wrongness of treating independent random events as dependent (the most conspicuous instance of this is the "gambler's fallacy") will in fact have trained their intuitions to recognize as right the corrective method of analyzing such events as genuinely independent.

If the "hot hand" perception is an illusion, then it definitely stems from mistaking an independent random process for one that is generating systematically interdependent results.

So fix it -- by applying a test that treats those same events as independent!

That's the intuition that the "hot hand fallacy" researchers had, and that 1000's & 1000's of other smart people have shared in celebrating their studies for three decades -- but it's wrong wrong wrong wrong wrong!!!!!

But because it feels right right right right right to those who've trained their intuitions to avoid heuristic biases involving the treatment of independent events as interdependent, it is super hard for them to accept that the method reflected in the "hot hand fallacy" studies is indeed incorrect.

So how does one fix that problem?

Well, no amount of logical argument will work!  One must simply see that the right result is right first; only then will one be open to working out the logic that supports what one is seeing.

And at that point, one has initiated the process that will eventually (probably not in too long a time!) recalibrate one's reciprocal and integrated dual-process reasoning apparatus so as to purge it of the heuristic bias that concealed the " 'hot hand fallacy' fallacy" from view for so long!

BTW, this is an account that draws on the brilliant exposition of the "integrated and reciprocal" dual process reasoning offered by Howard Margolis

For Margolis, reason giving is not what it appears: a recitation of the logical operations that make an inference valid. 

Rather it is a process of engaging another reasoner's affective perception, so that he or she sees why a result is correct, at which point the "reason why" can be conjured through conscious processing.  (The "Legal Realist" scholar Karl Llewellyn gave the same account of legal arguments, btw.)

To me, the way in which the " 'hot hand fallacy' fallacy" fits Margolis's account -- and also Ellen Peters's of the sorts of heuristic biases that only those high in Numeracy are likely to be vulnerable too-- is what makes the M&S paper so darn compelling!

But now...

If you, like me and 10^6s of others, are still having trouble believing that the analytic strategy of the original "hot hand" studies was wrong, here are some gadgets that I hope will enable you, if you play with them, to see that M&S are in fact right.  Because once you see that, you'll have vanquished the intuition that bars the path to your conscious, logical apprehension of why they are right.  At which point, the rewiring of your brain to assimilate M&S's insight, and avoid the "'hot hand fallacy' fallacy" can begin!

Indeed, in my last post, I offered an argument that was in the nature of helping you to imagine or see why the " 'hot hand fallacy' fallacy" is wrong. 

But here--available exclusively to the 14 billion regular subscribers to this blog (don't share it w/ nonsubscribers; make them bear the cost of not being as smart as you are about how to use your spare time!)-- are a couple of cool gadgets that can help you see the point if you haven't already.

Gadget 1 is the "Miller-Sanjurjo Machine" (MSM). MSM is an excel sheet that random generates a sequence of 100 coin tosses.  It also keeps track of how each successive toss changes the probability that the next toss in the sequence will be a "heads."  By examining how that probability goes up & down in relation to strings of "heads" and "tails," one can see why it is wrong to simply expect P(H|any specified string of Hs) - P(T|any specified string of Ts) to be zero.

MSM also keeps track of how many times "heads" occcurs after three previous "heads" and how many times "heads" occurs after three previous "tails."  If you keep doing tosses, you'll see that most of the time P(H|HHH)-P(H|TTT) < 0.

Or you'll likely think you see that. 

Because you have appropriately trained yourself to feel something isn't quite right about that way of proceeding, you'll very sensibly wonder if what you are seeing is real or just a reflection of the tendency of you as a human (assuming you are; apologies to our robot, animal, and space alien readers) to see pattern signals in noise.

Hence, Gadget 2: the "Miller-Sanjurjo Turing Machine" (MSTM)! 

OMG!!! A Miller-Sanjurjo Turing Maching! No matter how many times you run it, you'll sware it's another human being who behaves just the way you do!!MSTM is not really a "Turing machine" (& I'm conflating "Turing machine" with "Turing test")-- but who cares?  It's a cool name for what is actually just a simple statisical simulation that does 1,000 times what it's baby sister MSM does only once -- that is, flip 100 coins and tabluate the  P(H|HHH) & P(H|TTT). 

MSTM then reports the average difference between the two.  That way you can see in fact it's true that P(H|HHH) - P(H|TTT) for sure should be expected to be < 0. 

Indeed, you can see exactly how much less than 0 we should expect P(H|HHH) - P(H|TTT) to be: about 8%. That amount is the bias that was built into the original "hot hand" studies against finding a "hot hand."

(Actually, as M&S explain, the size of the bias could be more or less than that depending on the length of the sequences of shots one includes in the sample and the number of previous "hits" one treats as the threshold for a potential "hot streak".)

MSTM is written to operate in Stata.  But if you don't have Stata, you can look at the code (opening the file as a .txt document) & likely get how it works & come up with an equivalent program to run on some other application.

Have fun seeing, ratiocinating, and rewiring [all in that order!] your affective perception of valid inferences! 


Two threats to the public-health good of childhood vaccines ... a fragment

From something in the pipeline:

The tremendous benefit that our society enjoys by virtue of universal childhood immunizations is being put in jeopardy by two threats.  The first is the deliberate miscommunication of scientific evidence on vaccine safety. The second is  our society’s persistent neglect of the best available scientific evidence on risk communication.  Indeed, these two threats are linked: the void created by the absence of scientifically informed, professional risk communication is predictably being filled by uniformed, ad hoc, unprofessional alternatives, which nourish the state of confusion that miscommunicators aim to sow.  The value of the scientific knowledge embodied in childhood vaccinations demands a commensurate investment in effectively using science to protect the science communication environment in which ordinary members of the public come to know what is known by science. Every constituent of the public health establishment—from government agencies to research universities, from professional associations to philanthropic organizations—must contribute its share to this vital public good.


Holy smokes! The "'hot-hand fallacy' fallacy"!

It' super-duper easy to demonstrate that individuals of low to moderate Numeracy --an information-processing disposition that consists in the capacity & motivation to engage in quantitative reasoning -- are prone to all manner of biases--like "denominator neglect," "confirmation bias," "covariance [non]detection," the "conjunction fallacy," etc.

It's harder, but not impossible, to show that individuals high in Numeracy are more prone to biased reasoning under particular conditions.

In one such study, Ellen Peters and her colleagues did an experiment in which subjects evaluated the attractiveness of proposed wagers.

For one group of subjects, the proposed wager involved outcomes of a positive sum & nothing, with respective probabilities adding to 1.  

For another group, the proposed wager had a slightly lower positive expected value and proposed outcomes were a positive sum & anegative sum (again with respective probabilities adding to 1).

Because the second wager had a lower expected value, and added "loss aversion" to boot, one might have expected subjects to view the first as more attractive.

But in fact subjects low in Numeracy ranked the two comparable in attractiveness.  Maybe they couldn't do the math to figure out the EVs. 

But the real surprise was that among subjects high in Numeracy, the second wager-- the one that coupled a potential gain and a potential loss-- was rated as being substantially more attractive than the first -- the one that coupled a potential gain with a potential outcome of zero and a higher expected EV.  

Go figure!

This result, which is hard to make sense of if we assume that people generally prefer to maximize their wealth, fit Peters et al.'s hypothesis that the cognitive proficiency associated with high Numeracy guides decisionmaking through its influence in calibrating affective perceptions.  

Because those high in Numeracy literally feel the significance of quantitative information, the necessity of doing the computations necessary to evaluate the second wager, Peters et al. surmised, would generate a more intense experience of positive affect for them than would the process of evaluating the first wager, the positive expected value of which can be seen without doing any math at all.  Lacking the same sort of emotional connection to quantitative information, the subjects low in Numeracy wouldn't perceive much difference between the two wagers.

Veeeeery interesting.   

But can we find real-world examples of biases in quantitative information-processing distinctive to individuals high in Numeracy?  Being able to is important not only to show that the Peters et. al result has "practical" significance but also show that it is valid.  Their account of what they expected to and did find hangs together, but as always there are alternative explanations for their results.  We'd have more reason to credit the explanation they gave-- that high Numeracy can actually cause individuals to make mistakes in quantitative reasoning that low Numeracy ones wouldn't -- in the real world. 

That way of thinking is an instance of the principle of convergent validity: because we can never be "certain" that the inference we are drawing from an empirical finding isn't an artifact of some peculiarity of the study design, the corroboration of that finding by an empirical study using different methods -- ones not subject to whatever potential defect diminished our confidence in the first -- will supply us with more reason to treat the first finding as valid.

Indeed, the confidence enhancement will be reciprocal: because there will always be some alternative explanation for the findings associated with the second method, too, the concordance of the results reached via those means with the results generated by whatever method informed the first study gives us more reason to credit the inference we are drawing from the second.

Okay, so  now we have some realllllllly cool "real world" evidence of the distinctive vulnerability of high Numeracy types to a certain form of quantitative-reasoning bias.

It comes in a paper, the existence of which I was alerted to in the blog of stats legend  (& former Freud expert) Andrew Gelman, that examines the probability that we'll observe the immediate recurrence of an outcome if we examine some sequence of binary outcomes generated by a process in which the outcomes are independent of one another-- e.g., of getting "heads" again after one getting "heads" rather than "tails" in the previous flip of a fair coin.

We all know that if the events are independent, then obviously the probability of the previous event recurring is exactly the same as the probability that it would occur in the first place.

So if someone flipped a coin 100 times, & we then examined her meticulously recorded results, we'd discover the probability that she got "heads" after any particular flip of "heads" was 0.50, the same as it would be had she gotten "tails" in the previous flip.

Indeed, only real dummies don't get this!  The idea that the probability of independent events is influenced by the occurrence of past events is one of the mistakes that those low to moderate Numeracy dolts make!  

They (i.e., most people) think that if a string of "heads" comes up in a "fair" coin toss (we shouldn't care if the coin is fair; but that's another stats legend /former Freud expert Andrew Gelman blog post), then the probability we'll observe "heads" on the next toss goes down, and the probability that we'll observe "tails" goes up. Not!

Only a true moron, then, would think that if we looked at a past series of coin flips, the probability of a "heads" after a "heads" would be lower than the probability of a "heads" after a "tail"! Ha ha ha ha ha! I want to play that dope in poker! Ha ha ha!

Um ... not so fast, say Miller & Sanjurjo in their working paper, "Surprised by the Gambler’s and Hot Hand Fallacies? A Truth in the Law of Small Numbers."

The "assumption that in a sequence of coin tosses, the relative frequency of heads on those flips that immediately follow a streak of heads is expected to be equal to the relative frequency of heads on those flips that immediately follow a streak of tails" is "seemingly correct, but mistaken" (p. 19).

Yeah, right.

"We prove," M&S announce (p. 22),

that in a fi nite sequence generated by repeated trials of a Bernoulli random variable the expected conditional relative frequency of successes, on those realizations that immediately follow a streak of successes, is strictly less than the fi xed probability of success.

What? (I'm asking myself this as the same time you are asking me). "That can't possibly be the case"!

You'll feel like someone is scratching his fingers on a chalkboard as you do it, but read the first 6 pages of their paper (two or three times if you can't believe what you conclude the first time) & you'll be convinced this is true.

Can I explain this really counterintuitive (for high Numeracy people, at least) result in conceptual terms? Not sure but I'll try!

If we flip a coin a "bunch" of times, we'll get roughly 0.50 "heads" & 0.50 "tails" (it will land on its edge 10^-6 of the time). But if we go back & count the "heads" that came up only after a flip of "heads," we'll come up w/ less than 0.5 x 1 "bunch."

If we look at any sequence in the "bunch," there will be some runs of "heads" in there.  Consider THHTHTTTHTHHHTHT..."  In this sequence of 16, there were (conveniently!) 8 "heads" & 8 "tails."  But only 3 of the 8 (conveniently!) occurred after a previous flip of "heads"; 5 of the 8 ocurred after a flip of "tails."

In this sample, then, the probability of getting "heads" again after getting "head"s on the previous flip was not 0.5. It was 3/8 or .375 or ... about 0.4!

You might wonder (because for sure you are searching for the flaw in the reasoning) that this result was just a consequence of the sequence I happened to "report" for my (N = 16) "experiment." 

You'd not be wrong to respond that way!

But if you think hard enough & start to play around with the general point --that we are looking at the history of at a past sequence of coin tosses  -- you'll see (eventually!) that the probability of "heads" in the sample that occur after a previous "heads" (not to mention "several" heads in a row!) always is lower than the overall probability that any particular flip in that sequence was "heads."

That indeed it has to be. 

What will you be seeing/feeling when you "get" this? Perhaps this: 

  1. Imagine I perform 100 coin tosses and observe 50 "heads" and 50 "tails." (No problem so far, right?)
  2. If I now observe the recorded sequence and begin to count backwards from 50 every time I see a "heads," I'll always know how many "heads" remain in the sequence.  (Still okay?  Good.)
  3. Necessarily, the number goes down by 1 every time I see a "heads" in the sequence. 
  4. And necessarily the number does not go down -- it stays the same -- every time I see a "tails" in the sequence.
  5. From this we can deduce that the probability that the next flip in the sequence will be a "heads" is always lower if the previous flip was a "heads" than if it was a "tails."
  6. Oh, btw, steps 2-5 still apply if you happened to get 51 "heads," or 48 or 55 or whatever, in your 100 tosses. Think about it!

At this point you are saying, um, "now I'm not sure anymore"; go through that again.  Okay...

But here is the really cool & important thing: M&S show that the methodology used in literature examining the so-called "hot hand fallacy" doesn't reflect this logic.

Those studies have been understood to "debunk" the common perception that basketball players go through "hot streaks" during which it makes sense for others to expect them to achieve a level of shooting success that exceeds their usual or average level of success.

The researchers who purported to "debunk" the perception of "hot hands" report that if one examines game data, the probability of players making a shot after making a specified number of shots in a row is roughly their average level of success. Just as one would expect if shots are independent events-- so there's no "hot hand" in reality--only in our fallible, error-prone minds!

But this method of analyzing the data, M&S demonstrate, is wrong. 

It overlooks that, "by conditioning on a streak of hits within a sequence of finite length, one creates a selection bias towards observing shots that are misses" (p. 19).

Yeah, that's what I was trying to say!

So if the data show, as the "hot hand fallacy" researchers found, that the probability a player would make his or her next shot after making a specified number in a row was the same as the probability that he or she would make a shot overall, their data, contrary to their conclusion, support the inference that players do indeed enjoy "hot streaks" longer than one would expect to observe by chance in a genuinely random process (& necessarily, somewhere along the line, "cold streaks" longer than one would expect by chance too).

I'm sold!

But for me, the amazing thing is not the cool math but the demonstration, w/ real world evidence, of high Numeracy people being distinctively prone to a bias in quantitative reasoning.

The evidence consists in the mistake made by the authors of the original "hot hand" studies and repeated by 100s or even 1000s (tens of thousands?) of decision science researchers who have long celebrated these classic studies and held them forward was a paradigmatic example of the fallibility of human perception.

As M&S point out, this was a mistake that we would expect only a high Numeracy person to make. A low Numeracy person is more prone to believe that independent events are not independent; that's what the "gambler's fallacy" is about. 

Someone who gets why the gambler's fallacy is a fallacy will feel that the way in which "hot hand fallacy" researchers analyzed their data was obviously correct: because events that are independent occur with the same probability irrespective of past outcomes, it seems to make perfect sense to test the "hot hand" claim by examining whether players' shooting proficiency immediately after making a shot differs significantly from their proficiency immediately after missing.

But in fact, that's not the right test!  Seriously, it's not!  But it really really really seems like it is to people whose feelings of correctness have been shaped in accord with the basic logic of probability theory--i.e., to high Numeracy people!  (I myself still can't really accept this even though I accept it!)

That's what Peters says happens when people become more Numerate: they develop affective perceptions attuned to sound inferences from quantitative information.  Those affective perceptions help to alert high Numeracy people to the traps that low Numeracy ones are distinctively vulnerable to.

But they can create their own traps -- they come with their own affective "Sirens," luring the highly Numerate to certain nearl-irresitible but wrong inferences....

Holy smokes!

M&S don't make a lot of this particular implication of their paper. That's okay-- they like probability theory, I like cognition!

But they definitely aren't oblivious to it. 

On the contrary, they actually propose-- in a casual way in a footnote (p. 2, n.2)-- a really cool experiment that could be used to test the hypothesis that the "'hot hand fallacy' fallacy" is one that high Numerate individuals are more vulnerable to than low ones:

Similarly, it is easy to construct betting games that act as money pumps while defying intuition. For example, we can offer the following lottery at a $5 ticket price: a fair coin will be flipped 4 times. if the relative frequency of heads on flips that immediately follow a heads is greater than 0.5 then the ticket pays $10; if the relative frequency is less than 0.5 then the ticket pays $0; if the relative frequency is exactly equal to 0.5, or if no flip is immediately preceded by a heads, then a new sequence of 4 flips is generated. While, intuitively, it seems like the expected payout of this ticket is $0, it is actually $-0.71 (see Table 1). Curiously, this betting game may be more attractive to someone who believes in the independence of coin flips, rather that someone who holds the Gambler’s fallacy.

If someone did that study & got the result-- high Numeracy taking the bet more often than low--we'd have "convergent validation" of the inference I am drawing from M&S's paper, which I now am treating (for evidentiary purposes) as part of a case study in how those who know a lot can make distinctive -- spectacular, colossal even! -- errors.

But my whole point is that M&S's paper, by flushing this real-world mistake out of hiding, convergently validates the experimental work of Peters et al.

But for sure, more experiments should be done! Because empirical proof never "proves" anything; it only gives us more reason than we otherwise would have had for believing one thing rather than another to be true....

Two last points: 

1.  The gambler's fallacy is still a fallacy! Coin tosses are independent events; getting "heads" on one flip doesn't mean that one is "less likely" to get "heads" on the next.

The gambler's fallacy concerns the tendency of people mistakenly to treat independent events as non-independent when they make predictions about future events.

The " 'hot hand fallacy' fallacy" -- let's call it--involves expecting the probability that binary outcomes will immediately recur is the same as the probability that they will occur on average in the sample.  That's a logical error that reflects failing to detect a defect in the inference strategy reflected in the "hot-hand" studies.

Indeed, the same kind of defect in reasoning can explain why the gambler's fallacy is so prevalent -- or at least M&S surmise.

In the world, when we see independent events occurring, we observe or collect data in relatively short bursts -- let's call them “attention span” units (M&S present some data on self-reports of the longest series of coin tosses observed: the mean was a mere 6; strange, because I would have guessed every person flipped a coin at least 1000 times in a row at some point during his or her childhood!). If, in effect, we "sample" all the sequences recorded during “attention span” units, we'll observe that in fact the recurrence of an outcome immediately after it occurred was generally less than the probability it would occur on average.

That's correct.

But it's not correct to infer from such experience that, in any future sequence, the probability of that event recurring will be lower than the probability of it ocurring in the first place.  That's the gambler's fallacy.

The "'hot hand fallacy' fallacy" invovles not noticing that correcting the logical error in the gambler's fallacy does not imply that if we examine a past sequence of coin tosses, we should expect to observe that "heads" came up just as often immedately after one or more "tails" than it did immediately after one or more "heads."

Ack! I find myself not believing this even though I know it's true!

2. Is "motivated numeracy" an instance of a bias that is more prevalent among high Numeracy persons?

That depends!

"Motivated Numeracy" is the label that my collaborators-- who include Ellen Peters -- & I give to the tendency of individuals who are high in Numeracy to display a higher level of motivated reasoning in analyzing quantitative information.  We present experimental evidence of this phenomenon in the form of a covariance-detection task in which high-Numeracy partisans were more likely to construe (fictional) gun control data in a manner consistent with their ideological predispositions than low-Numeracy partisans.

The reason was that the low-partisan subjects couldn't reason well enough with quantitative information to recognize when the data were and weren't consistent with their ideological predispositions.  The high-Numeracy subjects could do that, and so never failed to credit predispositions-affirming evidence or to explain away predisposition-confounding evidence.

But whether that's a bias depends on what you think people are trying to do when they reason about societal risks.  If they are trying to get the "right answer," then yes, Motivated Numeracy is a bias.

But if they are trying to form identity-congruent beliefs for the sake of conveying their membership in and loyalty to important affinitty groups, the answer is no; motivated Numeracy is an example of how one can do an even better job of that form of rational information processing if one is high in Numeracy.

I think the latter interpretation is right ... I guess ... hmmmm.... "Now I'm not sure anymore..."

But I am sure that the "hot hand" study authors, and all those who have celebrated their studies, were really trying to get the right answer.

They didn't, because their high Numeracy tempted them to error.

p.s. I'll bet $10^3 against this, but if someone proves the paper wrong, the example of high Numeracy subjects being led to error by an argument only they could be seduced by still holds!


Three points about "believing in" evolution ... a travel report

the colored bars are 0.95 CIs!!0. I was ambushed!

Emlen Metz and Michael Weisberg, my fellow panelists at the International Society for the Hisotry of Philosophy and Social Studies of Biology, were lying in wait and bombarded me with a fussilade of counter-proofs and thoughtful alternative explanations! 

For such treachery, they should, at a minimum, compensate me by sharing summaries of their own presentations with the 14 billion readers of this blog, so that subscribers can see for themselves the avalanche of critical reason that crashed down on me.  I am working to exact this settlement.

For my part, I made three points about “believing in” evolution:  one empirical, one political, and one philosophical. (Slides here.)

1. The empirical point was that what people "believe" about evolution doesn’t measure what they know about science but rather expresses who they are, culturally speaking. 

Not a new point for me, I relied primarily on data from The Measurement Problem study to illustrate.

Whipping out my bewildering array of multi-colored item response profiles, I showed that the probability of correctly responding to the NSF Science Indicators Evolution item—“human beings evolved from an earlier species of animals—true or false?”—doesn’t vary in in relation to people’s scores on the Ordinary Science Intelligence (OSI) assessment. Instead the probability of responding correctly depends on the religiosity of the test taker.

Indeed, using factor analysis, one can see that the Evolution item doesn’t share the covariance structure of the items that indicate OSI but instead shares that of the items that indicate religiosity.

Finally, I showed how it’s possible to unconfound the Evolution item’s measurement of identity from its measurement of “science literacy” by introducing it with the phrase, “According to the theory of evolution . . . .”

At that point, religious test takers don’t have to give a response that misrepresents who they are in order to demonstrate that they know science’s understanding of the natural history of human beings.  As a result, the gap between responses to the item and the OSI scores of non-religious and religious respondents, respectively, essentially disappears.

Unconfounding identity and knowledge, I noted, is essential not only to assessing understanding of evolutionary science but also to imparting it. The classic work of Lawson and Worsnop (1992; see also Lawson 1999), I told the audience, demonstrates that kids who say they “don’t believe in” evolution can learn the essential elements of the modern synthesis just as readily as kids who say they “do believe it” (and who  are otherwise are not any more likely be able to give a cogent account of natural selection, genetic variance and random mutation).

But because what one says one “believes” about evolution  is in fact not an indicator of knowledge but an indicator of identity, teaching religiously inclined students how the theory of evolution actually works doesn’t make them any more likely to profess “acceptance” of it.

Indeed, Lawson stresses that the one way to assure that more religiously inclined students won’t learn the essential elements of evolutionary science is to make them perceive that the point of the instruction is to change their “beliefs”: when people are put in the position of having to choose between being who they are and knowing what’s known by science, they will predictably choose being who they are, and will devote all of their formidable reasoning proficiencies to that.

The solution to the measurement problem posed by people's "beliefs in" evolution, then, is the  science communication disentanglement principle: “Don’t  make reasoning, free people choose between knowing what’s known & being who they are.”

2.  The political point I made was the imperative to enforce the science communication disentanglement principle in every domain in which citizens acquire and make use of scientific information.

Liberal market democracies are the form of society distinctively suited both to the generation of scientific knowledge and to the protection of free and reasoning individuals' formation of their own understandings of the best way to live.

In my view, the citizens of such states have the individual right to enjoy both of these benefits without having to trade off one for the other.   To secure that right, liberal democratic societies must use the science of science communication to repel the dynamics that conspire to make what science knows a focal point for cultural status competition (Kahan in press).

Here  I focused on the public controversy over climate change.

Drawing on Measurement Problem and other CCP studies (Kahan, Peters, et al. 2012), I showed that what “belief in” human-caused climate change measures is not what people know but who they are as well.

The typical opinion poll item on “belief in” climate change, these evidence suggest, are is also not a valid indicator of the sort of latent cultural identity indicated by variously by cultural cognition worldview items and conventional “right-left” political outlook ones.

People with those identities don’t converge but rather polarize as their OSI scores increase.

Using techniques derived from unconfounding identity and knowledge in the assessment of what people understand about evolution, one can fashion an assessment instrument—the “Ordinary Climate Science Intelligence” (OCSI) test—that confounds identity from what they understand about the causes and consequences of climate change.

They don’t understand very much, it turns out, but they get the basic message that climate scientists are conveying: human activity is causing climate change and putting all of us at immense risk.

Nevertheless those who score the highest on the OCSI still are the most politically polarized on whether they “believe in” human climate change—because the question they are answering when they respond to a survey item on that is “who are you, whose side are you on?”

To enable people to acquire and make use of the knowledge that climate scientists are generating, science communication researchers are going to have to do the same sort of hard & honest work that education researchers did to figure out how to disentangle knowledge of evolutionary science from identity.

But they're going to need to figure out how to to do that not only in the classroom but also in the democratic political realm.  The science communication environment is now filled with toxic meanings that force people in their capacity as democratic citizens to choose between knowing what’s known about climate and being who they are.

Because individuals forced to make that choice will predictably--rationally-- use their reasoning proficiencies to express their identities, culturally diverse citizens will be unable to make collective decisions informed by what science knows about climate change until the disentanglement project is extended to our public discourse.

Indeed, conflict entrepreneurs (posing as each other's enemy as they symbiotically feed off one another's noxious efforts to stimulate a self-reinforcing atmosphere of contempt among rival groups) continue to pollute our science communication environment with antagonistic cultural meanings on evolution as well. 

Those who actually care about making it possible for diverse citizens to be able to know what’s known by science without having to pay the tax of acquiescing in others' denigration of their cultural identities are obliged to oppose these tapeworms of cognitive illiberalism no matter “whose side” they purport to be on in the dignity-annihilating, reason-enervating cultural status competition in which positions on climate change & evolution have been rendered into tribal totems.

3. The philosophical point was the significance of cognitive dualism.

Actually, cognitive dualism is not, as I see it, a philosophical concept or doctrine. 

It is a conjecture, to be investigated by empirical means, on what is “going on in heads” of those who—like the Pakistani Dr and the Kentucky Farmer—both “believe” and “disbelieve” in facts like human evolution and human-caused climate change.

But what the tentative and still very formative nature of the conjecture shows us, in my view, is just how much in need  the disentanglement project is of philosophers' help.

In the study of “beliefs” in evolution, cases like these are typically assumed to involve a profound cognitive misfire. 

The strategies skillful science teachers use to disentangle knowledge from identity in the classroom, far from being treated as a solution to a practical science communication dilemma, are understood to present us with another “problem”—that of the student who “understands” what he or she is taught but who will not “accept” it as true.

In my view, the work that reflects this stance is failing to engage meaningfully with the question of what it means to "believe in" evolution, climate change etc.

The work I have in mind simply assumes that “beliefs” are atomistic propositional stances identified by reference to the states of affairs (“natural history of humans,” “rising temperature of the globe”) that are their objects.

In this literature there is no cognizance of an alternative view—one with a rich tradition in philosophy (Pierce 1877; Braithwaite 1933, 1946; Hetherington 2011)—of “beliefs” as dispositions to action.  

Haven't figured out yet what to get Kentucky Farmer for X-mas? Here's a hint!

On this account, beliefs as mental objects always inhere in clusters of intentional states  (emotions, values, desires, and the like) that are distinctively suited for doing particular things.

The Pakistani Dr’s belief in evolution is integral to the mental routines that enable him to be (and take pride in being) a Dr; his disbelief in it is part of a discrete set of mental routines that he uses to be a member of a particular religious community (Everhart & Hameed 2013).  The Kentucky Farmer disbelieves in “human caused climate change” in order to be  a hierarchical individualist but believes in it—indeed, excitedly downloads onto his IPad custom-tailored predictions based on the same "major climate-change models ... under constant assault by doubters" in order to be a successful farmer.

If as mental objects “beliefs” exist only as components of more elaborate ensembles of action-enabling mental states, then explanations of the self-contradiction or "self-deception" of the Pakistani Dr, Kentucky Farmer--or of the creationist high school student who wants to be a veterinarian but "loves animals too much" to simply "forget" what she has learned about natural selction in her AP biology course-- are imposing a psychologically false criterion of identity on the contents of their minds.

So long as there is no conflict in the things that these actors are enabled to do with the clusters of mental states in which their opposing stances toward evolution or toward climate change inhere, there is no "inconsistency" to explain.

There is also no “problem” to "solve" when actors who use their acceptance of what science knows to do what scientific knowledge is uniquely suited for don't "accept" it in order to do something on which science has nothing to say.  

Unless the "problem" is really that what they are doing with nonacceptance is being the kind of person whose behavior or politics or understandings of the best way to live bother or offend us.  But if so, say that -- & don't confuse matters by suggesting that one's goals have anything to do with effecitvely communciating science.

Or at least that is what the upshot of cogntive dualism would be if in fact it is the right account of the Pakistani Dr, and the Kentucky Farmer, and the many many many other people in whose mental lives such "antinomies" coexist.

Of course,  it doesn’t bother me that cognitive dualism is not now the dominant explanation of “who believes what” about evolution or climate change and “why.”

But what does is the innocence of those who are studying these phenomena of the very possibility that the account of "belief" of which cognitive dualism is a part might account for what they are investigating, a state of inattention that assures that they will fail to conduct valid empirical research-- and fail to reflect consciously on the moral significance of their prescriptions.

This is exactly the sort of misadventure that philosophers ought to protect empirical researchers from experiencing, I told the roomful of curious and reflective people who paid us the privilege of attending our session and sharing their views on our research.

And for the first time in all my experiences introducing people to the Pakistani Dr and the Kentucky Farmer, no one seemed to disagree with me . . . .


Braithwaite, R.B. The nature of believing. Proceedings of the Aristotelian Society 33, 129-146 (1932).

Braithwaite, R.B. The Inaugural Address: Belief and Action. Proceedings of the Aristotelian Society, Supplementary Volumes 20, 1-19 (1946).

Everhart, D. & Hameed, S. Muslims and evolution: a study of Pakistani physicians in the United States. Evo Edu Outreach 6, 1-8 (2013).

Hetherington, S.C. How to know : a practicalist conception of knowledge (J. Wiley, Chichester, West Sussex, U.K. ; Malden, MA, 2011).

Kahan, D.M. Climate-Science Communication and the Measurement Problem. Advances in Political Psychology 36, 1-43 (2015).

Kahan, D.M. What is the science of science communication?” J. Sci. Comm. (in press).

Kahan, D.M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L.L., Braman, D. & Mandel, G. The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature Climate Change 2, 732-735 (2012).

Lawson, A.E. & Worsnop, W.A. Learning about evolution and rejecting a belief in special creation: Effects of reflective reasoning skill, prior knowledge, prior belief and religious commitment. Journal of Research in Science Teaching 29, 143-166 (1992).

Lawson, A.E. A scientific approach to teaching about evolution & special creation. The American Biology Teacher, 266-274 (1999).

Pierce, C.S. Philosophical Writings of Peirce, The Fixation of Belief. Popular Science Monthly  (1877).


In Montreal, asking philosophers (& others) to help make sense of Pakistani Drs & Kentucky Farmers

As the 1.4 x 10^10 regular readers of this blog know, I've committed this summer to visiting every country in the world to introduce people to the Pakistani Dr and the Kentucky Farmer.  

So I've done UK (England, Wales) & France.  

Today is Canada.  

Next week Macao (CCP headquarters-- ironic that I haven't even done a talk there on this topic!).  

Then N. Korea. Then Netherlands Antilles, & after that Las Vegas.  

Then I'll be all done!

I'll be giving today's talk as my contribution to this really cool panel:

This is a great venue for discussing the Pakistani Farmer & Kentucky Farmer b/c I think philosophers really need to get in on this issue.  I'm convinced the empirical study of "belief/disbelief in" both evolution & climate change is a matter being impeded by a failure to engage reflectively with the concept of "belief," & that philosophers are best situated to help empiricists here see that. 

Maybe I'll have something more to say about this event "tomorrow."

Anyway, turns out Manny & Krista are attending this conference, too!

Total coincidence--they just came b/c Krista really "likes learning about this stuff" and because Manny had nothing else to do, having refused to get a job for the summer to protest the failure of the U.S. to furnish free universal college education.



Ambivalence about "messaging"

State of the art "messaging" 2008From correspondence with a reflective person & friend who asked my opinion on how one might use "message framing" to promote public engagement with specific climate-mitigation policies:

A couple of things occur to me; I hope they are not completely unhelpful.

1. I think one has to be cautious about both the external & operational validity of "messaging" & "framing" studies in this area.  

The external validity concern goes to the usual problem w/ measuring public opinion on any particularly specific public policy proposal: there's likely no opinion to measure.  

People have a general affective orientation toward climate change. You'll know you are measuring it if the responses they give to what you are asking them are highly correlated with what they say they "believe" about climate change. 

But people know essentially nothing about climate change in particular.  For or against it (as it were), they will say things like "human carbon emissions are expected to kill plants in greenhouses." Seriously

Accordingly, if you start asking them specific things about policy, very soon you'll no longer be measuring the "thing" inside them that is their only true attitude toward climate change.  This is what makes it possible [for some researchers] to say ridiculous things like "70% of Republicans want to regulate carbon emissions!" when only 25% of Republicans say "yes" to the question "are human beings causing climate change."  What’s being measured with the policy questions is a non-opinion.

In sum, the point is, as soon as you get into specifics about policy, you'll be very uncertain what you are measuring, & as a result whether you are learning something about how opinion works in the real world.

I'm not saying that it's impossible to do studies like the one you are proposing, only that it's much easier to do invalid than valid ones.  Likely you are nodding your head saying "yes, yes, I know..."

The "operational validity" point has to do with the translation of externally valid lab studies of how people process information on these issues into real-world communication materials that will effectively make use of that knowledge.  

To pick on myself for a change, I'm positive that our framing study on "geoengineering" & open-minded assessment of climate science has "zero" operational validity.  

I do think it was internally & externally valid: that is, I think the design supported the inference we were drawing about the resutls we were observing in the experiment, and that the experiment was in turn modeling a mechanism of information-processing that matters for climate-science communication outside the lab.

But I don't think that anything we learned in the study supports any concrete form of "messaging." For sure it would be ridiculous, e.g., to send our study stimulus to every white hierarchical individualist male & expect climate skepticism to disappear!  

There almost certainly is something one can do in the real world that will reproduce the effects that we observed in the lab.  But what that is is something one would have to use empirical methods, conducted in the field & not the lab, to figure out.

Knowing you, you are likely planning to test communication materials that will be actually used in the real-world, and in a way that will give you & others more confidence or less to believe that one or another plausible strategy will work (that's what valid studies do of course!).

But I feel compelled to say all of this just b/c I know so many people don't think the way you do -- & b/c I am genuinely outraged at how many people who study climate-science communication refuse to admit what I just said, and go around making empirically insupportable pronouncements about "what to do" (here’s what they need to do: get off their lazy asses & do some field research).

Definitely a PR coup for organization that dreamed up this plan, but what is "message" people get when they read (or are told about) a NY Times story that applauds a clever strategy to "message" them?2.  I myself have become convinced that "messaging" is not relevant to climate-change science communication.  Or at least that the sort of "messaging" people have in mind when they do framing studies, & then propose extravagant social marketing campaigns based on them, is not.

For "messaging" to work, we have to imagine either one of 2 things to be true.  The first is that there is some piece of information that people are getting "wrong" about climate change & will get right if it is "framed" properly.

But we know that there is zero correlation between people's positions on climate change & any information relating to it.  Or any information relating to it other than "this is my side's position, & this theirs."  And they aren't wrong at all, sadly, about that information.

TState of the art 2014...he second thing we might imagine, then, is that a "messaging" campaign featuring appropriately selected “messengers” could change people's assessment of what "their side's" position is.  

I don't believe it.  

I don't believe it, first, because people aren't that gullible: they know people are trying to shape that understanding via "messaging" (in part b/c the people doing it are foolish enough to discuss their plans  within earshot of those whose belefs they are trying to “manage” in this way).  

I don't believe it, second, b/c it's been tried already & flopped big time.

There have been multiple "social marketing campaigns" that say, "see? even Republicans like you believe in climate change & want to do something! Therefore you should feel that way or you'll be off  the team!" 

There has been zero purchase.  Probably b/c people just aren't gullible enough to believe stuff like that when they live in a world filled with accurate information about what "their side" "believes."

To make progress, then, you have go into their world & show them something that's true but obscured by the pollution that pervades our science communication enviornment: that "their side"already is engaging climate change in a way that evinces belief in the science & a resolve to do something.  

That's the lesson of SE Fla "climate political science ..."    I've seen that in action.  It really really really does work.  

But it really really really doesn't satisfy the motivations of those who want to use the climate change controversy to gratify their appetite to condemn those who have different cultural values from theirs as evil and selfish.  So its successes get ignored, its power to reconfigure the political economy of climate change in the U.S. never tapped.

As always, & as you know, this is what I think for now.  One knows nothing unless one knows it provisionally w/ a commitment to revising based on new evidence. You are the sort of person I know full well will produce evidence, on a variety of things, that will enable me to update & move closer to truth.

But for now, I think the truth is that "messaging" (as normally understood) isn't the answer.


For the 10^6 time: GM foods is *not* polarizing issue in the U.S., plus an initial note on Pew's latest analysis of its "public-vs.-scientists" survey

Keith Kloor asked me whether a set of interesting reflections by Mark Lynas on social and cultural groundings of conflict over GM food risks in Europe generalize to the U.S.

The answer, in my view, is: no.

In Europe, GM food risks is a matter of bitter public controversy, of the sort that splinters people of opposing cultural outlooks (Finucane 2002).

But as scholars of risk perception are fully aware (Finucane & Holup 2005), that ain't so in the U.S.


These data come from the study reported in Climate-Science Communication and the Measurement Problem, Advances in Pol. Psych. (2015).

But there are tons more where this came from.  And billions of additional blog posts in which I've addressed this question! Including:

I'm pretttttttttty sure, in fact, that Keith was "setting me up," "throwing me a softball," "yanking my chain" etc-- he knows all of this stuff inside & out.

One of the things he knows is that general population surveys of GM food risks in the US are not valid

Ordinary Americans don't have any opinions on GM foods; they just eat them in humongous quantities.

Accordingly, if one surveys them on whether they are "afraid" of "genetically modified X" -- something they are likely chomping on as they are being interviewed but in fact don't even realize exists-- one ends up not with a sample of real public opinion but with the results of a weird experiment in which ordinary Americans are abducted by pollsters and probed w/ weird survey items being inserted into places other than where their genuine risk perceptions reside.

Pollsters who don't acknowledge this limitation on public opinion surveys -- that surveys presuppose that there is a public attitude to be measured & generate garbage otherwise (Bishop 2005) -- are to legitimate public opinion researchers what tabloid rreporters are to real science journalists.

A while back, I criticized Pew, which is not a tabloid pollster operation, for resorting to tabloid-like marketing of its own research findings after it made a big deal out of the "discrepancy" between "public" and "scientist" (i.e., AAAS member) perceptions of GM food risks.

So now I'm happy to note that Pew is doing its part to try to disabuse people of the persistent miconception that there is meaningful public conflict over GM foods in the U.S.

It issued a supplementary analysis of its public-vs.-AAAS-member survey, in which it examined how the public's responses related to individual characteristics of various sorts:

As this graphic shows, neither "political ideology" nor "religion" -- two characteristics that Lynas identifies as important for explaining conflict over GM foods in Europe-- are meaningfully related to variance in perceptions of GM food risks in the U.S.

Pew treats "education or science knowledge" as having a "strong effect." 

I'm curious about this.

I know from my own analyses of GM food risks that even when one throws every conceivable individual predictor at them, only the tiniest amount of variance is explained.

In other words, variation is mainly noise.

click for regression analysis of gm food risk perceptions... yum!One can see from my own data above that science comprehension, as measured by the "ordinary science intelligence test," reduces risk perceptions (for both right-leaning and left-leaning respondents).

But the pct of variance explained (R^2) is less than 2% of the total variance in the sample. It's a "statistically significant" effect but for sure I wouldn't characterize it as "strong"!

I looked at Pew's own account of how it determined its characterizations of effects as "strong" & have to admit I couldn't understand it.

But with its characteristic commitment to helping curious and reflective people learn, Pew indicates that it will furnish more information on these analyses on request.

So I'll make a request, & figure out what they did.  Wouldn't be surprised if they figured out something I don't know!

Stay tuned...


Bishop, G.F. The illusion of public opinion : fact and artifact in American public opinion polls (Rowman & Littlefield, Lanham, MD, 2005).

Finucane, M.L. Mad cows, mad corn and mad communities: the role of socio-cultural factors in the perceived risk of genetically-modified food. P Nutr Soc 61, 31-37 (2002). 

Finucane, M.L. & Holup, J.L. Psychosocial and cultural factors affecting the perceived risk of genetically modified food: an overview of the literature. Soc Sci Med 60, 1603-1612 (2005).



Two publics, two modes of reasoning, two forms of information in science communication: a fragment . . .

From something I'm working on . . .

Members of the public vary in the mode of reasoning they use to engage information on decision-relevant science. To be sure, many—including not just official decisionmakers but leaders of important stakeholder groups,  media professionals, and also ordinary citizens of high civic engagement—apply their reason to making informed judgment about science content.  Evidence-based methods (Kahan 2014; Han & Stenhouse 2014) are essential to anticipating how affect, numeracy, and cultural cognition interact when these "proximate information evaluators" assess scientific information (Peters, Burraston & Mertz 2004; Dieckman, Peters & Gregory 2015; Slovic, Finucane et al. 2004; Kahan, Peters et al. 2012).

Most members of the public, however, use a different reasoning strategy to assess the validity and consequence of decision-relevant science. Because everyone (even scientists, outside of their own domain) must accept as known by science much more than they could possibly comprehend on their own, individuals—all of them—become experts at using social cues to recognize valid science of consequence to their lives (Baron 1993).

The primary cue that these "remote information evaluators" use consists not  in anything communicated directly by scientists or other experts. Instead, it consists in the confidence that other ordinary members of the public evince in scientific knowledge through their own words and actions. The practical endorsement of science-informed practices and policies by others with whom individuals have contact in their everyday lives and whom they regard as socially competent and informed furnishes ordinary members of the public with a reliable signal that relying on the underlying science is “the sensible, normal thing to do” (Kahan 2015).

Much of the success of the Southeast Florida Regional Climate Compact in generating widespread public support for the initiatives outlined in its Regional Climate Action Plan reflect the Compact’s success in engaging this mode of public science communication. Because so many diverse private actors—from  business owners to leaders of prominent civic organizations to officers in neighborhood resident associationsparticipated in the planning and decisionmaking that produced the RCAP, the process the Compact used created a science communication environment amply stocked with actors who play this certifying role in the diverse opinion-formation communities in which "remote evaluators" exercise this rational form of information processing (Kahan 2015).

As was so in Southeast Florida, evidence-based methods are essential for effective transmission of information to "remote evaluators." In particular, communicators must take steps to protect the science communication environment from contamination by antagonistic cultural meanings, which predictably disable the rational faculties ordinary citizens use to recognize the best available evidence (Kahan 2012). . . .


Baron, J. Why Teach Thinking? An Essay. Applied Psychology 42, 191-214 (1993).

Dieckmann, N.F., Peters, E. & Gregory, R. At Home on the Range? Lay Interpretations of Numerical Uncertainty Ranges. Risk Analysis (2015).

Han, H. & Stenhouse, N. Bridging the Research-Practice Gap in Climate Communication Lessons From One Academic-Practitioner Collaboration. Science Communication, 1075547014560828 (2014).

Kahan, D.M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L.L., Braman, D. & Mandel, G. The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature Climate Change 2, 732-735 (2012).

Kahan, D.M. Climate-Science Communication and the Measurement Problem. Advances in Political Psychology 36, 1-43 (2015).

Kahan, D.M. Making Climate-Science Communication Evidence-Based—All the Way Down. in Culture, Politics and Climate Change (ed. M. Boykoff & D. Crow) 203-220 (Routledge Press, New York, 2014).

Kahan, D. Why we are poles apart on climate change. Nature 488, 255 (2012).

Peters, E.M., Burraston, B. & Mertz, C.K. An Emotion-Based Model of Risk Perception and Stigma Susceptibility. Risk Analysis 24, 1349-1367 (2004).

Slovic, P., Finucane, M.L., Peters, E. & MacGregor, D.G. Risk as Analysis and Risk as Feelings: Some Thoughts About Affect, Reason, Risk, and Rationality. Risk Analysis 24, 311-322 (2004).


Self-deception at L'université Toulouse: an encore!

I offered a report on my presentation at the fun "self-deception" symposium sponsored by the Institute for Advanced Study at L'université Toulouse Capitole (UT Capitole). I also described my ambivalence toward characterizing identity-protective cognition--the species of motivated reasoning that is at work in public conflict over societal risks & like facts-- as a form of "self-deception."

These reflections have now inspired/provoked a report from another of the symposium participants, Joël Van der Weele, who presented really cool study results on the dynamics of self-deception in job interviewing.  In addition to summarizing the study highlights, Joël's post widens the lens to take in how "self-deception" has figured more generally in the study of behavioral economics.  Having read & reflected on the post, I would definitely now qualify my own ambivalence. I think "self-deception" fits more comfortably when the "self" is the object as well as the subject of the asserted "deception" than it does when the objects are societal risks.... But I'm perplexed, which is good!

Strategic self-deception

Joël van der Weele

 (with thanks to Peter Schwardmann for input)

Like Dan, I attended the workshop on self-deception in Toulouse, and like Dan, I will focus on my own talk. Unlike Dan, my viewpoint is that of a behavioral economist, with associated convictions and blind spots, of which I am happy to be reminded.

Joël van der Weele, steely resisting self-deceptionMost of the empirical literature on motivated cognition and self-deception is focused on establishing the existence of this phenomenon. Social psychologists in particular have made great progress in showing that people will systematically bias their beliefs and their information processing in a self-serving manner, and end up believing that they are smarter, nicer and more beautiful than they really are, and that the world is a safer, more just and more manageable place than it really is.

As usual, behavioral economists arrived to this research area a few decades after the psychologists, and are now confirming some of these results in economic contexts, using their own experimental and theoretical paradigms. While they have questioned that some of the overconfidence evidence is really inconsistent with rationality (Benoît and Dubra, 2011), they also find that much of it seems to be a truly self-serving bias.

At the workshop, several talks were dedicated to summarizing or adding to the evidence of when and where this kind of motivated cognition may occur, for example in the domain of information seeking about stock performance (George Loewenstein), scientific but politicized beliefs about gun control and climate change (Dan Kahan), trust in others (Roberto Weber), and self-inferences from test scores (Zoë Chance).

At the same time, economic studies are showing that overconfidence is expensive. Both in real world data (Barber and Odean, 2000) and experiments (Biais et al. 2005), traders who are overconfident tend to trade too much and make less money. There is more anecdotal evidence from other domains: I am sure that you all know people who think they are really good at something they really are not that good at, with embarrassing or painful results.

Given these costs, why would people deceive themselves? A popular account in both psychology and economics is that people simply like to think well of themselves, or like to think that things turn out well for them in the future, but this is not a very satisfactory explanation. Why wouldn’t evolution or the market take care of those sentimental souls in favor of more hard-boiled types? Where, in other words, are the material benefits that self-deception can bring?

The answer to this question is still mostly in the hands of theorists. Roland Bénabou, who gave the opening talk at the conference, has, together with his co-author Jean Tirole, proposed an explanation in terms of motivation (Bénabou and Tirole, 2002). If people suffer from laziness or have other difficulties in seeing through their plans, overconfidence may be a helpful `anti-bias’ that gets them out of their seat and into action. I don’t know of experiments testing this idea, but if you can help me out I am happy to hear of some. 

Another influential idea has been put forward by a biologist, Robert  Trivers, in several publications since the mid ‘80s (most prominently Von Hippel and Trivers, 2011),  including this book. Trivers argues that self-deception enables you to better deceive others and thus achieve social gain. If you truly believe you are great, you will do a much better job at convincing others that you are. This will help you impress potential sexual partners, achieve sales, land jobs, etc. Self-deception is useful because if you are not aware of lying about being great, you’ll be less likely to feel bad about your deception, give yourself away or face retribution in case of subsequent failure to live up to your proclaimed greatness.

This hypothesis is strikingly consistent with the folk wisdom peddled in the popular self-help literature. Just search for “success” and “confidence” on amazon, and you will find a score of books telling you that if you just believe in yourself (no matter the evidence), riches will soon be yours. While this may be true of the authors of these books, the kind of evidence that is cited in this literature is not very convincing to someone trained in scientific inference (“Look at Person X, she’s is confident and rich. So if you become as confident as X, you’ll sure be rich.”).

So my co-author Peter Schwardmann and I decided to subject the folk wisdom to a proper experimental test. We got about 300 people to the lab to perform a cognitively challenging task.  We then split the group in two. Our treatment group was told that they would be able to earn about 15 euros ($17) if they can persuade others in a face-to-face “interview” that they were amongst the top performers in the task. The control group is not told anything.

Before actually conducting the interviews, participants in both groups then privately report their beliefs about the likelihood of being in the top half of performers on the task, where we pay them for submitting an accurate belief. We find that treatment and control group are both overconfident on average, with the average belief of being in the top half being 60%, i.e. 10% higher than the true number.

In line with Trivers’ hypothesis, the shadow of future interactions increases overconfidence by about 50%, from 8% to 12%. This effect does not go away after we give participants some noisy information about their actual performance, as the prospect of future deception responsibilities also reduces responsiveness to new information about performance. Thus, anticipation of future deception opportunities indeed causes a more optimistic self-assessment amongst our participants, a case of strategic self-deception.

Our next question was whether self-deception paid off in the interview phase, i.e. whether increased confidence made a participant more likely to be flagged as a good performer, conditional on real performance. The interactions followed a speed-dating protocol, where we promoted the control group to interviewers, tasked with assessing the performance of the treatment group.

The results in this phase of the experiment crucially depend on the details of the environment. We had given some of the interviewers a short tutorial in lie-detection. It turned out that these interviewers were pretty good at spotting the true good performers and the self-deceptive strategies of the interviewees were ineffective. Against untrained interviewers, however, the average level of self-deception in our experiment (i.e. the increase in overconfidence of our treatment group) led to a substantial increase in the chance of being flagged as a top performer and the associated earnings. 

All of this is somewhat preliminary, as we are currently refining results and putting them on paper on paper. As far as we know, there are no other studies showing causal evidence for strategic self-deception in social contexts, although some are suggestive of it (Burks et al. 2013, Charness et al. 2014). If this finding holds up in a wider array of settings, we may find that the pop psychology literature is not that wrong after all.


Barber, B. M. and T. Odean. 2000. "Trading is hazardous to your wealth: Common stock investment performance of individual investors", Journal of Finance 55, 773-806.

Bénabou, Roland and Jean Tirole. 2002. “Self-confidence and Personal Motivation”, Quarterly Journal of Economics, 117:3, 871-915.

Benoit, J.P. and J. Dubra. 2011. “Apparent Overconfidence”, Econometrica, 79:5, 1591-625.

Biais, B. D. Hilton, K. Mazurier and S. Pouget. 2005. “Judgemental overconfidence, self-monitoring, and trading performance in an experimental financial market” Review of Economics Studies, 72:2, 287-311.

Burks, S. V., J. P. Carpenter, L. Goette and A. Rustichini. 2010. “Overconfidence and Social Signaling”, Review of Economics Studies, 80:3, 949-983.

Charness, G., A. Rustichini, and J. van de Ven. 2014. “Self-confidence and strategic behavior”, Amsterdam University mimeo.

Von Hippel, William, and Robert Trivers. "The evolution and psychology of self-deception." Behavioral and Brain Sciences 34.01 (2011): 1-16.



On the provisionality & conjectural status of claims about Pakistani Drs & Kentucky Farmers

This is a response to a friend & scholar who wrote to me with some responses to "yesterday's" post on identity-protective reasoning & self-deception.  In the response, I found myself being clearer than I usually am in my posts about the tentative & conjectural status of the views I have been advancing about "cognitive dualism"--the state in which an actor appears to entertain opposing states of belief within bundles or ensembles of action-enabling mental routines that are summoned for discrete activities.  

So I'm posting this portion of my response, both to remedy the failure to be as consistently clear as I should be that "cognitive dualism" is a conjecture and to create a "location" for this qualification when I have occassion to discuss this concept in the future & wish to emphasize what my attitude actually is about its status as an explanation for certain intriguing phenomena.

* * *

Thanks for the feedback & by all means feel free to share any portions of the post with others who you think might find the ideas expressed & arguments advanced to be of value. 

On the "believe/disbelief" issue: I should start by saying that my views on this are certainly very provisional. This is always true, at least for anyone who knows how empirical proof works and is committed to treating it as his or her guide for enlarging knowledge.  But in this case, my intuitions are way out in front of my evidence; I am eager to lessen the gap.

I am drawn to this by two types of observations. The first the results of a study in which I tried to develop a climate-change knowledge assessment that unconfounded the "affective identity" measured by most questions about "belief in" climate change from genuine knowledge.  The results of that study suggested, not surprisingly, that there is essentially no correlation between understanding of the basic mechanisms of climate science (ones relating to causes or consequences) and "beliefs in" it (whether it is happening, human caused, etc.); the latter are simply indicators of identity of the same nature as response to political outlook questions.

The thing that disoriented me was what to make of the finding that the individuals who scored highest on the assessment (& who also scored highest on a general science knowledge assessment) were also the most polarized. They obviously "know" what the best evidence is & yet say they "believe" or "disbelieve" in a manner that indicates their political identity.  What is going on in their heads-- I asked myself this & was asked the question over & over again by many curious & reflective people.

So I tried to come up with a taxonomy of explanations, one of which was the "cognitive dualism" explanation.

On this account -- which is based on various general sources on the nature of belief & action but also specific investigations of "disbelief in" evolution among people use such knowledge professionally -- starts with a psychological conception of "beliefs" as "dispositions to action."  It then proceeds to the proposition that beliefs of opposing valences can be bundled into discrete complexes of intentional states suited for doing distinct things-- like being a good Muslim & a Dr; or being a good Hierarch individualist & a good farmer; or being a good cosmologist & a good mother.  Yes, the "beliefs" that are elements of the discrete bundles "conflict" as propositional assertions; but as mental objects, they don't exist independently of the action-enabling ensembles of mental states of which they are a part.  If those don't conflict, then there is no practical, experienced contradiction.  The criterion of identity that is used to individuate the "beliefs" & find contradiction in them is one that is alien to the psychology of the actor & likely to confuse us about how that person's reason works.

You ask about what happens when the actions that are enabled do conflict.  I want to say that is in fact an entirely different sort of phenomenon or set of mental dynamics.  In the taxonomy, it would be "compartmentalization," which refers to the conscious, effortful separation of contradictory action-enabling beliefs & associated mental states in the mind of the same actor.  Think of the closeted gay person who belongs to a religious group that persecutes gays, e.g.  This is a form of dissonance avoidance.  It is distinct from what happents with "cognitive dualism."  It is not what is going on, I think, in the case of the Pakistani Dr or the Kentucky Farmer (or his prospective veterinarian daughter).

It is also not what is going on, in my view, in South East Florida.  My experiences there in doing field-based science communication studies is the second source of my interest in this issue.

There I see people who "don't believe in" climate change when they are being who they are as members of cultural groups, but who do when they are deliberating as citizens about what to do in their local political communities to try to protect their way of life from impending climate impacts.  I think they are enabled to do this by cognitive dualism.  But I think they are enabled to pursue the cognitive dualism strategy only as a result of astute leaders who create an environment in which there isn't conflict in being who they are and using what they know in their local political life...  This is a very profound accomplishment in my view, one I discuss in the same paper that presents the results of the climate-science comprehension assessment instrument.

I am now in the course of designing studies that bear down more on this phenomenon, that try to conjure the observations that would give us more reason or less to credit one or another of the candidate accounts (which are not limited to "cognitive dualism" & "compartmentalization") of what is "going on in their heads."

And am eager for feedback-- even if quite critical, since I agree that there is more than one plausible account of what is going on & those who are drawn to accounts different from the one I find most consistent with what I've already seen can help me to identify what sorts of observations it would be helpful to make to decide the relative strength of the competing explanations.


Travel report: Self-deception at L'université Toulouse

I attended a great conference on "self-deception" sponsored by the Institute for Advanced Study at L'université Toulouse Capitole (UT Capitole)

The concept of "self-deception" encompasses forms of information-processing that predictably bias individuals' beliefs toward some self-serving end or goal.

The main theoretical/scholarly issues are two: first, whether "self-deception" is at least under some circumstances "rational" or in any case beneficial to those who engage in it; and second, whether there is a cogent psychological mechanism that could explain the feasibility of this sort of rational or "adaptive" self-deception, given that presumably it is self-defeating to pursue such a state consciously (b/c if one knows one is decieving oneself, one will not be deceived into subscribing to the false belief).

We heard many interesting takes on these questions.

I myself gave a talk on "Motivated System 2 Reasoning." 

Slides here.

I made two principal points. 

First, contrary to the dominant decision-science and political science accounts, identity-protective cognition --the species of motivated reasoning that generates political polarization on decison-relevant science -- is not a consequence of over-reliance on heuristic or "system 1" information processing; indeed, it is magnified by proficiency in one or another of the reasoning dispositions associated with conscious, effortful form of information processing associated with "System 2"

Or so I argued on the basis of various CCP study results.

To me this suggests it is not tenable to see identity-protective reasoning as a "cognitive bias."

It is individually rational to process information on societal risks in this manner when one's own exposure to that risk is not materially affected by the correctness of one's views but where one's status in one's cultural group is very much affected by the congruity of one's beliefs with those that predominate in the group.

This is so for climate change, gun control, fracking, etc.

Of course, if everyone engages in this individually rational mode of information processing at the same time, the results can be collectively disastrous.  Under these conditions, culturally diverse citizens will fail to converge on the best currently available evidence essential to enactment of democratic laws that protect the welfare of all.

That consequence, though, won't change anyone's individual psychic incentives to process information in the personally beneficial manner associated with identity-protective cognition.  This is, as I've described it before, the "tragedy of the science communications commons."

This point aligned me pretty squarely with the economist contingent at the conference, which was mainly intent on demonstrating that "self-deception" is "rational" in the sense of welfare-maximizing at the individual level.

My second point was less in line with the views of the economists but likely more in line with at least some of the members of psychologist contingent at the conference (& I think with Richard Holton, the lone philosopher on the program, who gave a very insightful & helpful talk).

The point was that I didn't really think it was theoretically cogent or psychologically realistic to describe identity-protective reasoning as a form of self-deception.

It's true that this mode of information processing systematically promotes formation of beliefs that aren't aligned to the best currently available evidence. (There was some pushback on this along the predictable "but that's perfectly consistent with Bayesianism..." lines.  It never ceases to astonish me how many economists & political scientists have trouble grasping the conceptual distinction between truth-convergent Bayesian updating, in which one's priors are updated on the basis of evidence the likelihood ratio or weight of which is determined on the basis of independent truth-convergent criteria; and confirmation bias, in which one uses one's priors to determine the likelihood ratio assigned to new evidence.)

But I don't really see why this makes identity-protective cognition an instance of "self-deception."

People do things with information other than use it to form "accurate beliefs."  One of those other things they use information for is to cultivate dispositions that evince their commitment to values that unite them with other members of affinity groups important to their identity.

Sometimes the way to evince such commitments is by holding certain beliefs about risks or other related facts that, by virtue of one or another socially and historically contignent set of events, has come to be understood as a badge of membership in a particular cultural group.

If the person has no other purpose for the belief in question, then someone who forms beliefs using this style of information processing is not deceiving him- or herself at all, any more than such a person would be if the person used this form of information processing, say, to form the disposition to leave a tip at a restaurant (Frank 1988).

Or so it seems to me.

I think the reason so many scholars regard this form of information processing as "self-deception" is rooted in a psychologically implausible view of "beliefs" as isolated states of assent or nonassent to factual propositions.

The mind is not a registry of atomistic propositional stances.

It comprises a wide array of mental routines, which themselves consist of bundles of intentional states--desires, emotions, moral evaluations--each of which is suited for doing something.

As elements of these action-enabling ensembles, beliefs are dispositions to action (Pierce 1877; Braithwaite 1946).

If someone is using a style of information processing to form clusters of intentional states that reliably alert and motivate him or her to display identity-congruent societal risk perceptions in appropriate circumstances, then that person is is doing with his or her reason something akin to what someone does when internalizing a disposition to conform to norms that signify being a socially competent actor. 

In this sense, "beliefs" in "climate change," "evolution," "the deterrent effect of gun control laws" & the like are more akin to action-promoting attitudes than bare states of assent or non-assent to context-free factual propositions.  

If one accepts this view, none of the puzzles that vex "self-deception" need arise.  

A person who forms "beliefs" on these issues in the course of cultivating affective states that express his or her identity (Akerlof & Cranton 2000; Anderson 1993) is not "deceiving"  him- or herself -- or anyone else --about anything.

This assumes, of course, that this is what a person is doing with information relevant to forming a "belief" on a risk or like fact.

Sometimes people do other things with such beliefs-- like be good "doctors," or "farmers," or "judges" or other types of professionals.  

In that case, we might see "cognitive dualism," the condition in which the actor forms opposing states of beliefs as part of separate and discrete action-enabling ensembles of intentional states.

The Pakistani Dr "disbelieves in" evolution at home to be a good Muslim, but "believes in" it at work to be a good Dr.

The Kentucky Farmer, likewise, "disbelieves in" climate change to be a good Hierarch Individualist, in the settings where that is what he is doing; but "believes in" it when he is atop his tractor engaged in "zero tillage" or like practices that he knows will help him master the challenges that global warming is going to create for success in his occupation.

The propositional stances in the disbelief-belief couplings are indeed inconsistent if we abstract them from the action-enabling ensemble of mental states of which they are a part.  

But doing that is not faithful to the agent's psychology.  The opposing "beliefs" and "disbeliefs" don't exist apart from the action-enabling bundles of intentional states they reside in.  If those actions aren't inconsistent, then there's no "conflict" between any meaningful mental object that resides in the agent's mind.

Introduced with a discussion of the Pakistani Dr & the Kentucky Farmer, this last point -- about cognitive dualism -- predictably dominated discussion.  

I'm not sure how I feel about that.

It's interesting and fun to see people struggle with the point (especially when one invokes Kantian dualism & adds a Laplacian cosmologist who is proud of his or her children to the mix).

But if that point isn't really the point of the presentation, it can end up being a bit of show stealer and ultimately a distraction.

That doesn't make me doubt "cognitive dualism," of course.  If anything, it strengthens my resolve to investigate it; that it bothers and disorients people so much means something, I suspect.

But "cognitive dualism" is severable from "motivated system 2" reasoning, certainly, and I don't want to leave anyone with any misimpressions about that.

Better to address difficult issues one at a time.

But here is something that can be figured out w/o any great difficulty at all: L'université Toulouse is really cool!  I was awed at the number of talented scholars engaged both in high-level investigations of human behavior and high-level scholarly exchange w/ one another across disciplines.


Akerlof, G.A. & Kranton, R.E. Economics and identity. The Quarterly Journal of Economics 115, 715-753 (2000).

Anderson, E. Value in Ethics and Economics (Harvard University Press, Cambridge, Mass., 1993).

Frank, R.H. Passions within reason : the strategic role of the emotions (Norton, New York, 1988).

Braithwaite, R.B. The Inaugural Address: Belief and Action. Proceedings of the Aristotelian Society, Supplementary Volumes 20, 1-19 (1946).

Pierce, C.S. The Fixation of Belief. Popular Science Monthly, 12, 1-15  (1877).






MAPKIA #73 part IV: Revenge of the disgust skeptics! Does *disgust* really play any role in vaccine & GM-food risk perceptions?

CCP blog subscriber special offer: get this paper *now*, so you can be smarter than others for at least several weeks!!!So I’ve spent a day or so reflecting on the really great Wendell & Clifford guest post, along with their fantastic “in press” paper, on disgust sensibilities and vaccine-risk and GM-food risk perceptions.  I learned a ton from doing so.

I have some questions, certainly.

But in my experience, the best studies are always the ones that make you pay for the solution to a vexing puzzle by obliging you to see multiple additional ones that you now feel impelled to find an explanation for.  That's the way I feel about W&C's post & paper.

I’ve divided my reactions into two parts.  The first set address W&C’s own data, the second their “alternative interpretation” of the data analyses that earned @Mw her now disputed 5th straight MAPKIA! crown (the Chair of CCP Gaming Commission has stripped her of the synthetic biology giganto E. coli first prize. . . heart breaking . . .).

A. W&C's data

1. High or low, disgust sensitivities predict a high level of support for vaccines, no? Unlike a lot of researchers, W&C don’t hang their hat on disembodied correlation coefficients with long strings of asterisks. They get that a “statistically significant” correlation is not equivalent to a practically meaningful influence.  They respect the reason of readers by showing them the raw data, so that readers can meaningfully reflect on whether they agree the relationship expressed in the correlation bears the interpretation—because that’s inevitably what it is!—assigned to it.

I certainly respect and value the account they give to support their conclusion.

But when I look at the cool W&C data, I infer that people who vary in “pathogen” disgust are not in much disagreement: childhood vaccines are a good idea. 

W&C don’t describe the wording of the individual survey items used to from the “opposition to vaccines” scale, but their scatterplot does make it possible for us to see that all the subjects in their sample are heavily concentrated at the lowest values of “opposition.” In other words, across the items, the sample was highly skewed toward responses the evince “support” for vaccines.

from W&C postEven the individuals who scored high on the “pathogen disgust sensibilities” (PDS) scale were many times more likely to hold a positive than a negative attitude toward vaccines.  The “r = 0.15” (students) and “r = 0.20” (M turk) coefficients, then, don’t bear out the inference that high-PDS subjects were afraid of or against vaccines; they imply only that the high degree of support that those subjects had for vaccines wasn’t quite as high as was that of subjects low in PDS.

Just to try to add some perspective to the admirably concrete picture W&C show us, consider these data from the  CCP Vaccine Risk Perceptions and Ad Hoc Risk Communication Report:

These are the sort of data that make it possible to see that those who think that there is meaningful ideological contestation over vaccine risks are uninformed (to put it politely).  Yes, subjects who are more left-leaning in their outlooks love vaccines a smidgen more than those who are right-leaning. But it is clear enough that those who are “right-leaning” love them too!

The correlation between this item and left-right ideology (r = -0.14) is about the same one that W&C report in their student sample.

The correlation that W&G report in their other M Turk subjects—r = 0.20—is a bit higher. 

But here is what an "r = 0.20" relationship looks like in raw data relating the Industrial Strength Risk Perception measures for childhood vaccines, and in comparison to perceptions of a bunch of other putative risks (again from the CCP Vaccine Risk Perceptions and Ad Hoc Risk Communication Report):

The point of showing the data that stand behind disembodied “statistically significant” correlations is to see whether they support the inferences that people draw from them.

Just as I think it would be unreasonable for someone to treat these CCP data as saying “conservative ideology predicts fear of” or “opposition to” to childhood vaccines, so I think it is not persuasive to treat W&C’s data as suggesting that high pathogen-disgust sensitivities predict any sort of opposition to or concern about childhood vaccines in either their M Turk or student samples.

Indeed, in their excellent paper, W&C characterize the relationship between PDS and the perception that vaccines cause autism as "weak and not statistically significant” (p. 26) for their student subjects.

2. Inferential sufficiency? W&C show us that pathogen-disgust sensitivities are correlated, but not very strongly, with both GM-food and vaccine risk perceptions.  But that’s not actually enough information for us to assess whether either, much less both of these risk perceptions, are meaningfully explained by variance in disgust sensitivity.

Before we can draw that inference, we'd need to  be shown, first, that the relationship between PDS and both GM-food and vaccine risk perceptions is comparable to what we’d expect to see between PDS and the perceived risks of other putative risk sources that we are already confident do provoke pathogen-disgust reactions. If the relationship is smaller, then that’s a reason for thinking that disgust sensitivities aren’t that important in the case of GM-food and vaccine-risk perceptions.

Second, we’d need to be shown what the relationship is between PDS and other putative risk sources that we have good reason to believe don’t provoke meaningful pathogen disgust sensitivities.  If those relationships are comparable in size to those between PDS and either GM-food or vaccine-risk perceptions, that would be reason, too, for discounting the inference that GM-food and vaccine-risk perceptions are meaningfully “explained” by differences in pathogen-disgust sensitivities.

This was the nub of @Mw’s case against treating disgust sensitivities as linking GM-food and vaccine-risk perceptions.  The relationship between the two was the same as the one between each of those risks and myriad other risk perceptions of putative risk sources, like drones and nuclear power, that didn’t seem to have much to do with disgust.

W&C don’t present this sort of info—the equivalent of what one would need to fill in a 2x2 covariance matrix—in the blog post, but they do have some data on other risk perceptions in their excellent paper.

Others should look and see what they think, but I found this data somewhat puzzling.

E.g., they report that neither drugs nor cigarettes, which they say are recognized in the literature as exciting pathogen-disgust sensitivities,  seemed to have meaningful relationships with PDS in their sample.  Indeed, they reported that sexual- disgust sensitivities were more meaningfully associated with anti-drug attitudes in their sample than pathogen disgust ones!

If the disgust scale didn’t perform as we expected on risk perceptions that we think we are related to disgust, then I’m left confused about what to make of the (pretty modest) relationships that they report between the scale and attitudes toward vaccines and GM foods.

Perhaps this is something W&C can clarify in a follow up or in fact do address in a revised version of the paper.

3. Why aren’t conservatives disgust sensitive? I found it remarkable that there was no meaningful correlation between PDS and ideology in the W&C sample. The idea that conservatives are “disgust sensitive” is a big theme in the moral psychology literature; the claim is made about “pathogen” as well as “sexual disgust” sensitivities.

I’d surmise that atypicality of the M Turk subjects, whose ideologies (W&C report) were heavily skewed toward liberalism, might have something to do with the explanation, except that on Twitter, Clifford supplied data showing that PDS had no meaningful relation with ideology in a YouGov sample, which I presume was drawn from a sample recruited and stratified for national representativeness.

I gather that “sex disgust sensitivities” (SDS) are generally understood to have a higher correlation with conservatism than PDS ones.  But the two are supposed to be correlated.  That, plus the W&C results on the relationship between SDS and drug laws, and the very modest relationships reported in studies that do seem to show an ideological-disgust relationship, have  now made me wonder whether the relationship between disgust and conservatism is as meaningful as it is made out to be by many commentators.

I’m sure moral psychologists will sort all this out!

B. @Mw's "factor 1"

1. Who sees what as a “pathogen” and why?  I myself was not entiredly persuaded that the loading of GM food risks on @Mw’s “factor 1” supports W&C's inference that  variance in GM foods is explained by PDS.

For one thing, it seems ad hoc to treat the eclectic assortment of risks that happened to load on “factor 1” as evincing a latent PDS sensibility.

@Mw's Factor Analysis from disputed MAPKIA #73 episode

Why did “residential exposure to magnetic field of high-voltage power lines” (POWER) and “user exposure to radio waves from cell phones” (CELL) load on factor 1?

click here to see the cool ISRPMs!I suppose the explanation would be that high PDS subjects are prone to see even invisible electronic waves travelling through the air as “pathogens” penetrating their bodies.

But then why didn’t nuclear power load on that factor? The idea that nuclear power plant radiation is hazardous is in fact a much more conspicuous, much more contentious matter in our society than that either cell phones or high-voltage power lines harm anyone.

Why didn’t “fracking”—which involves injecting noxious chemicals into bedrock, where it can leach into the groundwater—load on “factor 1” if it is measuring a latent PDS sensibility?

Again, drug use is generally understood in the literature to excite PDS.  So why didn’t marijuana legalization load on “factor 1”?

What about "drinking raw milk (milk that has not been pasteurized)" (RAWMILK)? That stuff is brimming with delicious E. coli, salmonella & other pathogens.  Shouldn't it load on Factor 1 if Factor 1 is about "pathogent disgust" sensibilities?

“Private operation of drones in U.S. airspace” (DRONES) correlates more strongly with “Factor 1” (r = 0.20, p < 0.01) than does  raw milk (r = 0.09, p =, p < 0.01).  That’s weird, I think, if the factor is supposed to be measuring some generic anxiety about bodily invasion by foreign agents (there are some really small drones--they’re adorable!--but none will make it very easily into your blood stream!).

I suggested that “factor 1” is a catchall: there isn’t much public concern about any of the risks that load on it, including consumption GM foods, in the US general public.  What explains variance in them is just some unobserved disposition to worry about things not many other people do.

But I accept for sure that there might be more to it.

Indeed, one possibility that occurs to me is a weak form of “environmental risk” sensitivity that is associated with being culturally egalitarian.

Actually, I don’t have cultural outlook scores in this dataset!

But I do have right-left ideology, which is correlated with being egalitarian and communitarian and definitely is an indicator of environmental-risk concern.

I also have the Ordinary Science Intelligence scale.

Click on this regression. It's a cool 1970s-era motif computer outputWhen I regress “factor 1” on those two variables and their interaction, it turns out that being more “left-leaning” predicts a higher level of the “factor 1” latent risk concern. 

Moreover, the disposition to worry about the Factor 1 risks becomes even more politically polarized as science comprehension does—a sign that identity-protective reasoning played a role in the formation of the relevant risk perceptions.

So there’s an explanation that competes with catchall: an environmental risk concern that is characteristic of an egalitarian-communitarian identity but that is less proximate to that identity than concerns about the more culturally freighted risks that figure in “factor 3.”

The effects are not big at all. But given that “conservatives” supposedly have greater PDS, it’s hard to reconcile these data with the proposition that “factor 1” is measuring a risk sensitivity related to pathogen disgust sensitivities.

Unless, of course, “disgust sensitivities” are themselves programmed by cultural outlooks, in which case, contrary to “moral foundations theory,” we’d expect disgust sensitivities to be symmetric with respect to cultural outlooks or political ideologies but to attach to different putative risk sources in patterns that reflect the cultural meanings that the sources in question have for the types involved.

I find that very plausible—even with respect to drones. 

(A last point: the @Mw “factors” were rotated so that they would be, or be close to, orthogonal.  Accordingly, it is not really useful to compare the correlations of the factors to one another, as @W&C had helpfully suggested.  Nevertheless, if we do that, it turns out that “factor 1” is in fact more strongly correlated (r = 0.13, p < 0.01) with “factor 3,” the “white hierarchical male” risk-skepticism group, than with “factor 2” (r = 0.05, p = 0.02), the social-deviancy “disgust” factor.)

2. No one sees vaccines as a “pathogen.” In any case, as @W&C note, vaccine risk perceptions do not load on “factor 1.”  So if “factor 1” is a latent PDS sensibility, concern over vaccines isn’t associated meaningfully with PDS.

click on this cool graphic that shows "affect" heuristic at work for vaccine risks/benefit perceptionsW&C suggest that maybe vaccines, because they confer health benefits as well as risks, might not excite PDS.  That sounds like a reason for thinking the hypothesis—that people who are vaccine hesitant are motivated by their disgust with needles in their veins—is false, not a reason to think the industrial strength risk perception measure for vaccine risks isn’t a valid measure of vaccine risk perceptions.

For sure the industrial strength measure is a valid indicator of the general affective orientation that people have toward vaccines, ones that informs all manner of assessment they make about vaccine risks and benefits. That's another of the findings from the CCP Vaccine Risk Perceptions and Ad Hoc Risk Communication Report).

* * *

So those are some of the thoughts & questions that occur to me.  Thanks a ton to W&C for making me both better informed and more perplexed!

[Note: I'm closing off comments here so that the discussion of W&C's own analysis occurs in 1 place-- after their post.]


MAPKIA! episode 73 sequel: Scholars who genuinely know something explain disgust's contribution to vaccine & GM food risk perceptions

This post is part of the settlement of the class action lawsuit filed after @Mw was declared the winner of the now infamous "MAPKIA!" episode 73. The other part of the settlement was a $54.75 billion punitive damage award to loyal listener @Cortland. But anyway, this is a really cool post on data from an "in press" paper that examines the impact of disgust on GM-food- and vaccine-risk perceptions. Enjoy!  

Needles in our veins and in our food: Disgust sensitivity predicts attitudes toward vaccines and genetically modified foods

Dane Wendell & Scott Clifford

the beginning of the epic MAPKIA battle ...

The MAPKIA question we arrived at via Twitter was

What sorts of individual characteristics or predispositions, if any, account for the observed relationship between vaccine- and GM-food-risk perceptions and what, if anything, can we learn about risk perceptions generally from this relationship?

We enjoyed and appreciated this follow-up post from Dan, which argues that attitudes towards vaccines and GM food are predicted by a generalized disposition to be worried about anything, rather than a substantively meaningful dimension such as disgust sensitivity. But, we disagree with that explanation! And we want to put forward three points:

  • First, disgust sensitivity is a very good potential explanation.
  • Second, we have evidence that disgust sensitivity has a fairly robust relationship to genetically modified food (GM Food) and anti-vaccine attitudes (anti-vax). And, these attitudes are unrelated or weakly related to political ideology.
  • Third, the risk perceptions evidence in the previous post may actually reinforce our argument, not dismantle it.

Why disgust?

Why disgust?Disgust is part of the behavioral immune system, an emotion that motivates avoidance of contamination, such as the consumption of toxins, physical contact with a diseased person, any breaking the skin, and the expulsion of potential toxins from the body. Disgust is a powerful drive that deeply motivates humans because it leads to bodily health and reproductive fitness. Disgust is extremely hard for us to inhibit.

In one of our favorite studies, Rozin and colleagues (1986) find that subjects are reluctant to eat delicious, safe chocolate if the chocolate has been molded to resemble dog poop.

The purpose of disgust is to help us avoid illness. When our team realized that GM foods and anti-vaccination attitudes did not seem related to political ideology, we began to wonder what could be underlying those attitudes. The cases of vaccines and GMO foods both involve literally introducing gross, unnatural things into the body. Because of this, we began to suspect that disgust sensitivity could be related to these attitudes.

It's a plausible surmise, and it ought to be directly tested. So, we did!

What does our evidence say?

Our argument, and indeed, a lot of evidence that we’ve collected, suggests that both vaccine attitudes and GM food attitudes are correlated with pathogen disgust sensitivity. Our paper under review examines disgust sensitivity and a number of issues related to food and health politics in three studies (a total of 612 Amazon Mechanical Turks and 177 students). We find that people who are more disgust sensitive in this way are also more opposed/skeptical of vaccinations and GM foods.

Our outcome measures are not the same as the risk perceptions: we are measuring policy attitudes like mandatory labeling of GMOs and vaccination beliefs about safety and efficacy.

click for mind-blowing higher res! *Not* at all disgusting!The scatterplots show the basic relationships, but note that full regressions with control variables (ideology, education, sex, income, age) make the relationships even more pronounced. Here is a link to the pre-print paper, which includes this discussion as well as some null findings, too.

It is also worth noting that self-described political ideology is, itself, unrelated to pathogen disgust sensitivity. Disgust sensitivity explains something about these attitudes that political ideology does not.

We can also note that several specific political attitudes (e.g. expanding War on Terror, defense spending) also do not seem related to pathogen disgust sensitivity, suggesting, again, that pathogen disgust sensitivity does not necessarily affect all political attitudes, just those that have a clear health connection.

How does risk perceptions analysis demonstrate disgust?

So, what do we make of all of the other risk perceptions that were presented in the MAPKIA episode 73 "answer"?

click it! c'mon ... c'mon!When looking at the factor analysis provided, we believe that the two factor structure is actually supportive of our theories.

We argue that factor 1 is related to pathogen disgust, and factor 2 is related to sexual disgust.

According to Tybur and colleagues (Tybur et al. 2009), pathogen disgust is concerned with the avoidance of infectious microorganisms, while sexual disgust is the avoidance of sexual partners and behaviors that threaten reproductive fitness. We have found that these domains of disgust are rather important for the study of political attitudes. For example, in our research, sexual disgust is strongly correlated with political ideology, but pathogen disgust is uncorrelated or weakly correlated. Not specifying the disgust domain risks conflating what is really going on in the data.

you're hooked now... no choice but to clickPathogen disgust is distinguishable from sexual disgust, so we would not expect a very strong relationship between GM attitudes (pathogen disgust) and pornography (sexual disgust), for example. Similarly, in our data, sexual disgust does not predict GM attitudes once pathogen disgust is accounted for.

These disgust domains potentially hold great explanatory power for our question today. Our interpretation is that the first factor is picking up concerns about pathogen disgust (while the second is related to sexual disgust). What do GM foods, pesticides, food coloring, saccharine, and (presumably faulty) beef all have in common? Well, they’re “unnatural” things that you consume, and thus raise pathogen concerns.

Now, power lines and cell phones fit less clearly with our explanation (and load less strongly), but both fit with concerns about unseen things causing cancer (disease!).

oh--& here are the risk perception items!True, as Dan notes, vaccines do not load strongly on that first factor. This could be an interesting consequence of how vaccines both contaminate the individual and protect the individual from illness. Asking respondents how risky "vaccines" are may depend on how/where the respondents assess the risk (initially risky? Or risky in the long term) or for whom. That said, we would have expected vaccines to load in the first factor, alongside other food/health risks.

Two additional tests come to mind.

First, if the first factor getting picked up in the factor analysis is just a general risk disposition, then it should be strongly correlated with both of the remaining factors. And the more strongly correlated it is, the more evidence in favor of Mw.

Second, our own hypothesis would predict that the first factor is more strongly related to the second factor than the third. This is because while pathogen and sexual disgust are distinct, they are of course related. So if we are right, and this first factor represents pathogen concerns, then it should be more strongly related to sexual concerns than concerns about harm and authority (or “hierarch communitarians” and “egalitarian individualists” in Dan’s terminology).

We look forward to seeing the results!

We also think this approach might shed some light on misconceptions about anti-vaccination and anti-GM attitudes.

As Dan notes at the end, there are many stereotypes about these people, particularly that they are made up of one distinct group of Whole Foods People aka "Over-privileged Rich People".

But the data doesn’t bear this out. We don’t find this particularly surprising, precisely because these attitudes arguably do not form a widely adopted cultural group. There are likely a few relatively visible cases of people who fit this whole foods stereotype and have created a belief system that upholds all of these attitudes. But most people don’t read Natural News and haven’t been exposed to all of these debates and thus have not yet had the relevant dispositions activated. Not to mention, they probably have lots of good countervailing reasons to not hold these attitudes.


Rozin, Paul, Linda Millman, and Carol Nemeroff. 1986. “Operation of the Laws of Sympathetic Magic in Disgust and Other Domains.” Journal of Personality and Social Psychology 50(4): 703–12.

Tybur, J. M., Lieberman, D., & Griskevicius, V. (n.d.). Microbes, mating, and morality: Individual differences in three functional domains of disgust.



"Politically Motivated Reasoning Paradigm" (PMRP): what it is, how to measure it

1. What’s this about. Here are some reflections on measuring the impact of “motivated reasoning” in mass political opinion formation.

They are not materially different form ones  I’ve either posted here previously or discussed in published papers (Kahan 2015; Kahan 2012). But they display points of emphasis that complement and extend those, and thus maybe add something. 

In any case, the need for more reflection on how to measure “motivated reasoning” in this setting demands more reflection—not just by me, but by the scholars doing work in this area, since in my view many of the methods being used are plainly not valid.

2. Terminology. “Identity-protective reasoning” is the tendency of individuals selectively to credit or discredit all manner of evidence on contested issues in patterns that support the position that predominates among persons with whom they share some important, identity-defining affinity (Sherman & Cohen 2006).

This is the form of information processing that creates polarization on politically charged issues like climate change, gun control, nuclear power, the HPV vaccine, and fracking.  Frankly, I don’t think very many people “define” themselves with reference to ideological groups (and certainly not many ordinary ones; only very odd people spend a lot of time thinking about politics). But the persons in the groups with whom they do share ties are likely to share various kinds of important values that have political significance; as a result, political outlooks (and better still, cultural ones) will often furnish a decent proxy (or indicator) for the particular group affinities that define people’s identities.

For simplicity, though, I will just refer to the species of motivated reasoning that figures in the study of mass political opinion formation as “politically motivated reasoning.”

What I want to do is suggest a conception of politically motivated reasoning that simultaneously reflects a cogent account of what it is and a corresponding valid way to experimentally assess what impact it has if any.

I will call this the “Politically Motivated Reasoning Paradigm”—or PMRP.

3. Information-processing mechanisms.  In my view, it is useful to specify PMRP in relation to a very basic, no-frills Bayesian information-processing model. Indeed, I think that’s the way to specify pretty much any posited cognitive mechanism of information-processing.  When obliged to identify how the mechanism in question differs from the no-frills Bayesian model, the person giving the account is forced to be clear and precise about the key features of the information-processing dynamic she has in mind. This sort of account, moreover, is the one most likely to enable reflective people to discern forms of empirical investigation aimed at assessing whether the mechanism is real and how it operates.

So start with this figure: 

The Bayesian model (A) not only directs individuals to use new evidence to update their existing or prior belief on the probability of some factual proposition but also tells them to what degree they should adjust that belief: by a factor equal to its “likelihood ratio,” which represents how much more consistent the evidence is with that proposition than some alternative.  The Bayesian “likelihood ratio” is the “weight of the evidence” in practical or everyday terms (Good 1985).

When an individual displays “confirmation bias” (B), that person credits evidence selectively based on its consistency with his or her existing beliefs.  In relationship to a simple Bayesian model, then, confirmation bias involves an endogeneity between priors and likelihood ratio: that is, rather than updating ones priors based on the weight of the evidence, a person assigns weight to the new evidence based on its conformity with his or her priors.

This might well be “consistent” with Bayesianism, which only tells a person what to do with his or her prior odds and likelihood ratio—multiply them together—and not how to derive either. But if one's goal is to form accurate beliefs, one should assign new information a likelihood ratio derived from some set of valid, truth-convergent criteria independent of one’s priors, as in (A)  (Stanovich 2011, p. 135).  If a person determines the likelihood ratio (weight of the new evidence) based entirely on his or her priors, that person will in fact never change his or her position or even how intensely he or she holds it no matter what valid evidence that  individual encounters (Rabin & Schrag 1999). 

In a less extreme case, if such a person incorporates his or her priors along with independent, valid, truth-convergent criteria into his or her determination of the likelihood ratio, that person will, eventually, start to form more accurate beliefs, but at a slower rate than if he or she had determined the likelihood ratio with valid criteria wholly independent of his or her priors.

Again, motivated reasoning refers to the tendency to weight evidence in relation to some external goal or end independent of forming an accurate belief. Reasoning is “politically motivated” when external goal or end is congruence between one’s beliefs and those associated with those who share one’s political outlooks (Kahan 2013).  In relation to the Bayesian model (A), then, an ideological predisposition is what determines the likelihood ratio one assigns new evidence (C).

As should be reasonable clear, politically motivated reasoning is not the same thing as confirmation bias.  Under confirmation bias, it is a person’s priors, not her ideological or political predispositions, that governs the likelihood ratio he or she assigns new information. 

Because someone who processes information in an ideologically motivated way will predictably end up with beliefs or priors that reflect his or her ideology, it will often look as if that person is engaged in “confirmation bias” when she assigns weight to the evidence based on its conformity to her political predispositions.  But the appearance is in fact spurious: the person’s priors are not determining his or her likelihood ratio; rather his or her priors and the likelihood ratio he or she assigns to new information are both being determined by that person’s political predispositions (D).

This matters A theory that posits individuals will conform the likelihood ratio of new information to their political predispositions generates different predictions than one that posits they will simply conform their likelihood ratio of new information to their existing beliefs.  E.g., the former but not the latter furnishes reason to expect systematic partisan differences in assessments of information relating to novel issues, on which individuals have no meaningful priors (Kahan et al. 2009).  The former also helps to identify conditions in which individuals will actually consider counter-attitudinal information open-mindedly (Kahan et al. 2015).

4. Validly measuring “politically motivated reasoning.”  Understanding politically motivated reasoning in relation to Bayesianism—and getting how it differs from conformation bias—also makes it possible to evaluate the validity of study designs that test for politically motivated reasoning. 

For one thing, it does not suffice to show (as many invalid studies do) that individuals do not “change their mind” (or that partisans do not converge) when furnished with counter-attitudinal information.  Such a result is consistent with someone actually crediting ideologically noncongruent evidence but persisting in his or her position (albeit with a reduced level of intensity) based on the strength of his or her priors (Gerber & Green 1999).

This design also disregards pre-treatment effects. Subjects who have been bombarded with arguments on issues like global warming or the death penalty prior to the study might disregard—assign a likelihood ratio of one—to counter-attitudinal evidence furnished by the experimenter not because they are biased but because they’ve seen and evaluated it or the equivalent already (Druckman 2012).

Another common but patently defective design is to furnish partisans with distinct pieces of “contrary evidence.” Those on one side of an issue—the death penalty, say—might be furnished with separate “pro-” and “con-” arguments.  Or “liberals” who are opposed to nuclear power might be shown evidence that it is safe, and “conservatives” who don’t believe in climate change evidence that it is occurring, is caused by humans, and is dangerous.  Then the researcher measures how much partisans of each type “change” their respective positions.

In such a design, it is impossible to determine whether the “contrary” evidence furnished conservatives on the death penalty or on global warming (in my examples) is in fact as strong—has as high a likelihood ratio—as the “contrary evidence” furnished liberals on the death penalty or on nuclear power. Accordingly, the failure of one group to "change its views" or change them to the same extent as the others supports no inferences about the relative impact of their political predispositions on the weight (likelihood ratios) they assigned to the evidence.

The design is invalid, then, plain and simple.

The “most compelling experimental test” of politically motivated reasoning “involves manipulating the hypothesized motivating stake” by changing the perceived ideological significance of the evidence “and then assessing how that manipulation affects the weight individuals of opposing [ideological] identities assign to one and the same piece of evidence (say, a videotape of a political protest)” (Kahan 2015, p. 59).  If the subjects “opportunistically adjust the weight they assign the evidence consistently with its perceived” ideological valence, then they are displaying ideologically motivated reasoning (ibid.).  If they in fact use this form of information processing in the real world, individuals of opposing outlooks will not converge but instead polarize even when they rely on the same information (Kahan et al. 2011).

5. PMRP. That’s PMRP, then. Again, conceptually, PMRP consists is the opportunistic adjustment of the likelihood ratio assigned to evidence based on its conformity to conclusions that reflect the ones associated with one’s political outlooks or predispositions.  Methodologically, it is reliably tested for by experimentally manipulating the perceived ideological significance of one and the same piece of evidence and assessing whether individuals, consistent with manipulation, adjust their assessment of the validity or weight (the likelihood ratio, conceptually speaking) assigned to the evidence.

There are many studies that reflect PMRP (e.g., Cohen 2003).  I plan to compile a list of them and to post it “tomorrow.”

But for now, here's a collection of CCP studies that have been informed by PMRP.  They show things like individuals polarizing over whether filmed political protestors resorted to violence against onlookers (Kahan et al. 2012); whether particular scientists are subject matter experts on issues like climate change, gun control, and nuclear power (Kahan et al. 2011); whether the Cognitive Reflection Test is a valid way to measure the open-mindedness of partisans on issues like climate change (Kahan 2013); whether a climate-change study was valid (Kahan et al. 2015); and what inferences are supported by experimental evidence on gun control reported in a 2x2 contingency table (Kahan et al. 2013).

There are many many many more studies that purport to study “politically motivated reasoning” that do not reflect PMRP.  I won’t bother to compile and post a list of those.

6. Blowhard blowdowns of straw people are boring. I will say, though, that scholars who—quite reasonably—are skeptical about “politically motivated reasoning” should not think they are helping anyone to learn anything by pointing out the flaws in studies that don’t conform to PMRP.  The studies that do reflect PMRP were designed with exactly those flaws in mind.

So if one wants to cast doubt on the reality or significance of “politically motivated reasoning” (or cast doubt on it in the minds of people who actually know what the state of the scholarship is; go ahead and attack straw people if you just want to get attention and commendation from people who are unfamiliar), they should focus on PMRP studies.


Cohen, G.L. Party over Policy: The Dominating Impact of Group Influence on Political Beliefs. J. Personality & Soc. Psych. 85, 808-822 (2003).

Druckman, J.N. The Politics of Motivation. Critical Review 24, 199-216 (2012).

Gerber, A. & Green, D. Misperceptions about Perceptual Bias. Annual Review of Political Science 2, 189-210 (1999).

Good, I.J. Weight of evidence: A brief survey. in Bayesian statistics 2: Proceedings of the Second Valencia International Meeting (ed. J.M. Bernardo, M.H. DeGroot, D.V. Lindley & A.F.M. Smith) 249-270 (Elsevier, North-Holland, 1985).

Kahan, D.M. Cognitive Bias and the Constitution. Chi.-Kent L. Rev. 88, 367-410 (2012).

Kahan, D.M. Ideology, Motivated Reasoning, and Cognitive Reflection. Judgment and Decision Making 8, 407-424 (2013).

Kahan, D.M. Laws of Cognition and the Cognition of Law. Cognition 135, 56-60 (2015).

Kahan, D.M., Braman, D., Slovic, P., Gastil, J. & Cohen, G. Cultural Cognition of the Risks and Benefits of Nanotechnology. Nature Nanotechnology 4, 87-91 (2009).

Kahan, D.M., Hank, J.-S., Tarantola, T., Silva, C. & Braman, D. Geoengineering and Climate Change Polarization: Testing a Two-Channel Model of Science Communication. Annals of the American Academy of Political and Social Science 658, 192-222 (2015).

Kahan, D.M., Hoffman, D.A., Braman, D., Evans, D. & Rachlinski, J.J. They Saw a Protest : Cognitive Illiberalism and the Speech-Conduct Distinction. Stan. L. Rev. 64, 851-906 (2012).

Kahan, D.M., Jenkins-Smith, H. & Braman, D. Cultural Cognition of Scientific Consensus. J. Risk Res. 14, 147-174 (2011).

Kahan, D.M., Peters, E., Dawson, E. & Slovic, P. Motivated Numeracy and Enlightened Self Government. Cultural Cognition Project Working Paper No. 116  (2013).

Rabin, M. & Schrag, J.L. First Impressions Matter: A Model of Confirmatory Bias*. Quarterly Journal of Economics 114, 37-82 (1999).

Sherman, D.K. & Cohen, G.L. The Psychology of Self-defense: Self-Affirmation Theory. in Advances in Experimental Social Psychology 183-242 (Academic Press, 2006).

Stanovich, K.E. Rationality and the reflective mind (Oxford University Press, New York, 2011).

*See* "cognitive reflection" *magnify* (ideologically symmetric) motivated reasoning ... (not for faint of heart)

So this is in the category of "show me the data, please!"

I'm all for statistical models to test, discipline, and extend inference from experimental (or observational) data.

But I'm definitely against the use of models in lieu of displaying raw data in a manner that shows that there really is a prospective inference to test, discipline, and extend.  

Statistics are a tool to help probe and convey information about effects captured in data; they are not a a device to conjure effects that aren't there. 

They are also a device to promote rather than stifle critical engagement with evidence. But that's another story--one that goes to effective statistical modeling and graphic presentation.  

The point I'm making now, and have before, is that researchers who either present a completely perfunctory summary of the raw data (say, a summary of means for an arbitrarily selected number of points for continuous data) or simply skip right over summarizing the raw data and proceed to multivariate modeling are not furnishing readers with enough information to appraise the results.

The validity of the modeling choice in the statistical analysis--and of the inferences that the model support--can't be determined unless one can *see* the data!

Like I said, I've made that point before.

And all of this as a wind up for a simple "animated" presentation of the raw data from one CCP study, Kahan, D.M., Ideology, Motivated Reasoning, and Cognitive Reflection. Judgment and Decision Making 8, 407-424 (2013).

That study featured an experiment to determine how the critical reasoning proficiency measured by the Cognitive Reflection Test (CRT) interacts with identity-protective reasoning--the species of motivated reasoning that consists in the tendency of individuals to selectively credit or discredit data in a manner that protects their status within an identity-defining affinity group.

The experiment involved, first, having the subjects take the CRT, a short (3-item) performance based measure of their capacity and disposition to interrogate their intuitions and preconceptions when engaging information. 

It's basically considered the "gold standard" for assessing vulnerability to the sorts of biases that reflect overreliance on heuristic information processing.  With some justification, many researchers also think of it as a measure of how willing people are to open-mindedly revise their beliefs in light of empirical evidence, a finding that is at least modestly supported by several studies of how CRT and religiosity interact.

I've actually commented a bit on what I regard as the major shortcoming of CRT: it's too hard, and thus fails to capture individual differences in the underlying critical reasoning disposition among those who likely are in the bottom 50th percentile with respect to it.  But that's nit picking; it's a really really cool & important measure, and vastly superior to self-report measures like "Need for Cognition," "Need for Closure" and the like.

After the taking the test, subjects were divided into three treatment groups. One was a control, which got information that explained social psychologists had collected data and concluded that the CRT was a valid measure of "open-minded and reflective" a person is.

Another was the "believer scores higher" condition: in that one, subjects were told in addition that individuals who believe in climate change have been determined to score higher on the CRT.

Finally there was the "skeptic scores higher" condition: in that one, subjects were told that individuals who are skeptical of climate change have been found to score higher.

Subjects in all three conditions then indicated whether they thought of the validity of the CRT by indicating how strongly they agreed or disagreed with the statement "I believe the word-problem test that I just took supplies good evidence of how reflective and open-mined a person is." 

Because belief in climate change is associated with membership in identity-defining cultural groups that are indicated by political outlooks (and of course even more strongly by cultural worldviews), one would expect identity-protective reasoning to unconsciously motivated individuals to selectively credit or dismiss the information on the validity of the CRT conditional on whether they had been advised that it showed that individuals who subscribed to their group's position on climate change were more or less "reflective" and "open-minded" than those who subscribed to the rival group's position.

The study tested that proposition, then.

But it also was designed to pit a number of different theories of motivated reasoning against each other, including what I called the "bounded rationality thesis" (BRT) and the "ideological asymmetry thesis" (IAT). 

BRT sees motivated reasoning as just another one of the cognitive biases assocaited with over-reliance on heuristic rather than effortful, conscious information-processing.  It thus should predict that identity-protective reasoning, as measured in this experiment, will be lower in individuals score higher in CRT.

IAT, in contrast, attributes politically motivated reasoning to a supposedly dogmatic reasoning style (one supposedly manifested by self-report measures of the sort that are vastly inferior to CRT) on the part of individuals who are politically conservative.  Because CRT has been used as a measure of open-minded engagement with evidence (particularly in studies of religiosity), IAT would predict that motivated reasoning ought to be more pronounced among conservatives than among liberals.

The third position was the "expressive rationality thesis" (ERT). ERT posits that it is individually rational, once positions on disputed risks and comparable facts have acquired a social meaning as badges of membership in and loyalty to a self-defining affinity group, to process information about societal risks (ones their individual behavior can't affect meaningfully anyway) in a manner that promotes beliefs consistent with the ones that predominate in their group.  That kind of reasoning style will tend to make the individuals who engage in it fare better in their everyday interactions with peers--notwithstanding its undesirable social impact in inhibiting diverse democratic citizens from converging on the best available evidence.

Contrary to IAT, ERT predicts that identity-protective reasoning will be ideologically symmetric.  Being "liberal" is an indicator of being a member of an identity-defining affinity group just as much as being "conservative" is, and thus furnishes the same incentive in individual group members to process information in a manner that promotes status-protecting beliefs in line with those of other group members.

Contrary to BRT and IAT, ERT predicts that this identity-protective reasoning effect will increase as individuals become more proficient in the sort of critical reasoning associated with CRT.  Because it is perfectly rational--at an individual level--for individuals to process information relevant to social risks and related issues in a manner that protects their status within their identity-defining affinity groups, those who possess the sort of reasoning proficiency associated with CRT can be expected to use it to do that even more effectively.

The experiment supported ERT more than BRT or IAT. 

When I say this, I ought to be able to enable you to see that in the raw data!

By "raw data," I mean the data before it has been modeled statistically. Obviously, to "see" anything in it, one has to arrange the raw data in the manner that makes it admit of visual interpretation.

So for that purpose, I plotted the subjects (N = 1750) on a grid comprising their "right-left" political outlooks (as measured with a composite scale that combined their responses to a conventional 7-point party self-identification measure and a 5-point liberal-conservative ideology measure) on the x-axis and their assessment of the CRT as measured by the 6-point "agree-disagree" outcome variable on the y-axis.

There are, unfortunately, too many subjects to present a scatterplot: the subjects would end up clumped on top of each other in blobs that obscured the density of observations at particular points, a problem called "overplotting."

But "lowess" or "locally weighted regression" is a technique that allows one to plot the relative proportions of the observations in relation to the coordinates on the grid.  Lowess is a kind of anti-model modeling of the data; it doesn't impose any particular statistical form on the data but in effect just traces the moving average or proportion along tiny increments of the x-axis. 

Plotting a lowess line faithfully reveals the tendency in the data one would be able to see with a scatterplot but for the overplotting.

Okay, so here I've created an animation that plots the lowess regression line successively for the control, the "believer scores higher," and the "skeptic scores higher" conditions:

What you can see is that there is essentially no meaningful relationship between the perceived validity of CRT and political outlooks in the "control" condition.

In "believer scores higher," however, the willingness of subjects to credit the data slopes downward: the more "liberal, Democratic" subjects are, the more they credit it, while the more "conservative, Republican" they are the less they do so.

Likewise, in the "skeptics score higher" condition, the willingness of subjects to credit the data slopes upward: the more "liberal, Democratic" subjects are, the more they credit it, while the more "conservative, Republican" they are the less they do so.

That's consistent with identity-protective reasoning.

All of the theories--BRT, IAT, and ERT predicted that.

But IAT predicted the effect would be asymmetric with respect to ideology.  Doesn't look that way to me...

Now consider the impact of the experimental in relation to scores on CRT.  This animation plots the effect of ideology on the perceived validity of the CRT separately for subjects based on their own CRT scores (information, of course, with which they were not supplied):

What you can see is that the steepness of the slopes is intensifying--the relative proportion of subjects who are moving in the direction associated with identity-protective reasoning getting larger--as CRT goes from 0 (the minimum score), to 0.65 (the sample mean), to 1 (about 80th percentil) to >1 (approximately 90th percentile & above).

That result is inconsistent with BRT, which sees motivated reasoning as a product of overreliance on heuristic reasoning, but consistent with ERT, which predicts that individuals will use their cognitive reasoning proficiencies to engage in identity-protective reasoning.

Notice, too, that there is no meaningful evidence of the sort of asymmetry predicted by IAT.

The equivalent of these "raw data" summaries appear in the paper--although they aren't animated, which I think is a shame!

So that's that.

Or not really.  That's what the data look like--and the inference that they seem to support.

To discipline and extend those inferences, we can now fit a model.

I applied an ordered logistic regression to the experimental data, one the results of which confirmed that the observed effects were "statistically significant."  But because the regression output is also not particularly informative to a reflective person trying to understand the practical effect of the data, I also used the model to predict the impact of the experimental assignment typical partisans (setting the predictor levels at "liberal Democrat" and "conservative Republican," respectively) and for both "low CRT" (CRT=0) and "high CRT" (CRT=2).

Not graphically reporting multivariate analyses--leaving readers staring a columns of regression coefficients with multiple asterisks, the practical import of which is indecipherable even to someone who understands what the output means--is another thing that researchers shouldn't do.

But even if they do a good job graphically reporting their statistical model results, they must first show the reader that that raw data support the inferences that the model is being used to test or discipline and refine.

Otherwise there's no way to know whether they modeling choice is valid -- and no way to assess whether the results support the conclusion the reproacher has reached.

Good bye!


Against "consensus messaging" . . .

Post-debate press conference... did I mention my sore shoulder?This is more or less what I remember saying in my "opening statement" in the University of Bristol "debate" with Steve Lewandowsky over the utility of "consensus messaging." Obviously, I don't remember exactly what I said b/c Steve knocked me unconscious with a lightening-quick 1-6-3-2 (i.e., Jab-Right uppercut-Left hook-rt-hand) combination. But the exchange was fruitful, especially after we abandoned the pretense of being "opposed" to one another and entered into conversation about what we know, what we don't, and what sorts of empirical observations might help us all to learn more. 

 Slides here.

* * *

I want to start with what I am not against.

I’m not against the proposition that there is a scientific consensus that human activity is causing climate change. That to me is the plain inference to be drawn from the concurrence of expert sources such as U.S. National Academy of Sciences, the Royal Society, and the IPCC.

I am also by no means against communicating scientific consensus on climate change. Indeed, both Steve and I have done studies that find that when there is cultural polarization over a societal risk, both sides always agree that scientific consensus should inform public policy.

What I am against is the proposition that the way to dispel polarization over global warming in the U.S. is to continue a decade’s long “social marketing campaign”—one on which literally hundreds of millions of dollars have already been spent—that features the claim that “97% [or 98% or 100% etc] of scientists accept human caused climate change.”

I am against this "communication strategy"--

  • first, because it misunderstands the nature of the problem;
  • second, because it diverts resources from alternative approaches that have a much better prospect for success; and
  • third, because it predictably reinforces the toxicity of the climate chagne debate for our science communication environment.

1. Misunderstands the problem. The most logical place to start is with what members of the public actually think climate scientists believe about the causes and consequences of climate change.

About 75% of the individuals whose political outlooks are “liberal” (meaning to the “left” of the mean on a political outlook scale that aggregates their responses to items on partisan identification and liberal-conservative ideology) are able to correctly identify “carbon dioxide” as the “gas . . . most scientists believe causes temperatures in the atmosphere to rise.

That’s very close to the same percentage of “liberals” who agree that human activity is causing climate change.

But if you think that that's a causal relationship, think again: about 75% of “conservatives” (individuals with political outlooks to the “right” of the mean on the same scale) know that scientists believe CO2 emissions increase atmospheric temperatures, too.  Yet only 25% of them say they “believe in” human-caused climate change.

The vast majority of liberals and conservatives, despite being polarized on whether global warming is occurring, also have largely the same impression of what climate scientists' view of the risks that global warming poses.

Indeed, by a substantial majorities, members of the public on both the left and right agree that climate scientists attribute all manner of risk to global warming that in fact no climate scientists attribute to it.

Contrary to what the vast majority of “liberal” and “conservative” members of the public think, climate scientists do not believe that climate change will increase the incidence of skin cancer.

Contrary to what the vast majority of “liberal” and “conservative” members of the public think, climate scientists do not believe sea levels will rise if the north pole ice cap melts (unlike the south pole ice cap, which sits atop a land mass, the north pole “ice cap” is already floating in the sea, a point that various “climate science literacy” guides issued by scientific bodies like NASA and NOAA emphasize).

And contrary to what the vast majority of “liberal” and “conservative” members of the public think, climate scientists do not believe that “the increase of atmospheric carbon dioxide associated with the burning of fossil fuels will reduce photosynthesis by plants.”

They haven’t quite gotten the details straight, it’s true.

But both “liberals” and “conservatives” have “gotten the memo” that scientists think human activity is causing climate change and that we are in deep shit as a result. 

So why should we expect that telling them what they already know will dispel the controversy reflected in persisting poll results showing that they are polarized on global warming?

I know what you are thinking: maybe climate-consensus messaging would work better if the "message" actually helped educate people on climate change science.

Well, I can give you some relevant data on that, too.

The individuals who scored the highest on this climate-literacy assessment aren’t any less divided when asked if they “believe in” climate change.  On the contrary, the “liberals” and “conservatives” who score highest—the ones who consistently distinguish the positons that climate scienitists actually hold from the ones they do not—are the most polarized of all.

“Ah,” you are thinking.  “Then the problem must be that conservatives don’t trust climate scientists!”

I don’t think that’s right

But if one took that position, then one would presumably think “consensus messaging” is pointless. Why should right-leaning citizens care that “97% of scientists accept climate change” if they don’t trust a word they are saying?

That’s logical.  But it’s not the view of those who support “consensus messaging.”  Indeed, the researchers who purport to “prove” that conservatives “distrust” climate scientists are the very same ones who are publishing studies (or republishing the same study over and over) that they interpret as “proving” consensus-messaging will work (despite their remarkable but unremarked failure to report any evidence that being exposed to the message affected the proportion of people who "believe in" climate change).

These meticulous researchers are hedged: no matter what happens, they will have predicted it!

Here, though, is some evidence on whether those who “don’t believe” in climate change trust climate scientists.

Leaving partisanship aside, farmers are probably the most skeptical segment of the US population. But they are also the segment that makes the greatest use of climate science in their practical decisionmaking.

The same ones who say they don’t think climate change has been “scientifically proven” are already busily adapting—self-consciously so—to climate change by adopting practices like no-till farming.

They also anticipate buying more crop-failure insurance.  Which is why Monsanto, which is pretty good at figuring out what farmers believe, recently acquired an insurance operation.

Because Monsanto knows how farmers really feel about climate scientists, it also recently acquired a firm that specializes in synthesizing government and university climate-science data for the purpose of issuing made-to-order forecasts tailored to users’ locations.  It expects the consumption of this fine-grained, local forecasting data to be a $20 billion market. Because farmers, you see, really really really want to know what climate scientists think is going to happen.

I’ll tell you someone else who you can be sure knows what farmers really think about climate scientists: their representatives in Congress.

Conisder Congressman Frank Lucas, Republican, 3d district of Oklahoma.  He has been diagnosed, in the charming idiom of the “climate change debate,” as suffering from “climate denier disorder syndrome.”  He is the “vice-chair” of the House Committee on Science (sic), Space (sic) and Technology (sic), which recently proposed slashing NASA’s budget for climate change research.

I’m sure his skeptical farmer constituents appreciate all that.

But they also are very pleased that Lucas, as the chair of the House Agriculture Committee, sponsored the 2014 Agriculture bill, which appropriated over a billion dollars for scientific research on the impact of climate change on farming.  His skeptical farmer constituents know they need science’s help to protect their cattle from climate change.  They got it to the tune of $10 million, which is what the USDA awarded Oklahoma State University as Clearwater, which is in Lucas’s district!

But he’s not selfish. His bill enabled huge appropriations for the other skeptical-farmer-filled states, too!

You see, there are really two “climate changes” in America.

There’s the one people “believe in” or “disbelieve in” solely for the purpose of expressing their allegiance in a mean, ugly, illiberal status competition between opposing cultural groups.

Then there’s the one that people “believe in” in order to do things—like being a farmer—that depend on the best available scientific evidence.

As you can imagine, it’s a challenge for a legislator to keep all this straight. 

Bob Inglis, from the farming state of South Carolina, for example, announced that he “believed in” climate change and wanted Congress to address the issue.

Wrong climate change!  That’s the one his constituents don’t believe in.  

Didn’t you notice, they ask, how funny it was when Senator Inhofe (of Oklahoma, who for sure didn't oppose the appropriation of all that money in the farm bill to support scientific research to help farmers adapt to global warming) brought a snow ball onto the floor of the Senate to show Al Gore how stupid he is for thinking there is scientific evidence global warming?

"You're out of here!," Inglis’s constitutents said, retiring him in a primary against a climate-skeptical Republican opponent.

Some people say that Republicans members of Congress who reject climate change are stupid. But actually, it takes considerable mental dexterity not to get messed up on which “climate change” one’s farmer constituents don’t believe in and which they do.

2. Diversion of resources.  The only way to promote constructive collective decsionmaking on the climate change that ordinary people, left and right, are worried about,and that farmers and other practical individuals are taking steps to protect themselves from, is to protect our science communication enviornment from the toxic effects of the other climate change—the one that people believe or disbelieve in to express their tribal loyalties.

That’s the lesson of Southeast Florida climate political science.

Because people in that region are as diverse in their outlooks as the rest of the Nation, they are as polarized on the “whose side are you on” form of “climate change” as everyone else.

Nevertheless, the member counties of the Southeast Florida Climate Change Compact—Broward, Miami-Dade, Palm Beach, and Monroe—have approved a joint “Regional Climate Action Plan,” which consists of some 100 mitigation and adaptation items.

The leaders in these counties didn’t bombard their constituents with “consensus messaging.”  Instead they adopted a style of political discourse that disentangled the question of “who are you, whose side are you on” from the question of “what should we do with what we know?”

Because they have banished the former “climate change question”  from their political discourse, a Republican member of the House doesn’t bear the risk that he’ll be confused for a cultural traitor when he calls a press conference and says “I sure as hell do believe in climate change, and I am going to demand that Congress address the threat that it poses to my constituents.”

There are some really great organizations that are helping the members of the Southeast Florida Compact and other local governments to remove the toxic “whose side are you on” question from their science communication environments.

But they are not getting nearly the support that they need from those who care about climate change policymaking, because nearly all of that support—in the form of hundreds of millions of dollars—is going instead to groups that prefer to pound the other team’s members over the head with “consensus messaging.”

The 2013 Cook et al. study was not telling us anything new. There had already been six previous studies finding an overwhelming scientific consensus on climate change, the first of which was published in Science, a genuinely signficant event, in 2004.

The people advocating “consensus messaging” aren’t advocating anything new either. Al Gore’s Alliance for Climate Protection spent over $300 million to promote “consensus messaging,” which was featured in Gore’s 2006 movie Inconvenient Truth (no doubt the organization gave a $1 million to an advertising agency, which conducted a focus group to validate its seat-of-the-pants guess that “reframing” the organization’s name as “Climate Reality” would convince farmers to “believe in” climate change).

Public opinion on climate change—whether it is “happening,” is “human caused,” etc.—didn’t move an inch at all during that time.

But we are supposed to think that that’s irrelevant because immediately after experimenters told them “97% of scientists accept climate change,” a group of study subjects, while not changing their own positions on whether climate change is happening, increased by a very small amount their expressed estimate of the percentage of scientists who believe in climate change?   Seriously?

The willingness of people to continue “believe in” consensus messaging is itself a science communication problem.  That one will get solved only if researchers resolve to tell people what they need to know, and not simply what they want to hear.

3. Perpetuating a toxic discourse.  No doubt part of the appeal of “consensus messaging” is how well suited it is as an idiom for expressing contempt.  The kinds of real-world “messaging campaigns” that feature the “97% agree” slogan all say “you are an idiot” to those for whom not believing climate change has become identity defining.  It is exactly that social meaning that must be removed from the climate change question before people can answer it with what they know: that their well-being and the well-being of others they actually care about requires doing sensible things with the best available current evidence.

Did you ever notice how all of the “consensus messages” invoke NASA?  The reason is that poorly designed studies, using invalida measures, found that people say they “trust NASA” more than various other science entitities, the majority of whch they've never even heard of.

I don't doubt, though, that the US general public used to revere NASA. But now bashing NASA is seen as more effective than bringing a snowball onto the floor of the Senate as a way to signal to farmers and other groups whose cultural identity is associated with skepticism that one has the values that make him or her fit to represent them in Congress.

Did I say “consensus messaging” hadn’t achieved anyting?  If so, I spoke to soon.

Yay team.

* * *

Climate science models get updated after a decade of real-world observations.

The same is necessary for climate-science-communication models.

A decades’ experience shows that  “Consensus messaging” doesn’t work.  Our best lab and field studies, as well as a wealth of relevant experience by people who are doing meaningful communciation rather than continuously fielding surveys that don't even measure the right thing, tell us why: "consensus messaging" is unresponsive to the actual dynamics driving the climate change controversy.

So it is time to update our models.  Time to give alternative approaches--ones that reflect rather than ignore evidence of the mechanisms of cultural conflict over societal risks--a fair trial, during which we can observe and measure their effects, and after which we can revise our understandings once more, incorporate what we have learned into refined approaches, and repeat the process yet again.

Otherwise the “science of science communication” isn’t scientific at all.




A Pigovian tax solution (for now) for review/publication of studies that use M Turk samples

I often get asked to review papers that use M Turk samples.

This is a problem because I think M Turk samples, while not invalid for all forms of study, are invalid for studies of how individual differences in political predispositions and cognitive-reasoning proficiencies influence the processing of empirical information relevant to risk and other policy issues.

I've discussed this point at length.

And lots of serious scholars now have engaged this isssue seriously.   

"Seriously" not in the sense of merely collecting some data on the demographics of M Turk samples at one point in time and declaring them "okay" for all manner of studies once & for all. Anyone who produces a study like that, or relies on it to assure readers his or her own use of an M Turk sample is "okay," either doesn't get the underlying problem or doesn't care about it.

I mean really seriously in the sense of trying to carefully document the features of the M Turk work force that bear on the validity of it as a sample for various sorts of research, and in the sense of engaging in meaningful discussion of the technical and craft issues involved.

I myself think the work and reflections of these serious scholars reinforce the conclusion that it is highly problematic to rely on M Turk samples for the study of information processing relating to risk and other facts relevant to public policy.

The usual reply is, "but M Turk samples are inexpensive! They make it possible for lots & lots of scholars to do and publish empirical research!"

Well, thought experiments are even cheaper.  But they are not valid.  

If M Turk samples are not valid, it doesn't matter that they are cheap. Validity is a non-negotiable threshold requirement for use of a particular sampling method. It's not an asset or currency that can be spent down to buy "more" research-- for the research that such a "trade off" subsidizes in fact has no value.

Another argument is, "But they are better than university student samples!"  If student samples are not valid for a particular kind of research, then journals shouldn't accept studies that use them either. But in any case, it's now clear that M Turk workers don't behave the way U.S. university students do when responding to survey items that assess whether subjects are displaying the sorts of reactions one would expect in people who  claim that they are members of the U.S. public with particular political outlooks (Krupnikov & Levine 2014).

I think serious journals should adopt policies announcing that they won't accept studies that use M Turk samples for types of studies they are not suited for.

But in any case, they ought at least to adopt policies one way or the other--rather than put authors in the position of not knowing before they collect the data whether journals will accept their studies, and authors and reviewers in the position of having a debate about the appropriateness of using such a sample over & over.  Case-by-case assessment is not a fair way to handle this issue, nor one that will generate a satisfactory overall outcome.

So ... here is my proposal: 

Pending a journal's adoption of a uniform policy on M Turk samples, the journal should oblige authors who do use M Turk samples to give a full account--in their paper-- of why the authors believe it is appropriate to use M Turk workers to model the reasoning process of ordinary members of the U.S. public.  The explanation should  consist of a full accounting of the authors’ own assessment of why they are not themselves troubled by the objections that have been raised to the use of such samples; they shouldn't be allowed to dodge the issue by boilerplate citations to studies that purport to “validate” such samples for all purposes, forever & ever.  Such an account helps readers to adjust the weight that they afford study findings that use M Turk samples in two distinct ways: by flagging the relevant issues for their own critical attention; and by furnishing them with information about the depth and genuineness of the authors’ own commitment to reporting research findings worthy of being credited by people eager to figure out the truth about complex matters.

There are a variety of key points that authors should be obliged to address.

First, M Turk workers recruited to participate in “US resident only” studies have been shown to misrepresent their nationality.  Obviously, inferences about the impact of partisan affiliations distinctive of the US general public cannot validly be made on the basis of samples that contain a “substantial” proportion of individuals from other societies (Shapiro, Chandler and Muller 2013)  Some scholars have recommended that researchers remove from their “US only” M Turk samples those subjects who have non-US IP addresses.  However, M Turk workers are aware of this practice and openly discuss in on-line M Turk forums how to defeat it by obtaining US-IP addresses for use on “US worker” only projects.  If authors are purporting to empirically test hypotheses about about how members of the U.S. general public reason on politically contested matters, why don't they see the incentive of M Turk workers to misrepresent their nationality as a decisive objection to using them as their study sample?

Second, M Turk workers have demonstrated by their behavior that they are not representative of the sorts of individuals that studies of political information-processing are supposed to be modeling. Conservatives are grossly under-represented among M Turk workers who represent themselves as being from the U.S. (Richey 2012).  One can easily “oversample” conservatives to generate adequate statistical power for analysis. But the question is whether it is satisfactory to draw inferences about real US conservatives generally from individuals who are doing something that such a small minority of real U.S. conservatives are willing to do.  It’s easy to imagine that the M Turk US conservatives (if really from the US) lack sensibilities that ordinary US conservatives normally have—such as the sort of disgust sensitivities that are integral to their political outlooks (Haidt & Hersch 2001), and that would likely deter them from participating in a "work force" a major business activity of which is “tagging” the content of on-line porn. These unrepresentative US conservatives might well not react as strongly or dismissively toward partisan arguments on a variety of issues.  So why is this not a concern for the authors? It is for me, and I’m sure would be for many readers trying to assess what to make of a study that nevertheless uses an M Turk sample.

Third, there are in fact studies that have investigated this question and concluded that M Turk workers do not behave the way that US general population or even US student samples do when participating in political information-processing experiments (Krupnikov & Levine 2014).   Readers will care about this—and about whether the authors care.

Fourth, Amazon M Turk worker recruitment methods are not fixed and are neither designed nor warranted to generate samples suitable for scholarly research. No serious person who cares about getting at the truth would accept the idea that a particular study done at a particular time could “validate” M Turk, for the obvious reason that Amazon doesn’t publicly disclose its recruitment procedures, can change them anytime and has on multiple occasions, and is completely oblivious to what researchers care about.  A scholar who decides it’s “okay” to use M Turk anyway should tell readers why this does not trouble him or her.

Fifth, M Turk workers share information about studies and how to respond to them (Chandler, Mueller & Paolacci 2014).   This makes them completely unsuitable for studies that use performance-based reasoning proficiency measures, which M Turk workers have been massively exposed to.  But it also suggests that the M Turk workforce is simply not an appropriate place to recruit subjects from for any sort of study in which subject communication can will contaminate the sample. Imagine you discovered that the firm you had retained to recruit your sample had a lounge in which subjects about to take the study could discuss it w/ those who just had completed it; would you use the sample, and would you keep coming back to that firm to supply you with study subjects in the future? If this does not bother the authors, they should say so; that’s information that many critical readers will find helpful in evaluating their work.

I feel pretty confident M Turk samples are not long for this world for studies that examine individual differences in reasoning relating to politically contested risks and other policy-relevant facts (again, there are no doubt other research questions for which M Turk samples are not nearly so problematic).  

Researchers in this area will not give much weight to studies that rely on M Turk samples as scholarly discussion progresses.  

In addition, there is a very good likelihood that an on-line sampling resource that is comparably inexpensive but informed by genuine attention to validity issues will emerge in the not too distant future.

E.g., Google Consumer Surveys now enables researchers to field a limited number of questions for between $1.10 & $3.50 per complete-- a fraction of the cost charged by on-line firms that use valid & validated recruitment and stratification methods.

Google Consumer Surveys has proven its validity in the only way that a survey mode--random-digit dial, face-to-face, on-line --can: by predicting how individuals will actually evince their opinions or attitudes in real-world settings of consequence, such as elections.  Moreover, if Google Surveys goes into the business of supplying high-quality scholarly samples, they will be obliged to be transparent about their sampling and stratification methods and to maintain them (or update them for the purposes of making them even more suited for research) over time.  

As I said, Amazon couldn't care less whether the recruitment methods it uses for M Turk workers now or in the future make them suited for scholarly research.

The problem right now w/ Google Consumer Surveys is that the number of questions is limited and so, as far as I can tell, is the complexity of the instrument that one is able to use to collect the data, making experiments infeasible.

But I predict that will change.

We'll see.

But in the meantime, obliging researchers who think it is "okay" to use M Turk samples to explain why they apparently are untroubled by the serious issues being raised about the validity of these samples would be an appropriate way, it seems to me, to make those who use such samples to internalize the cost that polluting the research environment with M Turk studies is imposing on social science research on cognition and political conflict.


Chandler, J., Mueller, P. & Paolacci, G. Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior research methods 46, 112-130 (2014).

Haidt, J. & Hersh, M.A. Sexual morality: The cultures and emotions of conservatives and liberals. J Appl Soc Psychol 31, 191-221 (2001). 

Kahan, D. Fooled Twice, Shame on Who? Problems with Mechanical Turk Study Samples. Cultural Cognition Project (2013a),

Krupnikov, Y. & Levine, A.S. Cross-Sample Comparisons and External Validity. Journal of Experimental Political Science 1, 59-80 (2014).

Richey, S,., & Taylor, B. How Representatives Are Amazon Mechanical Turk Workers? The Monkey Cage,(2012).

Shapiro, D.N., Chandler, J. & Mueller, P.A. Using Mechanical Turk to Study Clinical Populations. Clinical Psychological Science 1, 213-220 (2013).