follow CCP

Recent blog entries
popular papers

What Is the "Science of Science Communication"?

Climate-Science Communication and the Measurement Problem

Ideology, Motivated Cognition, and Cognitive Reflection: An Experimental Study

Motivated Numeracy and Enlightened Self-Government

Ideology, Motivated Cognition, and Cognitive Reflection: An Experimental Study

Making Climate Science Communication Evidence-based—All the Way Down 

Neutral Principles, Motivated Cognition, and Some Problems for Constitutional Law 

Cultural Cognition of Scientific Consensus

The Tragedy of the Risk-Perception Commons: Science Literacy and Climate Change

"They Saw a Protest": Cognitive Illiberalism and the Speech-Conduct Distinction 

Geoengineering and the Science Communication Environment: a Cross-Cultural Experiment

Fixing the Communications Failure

Why We Are Poles Apart on Climate Change

The Cognitively Illiberal State 

Who Fears the HPV Vaccine, Who Doesn't, and Why? An Experimental Study

Cultural Cognition of the Risks and Benefits of Nanotechnology

Whose Eyes Are You Going to Believe? An Empirical Examination of Scott v. Harris

Cultural Cognition and Public Policy

Culture, Cognition, and Consent: Who Perceives What, and Why, in "Acquaintance Rape" Cases

Culture and Identity-Protective Cognition: Explaining the White Male Effect

Fear of Democracy: A Cultural Evaluation of Sunstein on Risk

Cultural Cognition as a Conception of the Cultural Theory of Risk


Critical, must-do 2016 CCP NY resolutions!


Coolest article of the yr-- hot hands down!

Boy, it's not even close.

I’m going to resist summarizing Miller & Sanjurjo’s “Surprised by the Gambler's and Hot Hand Fallacies? A Truth in the Law of Small Numbers,” not only because I’ve already tried to do that multiple times –

but also because any attempt to do so results in a mental misadventure of staggering proportions.

Actually, that is what’s so cool about the article.  At least in my view. 

Like lots of other people—including, to his credit, the scholar most prominently identified with the classic “hot hand fallacy” study—I think it is really neat that M&S have re-opened the question whether the performance of athletes really do vary in patterns that defy the fluctuations one would expect to see by chance (i.e., whether NBA basketball players and others really do go on “hot streaks” etc).

I also am filled with admiration for their mathematical dexterity in exposing the error in the original “hot hand fallacy” research (viz., the assumption that the shooting consistency of basketball players over a finite set of observations should be measured in relation to the variance associated with a binomial distribution).

But what really intrigues me is what M&S's accomplishment tells us about cognition.  Or really what it tells us about what we don’t know but should about how intuition and conscious reflection operate in expert judgment.

How could researchers so familiar with probability theory, and so accomplished in exposing the errors people routinely make when attempting to detect patterns in random events, fail to detect the mistaken assumption that they themselves were making about how to detect such a pattern in this particular setting?

How could the error have evaded the notice of those who reviewed their work—and much more fundamentally the notice of thousands of scholars who for decades have held up the original “hot hand fallacy” study (along with its many progeny) as the paradigmatic demonstration of a particular cognitive bias (one that no one disputes really exists) and of a method for detecting defects in human rationality generally?

Why when they are shown incontrovertible (really!) proof of the error that the “hot hand” researchers made (and re-made over the course of numerous successor studies) do so many highly intelligent, reflective people—ones who unquestionably possess the knowledge and reasoning proficiency that it takes to understand the logic of the M&S argument—so strongly and stubbornly resist accepting it before (in the vast majority of cases, at least) finally acknowledging (often with a gratifying display of appreciative surprise) that M&S are right?

What is the cognitive process, in short, that makes individuals who have cultivated the habits of mind necessary to resist commonplace but mistaken intuitions about randomness vulnerable to being misled by mistaken intuitions about randomness that only those highly proficient in reasoning about randomness could have developed in the first place?

The project to answer this question started before 2015.

But the vividness imparted to this puzzle by the astonishing M&S paper, and the resulting amplification and dissemination of the motivation to solve it, will, I predict, energize researchers for years to come.

Still fooled by non-randomness? Some gadgets to help you *see* the " 'hot hand' fallacy" fallacy

Mining even more insight from the Pew "public science knowledge/attitudes" data--but hoping for even better extraction equipment (fracking technology, maybe?) in 2016...

Futzing around "yesterday" with the "public" portion of the"public vs. scientists" study (Pew 2015), I presented some data  consistent with previous findings (Kahan 2014, 2015) that "beliefs" in human evolution and human-caused climate change measure cultural identity, not any aspect of science comprehension.

Well, there's actually still more fun things one can do with the Pew data (a way to pass the time, actually, as I wait for some new data on climate-science literacy... stay tuned!).

"Today" I'll share with you some interesting correlations between the Pew "science literacy" battery (also discussed yesterday; but actually, a bit more about it at the end of the post) & various "science-informed" policy issues.  I'll also show how those relationships intereact with (vary in relation to) right-left political outlooks.

Ready? ...

Okay -- consider this one!

See? It's scary to eat GM foods, but people of all political outlooks & levels of science literacy agree that it makes sense to put blue GM tomatoes (or even a single "potatoe") in the gas tank of their SUVs.

But you know my view here: "what do you think of GM ..." in a survey administered to the general public measures non-opinion.  Fun for laughs, and for creating fodder for professional "anti-science" commentators, but not particulary helpful in trying to make genuine sense of public risk perceptions.

Just my opinion....

Here's another:

Okay, now this is meaningful stuff.  Not news, of course, but still nice to be able to get corroboration with additional high-quality data.

When polarization on a "societal risk" doesn't abate but increases conditional on science comprehension, that's a super strong indicator of a polluted science communication environment.  It is a sign that positions on an issue have become entangled in antagonistic social meanings that transform them into badges of identity in and loyalty to groups (Kahan 2012). When that happens, people will predictably use their reasoning proficiencies to fit their understanding of evidence to the view that predominates in their group.

Here one can reasonably question the inference I'm drawing, since Pew's items aren't about "risk perceptions" but rather "policy preferences." 

But if one is familiar with the "affect heuristic"--which refers to the tendency of people to conform their understanding of all aspects of a putative risk source to a generic pro- or con- attitude (Slovic, Finucane & MacGregor 2005; Loewenstein, Weber, Hsee & Welch 2001)--then one would be inclined to treat the Pew question as just another indicator of that risk-perception-generating sensibility. 

The "affect heuristic" is what makes the "Industrial Strength Risk Perception Measure" so powerful.  Using ISRPM, CCP data has found that both the perceived risk of both fracking and of nuclear power (not to mention climate change, of course) display the signature "polluted science communication environment" characteristic  of increased cultural polarization conditional on greater reasoning proficiency.

I, anyway, am inclined to view the Pew data as more corroboration of this relationship, just as in "yesterday's" post I explained how the Pew data corroborated the findings that greater science comprehension generally and greater comprehension of climate science in particular magnify polarization.

But before signing off here, let me observe one thing about the Pew science literacy battery.

You likely noticed that the values on the y-axes of the figures start to get more bunched together at the high end.

That's because the six-item, basic facts sicence literacy battery used in the Pew 2015 report are highly skewed in the direction of a high score.

Some 30% of the nationally represenative sample got all six questions correct! 

The distribution is a bit less skewed when one scores the responses to the battery using Item Response Theory, which takes account of the relative difficulty and measurement precision (or discrimination) of the individual items. But only a bit less. (You can't tell from the # of bins in the histogram, but there are actually over 5-dozen "science literacy" levels under the IRT model, as opposed to the 7 that result when one simply adds the number of correct responses; pretty cool illustration of how much more "information," as it were, one can get using IRT rather than "classic test theory" scoring.)

To put it plainly, the Pew battery is just too darn easy. 

The practical consequence-- a serious one-- is that the test won't do a very good job in helping us to determine whether differences in science comprehension affect perceptions of risk or other science-related attitudes among individuals whose scores are above the population average.

Actually, the best way to see that is to look at the Item Reponse Theory test information and reliability characteristics for the Pew battery:

If you need a refresher on the significance on these measures, then check out this post & this one

But what they are telling us is that the power of the Pew battery to discern differences in science comprehension is concentrated at about -1 SD below the estimated population mean. Even there, the measurement precision is modest -- a reliability coefficient of under 0.6 (0.7 is better). 

More importantly, it quickly tails of to zero by +0.5 SD. 

In other words, above the 60th percentile in the population the test can furnish us with no guidance on differences in science literacy levels.  And even what it can tell us even at the population mean ("0" on the y-axis) is pretty noisy (reliability = 0.40).

As I've explained in previous posts, the NSF Indicators have exactly the same problem. The Pew battery is an admirable effort to try to improve on the familiar NSF science literacy test, but with these items, at least, it hasn't made a lot of progress.

As the last two posts have shown, you can in fact still learn a fair amount from a science literacy scale the measurement precision is this skewed toward the lower end of the distribution of this sort of proficiency.

But if we really want to learn more, we desperately need a better public science comprehension instrument.

That conviction has informed the research that generated the "Ordinary Science Intelligence" assessment.  An 18-item test, OSI combines a modest number of "basic fact" items (ones derived from the Indicator and from a previous Pew battery) with critical reasoning measures that examine cognitive reflection and numeracy, dispositions essential to being able to recognize and give proper effect to valid science.

OSI was deliberately constructed to possess a high degree of measurement precision across the entire range of the underlying latent (or unobserved) dispotion that it's measuring. 

That's a necessary quality, I'd argue, for an instrument suited to advance scholarly investigation of how variance in public science comprhension affects perceptions of risk and related facts relevant to individual and collective decisionmaking.

Is OSI (actually "OSI_2.0") perfect?

Hell no

Indeed, while better for now than the NSF Indicators battery (on which it in fact builds) for the study of risk perception and science communication, OSI_2.0 is primarily intended to stimulate other scholars to try to do even better, either by building on and refining OSI or by coming up with instruments that they can show (by conducting appropriate assessments of the instruments' psychometric characteristics and their external validity) are even better.

I hope that there are a bunch of smart researchers out there who have made contributing to the creation of a better public science comprehension instrucment one of their New Year's resolutions.

If the researchers at Pew Research Center are among them, then I bet we'll all be a lot smarter by 2017!


Kahan, D.M. Climate-Science Communication and the Measurement Problem. Advances in Political Psychology 36, 1-43 (2015).

Kahan, D.M. “Ordinary Science Intelligence”: A Science Comprehension Measure for Use in the Study of Risk Perception and Science Communication. Cultural Cognition Project Working Paper No. 112 (2014).

Kahan, D. Why we are poles apart on climate change. Nature 488, 255 (2012).

Loewenstein, G.F., Weber, E.U., Hsee, C.K. & Welch, N. Risk as feelings. Psychological Bulletin 127, 267-287 (2001).

Pew Research Center (2015). Public and Scientists' Views on Science and Society.

Slovic, P., Peters, E., Finucane, M.L. & MacGregor, D.G. Affect, Risk, and Decision Making. Health Psychology 24, S35-S40 (2005).


Replicate "Climate-Science Communication Measurement Problem"? No sweat (despite hottest yr on record), thanks to Pew Research Center!

One of the great things about Pew Research Center is that it posts all (or nearly all!) the data from its public opinion studies.  That makes it possible for curious & reflective people to do their own analyses and augment the insight contained in Pew's own research reports. 

I've been playing around with the "public" portion of the "public vs. scientists" study, which was issued last January (Pew 2015). Actually Pew hasn't released the "scientist" (or more accurately, AAAS membership) portion of the data. I hope they do!

But one thing I thought it would be interesting to do for now would be to see if I could replicate the essential finding from "The Climate Science Communication Measurement Problem" (2015)

In that paper, I presented data suggesting, first, that neither "belief" in evolution nor "belief" in human-caused climate change were measures of general science literacy.  Rather both were better understood as measures of forms of "cultural identity" indicated, respectively, by items relating to religiosity and items relating to left-right political outlooks.

Second, and more importantly, I presented data suggesting hat there is no relationship between "belief" in human-caused climate change & climate science comprehension in particular. On the contrary, the higher individuals scored on a valid climate science comprehension measure (one specifically designed to avoid the confound between identity and knowledge that confounds most "climate science literacy" measures), the more polarized the respondents were on "belief" in AGW--which, again, is best understood as simply an indicator of "who one is," culturally speaking.

Well, it turns out one can see the same patterns, very clearly, in the Pew data.

Patterned on the NSF Indicators "basic facts" science literacy test (indeed, "lasers" is an NSF item), the Pew battery consists of six items:

As I've explained before, I'm not a huge fan of the "basic facts" approach to measuring public science comprehension. In my view, items like these aren't well-suited for measuring what a public science comprehension assessment ought to be measuring: a basic capacity to recognize and give proper effect to valid scientific evidence relevant to the things that ordinary people do in their ordinary lives as consumers, workforce members, and citizens.

One would expect a person with that capacity to have become familiar with certain basic scientific insights (earth goes round sun, etc.) certainly.  But certifying that she has stocked her "basic fact" inventory with any particular set of such propositions doesn't give us much reason to believe that she possesses the reasoning proficiencies & dispositions needed to augment her store of knowledge and to appropriately use what she learns in her everyday life.

For that, I believe, a public science comprehension battery needs at least a modest complement of scientific-thinking measures, ones that attest to a respondent's ability to tell the difference between valid and invalid forms of evidence and to draw sound inferences from the former.  The "Ordinary Science Intelligence" battery, used in the Measurement Problem paper, includes "cognitive reflection" and "numeracy"modules for this purpose.

Indeed, Pew has presented a research report on a more fulsome science comprehension battery that might be better in this regard, but it hasn't released the underlying data for that one.

Psychometric properties of Pew science literacy battery--click on it, c'mon!But anyway, the new items that Pew included in its battery are more current & subtle than the familiar Indicator items, & the six-member Pew group form a reasonably reliable (α = 0.67), one dimensional scale-- suggesting they are indeed measuring some sort of science-related apptitude.

But the fun stuff starts when one examines how the resulting Pew science literacy scale relates to items on evolution, climate change, political outlooks, and religiosity.

For evolution, Pew used it's two-part question, which first asks whether the respondent believes (1) "Humans and other living things have evolved over time" or (2) "Humans and other living things have existed in their present form since the beginning of time." 

Subjects who pick (1) then are asked whether (3) "Humans and other living things have evolved due to natural processes such as natural selection" or (4) "A supreme being guided the evolution of living things for the purpose of creating humans and other life in the form it exists today."

Basically, subjects who select (2) are "new earth creationists." Subjects who select (4) are generally regarded as believing in "theistic evolution."  Intelligent design isn't the only variant of "theistic evolution," but it is certainly one of the accounts that fit this account.

Only subjects who select (3)-- "humans and other living things have evolved due to natural processes such as natural selection" -- are the only ones furnishing the response that reflects science's account of the natural history of humans. 

So I created a variable, "evolution_c," that reflects this answer, which was in fact selected by only 35% of the subjects in Pew's U.S. general public sample.

On climate change, Pew assessed (using two items that tested for item order/structure effects that turned out not to matter) whether subjects believed (1) "the earth is getting warmer mostly because of natural patterns in the earth’s environment," (2) "the earth is getting warmer mostly because of human activity such as burning fossil fuels," or (3) "there is no solid evidence that the earth is getting warmer."

About 50% of the respondents selected (2).  I created a variable, gw_c, to reflect whether respondents selected that response or one of the other two.

For political orientations, I combined the subjects responses to a 5-point liberal-conservative ideology item and their responses to a 5-point partisan self-identification item (1 "Democrat"; 2 "Independent leans Democrat"; 3 "Independent"; 4 "Independent leans Republican"; and 5 "Republican").  The composite scale had modest reliability (α = 0.61).

For religiosity, I combined two items.  One was a standard Pew item on church attendance. The other was a dummy variable, "nonrelig," scored "1" for subjects who said they were either "atheists," "agnostics" or "nothing in particular" in response to a religious-denomination item (α = 0.66).

But the very first thing I did was toss all of these items -- the 6 "science literacy" ones, belief in evolution (evolution_c), belief in human-caused climate change (gw_c), ideology, partisan self-identification, church attendance, and nonreligiosity--into a factor analysis (one based on a polychoric covariance matrix, which is appropriate for mixed dichotomous and multi-response likert items).

Click for closer look-- if you dare....

Not surprisingly, the covariance structure was best accounted for by three latent factors: one for science literacy, one for political orientations, and one for religiosity.

But the most important result was that neither belief in evolution nor belief in human-caused climate change loaded on the "science literacy" factor.  Instead they loaded on the religiosity and right-left political orientation factors, respectively.

This analysis, which replicated results from a paper dedicated solely to examinging the properties of the Ordinary Science Intelligence test, supports the inference that belief in evolution and belief in climate Warning: Click only if psychologically prepared to see shocking cultural bias in "belief in evolution" as science literacy assessment item! change are not indicators of "science comprehension" but rather indicators of cultural identity, as manifested respectively by political outlooks and religiosity.

To test this inference further, I used "differential item function" or "DIF" analysis (Osterlind & Everson, 2009).

Based on item response theory, DIF examines whether a test item is "culturally biased"--not in an animus sense but a measurement one: the question is whether the responses to the item measure the "same" latent proficiency (here, science literacy) in diverse groups.  If it doesn't-- if there is a difference in the probability that members of the two groups who have equivalent science literacy scores will answer it "correctly"--then administering that question to members of both will result in a biased measurement of their respective levels of that proficiency.

In Measurement Problem, I used DIF analysis to show that belief in evolution is "biased" against individuals who are high in religioisity. 

Using the Pew data (regression models here), one can see the same bias:

The latter but not the former are likely to indicate acceptance of science's account of the natural history of humans as their science literacy scores increase. This isn't so for other items in the Pew science literacy battery (which here is scored used using an item response theory model; the mean is 0, and units are standard deviations). 

The obvious conclusion is that the evolution item isn't measuring the same thing in subjects who are relatively religious and nonreligious as are the other items in the Pew science literacy battery. 

In Measurement Problem, I also used DIF to show that belief in climate change is a biased (and hence invalid) measure of climate science literacy.  That analysis, though, assessed responses to a "belief in Warning: Graphic demonstration of cultural bias in standardized assessment item. Click only if 21 yrs or older or accompanied by responsible adult or medical professional.climate change" item (one identical to Pew's) in relation to scores on a general climate-science literacy assessment, the "Ordinary Climate Science Intelligence" (OCSI) assesssment.  Pew's scientist-AAAS study didn't have a climate-science literacy battery.

Its general science literacy battery, however, did have one climate-science item, a question of theirs that in fact I had included in OCSI: "What gas do most scientists believe causes temperatures in the atmosphere to rise? Is it Carbon dioxide, Hydrogen, Helium, or Radon?" (CO2).

Below are the DIF item profiles for CO2 and gw_c (regression models here). Regardless of their political outlooks, subjects become more likely to get CO2 correctly as their science literacy score increases--that makes perfect sense!

But as their science literacy score increases, individuals of diverse political outlooks don't converge on "belief in human caused climate change"; they become more polarized.  That question is measuring who the subjects are, not what they know about about climate science.

So there you go!

I probably will tinker a bit more with these data and will tell you if I find anything else of note.

But in the meantime, I recommend you do the same! The data are out there & free, thanks to Pew.  So reciprocate Pew's contribution to knowledge by analyzing them & reporting what you find out!


Kahan, D.M. Climate-Science Communication and the Measurement Problem. Advances in Political Psychology 36, 1-43 (2015).

Kahan, D.M. “Ordinary Science Intelligence”: A Science Comprehension Measure for Use in the Study of Risk Perception and Science Communication. Cultural Cognition Project Working Paper No. 112 (2014).

Osterlind, S. J., & Everson, H. T. (2009). Differential item functioning. Thousand Oaks, CA: Sage.

Pew Research Center (2015). Public and Scientists' Views on Science and Society.


Two theories of "values," "identity" & "politically motivated reasoning"

This is a bit of correspondence with a thoughtful scholar & friend who was commenting on The Politically Motivated Reasoning Paradigm.

He stated,

Biggest question [for me] is what is the relationship between values and identities. You make clear that people can be acting protect any type prior but those two seem distinct in some ways and may benefit from more discussion. . . . .

[I am interested in the] larger question about whether you would call cultural cognition orientations an identity. The question arose because [I have a colleague] who is writing . . . on cases of identity-value conflict such as when a minority holds distinct values from the modal member of his/her identity group.

My response:

I’m eager to offer a response or acknowledge I don’t have a very good one to the sort of “value-identity” conflict you are envisioning. 

But I think we need to "iterate" a bit more in order to converge on a common conception of the issue here.

So I'm not going to try to address the "identity-value" conflict right off. Instead, I am going to discuss different understandings of how "values" & "identity" relate to one another in a research program that looks at the sort of "fact polarization" of interest to cultural cognition & other conceptions of PMR.

I'll start w/ two theories of why one might measure "values" to operationalize the source of "motivation" in PMR: dissonance avoidance & status protection.  

As a preliminary point, neither theory understands the sorts of "values" being measured as what  motivates information processing.  For both, the theoretically posited "motivator" is some unobserved (latent) disposition that causes the observable expression of "values," which are then treated simply as "indicators" or imperfect measures of that latent disposition.  

For that reason, both theories are agnostic on whether the relevant values are "truly, really" "political," "cultural" or something else.  All "value" frameworks are just alternative measures of the same unobserved latent dispositions.  The only issue is what measurement strategy works best for explanation, prediction, & prescription -- a criterion that will itself be specific to the goal of the research (e.g., I myself use much more fine-grained indicators, corresponding to much narrower specifications of the underlying dispositions, when I'm doing "field based" science communication in a region like S.E. Florida than I do when I'm participating in a scholarly conversation about mass opinion formation in "American society": the constructs & measurement instruments in former context wouldn't have same traction in latter context but the ones w/ most traction in latter furnish less in the former, where the consumers of the information are trying to do something that is advanced by a framework fitted more to their conditions).

Okay, the 2 theories:

1. Dissonance avoidance (DA). We might imagine that as "political beings" individuals are like diners at a restaurant that serves a "fixe prixe" menu of "ideologies" or "worldviews" or whathaveyou. After making their selections, it would be psychologically painful for these individuals to have to acknowledge that the world is configured in a way that forecloses achieving states of affairs associated with their preferred "worldview"or "ideology" or whatever: e.g., that unconstrained private orderings of the sort prized by individualists will burden the natural environment with toxic byproducts that make such a way of life unsustainable. They are therefore motivated to construe information in a manner that "fits" the evidence on risk and like facts to positions ("beliefs") supportive of policies congenial to their worldviews & unsupportive of policies uncongenial to the same.

2. Status protection (SP).  DA is a relatively individualistic conception of PMR; SP is more "social."  On this account, individual well-being is understood to be decisively linked to membership in important "affinity groups," whose members are bound together by their shared adherence to ways of life. Cultivating affective styles that evince commitment to the positions conventionally associated with these groups will be essential to signaling membership in and loyalty to one or another of them.  "Policy" positions will routinely bear such associations. But sometimes risks and like policy-relevant facts will come to bear social meanings (necessarily antagonistic ones in relation to the opposing groups) that express group membership &  loyalty too.  In those cases, PMR will be a mode of information processing rationally suited to forming the affective styles that reliably & convincingly express an individual's "group identity."

Avoiding the psychic disappointment of assenting to facts uncongenial to an individual's personal "policy preferences" is not the truth-extrinsic goal that "motivates" cognition on this view.  Status protection--i.e., the maintenance of the sort of standing in one's group essential to enjoying access to the benefits, material and emotional, that membership imparts--is.

Okay, those are the two theories.

But let me be clear: neither of these theories is "true"! 

Not because some other one is -- but because no theories are.  All theories are simplified, imperfect "models"-- or pictures or metaphors, even! -- that warrant our acceptance to the extent that they enable us to do what we want to do w/ an empirical research program: enlarge our capacity to explain, predict & prescribe.

On this basis, I view SP as "true" & DA "false."

For now at least.

But in any case, my question is whether your & your colleague's question --whether "cultural cognition orientations" are "an identity" -- can be connected to this particular account of how "values," "identities," & PMR are connected?  If so, then, I might have something more helpful to say!  If not, then maybe what you have to say about why not will help me engage this issue more concretely.



The "asymmetry thesis": another PMRP issue that won't go away

I feel like I've done 10^8 posts on this .... That's wrong: I counted, and in fact I've done 10.3^14.

But that's because it's a difficult question. Or at least is if one treats it as one of "measurement" & "weight of the evidence."  I remain convinced that it is not of great practical significance--that is, even if "motivated reasoning" and like dynamics are "asymmetric" across the ideological spectrum (or cultural spectra) that define the groups polarized on policy-consequential facts, the evidence is overwhelming and undeniable that members of all such groups are subject to this dynamic, & to an extent that makes addressing its general impact -- rather than singling out one or another group as "anti-science" etc. -- the proper normative aim for those dedicated to advancing enlightened self-govt.

But issues of "measurement" & "weight of the evidence" etc. are still, in my view, perfectly legitimate matters of scholarly inquiry. Indeed, pursuit of them in this case will, I'm sure, enlarge knowledge, theoretical and practical.

"Asymmetry" is an open question--& not just in the sense that nothing in science is ever resolved but in the sense that those on both "sides" (i.e., those who believe politically motivated reasoning is symmetric and those who believe it is asymmetric) ought to wonder enough about the correctness of their own position to wish that they had more evidence.

Here's an excerpt from my The Politically Motivated Reasoning Paradigm survey/synthesis essay addressing the state of the "debate":

4. Asymmetry thesis

The “factual polarization” associated with politically motivated reasoning is pervasive in U.S. political life. But whether politically motivated reasoning is uniform across opposing cultural groups is a matter of considerable debate (Mooney 2012).

In the spirit of the classic “authoritarian personality” thesis (Adorno 1950), one group of scholars has forcefully advanced the claim that it is not. Known as the “asymmetry thesis,” their position links biased processing of political information with characteristics associated with right-wing political orientations. Their studies emphasize correlations in observational studies between conventional ideological measures and scores on self-report reasoning-style scales such as “need for closure” and “need for cognition” and on personality-trait scales such “openness to experience” (Jost, Glaser, Kruglanski & Sulloway 2003; Jost, Hennes & Lavine 2013).

But the research that the “neo-authoritarian personality” school features supplies weak evidence for the asymmetry thesis. First, the reasoning style measures that they feature are of questionable validity. It is a staple of cognitive psychology that defects in information processing are not open to introspective observation or control (Pronin 2007) –a conclusion that applies to individuals high as well as more modest in cognitive proficiency (West, Meserve & Stanovich 2012). There is thus little reason to believe a person’s own perception of the quality of his reasoning is a valid measure of the same.

Indeed, tests that seek to validate such self-report reasoning style scales consistently find them to be inferior in predicting the disposition to resort to conscious, effortful information processing than performance-based measures such as the Cognitive Reflection Test and Numeracy (Toplak, West & Stanovich 2011; Liberali, Reyna, Furlan & Pardo 2011). Those measures, when applied to valid general population samples, show no meaningful correlation with party affiliation or liberal-conservative ideology (Kahan 2013; Baron 2015).

More importantly, there is no evidence that individual differences in reasoning style predict vulnerability to politically motivated reasoning. On the contrary, as will be discussed in the next part, evidence suggests that proficiency in dispositions such as cognitive reflection, numeracy, and science comprehension magnify politically motivated reasoning (Fig. 6).

Ultimately, the only way to determine if politically motivated reasoning is asymmetric with respect to ideology or other diverse systems of identity-defining commitments is through valid experiments. There are a collection of intriguing experiments that variously purport to show that one or another form of judgment—e.g., moral evolution, willingness to espouse counter-attitudinal positions, the political valence of positions formed while intoxicated, individual differences in activation of “brain regions” etc.—is ideologically asymmetric or symmetric (Thórisdóttir & Jost 2011; Jost, Nam, Jost & Van Bavel 2013; Eidelman et al. 2012; Crawford & Brandt 2013; Schreiber, Fonzo et al. 2013). These studies vary dramatically in validity and insight. But even the very best and genuinely informative ones (e.g., Conway, Gideon, et al. 2015; Liu & Ditto 2013; Crawford 2012) are in fact examining a form of information processing distinct from PMRP and with methods other than the PMRP design or its equivalent.

One study that did use the PMRP design found no support for the “asymmetry thesis” (Kahan 2013). In it, individuals of left- and right-wing political outlooks displayed perfectly symmetric forms of politically motivated fashioning in evaluating evidence that people who reject their group’s position on climate change have been found to engage in open-minded evaluation of evidence (Figure 5).

But that’s a single study, one that like any other is open to reasonable alternative explanations that themselves can inform future studies. In sum, it is certainly reasonable to view the “asymmetry thesis” issue as unresolved. The only important point is that progress in resolving it is unlikely to occur unless studied with designs that reflect PMRP design or ones equivalently suited to support inferences consistent with the PMRP model.


Adorno, T.W. The Authoritarian personality (Harper, New York, 1950).

Baron, J. Supplement to Deppe et al.(2015). Judgment and Decision Making 10, 2 (2015).

Conway, L.G., Gornick, L.J., Houck, S.C., Anderson, C., Stockert, J., Sessoms, D. & McCue, K. Are Conservatives Really More Simple‐Minded than Liberals? The Domain Specificity of Complex Thinking. Political Psychology (2015), advance on-line, DOI: 10.1111/pops.12304.

Crawford, J.T. The ideologically objectionable premise model: Predicting biased political judgments on the left and right. Journal of Experimental Social Psychology 48, 138-151 (2012).

Eidelman, S., Crandall, C.S., Goodman, J.A. & Blanchar, J.C. Low-Effort Thought Promotes Political Conservatism. Pers. Soc. Psychol. B. (2012).

Jost, J.T., Glaser, J., Kruglanski, A.W. & Sulloway, F.J. Political Conservatism as Motivated Social Cognition. Psychological Bulletin 129, 339-375 (2003).

Jost, J.T., Hennes, E.P. & Lavine, H. “Hot” political cognition: Its self-, group-, and system-serving purposes. in Oxford handbook of social cognition (ed. D.E. Carlson) 851-875 (Oxford University Press, New York, 2013).

Kahan, D. M.. Ideology, Motivated Reasoning, and Cognitive Reflection. Judgment and Decision Making, 8, 407-424 (2013).

Liberali, J.M., Reyna, V.F., Furlan, S., Stein, L.M. & Pardo, S.T. Individual Differences in Numeracy and Cognitive Reflection, with Implications for Biases and Fallacies in Probability Judgment. Journal of Behavioral Decision Making 25, 361-381 (2012).

Nam, H.H., Jost, J.T. & Van Bavel, J.J. “Not for All the Tea in China!” Political Ideology and the Avoidance of Dissonance. PLoS ONE 8(4) 8, :e59837. doi:59810.51371/journal.pone.0059837 (2013).

Pronin, E. Perception and misperception of bias in human judgment. Trends in cognitive sciences 11, 37-43 (2007).

Thórisdóttir, H. & Jost, J.T. Motivated Closed-Mindedness Mediates the Effect of Threat on Political Conservatism. Political Psychology 32, 785-811 (2011).

Toplak, M., West, R. & Stanovich, K. The Cognitive Reflection Test as a predictor of performance on heuristics-and-biases tasks. Memory & Cognition 39, 1275-1289 (2011).

West, R.F., Meserve, R.J. & Stanovich, K.E. Cognitive sophistication does not attenuate the bias blind spot. Journal of Personality and Social Psychology 103, 506 (2012).



Weekend update: "Color" preprint of " 'Ideology' vs. 'Situation Sense' "!

I've posted a revised "preprint" version of Kahan, D.M., Hoffman, D.A., Evans, D., Devins, N., Lucci, E.A. & Cheng, K. 'Ideology'or'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment, U. Pa. L. Rev. 164 (in press).

It is prettttttttttttttttttty darn close to final.

Main difference is that it has color rather than B&W graphics.  I have a feeling, w/ all the advances in information technology associated with "our internet," & w/ humans now having walked on the moon & all, that I might still live to see the day when all scholarly journals use color graphics (at least for their on-line versions; I think I've already live long enough to see the day when no one reads the "hardcopy"/"print" versions of journals!).... Call me a dreamer!

I'm sure, too, you all remember but in case not:  This is the study that examines a sample of judges, lawyers, law students & ordinary people to test competing theories about how identity-protective cognition relates to critical reasoning & professional judgment. 

We find that judges & lawyers who are as culturally polarized on societal risks-- like climate change & marijuana legalization--as are members of general population converge in readings of manifestly ambiguous statutes despite experimental manipulations that were intended to and did polarize culturally diverse  members of the public (and to a modest extentculturally diverse law students).

We view this result as most consistent with the theory that professional judgment furnishes experts with a degree of immunity from "identity-protective reasoning" when they perform "in-domain" but not "out-of-domain" decisionmaking tasks.

But as I emphasized in another recent post (one that presents an excerpt from another "in press" paper, The Politically Motivated Reasoning Paradigm), the "weight" of the evidence the study furnishes in this regard-- particularly as it relates to other types of experts like scientists who study contested societal risks--is indeed modest.  More study is called for!

I'm sure I'll live long enough to see this & every other interesting question about cognition definitively resolved too.  At which point, life will be so damn boring that people will stop fretting about its finite duration.

Anyway, happy clicking on grpahics!


1. Summary data

2. Multivariate regression model estimates

3. "Weight of the evidence" likelihood ratios

4. Data-collection process

David Hoffman carefully extracts cultural worldview from a state supreme court judge



Solving 2 nasty confounds: The "Politically Motivated Reasoning Paradigm [PMRP] Design"

Okay, so “yesterday,” I discussed the significance of two “confounds” in studies of “politically motivated reasoning.”

“Politically motivated reasoning” is the tendency of individuals to conform their assessment of the significance of evidence on contested societal risks and like facts to positions that are congenial to their political or cultural outlooks.

The “confounds” were heterogeneous priors and pretreatment effects. “Today” I want to address how to avoid the nasty effects of these confounds.

The inference-defeating consequences of heterogeneous priors and pretreatment effects are associated with a particular kind of study design. 

In it, the researcher exposes individuals of opposing political or cultural identities to counter-attitudinal information on a hotly contested topic such as gun control or climate change. Typically, the information is in the form of empirical studies or advocacy materials, real or fictional. If the information exposure fails to narrow, or even widens, the gap in the positions of subjects of opposing identities, this outcome is treated as evidence of politically motivated reasoning.

But as I explained in the last post, this inference is unsound.  

Imagine, e.g., that members of one politically identifiable group might be more uniformly committed to “their side’s” position than the those of another, some of whose members might be weakly supportive of the former’s position. If so, we would expect members of the latter group to be overrepresented among the subjects who “change their minds” when members of both groups are exposed to evidence more supportive of the other group’s position.  This is the “heterogeneous priors” confound.

You can't judge an experiment by its results; only by its design . . . .Alternatively, a greater proportion of one group might already have been exposed to evidence equivalent to that featured in the study design.  In that case, fewer members of that group would be expected to change their mind—not because they were biased but because they would have already adjusted their beliefs to take account of it. This is the “pretreatment effect” confound.

Put these two confounds together, and it’s clear that, under the design I described, no outcome is genuinely inconsistent with subjects having assessed the information in the “politically unbiased” manner associated with Bayesian information processing (Druckman, Fein & Leeper 2012; Druckman 2012; Bullock 2009; Gerber & Green 1999).

The solution, then, is to change the design.

If you can't remember exactly what the difference is between politically motivated reasoning, confirmation bias, and Bayesian information processing, click here. If you can, click here anyway!That’s one of the central points of The Politically Motivated Reasoning Paradigm (in press).  In that paper, I describe studies (e.g., Uhlman, Pizzaro, Tannenbaum, & Ditto 2009; Bolsen, Druckman & Cook 2014; Scurich & Shniderman 2014) that use a common strategy to avoid the confounding effects of heterogeneous priors and pretreatment effects.  I refer to it as the “PMRP” (for “Politically Motivated Reasoning Paradigm) “design.”

Under the PMRP design, the researcher manipulates the subjects’ perception of the consequences of crediting one and the same piece of evidence.  What’s compared is not individual subjects’ reported beliefs before and after being exposed to information but rather the weight or significance subjects of opposing predispositions attach to the evidence conditional on the experimental manipulation(cf. Koehler 1993). If subjects credit the evidence when they perceive it is consistent with their political predispositions but dismiss it when it’s not, then we can be confident that it is their politically biased weighing of evidence and not any discrepancy in priors or pre-study exposure to evidence that is driving subjects of opposing cultural or political identities apart.

One CCP study used the PMRP design to examine how study subjects of opposing cultural identities would assess the behavior of political protestors (Kahan, Hoffman, Evans, Braman & Rachlinski 2012). Instructed to adopt the perspective of juries in a civil case, the subjects examined a digital recording of demonstrators alleged to have assaulted passersby. The cause and identity of the demonstrators was manipulated: in one condition, they were described as “anti-abortion protestors” assembled outside the entrance to an abortion clinic; in the other, they were described as “gay-rights advocates” protesting the military’s “Don’t ask, don’t tell” policy outside a military-recruitment center.

Subjects of opposing “cultural worldviews” who were assigned to the same experimental condition—and who thus believed they were watching the same type of protest—reported forming opposing perceptions of whether the protestors “blocked” and “screamed in the face” of pedestrians trying to access the facility. At the same time, subjects who were assigned to different conditions—and who thus believed they were watching different types of protests—formed perceptions comparably different from subjects who shared their cultural worldviews.

go ahead, click it -- it won't bite!
In line with these opposing perceptions, the results in the two conditions produced mirror-image states of polarization on whether the behavior of the protestors met the factual preconditions for liability. 

But that outcome—an increased state of political polarization, in effect, in “beliefs”—is not, in my view, an essential one under the PMRP design. Indeed, if the issue featured in a study is familiar (like whether human beings are causing climate change, or whether permitting individuals to carry concealed firearms in public increases or decreases crime), we shouldn’t expect a one-shot exposure to evidence in the lab to change subjects' “positions.”

The only thing that matters is whether subjects of opposing outlooks opportunistically shifted the weight  (or in Bayesian terms, the likelihood ratio) they assigned to one and the same piece of evidence based on its congruence with their political predispositions.  If that’s how individuals of opposing cultural identities behave outside the lab, then contrary to what would occur under a Bayesian model of information processing they will not converge on politically contested facts no matter how much valid evidence they are furnished with.

Or won’t unless & until something is done in the world that changes the stake individuals with outlooks like those have in conforming their assessment of evidence to the positions then associated with their cultural identities (Kahan 2015).

The PMRP design is definitely not the only one that validly measures politically motivated reasoning. Indeed, the consistency of findings of studies that reflect the PMRP design and those based on other designs (e.g., Binning, Brick, Cameron, Cohen, & Sherman 2015; Nyhan, Riefler & Ubel 2015;  Druckman & Bolsen 2011; Bullock 2007; Cohen 2003) furnish more reason for confidence that the results of both are valid. Nevertheless, the test that the PMRP design is self-consciously constructed to pass—demonstration that individuals are opportunistically adjusting the weight they assign evidence to conform it to their political identities—supplies the proper standard for assessing whether the design of any particular study supports an inference of politically motivated reasoning.


Binning, K.R., Brick, C., Cohen, G.L. & Sherman, D.K. Going Along Versus Getting it Right: The Role of Self-Integrity in Political Conformity. Journal of Experimental Social Psychology 56, 73-88 (2015).

Bolsen, T., Druckman, J.N. & Cook, F.L. The influence of partisan motivated reasoning on public opinion. Polit. Behav. 36, 235-262 (2014).

Bullock, J. The enduring importance of false political beliefs. Unpublished Manuscript, Stanford University  (2007).

Bullock, J.G. Partisan Bias and the Bayesian Ideal in the Study of Public Opinion. The Journal of Politics 71, 1109-1124 (2009).

Cohen, G.L. Party over Policy: The Dominating Impact of Group Influence on Political Beliefs. J. Personality & Soc. Psych. 85, 808-822 (2003).

Druckman, J.N. & Bolsen, T. Framing, Motivated Reasoning, and Opinions About Emergent Technologies. Journal of Communication 61, 659-688 (2011).

Druckman, J.N., Fein, J. & Leeper, T.J. A source of bias in public opinion stability. American Political Science Review 106, 430-454 (2012).

Druckman, J.N. The Politics of Motivation. Critical Review 24, 199-216 (2012).

Druckman, J.N., Fein, J. & Leeper, T.J. A source of bias in public opinion stability. American Political Science Review 106, 430-454 (2012).

Gerber, A. & Green, D. Misperceptions about Perceptual Bias. Annual Review of Political Science 2, 189-210 (1999).

Kahan, D. M. The Politically Motivated Reasoning Paradigm. Emerging Trends in Social & Behavioral Sciences (in press).

Kahan, D. M. What is the “science of science communication”? J. Sci. Comm., 14(3), 1-12 (2015).

Kahan, D. M., Hoffman, D. A., Braman, D., Evans, D., & Rachlinski, J. J. They Saw a Protest : Cognitive Illiberalism and the Speech-Conduct Distinction. Stan. L. Rev., 64, 851-906 (2012).

Nyhan, B. & Reifler, J. The roles of information deficits and identity threat in the prevalence of misperceptions.  (2015),

Scurich, N. & Shniderman, A.B. The Selective Allure of Neuroscientific Explanations. PLoS One 9 (2014).

Uhlmann, E.L., Pizarro, D.A., Tannenbaum, D. & Ditto, P.H. The motivated use of moral principles. Judgment and Decision Making 4 (2009).


Testing for "politically motivated reasoning": 2 nasty confounds

The paper I posted “yesterday”—“The Politically Motivated Reasoning Paradigm”—is mainly about what “politically motivated reasoning” is and how to design studies to test whether it is affecting citizens’ assessment of evidence and by how much. 

The paper  is concerned, in particular, with two confounds—alternative explanations, essentially—that typically constrain the inferences that can be drawn from such studies.  The problems are heterogeneous priors and pretreatment effects (Druckman, Fein & Leeper 2012; Druckman 2012; Bullock 2009; Gerber & Green 1999).

Rather than describe these constraints abstractly, let me try to illustrate the problem they present.

Imagine a researcher is doing an experiment on “politically motivated reasoning”—the asserted tendency of individuals to conform evidence on disputed risks or other policy-relevant facts to the positions that are associated with their political outlooks.

She collects information on the subjects' “beliefs” in, say, “human caused global warming” and the strength of those beliefs (reflected in their reported probability that humans are the principal cause of it). She then presents the subjects with evidence—in the form of a study that suggests human activity is the principal cause of global warming--and measures their beliefs and their confidence in those beliefs again.

This is what she observes: 

Obviously, the subjects have become even more sharply divided. The difference in the proportion of Democrats and Republicans who accept AGW widened, as did the difference in their respective estimates of the probability of AGW.

Does the result support an inference that the subjects selectively credited or discredited the evidence consistent with their political predispositions?

Not really, no.

The clam that individuals are engaged in “politically motivated reasoning” implies they aren’t assessing the information in an unbiased manner, uninfluenced by the relationship between that information and outcomes congenial to their political views.

We can represent this kind of “unbiased” information processing in a barebones Bayesian model, in which individuals revise their existing belief in the probability of a hypothesis, expressed in odds, by a factor equivalent to how much more consistent the new information is with that hypothesis than with a rival one. That factor is known as the “likelihood ratio,” and conceptually speaking reflects the “weight” of the new information with respect to the competing hypotheses.

The distinctive feature of “politically motivated reasoning” is the endogeneity of the likelihood ratio and individuals’ political predispositions.  The political congeniality of crediting the evidence determines the weight they assign it.  Because “whose side does this evidence support—yours or mine?” is a criterion unrelated to its validity, individuals who reason this way will fail to converge on the best understanding of the best available evidence.

But in the hypothetical study I described, we really don’t know if that’s happening.  Certainly, we would expect to see a result like the one reported—partisans becoming even more “polarized” as they examine the “same” evidence--if they were engaged in politically motivated reasoning.

But we could in fact see exactly this dynamic consistent with the unbiased, Bayesian information-processing model.

As a simplification, imagine the members of a group of deliberating citizens, Rita, Ron, and Rose—all of whom are Republicans—and Donny, Dave, Daphne—all Democrats.  Each has a “belief” about the contribution of human beings to “human caused climate change,” and each has a sense of how confident they are about their beliefs—a sensibility we can represent in terms of how probable they think it is (expressed in odds) that human beings are the principal cause of climate change.

The table to the left represents this information 

Now imagine that they are shown a study.  The study presents evidence supporting the conclusion that humans are the principal cause of climate change. 

Critically, all of the individuals in this group agree about the weight properly afforded the evidence in the study!

They all agree, let’s posit, that the study has modest weight—a likelihood ratio of 3, let’s say, which means that it is three times more consistent with the hypothesis that human beings are responsible for climate change than with the contrary hypothesis (don’t confuse likelihood ratios with “p-values” please; the latter have nothing to do with the inferential weight evidence bears).

In other words, none of them adjusts the likelihood ratio or weight afforded to the evidence to fit their predispositions.

Nevertheless, the results of the hypothetical study I described could still display the polarization the researcher found!

This table shows how: 

First, the individuals in this "sample" started with different priors.  Daphne, e.g., put the probability that human beings were causing climate change at 2:1 (0.5:1 in favor) against before she got the information.  Rita’s prior odds were 1000:1 against (.001:1 in favor). 

When they both afforded the new information a likelihood ratio of 3, Daphne flipped from the view that human beings “probably” weren’t responsible for climate change to the view that they probably were (1.5:1 or 3:2 in favor).  But because Rita was more strongly convinced that human beings weren’t causing climate change, she persisted in her belief that humans probably weren’t responsible for climate change even after appropriately adjusting downward (from 1000:1 to about 333:1) against (Bullock 2009).

Second, the individuals in our sample started with differing amounts of knowledge about the existing evidence on climate change.  

In particular, Ron and Rose, it turns out, already knew about the evidence that the researcher showed them in the experiment! That's hardly implausible: members of the public are constantly being bombarded with information on climate change and similarly contentious topics.  Their priors—10:1 against against human-caused climate change, and 2:1 in favor, respectively--already reflected their unbiased (I’m positing) assessment of that information (or its practical equivalent). 

They thus assigned the evidence a likelihood ratio of “1” in reporting their "after evidence" beliefs in the study not because they were conforming the likelihood ratio to their predispositions—indeed, they agree that the evidence is 3x more consistent with the hypothesis that humans are causing climate change than that they are not—but because their priors already reflected having given the information that weight when they previously encountered it in the real world.

If the “outcome variable” of the study is “what percentage of Republicans and Democrats think human activity is a principal cause of climate change,” then we will see polarization even with Bayesian information processing—i.e, without the sort of selective crediting of information that is the signature of politically motivated reasoning--becaues of the heterogeneity of the group members' priors.

Likewise, if we examine the “mean” probabilities assigned to AGW by the Democrats and Republicans, we find the differential grew in the information-exposed condition.  The reason, however, wasn't differences in how much weight they gave the information, but pre-treatment (pre-study) differences in their exposure to information equivalent to that conveyed to them in the experiment (Druckman, Fein & Leepr 2012).

In sum, given the study design, we can’t draw confident inferences that the subjects engaged in politically motivated reasoning.  They could have.  But because of the confounds of heterogeneous priors and pretreatment exposure to information, we could have ended up with exactly these results even if they were engaged in unbiased, Bayesian information processing.

To draw confident inferences, then, we need a better study design for politically motivated reasoning—one that avoids these confounds.

I describe that design in the “Politically Motivated Reasoning Paradigm” paper.  I call it the “Politically Motivated Reasoning Paradigm” (PMRP) design.

I’ll say more about it . . . “tomorrow”!


Druckman, J.N. The Politics of Motivation. Critical Review 24, 199-216 (2012).

Druckman, J.N., Fein, J. & Leeper, T.J. A source of bias in public opinion stability. American Political Science Review 106, 430-454 (2012).

Bullock, J.G. Partisan Bias and the Bayesian Ideal in the Study of Public Opinion. The Journal of Politics 71, 1109-1124 (2009).

Gerber, A. & Green, D. Misperceptions about Perceptual Bias. Annual Review of Political Science 2, 189-210 (1999).

Kahan, D.M. The "Politically Motivated Reasoning Paradigm." Emerging Trends in Social & Behavioral Sciences (in press).




New paper: "The Politically Motivated Reasoning Paradigm"

What is it, how do you measure it, is it ideologically symmetric, do any of the herbal supplements advertised as counteracting it really work, etc.  Take a look & find out.

Still time for revisions, so comments welcome!


"*Scientists* & identity-protective cognition? Well, on the one hand ... on the other hand ... on the *other* other hand ..." A fragment

Scientific proof that "skeptical" scientisis are biased!From something I'm working on. I'll post the rest of it "tomorrow," in fact.  But likely this section will end up on the cutting room floor (that's okay; there's lots of stuff down there & eventually I expect to find use for most of it someplace; is a bit of fire hazard, though . . . .)

6. Professional judgment

Ordinary members of the public predictably fail to get the benefit of the best available scientific evidence when their collective deliberations are pervaded by politically motivated reasoning. But even more disturbingly, politically motivated reasoning might be thought to diminish the quality of the best scientific evidence available to citizens in a democratic society (Curry 2013).

Not only do scientists—like everyone else—have cultural identities. They are also highly proficient in the forms of System 2 information processing known to magnify politically motivated reasoning.   Logically, then, it might seem to follow that scientists’ factual beliefs about contested societal risks are likely skewed by the stake they have in conforming information to the positions associated with their cultural groups.

But a contrary inference would be just as “logical.” The studies linking politically motivated reasoning with the disposition to use System 2 information processing have been conducted on general public samples, none of which would  have had enough scientists in them to detect whether being one matters. Unlike nonscientists  with  high CRT or Numeracy scores, scientists use professional judgment when they evaluate evidence relevant to disputed policy-relevant facts. Professional judgment consists in habits of mind, acquired through training and experience and distinctively suited to specialized forms of decisionmaking.  For risk experts, those habits of mind confer resistance to many cognitive biases that can distort the public’s perceptions(Margolis 1996).  It is perfectly plausible to believe that one of the biases that professional judgments can protect risk experts from is “politically motivated reasoning.”

Here, too, neither values nor positions on disputed policies can help decide between these competing empirical claims. Only evidence can.  To date, however, there are few studies of how scientists might be affMy spidey sense tells me this is a future classic!ected by politically motivated reasoning, and the inferences they support are equivocal. 

Some observational studies find correlations between the positions of scientists on contested risk issues and their cultural or political orientations (Bolsen, Druckman, & Cook 2015; Carlton, Perry-Hill, Huber & Prokopy 2015).  The correlations, however, are much less dramatic than ones observed in general-population samples.  In addition, with one exception (Slovic, Malmfors et al. 1995), these studies have not examined scientists’ perceptions of facts in their own domains of expertise.

This is an important point. Professional judgment inevitably comprises not just conscious analytical reasoning proficiencies but perceptive sensibilities that activate those proficiencies when they are needed (Bedard & Biggs 1991; Marcum 2012). Necessarily preconscious (Margolis 1996), these sensibilities reflect the assimilation of the problem at hand to an amply stocked inventory of prototypes. But because these prototypes reflect the salient features of problems distinctive of the expert’s field, the immunity from bias that professional judgment confers can’t be expected to operate reliably outside the domain of her expertise (Dane & Pratt 2007).

A study that illustrates this point examined legal professionals.  In it, lawyers and judges, as well as a sample of law students and members of the public, were instructed to perform a set of statutory interpretation problems. Consistent with the PMRP design, the facts of the problems—involving behavior that benefited either illegal aliens or “border fence” construction workers; either a pro-choice or pro-life family counseling clinic—were manipulated in a manner designed to provoke responses consistent with identity protective cognition in competing cultural groups.  The manipulation had exactly that effect on members of the public and on law students.  But it didn’t on either judges or lawyers:  despite the ambiguity of the statutes and the differences in their own cultural values, those study subjects converged in their responses, just as one would predict if one expected their judgments to be synchronized by the common influence of professional judgment. Nevertheless, this relative degree of resistance to identity-protective reasoning was confined to legal-reasoning tasks: the judges and lawyers’ respective perceptions of disputed societal risks—from climate change to marijuana legalization—reflected the same identity-protective patterns observed in the general public and student samples (Kahan, Hoffman, Evans, Lucci, Devins & Cheng in press). Extrapolating, then, we might expect to see the same effect in risk experts: politically motivated divisions on policy-relevant facts outside the boundaries of their specific field of expertise; but convergence guided by professional judgment inside of them.

Or alternatively we might expect convergence not on positions that are true necessarily but that are so intimately bound up with a field’s own sense of identity that acceptance of them has become a marker of basic competence (and hence a precondition of recognition and status) within it.  In Koehler (1993), scientists active  in either defending or discrediting scientific proof of “parapsychology” were instructed to review the methods of a fictional ESP study. The result of the study was experimentally manipulated: Half the scientists got one that purported to find evidence supporting ESP, the other half one that purported to find evidence not supporting it. The scientists’ assessments of the quality of the study’s methods turned out to be strongly correlated with the fit between the representeveeeeeeeeeery interesting ....d result and the position associated with the scientists’ existing positions on the scientific validity of parapsychology—although Koehler found that this effect was in fact substantially more dramatic among the “skeptic” than the “non-skeptic” scientists. 

Koehler’s study reflects the core element of the PMRP design: the outcome measure was the weight that members of opposing groups gave to one and the same piece of evidence conditional on the significance of crediting it. Because the significance was varied in relation to the subjects’ prior beliefs and not their stake in some goal independent of forming an accurate assessment, the study can and normally is understood to be a demonstration of confirmation bias.  But obviously, the “prior beliefs” in this case were ones integral to membership in opposing groups, the identity-defining significance of which for the subjects was attested to by how much time and energy they had devoted to promoting public acceptance of their respective groups’ core tenets. Extrapolating, then, one might infer that professional judgment might indeed fail to insulate from the biasing effects of identity-protective cognition scientists whose professional status has become strongly linked with particular factual claims.

So we are left with only competing plausible conjectures.  There’s nothing at all unusual about that. Indeed, it is the occasion for empirical inquiry—which here would take the form of the use of the PMRP design or one of equivalent validity to assess the vulnerability of scientists to politically motivated reasoning—both in and outside of the domains of their expertise, and with and without the pressure to affirm “professional-identity-defining” beliefs.


Curry, J. Scientists and Motivated Reasoning. Climate Etc. (Aug. 20, 2013)

Bedard, J.C. & Biggs, S.F. Pattern recognition, hypotheses generation, and auditor performance in an analytical task. Accounting Review, 622-642 (1991).

Bolsen, T., Druckman, J.N. & Cook, F.L. Citizens’, scientists’, and policy advisors’ beliefs about global warming. The ANNALS of the American Academy of Political and Social Science 658, 271-295 (2015).

Carlton, J.S., Rebecca, P.-H., Matthew, H. & Linda, S.P. The climate change consensus extends beyond climate scientists. Environmental Research Letters 10, 094025 (2015).

Dane, E. & Pratt, M.G. Exploring Intuition and its Role in Managerial Decision Making. Academy of Management Review 32, 33-54 (2007).

Kahan, D.M., Hoffman, D.A., Evans, D., Devins, N., Lucci, E.A. & Cheng, K. 'Ideology' or 'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment. U. Pa. L. Rev. 164 (in press).

Koehler, J.J. The Influence of Prior Beliefs on Scientific Judgments of Evidence Quality. Org. Behavior & Human Decision Processes 56, 28-55 (1993).

Marcum, J.A. An integrated model of clinical reasoning: dual-process theory of cognition and metacognition. Journal of Evaluation in Clinical Practice 18, 954-961 (2012).

Margolis, H. Dealing with risk : why the public and the experts disagree on environmental issues (University of Chicago Press, Chicago, IL, 1996).

Margolis, H. Patterns, thinking, and cognition : a theory of judgment (University of Chicago Press, Chicago, 1987).

Slovic, P., Malmfors, T., Krewski, D., Mertz, C.K., Neil, N. & Bartlett, S. Intuitive toxicology .2. Expert and lay judgments of chemical risks in Canada. Risk Analysis 15, 661-675 (1995).


A classic!


Disentanglement principle corollary no. 16a: "You don't have to choose ... between being a reality tv star & being excited to learn what science knows (including what it knows about how people come to know what's known by science)"

Sometimes 1 or 2 of the 14 billion regular followers of this blog ask, "are there really 14 billion reglar followers of this blog?..." 

Yeah. There really are!


"Hey Joe": "Practical scholarship" on climate "science communication"

Sorry for lack of context here, but my guess is that it will become clear enough after a few sentences.

Dear Joe:

I apologize for disparaging your work at the Society for Risk Analysis session yesterday.  You perceived my remarks that way, and on reflection I can see why you did, & why others likely formed the same impression.  I truly regret that.

In fact, it wasn’t your work that I meant to be criticizing. 

My intention was to respond to the argument you presented (with the admirable degree of clarity I wish I had been able to summon in response) in favor of “practical scholarship.”  Because you see, I don’t think the sort of work you defended is either practical or scholarly.

You  proposed to those in the room that the empirical study of climate science communication should be evaluated in light of its contribution to a “goal” of promoting a “world war II scale mobilization” of public opinion (I encourage you to post your slides; they were very well done). 

Research aimed at identifying the significance of values & science comprehension for public conflict on climate change (the subject of the panel we were both on; great new research unveiled by the Shi, Visschers, Siegrist team!) doesn’t meet this criterion, you made clear. Indeed, it detracts from it, because, in your opinion, it implies change will take a “long time” (I disagree it implies any such thing but that’s another matter).

As an example of research that is “practical,” you offered your own, which you characterized as aimed at convincing democratic representatives that their prospects for re-election depend on honoring the sorts of “public preferences” revealed by the structured preference-elicitation  methods you described.

You also stated that your work, along with that of others, is intended to “create cover” for officials to take positions supportive of climate change policies (a common refrain among researchers who generate endless streams of public opinion polls purporting to find that there is fact widespread public consensus for one or another climate change mitigation initiative). 

We should all pitch in to help acehieve this result, you exhorted.

Again, to be clear, my point is that this vision of empirical work on science communication is neither “scholarly” nor “practical.”

Scholarship—of the empirical variety, in any event—tries to help people figure out what’s true, particularly under conditions in which there are multiple plausible understandings of phenomena of consequence.  That’s what the scholarship on the relationship between “values” and “science literacy” that you disparaged is about.  The occasion for that scholarly inquiry is a practical one: to figure out what sorts of dynamics are blocking public engagement with the best available evidence on climate change.

What’s definitely not practical (as Theda Skocpol has noted) is to think that public opinion researchers can be mobilized into a project to “show” elected officials what the public “really” wants.

Elected officials are in the profession of satisfying the expectations of their constituents. They invest plenty of money, most of the time wisely, to figure out how to do that.

They know that surveys purporting to show that a “majority” of Republicans support “the EPA's greenhouse gas emission standards” are measuring non-opinion.   They know too that the sort of preference-elicitation methods you demonstrated—however truly valuable they might be for learning about cognition—are not modeling the decisionmaking dynamics that determine election outcomes. 

Most importantly, they know—because those who agree with your conception of “practical scholarship” are constantly proclaiming this-- that your goal is to create an impression in these actors for your own purposes: to help “shove” them into supporting a particular set of policies (enough with these “nudges” already, you inspiringly proclaimed: we are facing the moral equivalent of Hitler invading Europe!), not help them get re-elected. 

They know, in short, that “non-opinion” survey methods are actually intended to message them!  And I would have sort of thought this was obvious, but it’s not a very good “messaging strategy” to incessantly go on & on within earshot of Republicans about “strategies” for “overcoming” the “Republicans' cognitive resistance to climate mitigation.”

The targeted politicians (Democrat and Republican) therefore sensibly discount (ignore really) everything produced by researchers who are following this "message the politicians" strategy.  They listen instead to the professionals, who tell them something very different from what these "practical scholars" are saying (over & over & over; “keep repeating—that it hasn't worked yet is proof that we just need to do it for longer!,”--another refrain inside this bubble) .  Politicians who take what these researchers say at face value, they’ve observed, get knocked out of office. 

I believe there is plenty that science communication researchers  can do to help actual people, including elected officials, promote science-informed decisionmaking relating to climate change by collaborating with them to adapt and test lab insights to their real-world problems. 

The form of research that I think is best for that aims to help those decisionmakers change the meaning of climate change in their communities, so that discussions of it no longer are perceived as being about “whose side are you on” but instead about “what do we know, what more do we need to know, and what should we do.”

That research doesn't try to conjure a new world into existence by disseminatng "studies" that constantly purport to find it already exists. 

It tries to supply people who actually are acting to make such a world with empirical information that they can use to exercise their judgment as best as they can.

Indeed, what motivated my rebuke of you yesterday was frustration at how closely aligned the program you defended (very clearly, very articulately) is with divisive forms of partisan advocacy that actually perpetuate the social meanings that make climate change a “struggle for the soul of America” rather than a practical problem that all Americans, regardless of the cultural identities, have a common interest in fighting. 

Frustration too at how much the sort of "practical" "scholarship" you called for is distracting and diverting and confusing people who are looking to empirical researchers for help.

At how self-defeating it obviously is ever to propose that a criterion other than “figuring out & sharing one’s best understanding of the truth on contested empirical issues” could possibly be practical.   

How twisted it is to call that singularly unscientific orientation  “science communication” research!

It's pretty simple really: Tell people what they need to know, not what they want to hear

That’s both ethical and practical.

Again, sorry I disparaged your scholarly work, which I think can teach people a lot about how people think. 

The intended target was your conception of “practical scholarship.”  And I did very much intend to be critical of that view and of those who are propogating the mindset you very much evinced in your talk.




p.s. My slides from talk on the challenge of "unconfounding" knowledge & identity in measuring "climate change science comprehension."


Mine goes to 11 ... or 10, at least, for now

What to do when stuck in Ft. Lauderdale airport b/c missing connecting flight to Keys?....

See what happens when the "Rules of Evidence Are Impossible CBR Simulator" is expanded from "8 item of proof" size cases to "10 item of proof" size ones!

Lots of people, no doubt thinking of the wildly popular "Miller-Sanjurjo Turing Machine" (MSTM), have been writing asking if a version of the CBR simulator will be made available for home use by CCPB subscribers... Stay tuned!


Cultural "fact polarization" trumps cultural "value" polarization -- a fragment

Working on this.  Rest "tomorrow."

1. The new politics of “fact polarization”

Polarization over questions of fact is one of the signature features of contemporary democratic political life.  Citizens divided over the relative weight of “liberty” and “equality” are less sharply divided today over the justice of progressive taxation (Moore 2015) than over the evidence that human  CO2 emissions are driving up global temperatures (Frankovic 2015).  Democrats and Republicans argue less strenuously about whether states should be permitted to require the "reading of the Lord's prayer" in school than whether permitting citizens to carry concealed handguns in public increases homicide rates—by multiplying the number of firearms in society—or instead decreases them by equipping law-abiding citizens to protect themselves from predation (Newport 2015).

Members of cultural groups that confer status to women for their mastery of domestic roles love their daughters as much as members of those who celebrate the world of commerce and public affairs as status-conferring arenas for men and women alike (Luker 1984). Yet the two cannot agree about the consequences of universally immunizing middle-school girls against the human papilloma virus: does that policy promote the girls’ health by protecting them later in life from an extremely prevalent sexually  transmitted disease linked to cervical cancer; or endanger them by lulling them into unprotected sex right now, thereby increasing their risks of becoming pregnant and of contracting other, even more deadly STDs (Kahan, Braman, Cohen, Gastil & Slovic 2010)?

These are admittedly complex questions.  But they are empirical ones. Values can’t supply the answers; only evidence can. The evidence that is relevant to any one of these factual issues, moreover, is completely distinct from the evidence relevant to any of the others.  There is simply no logical reason, in sum, for positions on these and various other policy-relevant facts (the safety of deep geologic isolation of nuclear wastes, the deterrent impact of the death penalty, the efficacy of invasive forms of surveillance to combat terrorism, etc.) to cluster at all, much less to form packages of beliefs that so strongly unite citizens of shared cultural commitments and so persistently divide citizens of opposing ones.

But there is a psychological explanation for today’s politics of “fact polarization.”  Or at least a very strong candidate explanation, the emergence of which has supplied an energizing focus for research and debate in the decision sciences over the course of the last decade. . . . 


Frankovic, K. Most republicans do not think humans are causing climate change. YouGov. (2015).

General Social Survey (2014).

Luker, K. Abortion and the politics of motherhood (University of California Press, Berkeley, 1984).



Weekend update: Is critical reasoning domain independent or domain specific?... a fragment of an incomplete rumination

An adaptation of a piece of correspondence--one no longer, really, than this-- w/ a thoughtful person who proposed that people have "corrective mechanisms" for the kind of "likelihood ratio cascade" that I identified with "coherence based reasoning" and that I  asserted makes "rules of evidence" impossible:

What are these corrective mechanisms?

I ask not because I doubt they exist but because I suspect that they do -- & that their operation has evaded full understanding because of a mistaken assumption central to the contemporary study of cognition.

That assumption is that reasoning proficiencies--the capacity to recognize covariance, give proper effect to base rates, distinguish systematic relationships from chance co-occurrences, & perform like mental operations essential to making valid inferences--are more or less discrete, stand-alone "modules" within a person's cognitive repertoire.

If the modules are there, and are properly calibrated, a person will reliably summon them for any particular task that she happens to be doing that depends on that sort of mental operation.

Call this the "domain independent" conception (DI) of cognitive proficiency. DI is presupposed by standardized assessments like the Cognitive Reflection Test (Frederick 2005) and Numeracy (Peters et al. 2006), which purport to measure the specified latent reasoning capacities "in general," that is, abstracted from anything in particular one might use them for.

Another conception sees cognitive proficiency as intrinsically domain specific. On this view--call it the DS conception--it's not accurate to envision reasoning abilities of the sort I described as existing independently of the activities that people use them for (cf. Heatherington 2011).

Accordingly, a person who performs miserably in a context-free assessment of, say, the kind of logical-reasoning proficiency measured by an abstract version of a the Wason Selection Task-- one involving cards with vowels and numbers on either side -- might in fact always (or nearly always!) perform that sort of mental operation correctly in all the real-world contexts that she is used to encountering that require it. In fact, people do very well at the Wason Selection Task when it is styled as something more familiar--like detecting a norm violator (Gigenrenzer & Hug 1992).

In sum, reasoning proficiencies are not stand-alone modules but integral components of action-enabling mental routines that are reliably summoned to mind by a person's perception of the sorts of recurring problem situations those routines, including their embedded reasoning proficiencies, help her to negotiate.

DS is suspicious of standardized assessments, including the usual stylized word problems that are thought by decision scientists to evince one or another type of "cognitive bias."  By (very deliberately) effacing the contextual cues that summon to mind the mental routines and embedded reasoning proficiencies necessary to address recurring problem situations, such tests grossly overstate the "boundedness" of human rationality (Gigenrenzer 2000).

Indeed, by abstracting from any particular use to which people might put the reasoning proficiencies they are evaluating, such assessments and problems are actually measuring only how good people are at doing tests. In fact, people can train themselves to become very proficient at a difficult type of reasoning task for purposes of taking an exam on it and then evince complete innocence of that same sort of knowledge in the real-world settings where it actually applies (DiSessa 1982)!

DI and DS have different accounts of "expertise" in fields that involve reasoning tasks that are vulnerable to recurring cognitive biases. DI  identifies that expertise with the cultivation of general, context-free habits of mind that evince the disposition to use "conscious, effortful" ("system 2") forms of information processing (Sunstein 2005).

DS, in contrast, asserts that "expertise" consists in the possession of  mental routines, and their embedded reasoning proficiencies, specifically suited for specialized tasks. Those mental routines  include the calibration of rapid, intuitive, pre-conscious, affective forms of cognition (or better, recognition) that reliably alert the expert to the need to bring certain conscious, effortful mental operations to bear on the problem at hand. The proper integration of reciprocal forms of intuitive and conscious forms of cognition tailored to specialized tasks is the essence of professional judgment.

Nonexperts can be expected to display one or another bias when confronted with those same problems.  But the reason isn't that the nonexpert "thinks differently" from the expert; it's that the expert has acquired through training and experience mental routines suited to do things that are different from anything the ordinary person has occasions to do in his or her life  (Margolis 1987, 1993, 1996). 

Indeed, if one confronts an expert with a problem divorced from all the cues that reliably activate the cognitive proficiencies she uses when she performs professional tasks, one is likely to find that the expert, too, is vulnerable to all manner of cognitive bias.

But if one infers from that that the expert therefore can't be expected to resist those biases in her professional domain, one is making DI's signature mistake of assuming that reasoning proficiencies are stand-alone modules that exist independent of mental routines specifically suited for doing particular things  (cf. Kahan, Hoffman, Evans,Luci, Devins & Cheng in press) ....

Or that at leas is what a DS proponent would say.

She might, then, too agree that the reason-eviscerating quality of "coherence based reasoning" supplies us with grounds to professionalize fact-finding in legal proceedings.

Not because "jurors" or other "nonexperts" are "stupid." But because it is stupid to think that doing what is required to make accurate findings of fact in legal proceedings does not depend on the cultivation of habits of mind specifically suited for that task.

I tend to think the DS proponent comes closer to getting it right. But of course, I'm not really sure.


DiSessa, A.A. Unlearning Aristotelian Physics: A Study of Knowledge‐Based Learning. Cognitive science 6, 37-75 (1982).

Frederick, S. Cognitive Reflection and Decision Making. Journal of Economic Perspectives 19, 25-42 (2005).

Gigerenzer, G. Adaptive thinking : rationality in the real world (Oxford University Press, New York, 2000).

Gigerenzer, G. & Hug, K. Domain-specific reasoning: Social contracts, cheating, and perspective change. Cognition 43, 127-171 (1992). 

Hetherington, S.C. How to know : a practicalist conception of knowledge (J. Wiley, Chichester, West Sussex, U.K. ; Malden, MA, 2011).

Kahan, D.M., Hoffman, D.A., Evans, D., Devins, N., Lucci, E.A. & Cheng, K. 'Ideology'or'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment. U. Pa. L. Rev. 164 (in press).

Margolis, H. Dealing with risk : why the public and the experts disagree on environmental issues (University of Chicago Press, Chicago, IL, 1996).

Margolis, H. Paradigms and Barriers (1993).

Margolis, H. Patterns, thinking, and cognition : a theory of judgment (University of Chicago Press, Chicago, 1987).

Peters, E., Västfjäll, D., Slovic, P., Mertz, C.K., Mazzocco, K. & Dickert, S. Numeracy and Decision Making. Psychol Sci 17, 407-413 (2006).

Sunstein, C.R. Laws of fear : beyond the precautionary principle (Cambridge University Press, Cambridge, UK ; New York, 2005). 



"Inherent internal contradictions" don't cause bad institutions to collapse; they just suck ... "Rules of evidence are impossible," part 3 (another report for Law & Cognition seminar)

Nope. Can't be done. Impossible.Time for part 3 of this series: Are Rules of Evidence Impossible?

The answer is yes, as I said at the very beginning.

But I didn’t say why & still haven’t.

Instead, I spent the first two parts laying the groundwork necessary for explanation.  Maybe you can build the argument on top of it yourself at this point?! If so, skip ahead to “. . . guess what?”—or even skip the rest of this post altogether & apply your reason to something likely to teach you something new!

But in the event you can’t guess the ending, or simply need your “memory refreshed” (see Fed. R. Evid. 612), a recap:

Where were we? In the first part, I described a conception of the practice of using “rules of evidence”—the Bayesian Cognitive Correction Model (BCCM). 

BCCM conceives of rules of evidence as instruments for “cognitively fine tuning” adjudication. By selectively admitting and excluding items of proof, courts can use the rules to neutralize the accuracy-diminishing impact of one or another form of of biased information processing--from identity-protective reasoning to the availability effect, from hindsight bias to baserate neglect, etc.  The threat these dynamics pose to accurate factfinding is their tendency to induce the factfinder to systematically misestimate the weight, or in Bayesian terms the “likelihood ratio” (LR), to be assigned items of proof (Kahan 2015). 

In part 2, I discussed a cognitive dynamic that has that sort of consequence: “coherence based reasoning” (CBR).

Monte carlo simulation of CBR! check it out!Under CBR (Simon 2004; Simon, Pham, Quang & Holyoak 2001; Carlson & Russo 2001), the factfinder’s motivation to find “coherence” in the trial proof creates a looping feedback effect. 

Once the factfinder forms the perception that the accumulated weight of the evidence supports one side, he begins to inflate or discount the weight of successive items of proof as necessary to conform them to that position.  He also turns around and revisits already-considered items of proof and reweights them to make sure they fit that position, too. 

His reward is an exaggerated degree of confidence in the correctness of that outcome—and thus the piece of mind that comes from never ever having to worry that maybe, just maybe he got the wrong answer.

The practical consequences are two.  First, by virtue of the exaggerated certainty the factfinder has in the result, he will sometimes rule in favor of a party that hasn’t carried its burden under a heightened standard of proof like, say, “beyond a reasonable doubt,” which reflects the law’s aversion to “Type 1” errors when citizens’ liberty is at stake.

Second, what position the factfinder comes to be convinced is right will be arbitrarily sensitive to the order of proof.  The same strong piece of evidence that a factfinder dismisses as inconsistent with what she is now committed to believing is true could have triggered a “likelihood ratio” cascade” in exactly the opposite direction had that item of proof appeared “sooner”-- in which case the confidence it instilled in its proponent's case would have infected the factfinder's evaluation of all the remaining items of proof.

If you hung around after class last time for the “extra credit”/“optional” discussion, I used a computer simulation to illustrate these chaotic effects, and to show why we should expect the accuracy-eviserating consequences of them to be visited disproportionately on innocent defendants in criminal proceedings.

This is definitely the sort of insult to rational-truth-seeking that BCCM was designed to rectify!

But guess what?

It can’t! The threat CBR poses to accuracy is one the BCCM conception of “rules of evidence” can’t possibly couneract!

As I explained in part 1, BCCM consists of three basic elements:

  1. Rule 401, understood as a presumption that evidence with LR ≠ 1 is admissible (Lempert 1977);

  2. a conception of “unfair prejudice” under Rule 403 that identifies it as the tendency of a piece of relevant evidence to induce a flesh-and-blood factfinder to assign incorrect LRs to it or other items of proof (Lempert 1977); and
  3. a strategy for Rule 403 weighing that directs the court to exclude “relevant” evidence when the tendency it has to induce the factfinder to assign the wrong LR to that or other pieces of evidence diminishes accurate assessment of the trial proof to a greater extent than constraining the factfinder to effectively treat the evidence in question as having no weight at all, or LR = 1 (Kahan 2010).

The problem is that CBR injects this “marginal probative value vs. marginal prejudice” apparatus with a form of self-contradiction, both logical and practical.

There isn’t normally any such contradiction. 

Imagine, e.g., that a court was worried that evidence of a product redesign intended to avoid a harmful malfunction might trigger “hindsight bias,” which consists in the tendency to inflate the LRs associated with items of proof that bear on how readily one might have been able to predict the need for and utility of such a design ex ante (Kamin & Rachlinski 1995).  (Such evidence is in theory—but not in practice— “categorically excluded” under Rule 407, when the correction was made after the injury to the plaintiff; but in any case, Rule 407 wouldn’t apply, only Rule 403 would, if the change in product design were made after injuries to third parties but before the plaintiff herself was injured by the original product—even though the same “hindsight bias” risk would be presented).

“All” the judge has to do in that case is compare the marginal accuracy-diminishing impact of [1] giving no weight at all to the evidence (LR = 1) on the "facts of consequence"  it should otherwise have made "more probable" (e.g, the actual existence of alternative designs and their cost-effectiveness) and [2] the inflationary effect of admitting it on the LRs assigned to the evidence bearing on every other fact of consequence (e.g., what a reasonable manufacturer would have concluded about the level of risk and feasibility of alternative designs at the time the original product was designed).

The BCCM conception of 403 "marginal probity vs. marginal prejudice" balancing! A thoughtful person might wonder about the capacity of a judge to make that determination accurately, particularly because weighing the “marginal accuracy diminishing impact” associated with admission and with exclusion, respectively,  actually requires the judge to gauge the relative strength of all the remaining evidence in the case. See Old Chief v. U.S., 519 U.S. 127, 182-85 (1997).

But making such a determination is not, in theory at least, impossible.

What is is doing this same kind of analysis when the source of the “prejudice” is CBR.  When a judge uses BCCM to manage the impact of hindsight bias (or any other type of dynamic inimical to rational information-processing), “marginal probative value” and “marginal prejudice”—the quantities she must balance—are independent.

But when the bias the judge is trying to contain is CBR, “marginal probative value” and “marginal prejudice” are interdependent—and indeed positively correlated.

What triggers the “likelihood ratio cascade” that is characteristic of CBR as a cognitive bias is the correct LR the factfinder assigned whatever item of proof induced the factfinder to form the impression that one side’s position was stronger than the other’s. Indeed, the higher (or lower) the “true” LR of that item of proof, the more confident the facftinder will be in the position that evidence supports, and hence the more biased the factfinder will thereafter be in assessment of the weight due other pieces of evidence (or equivalently, the more indifferent she'll become to the risk of erring in the direction of that position (Scurich 2012)).

To put it plainly, CBR creates a war between the two foundational “rules of evidence”: the more relevant evidence is under Rule 401 the more unfairly prejudicial it becomes for purposes of Rule 403.  To stave off the effects of CBR on accurate factfinding, the court would have to exclude from the case the evidence most integral to reaching an accurate determination of the facts.

Maybe an illustration would be useful?

This is one case plucked from the sort of simulation that I ran yesterday:

It shows how, as a result of CBR, a case that was in fact a “dead heat” can transmute into one in which the factfinder forms a supremely confident judgment that the facts supporting one side’s case The sad result of trying to do BCCM 403 balancing here...are “true.”

The source of the problem, of course, is that the very “first” item of proof had LR = 25, initiating a “likelihood ratio cascade” as reflected in the discrepancy between the "true" LRs—tLRs—and "biased" perceived LRs—pLRs—for each subsequent item of proof.

A judge applying the BCCM conception of Rule 403 would thus recognize that "item of proof No. 1" is injecting a huge degree of “prejudice” into the case. She should thus exclude proof item No. 1, but only if she concludes that doing so will diminish the accuracy of the outcome less than preventing the factfinder from giving this highly probative piece of evidence any effect whatsoever.

When the judge engages in this balancing, she will in fact observe that the effect of excluding that evidence distorts the accuracy of the outcome just as much as admitting it does--but in the opposite direction. In this simulated case, assigning item No. 1 an LR = 1—the formal effect of excluding it—now induces the factfinder to conclude that the odds against that party’s position being true are 5.9x10^2:1, or that that there is effectively a 0% chance that that party’s case is well-founded.

That’s because the very next item of proof has LR = 0.04 (the inverse of LR = 25), and thus triggers a form of “rolling confirmation bias” that undervalues every subsequent item of proof.

So if the judge were to exclude item No. 1 b/c of its tendency to excite CBR, she’d now face the same issue confronts her again in ruling on a motion to exclude item No. 2.

And guess what? If she assesses the impact of excluding that super probative piece of evidence (one that favored one party’s position 25x more than the other’s), she’ll again find that the “accuracy diminishing impact” of doing so is as high as not excluding: the remaining evidence in the case is configured so that the factfinder is impelled to a super-confident conclusion in favor of the first party once more!

And so forth and so on.

As this illustration should remind you, CBR also has the effect of making outcomes arbitrarily sensitive to the order of proof. 

Imagine item 1 and item 2 had been “encountered” in the opposite “order” (whether by virtue of the point at which they were introduced at trial, the relative salience of them to the factfinder as he or she reflected on the proof as a whole, or the role that post-trial deliberations had in determining the sequence with which particular items of proof were evaluated). 

The factfinder in that case would indeed have formed just as confident a judgment--but one in support of the opposite party:

Again, the judge will be confronted with the question whether the very “first” item of proof—what was item No. 2  in the last version of this illustration—should be excluded under Rule 403. When she works this out, moreover, she’ll end up discovering that Again, 403 balancing is impossible here--it is self-contradictory!the consequence of excluding it is the same as was the consequence of excluding item No. 1—LR = 25—in our alternative-universe version of the case: a mirror-image degree of confidence on the factfinder's part about the strength of the opposing party’s case.  And so  on and so forth.

See what’s going on?

The only way for the judge to assure that this case gets decided “accurately” is to exclude every single piece of evidence from the trial, remitting the jury to its priors—1:1—which, by sheer accident, just happened to reflect the posterior odds a “rational factfinder” would have ended up with after fairly assigning each piece of evidence its “true” LR.

Not much point having a trial at all under those circumstances!

Of course, the evidence, when properly considered, might have more decisively supported one side or the other.  But what a more dynamic simulation--one that samples from all the various distributions of case strength one cares to imagine-- shows us is that there’s still no guarantee the factfinder would have formed an accurate impression of the strength of the evidence in that cirucmstance either.

To assure an accurate result in such a cse, the judge, under the BCCM conception of the rules of evidence, would still have been obliged to try to deflect the accuracy-vitiating impact of CBR away from the factfinder’s appraisal of the evidence by Rule 403 balancing. 

And the pieces of evidence that the judge would be required in such a case to exclude would be the ones most entitled to be given a high degree of weight by a rational factfinder!  The impact of doing so would be to skew consideration of the remainder of the evidence without offsetting exclusions of similarly highly relevant pieces of proof. . . . 

Again, no point in even having  a trial if that’s how things are going to work. The judge should just enter judgment for the party she thinks “deserves” to win.

There is of course no reason to believe a judge could “cognitively fine-tune” a case with the precision that this illustration envisions.  But all that means is that the best a real judge can ever do will always generate an outcome that we have less reason to be confident is “right” than we would have had had the judge just decided the stupid case herself on the basis of her own best judgment of the evidence.

Of course, why should we assume the judge herself could make an accurate assessment, or reasonably accurate one, of the trial proof?  Won’t she be influenced by CBR too—in a way that distorts her capacity to do the sort of “marginal probative value vs. marginal prejudice” weighing that the BCCM conception of Rule 403 imagines?

If you go down this route, then you again ought to conclude that “rules of evidence are impossible” even without contemplating the uniquely malicious propensities of CBR.  Because if this is how you see things (Schauer 2006), there will be just as much reason to think that the judge’s performance of such balancing will be affected by all the other forms of cognitive bias that she is trying to counteract by use of BCCM’s conception of Rule 403 balancing.

I think that anxiety is in fact extravagant—indeed silly.

There is plenty of evidence that judges, by virtue of professionalization, develop habits of mind that reasonably insulate them from one or another familiar form of cognitive bias when the judges are making in-domain decisions—i.e., engaging in the sort of reasoning they are supposed to as judges (Kahan, Hoffman, et al. in press; Guthrie, Rachlinksi & Wistrich 2007) .

That’s how professional judgment works generally!

But now that I’ve reminded you of this, maybe you can see what the “solution” is to the “impossibility” of the rules of evidence?

Even a jurist with exquisite professional judgment cannot conceivably perform the kind of “cognitive fine-tuning” ‘envisioned by the “rules of evidence” -- the whole enterprise is impossible.

But what makes such fine tuning necessary in the first place is the law’s use of  non-professional decisionmakers divorced from any of the kinds of insights and tools that professional legal truthseekers would actually use.

Jurors aren’t stupid.  They are equipped with all the forms of practical judgment that they need to be successful in their everyday lives.

What's stupid is to think that making reliable assessments of fact in the artificial environment of a courtroom advesarial proceeding is one of the things everday life equips them to do. 

Indeed, it's absurd to think that that environment is conducive to the accurate determination of facts by anyone.

A procedural mechanism that was suited for accurately determining the sorts of facts relevant to legal determinations would have to look different from anything we see in in everyday life, b/c making those sorts of determinations isn't something that everyday life requires.

No more than than having to practice medicine, repair foreign automobiles, or write publicly accessible accounts of relativity is (btw, happy birthday Die Feldgleichungen der Gravitation).

Ordinary, sensible people rely on professionals -- those who dedicate themselves to acquiring expert knowledge and corresponding forms of reasoning proficiency -- to perform specialized tasks like these.

The “rules of evidence” are impossible because the mechanism we rely on to determine the “truth” in legal proceedings—an adversary system with lay factfinders—is intrinsically flawed. 

No amount of fine-tuning by “rules of evidence” will  ever make that system capable of delivering the accurate determinations of their rights and obligations that citizens of an enlightened democratic state are entitled to.

We need to get rid of the current system of adjudication and replace it with a professionalized system that avails itself of everything we know about how the world works, including how human beings reason and how they can be trained to reason when doing  specialized tasks.

And we need to replace, too, the system of legal scholarship that generates the form of expertise that consists in being able to tell  soothing, tranquilizing, narcotizing just-so stories about how well suited the “adversary system” would be for truth-seeking with just a little bit  more "cognitive fine tuining" to be implemented through the rules of evidence.

That element of our legal culture is as antagonistic to the goal of truth-seeking as any the myriad defects of the adversary system itself. . . .

The end!


Guthrie, C., Rachlinski, J.J. & Wistrich, A.J. Blinking on the bench: How judges decide cases. Cornell Law Rev 93, 1-43 (2007).

Kahan, D.M. The Economics—Conventional, Behavioral, and Political—of "Subsequent Remedial Measures" Evidence. Columbia Law Rev 110, 1616-1653 (2010).

Kahan, D.M., Hoffman, D.A., Evans, D., Devins, N., Lucci, E.A. & Cheng, K. 'Ideology'or'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment. U. Pa. L. Rev. 164 (in press).

Kahan, D.M. Laws of cognition and the cognition of law. Cognition 135, 56-60 (2015).

Kamin, K.A. & Rachlinski, J.J. Ex Post ≠ Ex Ante - Determining Liability in Hindsight. Law Human Behav19, 89-104 (1995).

Lempert, R.O. Modeling Relevance. Mich. L. Rev. 75, 1021-57 (1977).

Pennington, N. & Hastie, R. A Cognitive Theory of Juror Decision Making: The Story Model. Cardozo L. Rev. 13, 519-557 (1991).

Schauer, F. On the Supposed Jury-Dependence of Evidence Law. U. Pa. L. Rev. 155, 165-202 (2006).

Scurich, N. The Dynamics of Reasonable Doubt. (Ph.D. dissertation, University of Southern California, 2012). 

Simon, D. A Third View of the Black Box: Cognitive Coherence in Legal Decision Making. Univ. Chi. L.Rev. 71, 511-586 (2004).

Simon, D., Pham, L.B., E, Q.A. & Holyoak, K.J. The Emergence of Coherence over the Course of Decisionmaking. J. Experimental Psych. 27, 1250-1260 (2001).


Check out wild & crazy "coherence based reasoning"! Are rules of evidence "impossible"?, part 2 (another report from Law & Cognition seminar)

If you want to do BCCM, you definitely should draw lots of little diagrams like thisThis is part 2 in a 3-part series, the basic upshot of which is that “rules of evidence” are impossible.

A recap. Last time I outlined a conception of “the rules of evidence” I called the “Bayesian Cognitive Correction Model” or BCCM.  BCCM envisions judges using the rules to “cognitively fine-tune” trial proofs in the interest of simulating/stimulating jury fact-finding more consistent with a proper Bayesian assessment of all the evidence in a case. 

Cognitive dynamics like hindsight bias and identity-protective cognition can be conceptualized as inducing the factfinder to over- or undervalue evidence relative to its “true” weight—or likelihood ratio (LR).  Under Rule 403, Judges should thus exclude an admittedly “relevant” item of proof (Rule 401: LR ≠ 1) when the tendency of that item of proof to induce jurors to over- or undervalue of other items of proof (i.e., to assign them LRs that differ from 1 more than they actually do) impedes verdict accuracy more than constraining the factfinder to assign the item of proof in question no weight at all (LR = 1).

“Coherence based reasoning”—CBR—is one of the kinds of cognitive biases a judge would have to use the BCCM strategy to contain..  This part in the series describes CBR and the distinctive threat it poses to rational factfinding in adjudication.

Today's episode. CBR can be viewed as an information-processing dynamic rooted in aversion to residual uncertainty.

Good study!A factfinder, we can  imagine, might initiate her assessment of the evidence in a reasonably unbiased fashion, assigning modestly probative pieces of evidence more or less the likelihood ratios they are due.

But should she encounter a piece of evidence that is much more consistent with one party’s position, the resulting confidence in that party’s case (a state that ought to be only provisional, in a Bayesian sense) will dispose her to assign the next piece of evidence a likelihood ratio supportive of the same inference—viz., that that party’s position is “true.”  As a result, she’ll be all the more confident in the merit of that party’s case—and thus all the more motivated to adjust the weight assigned the next piece of evidence to fit her “provisional” assessment, and so forth and so on  (Carlson & Russo 2001). 

Once she has completed her evaluation of trial proof, moreover, she will be motivated to revisit earlier-considered pieces of evidence, readjusting the weight she assigned them so that they now fit with what has emerged as the more strongly supported position ( (Simon, Pham, Quang & Holyoak 2001; Holyoak & Simon; Pennington & Hastie 1991). When she concludes, she will necessarily have formed an inflated assessment of the probability of the facts that support the party whose “strong” piece of evidence initiated this “likelihood ratio cascade.”

What does this matter?

Well, to start, in the law, the party who bears the “burden of proof” will often be entitled to win only if she establishes the facts essential to her position to a heightened degree of certainty like “beyond a reasonable doubt.”  One practical consequence of the overconfidence associated with CBR, then, will be to induce the factfinder to decide in favor of a party whose evidence, if evaluated in an unbiased fashion, would not have satisfied the relevant proof standard (Simon 2004).  Indeed, one really cool set of experiments (Scurich 2012) suggests that "coherence based reasoning" effects might actually reflect a dissonance-avoidance mechanism that manifests itself in factfinders reducing the standard of proof after exposure to highly probative items of proof! 

But even more disconcertingly, CBR makes the outcome sensitive to the order in which critical pieces of evidence are considered (Carlson, Meloy & Russo 2006). 

A  piece of evidence that merits considerable weight might be assigned a likelihood ratio of  1 or < 1 if the factfinder considers it after having already assigned a low probability to the position it supports.  In that event, the evidence will do nothing to shake the factfinder’s confidence in the opposition position.

But had the factfinder considered that same piece of evidence “earlier”—before she had formed a confident estimation of the cumulative strength of the previously considered proof—she might well have given that piece of evidence the greater weight it was due. 

Once BCCM practioner draws *this* diagram, though, she'll freak outIf that had happened, she would then have been motivated to assign subsequent pieces of proof likelihood ratios higher than they in fact merited. Likewise, to achieve a “coherent” view of the evidence as a whole, she would have been motivated to revisit and revise upward the weight assigned to earlier considered, equivocal items of proof.  The final result would thus have been a highly confident determination in exactly the opposite direction from the one she in fact reached.

This not the way things should work if one is engaged in Bayesian information processing—or at least any normatively defensible understanding of Bayesian information processing geared to reaching an accurate result!

Indeed, this is the sort of spectacle that BCCM directs the judge to preempt by the judicious use of Rule 403 to exclude evidence the “prejudicial” effect of which “outweighs” its “probative value.”

But it turns out that using the rules of evidence to neutralize CBR in that way is IMPOSSIBLE!

Why? I’ll explain that in Part 3!

# # #

But right now I’d like to have some more, “extra-credit”/“optional” fun w/ CBR! It turns out it is possible & very enlightening to create a simulation to model the accuracy-annihilating effects I described above.

Actually, I’m just going to model a “tame” version of CBR—what Carlson & Russo call “biased predecisional processing.” Basically, it’s the “rolling confirmation bias” of CBR without the “looping back” that occurs when the factfinder decides for good measure to reassess the more-or-less unbiased LRs she awarded to items of proof before she became confident enough to start distorting all the proof to fit one position. 

Imagine that a factfinder begins with the view that the “truth” is equally likely to reside in either party’s case—i.e., prior odds of 1:1. The case consists of eight “pieces” of evidence, four pro-prosecutor (likelihood ratio > 1) and four pro-defendant (likelihood ratio <1). 

The factfinder makes an unbiased assessment of the “first” piece of evidence she considers, and forms a revised assessment of the odds that reflects its “true” likelihood ratio.  As a result of CBR, however, her assessment of the likelihood ratio of the next piece of evidence—and every piece thereafter—will be biased by her resulting perception that one side’s case is in fact “stronger” than the other’s.

To operationalize this, we need to specify a “CBR factor” of some sort that reflects the disposition of the factfinder to adjust the likelihood ratios of successive pieces of proof up or down to match her evolving (and self-reinforcing!) perception of the strength disparity in the parties’  the party’s case.

Imagine the factfinder misestimates the likelihood ratio of all pieces evidence by a continuous amount that results in her over-valuing or under-valuing an item of proof by a factor of 2 at the point she becomes convinced that the odds in favor of one party’s position rather than the other’s position being “true” has reached 10:1.

What justifies selecting this particular “CBR factor”? Well, I suppose nothing, really, besides that it supplies a fairly tractable starting point for thinking critically about the practical upshot of CBR. 

But also, it’s cool to use this function b/c it reflects a “weight of the evidence” metric developed by Turing and Good to help them break the Enigma code! 

For Turing and Good, a piece of evidence with a likelihood ratio of 10 was judged to have a weight of “1 ban.” They referred to a piece of proof that had a likelihood ratio 1/10 that big as a “deci-ban”—and were motivated to use that as the fundamental unit of evidentiary currency in their code-breaking system based on their seat-of-the-pants conjecture that a “deciban” was the smallest shift in the relative likelihoods of two hypotheses that human beings could plausibly perceive (Good 1985). 

So with this “CBR factor,” I am effectively imputing to the factfinder a disposition to “add to”  (or subtract from) an item of proof one “deciban”—the smallest humanly discernable “evidentiary weight,” in Turing and Good’s opinion—for every 1-unit increase (1:1 to 2:1; 2:1 to 3:1, etc.) or (decrease--1:1 to 1:2; 1:3 to 1:4) in the “odds” of that party’s position being true.

And this figure illustrates how this distorting potential can be affected by CBR generally:

In the “unbiased” table, “prior” reflects the factfinder’s current estimate of the probability of the “prosecutor’s” position being true, and “post odds” the revised estimate based on the weight of the current “item” of proof, which is assigned the likelihood ratio indicated in the “LT” column.  The “post %” column transforms the revised estimate of the probability of “guilt” into a percentage. 

I’ve selected an equal number of pro-prosecution (LR >1) and pro-defense (LR<1) items of proof, and arranged them so they are perfectly offsetting—resulting in a final estimate of guilt of 1:1 or 50%.

In the “coherence based reasoning” table, “tLR” is the “true likelihood ratio” and “pLR” the perceived likelihood ratio assigned the current item of proof. The latter is derived by applying the CBR factor to the former.  When the odds are 1:1, CBR is 1, resulting in no adjustment of the weight of the evidence. But as soon as the odds shift in one party’s favor, the CBR factor biases the assessment of the next item of proof accordingly.

As can be seen, the impact of CBR in this case is to push the factfinder to an inflated estimate of the strength of the prosecution’s  position being true, which at the factfinder puts at 29:1 or 97% by the “end” of the case.

But things could have been otherwise. Consider:

I’ve now swapped the “order” of proof items “4” and “8,” respectively.  That doesn't make any difference, of course, if one is "processing" the evidence they way a Bayesian would; but it does if one is CBRing.

The reason is that the factfinder now “encounters” the defendant’s strongest item of proof -- LR = 0.1—earlier than the prosecution’s strongest—LR = 10.0.

Indeed, it was precisely because the factfinder encountered the prosecutor’s best item of proof “early” in the previous case that she was launched into a self-reinforcing spiral of overvaluation that made her convinced that a dead-heat case was a runaway winner for the prosecutor.

The effect when the proof is reordered this way is exactly the opposite: a devaluation cascade that convinces the factfinder that the odds in favor of the prosecutor’s case are infinitesimally small!

These illustrations are static, and based on “pieces” of evidence with stipulated LRs “considered” in a specified order (one that could reflect the happenstance of when particular pieces register in the mind of the factfinder, or are featured in post-trial deliberations, as well as when they are “introduced” into evidence at trial—who the hell knows!).

But we can construct a simulation that randomizes those values in order to get a better feel for the potentially chaotic effect that CBR injects into evidence assessments. 

The simulation constructs trial proofs for 100 criminal cases, each consisting of eight pieces of evidence. Half of the 800 pieces of evidence reflect LRs drawn randomly from a uniform distribution between 0.05 and 0.95; these are “pro-defense” pieces of evidence. Half reflect LRs drawn randomly from a uniform distribution between 1.05 and 20. They are “pro-prosecution” pieces.

We can then compare the “true” strength of the evidence in the 100 cases —the probability of guilt determined by Bayesian weighting of each one’s eight pieces of evidence—to the “biased” assessment generated when the likelihood ratios for each piece of evidence are adjusted in a manner consistent with CBR.

This figure compares the relative distribution of outcomes in the 100 cases:


As one would expect, a factfinder whose evaluation is influenced by CBR will encounter many fewer “close” cases than will one that engages in unbiased Bayesian updating.

This tendency to form overconfident judgments will, in turn, affect the accuracy of case outcomes.  Let’s assume, consistent with the “beyond a reasonable doubt” standard, that the prosecution is entitled to prevail only when the probability of its case being “true” is ≥ 0.95.  In that case, we are likely to see this sort of divergence between outcomes informed by rational information processing and outcomes informed by CBR:


The overall “error rate” is “only” about 0.16.  But there are 7x as many incorrect convictions as incorrect acquittals.  The "false conviction" rate is 0.21, wheras the "false acquittal" rate is 0.04....

The reason for the asymmetry between false convictions and false acquittals is pretty straightforward. In the CBR-influenced cases, there are a substantial number of “close” cases that factfinder concluded “strongly” supported one side or the other. Which side—prosecution or defendant—got the benefit of this overconfidence is roughly equally divided.  However, a defendant is no less entitled to win when the factfinder assesses the strength of the evidence to be 0.5 or 0.6 than when the factfinder assesses the strength of the evidence as 0.05 or 0.06.  Accordingly, in all the genuinely “close” cases in which CBR induced the factfinder to form an overstated sense of confidence in the weakness of the prosecution’s case, the resulting judgment of “acquittal” was still the correct one.  But by the same token, the result was incorrect in every close case in which CBR induced the factfinder to form an exaggerated sense of confidence in the strength of the prosecution’s case.  The proportion of cases, in sum, in which CBR can generate a “wrong” answer is much higher in ones that defendants deserve to win than in ones in which the prosecution does.

This feature of the model is an artifact of the strong “Type 1” error bias of the “beyond a reasonable doubt” standard.  The “preponderance of the evidence” standard, in contrast, is theoretically neutral between “Type 1” and “Type 2” errors.  Accordingly, were we treat the simulated cases as “civil” rather than “criminal” ones, the false “liability” outcomes and false “no liability” ones would be closer to the overall error rate of 16%.

Okay, I did this simulation once for 100 cases.  But let’s do it 1,000 times for 100 cases—so that we have a full-blown Monte Carlo simulation of the resplendent CBR at work!

These are the kernel distributions for the “accurate outcome” “false acquittal,” and “false conviction” rates over 1000 trials of 100 cases each:

Okay—see you later!


Carlson, K.A., Meloy, M.G. & Russo, J.E. Leader‐driven primacy: using attribute order to affect consumer choice. Journal of Consumer Research 32, 513-518 (2006).

Carlson, K.A. & Russo, J.E. Biased interpretation of evidence by mock jurors. Journal of Experimental Psychology: Applied 7, 91-103 (2001)

I.J. Good, Weight of Evidence: A Brief Survey, in Bayesian Statistics 2: Proceedings of the Second Valencia International Meeting (J.M. Bernardo, et al. eds., 1985).

Keith J. Holyoak & Dan Simon, Bidirectional Reasoning in Decision Making by Constraint Satisfaction,  J. Experimental Psych. 128, 3-31 (1999).

Kahan, D.M. Laws of cognition and the cognition of law. Cognition 135, 56-60 (2015). 

Pennington, N. & Hastie, R. A Cognitive Theory of Juror Decision Making: The Story Model. Cardozo L. Rev. 13, 519-557 (1991).

Simon, D. A Third View of the Black Box: Cognitive Coherence in Legal Decision Making. Univ. Chi. L.Rev. 71, 511-586 (2004).

Scurich, N. The Dynamics of Reasonable Doubt. (Ph.D. dissertation, University of Southern California, 2012). 

Simon, D., Pham, L.B., E, Q.A. & Holyoak, K.J. The Emergence of Coherence over the Course of Decisionmaking. J. Experimental Psych. 27, 1250-1260 (2001).

CBR ... frankenstein's monster of law & psychology...



Report from "Law & Cognition" class: Are “rules of evidence impossible”? Part 1 

Well, I didn't do a good job of sharing the to & fro of this semester's Law & Cognition seminar w/ the 14 billion of you who signed up to take the coure on-line. I'm happy to refund your enrollment fees--I actually parleyed them into a sum 10^3 x as large by betting incredulous behavioral economists that P(H|HHH) < P(H) when sampling from finite sequences w/o replacement-- but stay tuned & I'll try to fill you in over time...

If you’re a Bayesian, you’ll easily get how the Federal Rules of Evidence work. 

But if you accept that “coherence based reasoning” characterizes juries’ assessments of facts (Simon, Pham, Quang & Holyoak 2001; Carlson & Russo 2001), you’ll likely conclude that administering the Rules of of Evidence is impossible.

Or so it seems to me.  I’ll explain but it will take some time—about 3 posts’ worth.

The "Rules of Evidence Impossibility Proof"--Paaaaaaart 1!

There are really only two major rules of evidence. There are a whole bunch of others but they are just variations on a theme.

The first is Rule 401, which states that evidence is “relevant” (and hence presumptively admissible under Rule 402) if it “has any tendency to make a fact  [of consequence to the litigation] more or less probable” in the assessment of a reasonable factfinder.

As Richard Lempert observed (1977) in his classic paper Modeling Relevance, Rule 401 bears a natural Bayesian interpretation.

The “likelihood ratio” rendering of Bayes’s Theorem—Posterior odds = Prior odds x Likelihood Ratio—says that one should update one’s existing or “prior” assessment of the probability of some hypothesis (expressed in odds) by a factor that reflects how much more consistent the new information is with that hypothesis than with some rival hypothesis.  If this factor—the likelihood ratio—is greater than one, the probability of the hypothesis increases; if it is less than one, it decreases.

Accordingly, by defining as “relevant” any evidence that gives us reason to treat a “fact of consequence” as “more or less probable,” Rule 401 indicates that evidence should be treated as relevant (and thus presumptively admissible) so long as it has a likelihood ratio different from 1—the factor by which one should revise one’s prior odds when new evidence is equally consistent with the hypothesis and with its negation.


Second is Rule 403, which states that “relevant evidence” should be excluded if its “probative value is substantially outweighed by . . . unfair prejudice.”  Evidence is understood to be “unfairly prejudicial” when (the Advisory Committee Notes tell us) it has a “tendency to suggest decision on an improper basis.” 

There’s a natural Bayesian rendering of this concept, too: because the proper basis for decision reflects the updating of one’s priors by a factor equal to the product of the likelihood ratios associated with all the (independent) items of proof, evidence is prejudicial when it induces the factfinder to weight items of proof inconsistent with their true likelihood ratios

Lempert crica 1977 (outside Studio 54, during break from forensic science investigation of then-still unsolved Son of Sam killing spree)An example would be evidence that excites a conscious intention—born perhaps of animus, or alternatively of sympathy—to reach a particular result regardless of the Bayesian import of the proof in the case.

More interestingly, a piece of evidence might be “unfairly prejudicial” if it triggers some unconscious bias that skews the assignment of the likelihood ratio to that or another piece of evidence (Gold 1982).

E.g., it is sometimes said (I think without much basis) that jurors “overvalue” evidence of character traits—that is, that they assign to a party’s disposition a likelihood ratio, or degree of weight, incommensurate with what it is actually due when assessing the probability that the party acted in a manner that reflected such a disposition on a particular occasion (see Fed. R. Evid. 404).

Or the “unfairly prejudicial effect” might consist in the tendency of evidence to excite cognitive dynamics that bias the weight assigned other pieces of evidence (or all of it).  Evidence that an accident occurred, e.g., might trigger  “hindsight bias,” causing the factfinder to assign more weight than is warranted to evidence that bears on how readily that accident could have been foreseen before its occurrence (Kaman & Rachlinski 1995).

By the same token, evidence that excites “identity-protective cognition” might unconsciously motivate a factfinder to selectively credit or dismiss (i.e., opportunistically adjust the likelihood ratio of) all the evidence in the case in a manner geared to reaching an outcome that affirms rather than denigrates the factfinder’s cultural identity (Kahan 2015).

Rule 403 directs the judge to weigh probity and prejudice.

Again, there’s a Bayesian rendering: a court should exclude a “relevant” item of proof as “unfairly prejudicial” when the marginal distortion of accuracy associated with the incorrect likelihood ratio that admitting it will induce the factfinder to assign to that or any other items of proof is bigger than the marginal distortion of accuracy associated with constraining the factfinder to assign that item of proof a likelihood ratio of 1, which is the practical effect of excluding it (Kahan 2010).  

click me & behold what it looks like to do Bayesian analysis of evidence rules *after* emerging from a night of partying at Studio 54 circa 1977!If you work this out, you’ll see (perhaps counterintuitively, perhaps not!) that courts should be much more reluctant to exclude evidence on Rule 403 grounds in otherwise close cases. As cases become progressively closer, the risk of error associated with under-valuing (by failing to consider) relevant evidence increases faster than the risk of error associated with over-valuing that or other pieces of evidence: from the point of view of deciding a case, being “ovderconfident” is harmless so long as one gets the right result. Likewise the risk that admitting "prejudicial" evidence will result in error increases more rapidly as the remaining proof becomes weaker: that's the situation in which a facfinder is most likely to decide for a party that she wouldn't have but for her biased over-valuing of that item of proof or others (Kahan 2010).

For an alternative analysis, consider Friedman (2003). I think he's wrong but for sure maybe I am! You tell me!

The point is how cool it is-- how much structure & discipline it adds to the analysis-- to conceptualize Rules of Evidence as an instrument for closing the gap between what a normatively desirable Bayesian assessment of trial proof would yield and what a psycholigically realistic account of human information processing tells us to expect (someday, of coures, we'll replace human legal decisionmakers with AI evidence-rule robots! but we aren't quite there yet ...).

Let's call this approach to understanding/perfecing evidence law the "Bayesian Cognitive Correction Model" (BCCM).

But is BCCM itself psychologically realistic?  

Is it plausible to to think a court can reliably “maximize” the accuracy of adjudication by this sort of cognitive fine-tuning of the trial proof?

Not if you think that coherence-based reasoning  (CBR) is one of the reasoning deficiencies that a court needs to anticipate and offset by this strategy.

I’ll describe how CBR works in part 2 of this series—and then get to the “impossibility proof” in part 3!


Carlson, K.A. & Russo, J.E. Biased interpretation of evidence by mock jurors. Journal of Experimental Psychology: Applied 7, 91-103 (2001).

Friedman, R.D. Minimizing the Jury Over-valuation Concern. Mich. State L. Rev. 2003, 967-986 (2003).

Gold, V.J. Federal Rule of Evidence 403: Observations on the Nature of Unfairly Prejudicial Evidence. Wash. L. Rev. 58, 497 (1982).

Kahan, D.M. The Economics—Conventional, Behavioral, and Political—of "Subsequent Remedial Measures" Evidence. Columbia Law Rev 110, 1616-1653 (2010).

Kahan, D.M. Laws of cognition and the cognition of law. Cognition 135, 56-60 (2015).

Kamin, K.A. & Rachlinski, J.J. Ex Post ≠ Ex Ante - Determining Liability in Hindsight. Law Human Behav 19, 89-104 (1995).

Lempert, R.O. Modeling Relevance. Mich. L. Rev. 75, 1021-57 (1977).

Simon, D., Pham, L.B., E, Q.A. & Holyoak, K.J. The Emergence of Coherence over the Course of Decisionmaking. J. Experimental Psych. 27, 1250-1260 (2001).


My remote post-it notes for my HLS African-American teachers