Law & Cognition 2016, Sessions 6 & 7 recap: To bias or not to debias--that is the question about deliberation
Those were the questions we took up in the last couple of sessions of Law & Cognition.
The answer, I’d say, is . . . who the hell knows!
The basis for this assessment are two excellent studies, one of which seems to put deliberation in a really good light, and another that seems to put it in a really bad one.
Seems to is the key part of the assessment in both cases.
The first study was Sommers, S.R., On Racial Diversity and Group Decision Making: Identifying Multiple Effects of Racial Composition on Jury Deliberations, Journal of Personality and Social Psychology 90, 597-612 (2006). I identified this one several yrs ago as the “coolest debiasing study I’ve ever read,” and I haven’t read anything since that affects its ranking.
Sommers examines the effect of varying the racial composition of mock jury panels assigned to hear a case against an African-American who is alleged to have sexually assaulted a white victim. White jurors, he reports, formed more pro-defense views and also engaged in higher quality deliberations when they were on racially mixed panels as opposed to all-white ones.
But the key finding was that this effect had nothing to do with actual deliberations; instead it had to do with the anticipation of them.
White members of the mixed panels were more disposed to see the defendant as innocent even before deliberations began.
Once deliberations did start, moreover, the whites on the mixed panels were less likely to make erroneous statements and more likely to make correct ones independently of any contributions to the discussion made by the African-American jurors.
The prospect of having to give an account to a racially mixed panel, Sommers convincingly surmises, activated unconscious processes that accentuated the attention that whites on the mixed-race panels paid to the trial proof and thus improved the accuracy of their information processing.
It’s a really great example of how environmental cues can achieve a debiasing effect that a conscious instruction to “be objective” or “fair” or to “pay attention” etc. demonstrably cannot (indeed, such instructions, the readings for this week reminded us, often have a perverse effect).
I’m not sure, though, that the result tells us anything about whether and when deliberation in general can be expected to have a positive effect on information processing in legal settings.
Indeed, the second study we read, Schkade, D., Sunstein, C.R. & Kahneman, D., Deliberating about dollars: The severity shift, Columbia Law Rev. 100, 1139-1175 (2000), furnished us with reason to think that deliberation can be expected to exacerbate legal reasoning biases, at least in some circumstances.
SSK did a massive study in which 500 6-member panels deliberated on 15 separate civil cases presenting demands for punitive damages. After watching films of these cases, the subjects individually completed forms that solicited their rankings of the ‘level” of punishment that was appropriate on a 0-8 scale and their assessment of the amount of punitive damages that should be awarded. They then deliberated with their fellow mock jurors and made collective determinations on the same issues.
SSK found two interesting things.
First, in relation to the 0-8 “level of punishment” judgments, there was a group-polarization dynamic. Group panels tended to reach punishment-level judgments that were less severe than those of their median members in cases that presented relatively less egregious behavior. In case that presented relatively more egregious behavior, they tended to reach punishment-level judgments that were more severe than those of their median members.
Yet second, in all cases, there was a “severity shift” in the dollar amount of punitive damages awarded. That is, in both the less egregious and the more egregious cases, the jury panels tended to agree on damage awards larger than the one favored by their median members—and indeed, in many cases, larger than the biggest one favored by any individual jury member before deliberation.
This is just plain weird, right? I mean, the damages awards got bigger relative to what individual jurors favored on average even in the cases in which the panels’ deliberations produced a “punishment level” assessment that was less severe than of the median member of the panel!
As SSK show, moreover, the resulting punitive awards displayed a massive amount of variability. SSK don't supply any graphic displays of the distributions (the biggest shortcoming of the paper, in my view), but they do supply enough information in tabular form to demonstrate that the distribution of awards was massively right skewed.
Indeed, SSK gravely rehearse just how severely the variability generated by the dynamics they uncovered would hamper the efforts of parties to predict the outcome of cases, something that generally is bad for the rational operation of law and for the decisionmaking of people who have to live with it.
But I have to be honest: I’m not 100% sure they really made the case on unpredictability.
They argued that it’s really difficult to pin down the likely outcome if one is drawing results randomly from a massively skewed distribution. But they didn’t show that someone who knows about the dynamics they uncovered would be unable to use that information to improve his or her predictions of likely case outcomes.
For sure, those dynamics involved some pretty whacky shit at the micro-level—in the formation of individual jury verdicts.
But the question is whether the resulting macro-level pattern of judgments admit of statistical explanation based on the available information.
That information consists of the “punishment level” ratings of the individual jurors and the 6-member panels; what was the relationship between those and the resulting punitive verdicts?
SSK don’t say anything about that!
Just for fun, I created a little simulation (here’s the Stata code) to see if it might at least be possible that something that looked as whacky as what SSK observed might still be amenable to a measure of statistical discipline.
In the simulation, I created 3000 jurors, each of whose members, like SSK’s subjects, individually rated a “case” on a 0-8 “punishment level” scale.
I then put the jurors on 500 juries, whose members, like SSK’s subjects, evinced (by design, of course) a group-polarization effect in their collective “punishment level” judgments.
Then, to generate massively skewed punitive awards like SSK’s, I multiplied those jury-level “punishment level” judgments by a factor drawn randomly from a randomly generated and massively right skewed distribution of values. The resulting array of punitive awards looked just as chaotically lopsided as SSK’s:
Nevertheless, when I regressed the damage awards on the jury verdicts I was able to explain 33% of the variance. Not bad!
I was able to do even better–40% of the variance explained—when I regressed the log-transformed values on the verdicts, a conventional statistical trick when one is dealing with right-skewed data.
This result turned out to be very much in line with observational studies, which suggest that a simple model that regresses punitive awards on compensatory ones can explain about over half the variance in punitive judgments (Eisenberg et al. 1997)!
Practically speaking, then, there’s potentially still a lot one can do to predict results even in a world as whacky as SSK’s. All a lawyer would have to be able to do to make such predictions is form a reasonable estimate of the punishment-level assessment jurors would make of particular case, and then he or she would be able to give advice reflecting an analysis that reduces the variance in the resulting punitive damage awards by 40%.
Making the punishment-level estimate, moreover, wouldn’t be that hard. SSK demonstrated that, unlike their damage-award judgments, the study subjects’ 0-8 punishment level assessments displayed a remarkable degree of coherence. People basically agreed, in other words, how egregious the behavior in the experiment's 15 cases was.
An experienced lawyer would thus likely be able to intuit “how bad” an average juror would think the behavior in such a case was. And if the lawyer were really on the ball, then he or she could fortify his or her judgment with the results of an mock juror experiment that solicited 150 or so mock jurors’ assessments.
I definitely can’t be sure that the data in the SSK experiment would be as well behaved as my simulated data were, of course.
But I think we can be sure that looking inside the kitchen door of individual juries’ deliberations is not actually the right way to figure out how predictable their judgements are. One has to take a nice statistical bite of the results and see how much variance one can digest!
But that said, SSK definitely is in the running for my “coolest biased-deliberation study I’ve ever read” award. . . .
Eisenberg, T., Goerdt, J., Ostrom, B., Rottman, D. & Wells, M.T. The predictability of punitive damages. The Journal of Legal Studies 26, 623-661 (1997).