## Holy smokes! The "'hot-hand fallacy' fallacy"!

It' super-duper easy to demonstrate that individuals of low to moderate *Numeracy --*an information-processing disposition that consists in the capacity & motivation to engage in quantitative reasoning -- are prone to all manner of biases--like "denominator neglect," "confirmation bias," "covariance [non]detection," the "conjunction fallacy," etc.

It's harder, but not impossible, to show that individuals *high* in Numeracy are more prone to biased reasoning under particular conditions.

In one such study, Ellen Peters and her colleagues did an experiment in which subjects evaluated the attractiveness of proposed wagers.

For one group of subjects, the proposed wager involved outcomes of a positive sum & nothing, with respective probabilities adding to 1.

For another group, the proposed wager had a slightly *lower* positive expected value and proposed outcomes were a positive sum & a*negative* sum (again with respective probabilities adding to 1).

Because the second wager had a lower expected value, and added "loss aversion" to boot, one might have expected subjects to view the first as more attractive.

But in fact subjects low in Numeracy ranked the two comparable in attractiveness. Maybe they couldn't do the math to figure out the EVs.

But the real surprise was that among subjects *high* in Numeracy, the second wager-- the one that coupled a potential gain and a potential loss-- was rated as being substantially *more *attractive than the first -- the one that coupled a potential gain with a potential outcome of zero and a higher expected EV.

Go figure!

This result, which is hard to make sense of if we assume that people generally prefer to maximize their wealth, fit Peters et al.'s hypothesis that the cognitive proficiency associated with high Numeracy guides decisionmaking through its influence in calibrating affective perceptions.

Because those high in Numeracy literally *feel* the significance of quantitative information, the necessity of doing the computations necessary to evaluate the second wager, Peters et al. surmised, would generate a more intense experience of positive affect for them than would the process of evaluating the first wager, the positive expected value of which can be seen without doing any math at all. Lacking the same sort of emotional connection to quantitative information, the subjects low in Numeracy wouldn't perceive much difference between the two wagers.

Veeeeery interesting.

But can we find real-world examples of biases in quantitative information-processing distinctive to individuals high in Numeracy? Being able to is important not only to show that the Peters et. al result has "practical" significance but also show that it is *valid*. Their account of what they expected to and did find hangs together, but as always there are alternative explanations for their results. We'd have more reason to credit the explanation they gave-- that high Numeracy can actually cause individuals to make *mistakes* in quantitative reasoning that low Numeracy ones wouldn't -- in the real world.

That way of thinking is an instance of the principle of *convergent validity: *because we can never be "certain" that the inference we are drawing from an empirical finding isn't an artifact of some peculiarity of the study design, the corroboration of that finding by an empirical study using *different methods* -- ones not subject to whatever potential defect diminished our confidence in the first -- will supply us with more reason to treat the first finding as valid.

Indeed, the confidence enhancement will be reciprocal: because there will always be some alternative explanation for the findings associated with the second method, too, the concordance of the results reached via those means with the results generated by whatever method informed the *first* study gives us more reason to credit the inference we are drawing from the second.

Okay, so now we have some realllllllly cool "real world" evidence of the distinctive vulnerability of high Numeracy types to a certain form of quantitative-reasoning bias.

It comes in a paper, the existence of which I was alerted to in the blog of stats legend (& former Freud expert) Andrew Gelman, that examines the probability that we'll observe the immediate recurrence of an outcome if we examine some sequence of binary outcomes generated by a process in which the outcomes are independent of one another-- e.g., of getting "heads" again after one getting "heads" rather than "tails" in the previous flip of a fair coin.

We all know that if the events are independent, then obviously the probability of the previous event recurring is *exactly the same* as the probability that it would occur in the first place.

So if someone flipped a coin 100 times, & we then examined her meticulously recorded results, we'd discover the probability that she got "heads" after any particular flip of "heads" was 0.50, the same as it would be had she gotten "tails" in the previous flip.

Indeed, only *real dummies* don't get this! The idea that the probability of independent events is influenced by the occurrence of past events is one of the mistakes that those low to moderate Numeracy dolts make!

They (i.e., most people) think that if a string of "heads" comes up in a "fair" coin toss (we shouldn't care if the coin is fair; but that's another stats legend /former Freud expert Andrew Gelman blog post), then the probability we'll observe "heads" on the next toss goes down, and the probability that we'll observe "tails" goes up. Not!

Only a true *moron*, then, would think that if we looked at a past series of coin flips, the probability of a "heads" after a "heads" would be lower than the probability of a "heads" after a "tail"! Ha ha ha ha ha! I want to play that dope in poker! Ha ha ha!

Um ... not so fast, say Miller & Sanjurjo in their working paper, "Surprised by the Gambler’s and Hot Hand Fallacies? A Truth in the Law of Small Numbers."

The "assumption that in a sequence of coin tosses, the relative frequency of heads on those flips that immediately follow a streak of heads is expected to be equal to the relative frequency of heads on those flips that immediately follow a streak of tails" is "seemingly correct, but mistaken" (p. 19).

Yeah, right.

"We prove," M&S announce (p. 22),

that in a finite sequence generated by repeated trials of a Bernoulli random variable the expected conditional relative frequency of successes, on those realizations that immediately follow a streak of successes, is strictly less than the fixed probability of success.

What? (I'm asking myself this as the same time you are asking me). "That can't possibly be the case"!

You'll feel like someone is scratching his fingers on a chalkboard as you do it, but read the first 6 pages of their paper (two or three times if you can't believe what you conclude the first time) & you'll be convinced this is true.

Can I explain this really counterintuitive (for high Numeracy people, at least) result in conceptual terms? Not sure but I'll try!

If we flip a coin a "bunch" of times, we'll get roughly 0.50 "heads" & 0.50 "tails" (it will land on its edge 10^-6 of the time). But if we go back & count the "heads" that came up *only* after a flip of "heads," we'll come up w/ less than 0.5 x 1 "bunch."

If we look at any sequence in the "bunch," there will be some runs of "heads" in there. Consider T*HH*THTTTHT*HHH*THT..." In this sequence of 16, there were (conveniently!) 8 "heads" & 8 "tails." But only 3 of the 8 (conveniently!) occurred after a previous flip of "heads"; 5 of the 8 ocurred after a flip of "tails."

In this sample, then, the probability of getting "heads" *again* after getting "head"s on the previous flip was not 0.5. It was 3/8 or .375 or ... about 0.4!

You might wonder (because for sure you are searching for the flaw in the reasoning) that this result was just a consequence of the sequence I happened to "report" for my (*N* = 16) "experiment."

You'd *not* be wrong to respond that way!

*But* if you think hard enough & start to play around with the general point --that we are looking at the history of at a *past* sequence of coin tosses -- you'll see (eventually!) that the probability of "heads" in the sample that occur *after* a previous "heads" (not to mention "several" heads in a row!) always is lower than the overall probability that any particular flip in that sequence was "heads."

That indeed it *has *to be.

What will you be seeing/feeling when you "get" this? Perhaps this:

- Imagine I perform 100 coin tosses and observe 50 "heads" and 50 "tails." (No problem so far, right?)
- If I now observe the recorded sequence and begin to count backwards from 50 every time I see a "heads," I'll always know how many "heads" remain in the sequence. (Still okay? Good.)
- Necessarily, the number goes down by 1 every time I see a "heads" in the sequence.
- And necessarily the number does
*not*go down -- it stays the same -- every time I see a "tails" in the sequence. - From this we can deduce that the probability that the next flip in the sequence will be a "heads" is
*always lower*if the previous flip was a "heads" than if it was a "tails." - Oh, btw, steps 2-5 still apply if you happened to get 51 "heads," or 48 or 55 or whatever, in your 100 tosses. Think about it!

At this point you are saying, um, "now I'm not sure anymore"; go through that again. Okay...

But here is the *really cool & important thing*: M&S show that the *methodology* used in literature examining the so-called "hot hand fallacy" doesn't reflect this logic.

Those studies have been understood to "debunk" the common perception that basketball players go through "hot streaks" during which it makes sense for others to expect them to achieve a level of shooting success that exceeds their usual or average level of success.

The researchers who purported to "debunk" the perception of "hot hands" report that if one examines game data, the probability of players making a shot after making a specified number of shots in a row is roughly their average level of success. Just as one would expect if shots are independent events-- so there's no "hot hand" in reality--only in our fallible, error-prone minds!

But this method of analyzing the data, M&S demonstrate, is wrong.

It overlooks that, "by conditioning on a streak of hits within a sequence of finite length, one creates a selection bias towards observing shots that are misses" (p. 19).

Yeah, that's what I was trying to say!

**So if the data show, as the "hot hand fallacy" researchers found, that the probability a player would make his or her next shot after making a specified number in a row was the same as the probability that he or she would make a shot overall, their data, contrary to their conclusion, support the inference that players do indeed enjoy "hot streaks" longer than one would expect to observe by chance in a genuinely random process (& necessarily, somewhere along the line, "cold streaks" longer than one would expect by chance too)**.

I'm sold!

But for me, the amazing thing is not the cool math but the demonstration, w/ real world evidence, of high Numeracy people being distinctively prone to a bias in quantitative reasoning.

The evidence consists in the mistake made by the authors of the original "hot hand" studies and repeated by 100s or even 1000s (tens of thousands?) of decision science researchers who have long celebrated these classic studies and held them forward was a paradigmatic example of the fallibility of human perception.

As M&S point out, this was a mistake that we *would expect* only a high Numeracy person to make. A *low* Numeracy person is more prone to believe that independent events are not independent; that's what the "gambler's fallacy" is about.

Someone who *gets* why the gambler's fallacy is a fallacy will *feel* that the way in which "hot hand fallacy" researchers analyzed their data was obviously correct: because events that are independent occur with the same probability irrespective of past outcomes, it *seems* to make perfect sense to test the "hot hand" claim by examining whether players' shooting proficiency immediately after making a shot differs significantly from their proficiency immediately after missing.

But in fact, that's not the right test! Seriously, it's *not*! But it really really really seems like it is to people whose *feelings* of correctness have been shaped in accord with the basic logic of probability theory--i.e., to *high* Numeracy people! (I myself still can't really accept this even though I accept it!)

That's what Peters says happens when people become more Numerate: they develop affective perceptions attuned to sound inferences from quantitative information. Those affective perceptions help to alert high Numeracy people to the traps that low Numeracy ones are distinctively vulnerable to.

But they can create their own traps -- they come with their own affective "Sirens," luring the highly Numerate to certain nearl-irresitible but *wrong* inferences....

Holy smokes!

M&S don't make a lot of this particular implication of their paper. That's okay-- they like probability theory, I like cognition!

But they definitely aren't oblivious to it.

On the contrary, they actually propose-- in a casual way in a footnote (p. 2, n.2)-- a *really cool *experiment that could be used to test the hypothesis that the "'hot hand fallacy' fallacy" is one that *high* Numerate individuals are more vulnerable to than low ones:

Similarly, it is easy to construct betting games that act as money pumps while defying intuition. For example, we can offer the following lottery at a $5 ticket price: a fair coin will be flipped 4 times. if the relative frequency of heads on flips that immediately follow a heads is greater than 0.5 then the ticket pays $10; if the relative frequency is less than 0.5 then the ticket pays $0; if the relative frequency is exactly equal to 0.5, or if no flip is immediately preceded by a heads, then a new sequence of 4 flips is generated. While, intuitively, it seems like the expected payout of this ticket is $0, it is actually $-0.71 (see Table 1).

Curiously, this betting game may be more attractive to someone who believes in the independence of coin flips, rather that someone who holds the Gambler’s fallacy.

If someone did that study & got the result-- high Numeracy taking the bet more often than low--we'd have "convergent validation" of the inference I am drawing from M&S's paper, which I now am treating (for evidentiary purposes) as part of a *case study* in how those who know a lot can make distinctive -- spectacular, colossal even! -- errors.

But my whole point is that M&S's paper, by flushing this real-world mistake out of hiding, convergently validates the experimental work of Peters et al.

But for sure, more experiments should be done! Because empirical proof never "proves" anything; it only gives us more reason than we otherwise would have had for believing one thing rather than another to be true....

Two last points:

1. The gambler's fallacy *is* still a fallacy! Coin tosses are independent events; getting "heads" on one flip doesn't mean that one is "less likely" to get "heads" on the next.

The gambler's fallacy concerns the tendency of people mistakenly to treat independent events as non-independent when they make predictions about future events.

The " 'hot hand fallacy' fallacy" -- let's call it--involves expecting the probability that binary outcomes will immediately recur is the same as the probability that they will occur on average in the sample. That's a* logical error* that reflects failing to detect a defect in the inference strategy reflected in the "hot-hand" studies.

Indeed, the same kind of defect in reasoning can explain *why *the gambler's fallacy is so prevalent -- or at least M&S surmise.

In the world, when we see independent events occurring, we observe or collect data in relatively short bursts -- let's call them “attention span” units (M&S present some data on self-reports of the longest series of coin tosses observed: the mean was a mere 6; strange, because I would have guessed every person flipped a coin at least 1000 times in a row at some point during his or her childhood!). If, in effect, we "sample" all the sequences recorded during “attention span” units, we'll observe that in fact the recurrence of an outcome immediately after it occurred was generally less than the probability it would occur on average.

That's correct.

But it's not correct to infer from such experience that, in any future sequence, the probability of that event recurring will be lower than the probability of it ocurring in the first place. *That's* the gambler's fallacy.

The "'hot hand fallacy' fallacy" invovles not noticing that correcting the logical error in the gambler's fallacy * does not imply *that if we examine a

*past*sequence of coin tosses, we should expect to observe that "heads" came up just as often immedately after one or more "tails" than it did immediately after one or more "heads."

Ack! I find myself not believing this even though I know it's true!

2. Is "motivated numeracy" an instance of a bias that is more prevalent among *high* Numeracy persons?

That depends!

"Motivated Numeracy" is the label that my collaborators-- who include Ellen Peters -- & I give to the tendency of individuals who are high in Numeracy to display a *higher* level of motivated reasoning in analyzing quantitative information. We present experimental evidence of this phenomenon in the form of a covariance-detection task in which high-Numeracy partisans were more likely to construe (fictional) gun control data in a manner consistent with their ideological predispositions than low-Numeracy partisans.

The reason was that the low-partisan subjects couldn't *reason well enough* with quantitative information to *recognize* when the data were and weren't consistent with their ideological predispositions. The high-Numeracy subjects could do that, and so never failed to credit predispositions-affirming evidence or to explain away predisposition-confounding evidence.

But whether that's a bias depends on what you think people are trying to *do* when they reason about societal risks. If they are trying to get the "right answer," then yes, Motivated Numeracy is a bias.

But if they are trying to form identity-congruent beliefs for the sake of conveying their membership in and loyalty to important affinitty groups, the answer is *no*; motivated Numeracy is an example of how one can do an even better job of that form of *rational information processing *if one is high in Numeracy.

I think the latter interpretation is right ... I guess ... hmmmm.... "Now I'm not sure anymore..."

But I *am* sure that the "hot hand" study authors, and all those who have celebrated their studies, *were* really trying to get the right answer.

They didn't, because their *high *Numeracy tempted them to error.

p.s. I'll bet $10^3 against this, but if someone proves the paper wrong, the example of high Numeracy subjects being led to error by an argument only *they* could be seduced by still holds!

Oy!

I revised this post to remove material that @Mw points out incorrectly implied the probability of recurrence of "heads" was 0.4 no matter how long the "finite sequence" of coin flips. That's* not *what the authors say. The authors' point is only that within an observed sample of sequences, the probability of "heads" recurring immediately after 1 or more "heads" is *less* than 0.5. As the authors state & as @Mw correctly emphasizes, how *much* less than 0.5 that probability is goes down as the "finite sequence" is increased in length; "0.4" was offered by the authors as an illustration, reflecting the average probability of "heads" immediately recurring after one or or more "heads" in sequences of 4 tosses.

Obviously, the fate of the "hot hand" thesis --that basketball players or other athletes *do* display "streaks" of superior performance that exceed what we'd expect to observe by chance--depends on exactly how much the analytic strategy used in the "hot hand fallacy" studies overstated the true chance-probability of observing a successful shot (or other unit of performance) following a previous string of successes. Figuring that out invovles the sort of math that would gie both GI Joe & Barbie severe migraines, in no small part b/c both the duration of the observed periods of performance and the number of immediately preceding "successes" necessary for a string of outcomes to count as a "hot streak" are judgment calls.

Conceivably, once the right chance-probability baseline is applied to the data, the results will still show that the perception of "hot hands" is an illusion. Even though I've followed this M&S pied piper duo this far, I'm still unsure who is right about how to "fix" the defect in the original "hot hand" studies.

But however that debate turns out, it will remain the case that those classic studies displayed the "'hot hand' fallacy' fallacy" -- the assumption that if we examine a finite sequence of past coin tosses (or other events generated by a random binary process), we should expect the probability of flipping "heads" (or the like) immediately after a consecutive string of "heads" to be the same as the probability of flipping "heads" after a consecutive string of "tails" & the same as the probability of flipping "heads" *irrespective *of what the previous toss yielded.

Unless, of course, M&S (along with those who've they've now managed to persuade) are wrong!

More here-- including cool gadgets!

## Reader Comments (4)

>I would have guessed every person flipped a coin at least 1000 times in a row at some point during his or her childhood!

Perhaps you did, but I very much doubt I exceeded 100x, more like 10-20x. I'd guess I'm closer to average, but this is hardly a representative sample. Does max childhood coin flips correlate with risk attitudes?

Quick comment on the paper itself: I don't read it as saying that "if someone has been flipping coins all day long, when we examine the recorded sequence of outcomes we'll find that the probability of 'heads' recurring after an immediate flip of 'heads' is ... oh, about 0.4." The "0.4" is only for a sequence of four flips. The more times you do four flips, the more you'll converge on the average conditional probability of heads given a previous heads -- where each 4-flip is weighted equally -- being 0.4. The longer your sequence is, the closer you'll get to p, or 0.5. I just simulated 500 flips on Excel, and of the Hs, 48% were followed by another H. And if you look at Table 1, half of the Hs are followed by another H, half are followed by a T (consistent with gambler's fallacy being a fallacy). It's only when you look at the average of each group of four that you get the lower number. That all sound right?

Also, is this really numerate people being drawn to a bias, per se? Or is it just a little bit of knowledge being dangerous? In the Peters et al. study, as you describe it, the low numeracy subjects behaved more or less rationally, and (importantly) for the right reasons. The high numeracy subjects' deviated from this because they got extra excitement from the math. (Let me know if I'm reading that incorrectly.) Here, low numeracy people aren't doing anything particularly right. They've just lucked their way into the right answer, as opposed to the high numeracy people who are only thinking through the problem part way. And yes, affect is involved, but it doesn't seem to be disrupting what would otherwise be a rational process, no? (This seems more like: Person 1 doesn't know any spelling rules, writes "weird," Person 2 proudly learns "i before e, except after c," writes "wierd" for years and has mind blown when Person 3 says it's "weird.") Or maybe the hypothesis is something simpler?

@Mw--

I agree!

The authors don't argue that P(heads|heads on previous toss) is 0.4 regardless of the length of the observed sequence; only that it is always less than P(heads), or 0.5. I have edited the post to make it clear that 0.4 was just the average P(H|H on previous toss) for sequences of 4 tosses.

Actually, I think you were commenting at the same time I was editing: a commenter on the Gelman site didn't recognize that the 0.4 was offered by the authors as an avg for sequences of 4, and it occurred to me that that anyone reading my post *for sure* wouldn't have recognized that & for exactly the reasons you state (I hope I didn't mislead *him*)!

Agree too that the upshot of the "'hot hand fallacy' fallacy" is *not* that low to moderate Numeracy individuals are "less" prone to bias, either generally *or* in the very sort of task at issue here.

On the contrary, my point is that the "'hot hand fallacy' fallacy" is one that can be made only by those sufficiently Numerate to resist an even more common misfire-- the "gambler's fallacy."

The same would be true for Motivated Numeracy (if one consdiers that a bias): indivudals of low & modeate Numeracy wouldn't be able to recognize covariance *selectively* enough to be as biased as the high Numeracy subjects in our experiment were.

ack^2!

The link to the M&S paper is broken. Should be

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2627354