So I finished Nate Silver’s Signal & the Noise a couple days ago. I loved it!
n fact, it managed, very unexpectedly, to sneak in as a late entrant and catch what looked like the sure winner for “my personal favorite book of the year,” Sharon Bertsch McGrayne’s The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. The two ended up in a dead heat, I’d say (super honorable mention to George Dyson’s Turing’s Cathedral & James Gleick’s The Information).
McGrayne is the better story teller. Silver's not bad in that regard but the reason he managed to score so high is that his book contains some genuine elements of wisdom. Or at least I thought so!
Over the course of three posts (I’m guessing), I’m going to feature three of the nuggets of insight that I spotted in the book.
The first has to do with the important parallels between earthquakes and terrorist attacks.
As likely everyone—whether or not they’ve read the book—knows, Signal is about the simultaneous indispensability and fallibility of data-driven forecasting. The book essentially uses certain cases as prototypes for what forecasting can and can’t do, when it does and doesn’t work, and why.
Baseball, e.g., works well because there are so much data and the outcomes of interest so straightforward.
Economic forecasting is absurdly bad, in large part because of the bad statistical hygiene of economists, who are theoretically dogmatic, enamored of post-hoc story-telling (they use a statistical tool for this: “overfitting”), and uninterested in iterative calibration of their models.
Another prototype is forecasting of earthquake occurrence. Pretty much can’t be done: there aren’t enough events to form and test models; the underlying dynamics are likely chaotic—if not technically so, then practically, given the high number of interacting mechanisms; and measurement of those mechanisms is confined to exceedingly crude proxies.
But the kind of forecasting that can’t be done has to do with predicting the timing of earthquakes, particularly ones of significant magnitude.
What can be predicted, however, is the likely scale and frequency of earthquakes in particular locales.
The reason that can be done (and actually, one of the reasons that precise forecasting of timing probably can’t be) is that earthquake magnitude and frequency reflect a “power law” distribution.
In a power-law distribution, there will be a predictable incidence of extreme values—ones many more standard deviations from the mean than you’d ever expect to see in a normal, Gaussian or “bell-shaped” distribution. The classic illustration is height, which is normally distributed, vs. wealth, which reflects a power-law distribution: you’ll never find even one person 20 SDs taller than average (10.5', or so), whereas people who are 20 SDs wealthier than average (net worth of, oh, $2 billion,) or even 100 SDs ($10 billion, more or less), while not commonplace, definitely are out there (for this reason, it is not very meaningful to talk about "standard deviations" in wealth).
Well, if you have enough data to estimate the key parameters of a power law, you can form a reasonably precise estimate of the frequency of extreme values relative to more typical, more common-place ones.
One can do this for earthquakes, it turns out. The Gutenberg-Richter law characterizes the power-law distribution for earthquakes and makes it possible, based on the frequency of relatively “mild” earthquakes to conclude that San Francisco can be expected to have a “major” one (> 6.75 on Richter scale) once ever 30 years, whereas New York will have a major one on average, once every 12,000 years (Signal, p. 150).
This is useful information. It doesn’t tell you when you are likely to have a major earth quake. But it tells you that if you live in San Francisco you are an idiot not to plan for one, whereas if you live in New York, you’d likely be out of your mind to insist that every building be able to able to handle the same impact.
Okay, so Silver goes through all of this. But later in the book he uses this information to say something pretty insightful (to me!) about terrorism.
Terrorist incidents, Silver observes, reflect a power law distribution, too (pp. 430-31). From the number of relatively “small” incidents in a given locale, one can estimate, with a reasonable degree of precision, the expected frequency of progressively “larger” ones.
Could 9/11 have been “predicted”? As to when & where—probably not (Silver hedges on this; I think he is guilty of a bit of hindsight bias). But as to whether? For sure.
Applying a power-law model with data compiled by Aaron Clauset, Silver concludes that, as of Sept. 10, 2001, “a September 11-scale attack” could have been expected to occur on average “about once every eighty years in a NATO country” (p. 432), which (by my math) comes out to over a 10% chance in a given decade.
Updating the power-law estimate to incorporate this the new information associated with the occurrence of 9/11 itself, Silver reports that we now have reason to expect a 9/11-scale attack in a NATO country once every 40 years. That’s close to a 25% chance every decade (p. 432).
He also now estimates that there is a “10 percent chance of an attack that would kill at least 10,000 people in a NATO country,” a “3 percent change of an attack that would kill 100,000, and a 0.6 percent chance of one that would kill one million or more” in the next decade (p. 437).
The estimates, like any ones generated by Bayesian techniques, are provisional, and here are appropriately qualified in light of the relative spareness of the data and also various judgment calls about how to define the relevant class of events.
But the point is we know more with this information than we would without it. Just as the Gutenberg-Richter law supplies information that can help determine the appropriate level of preparedness for earthquakes, this Clauset-Silver terrorism power law supplies us with information we can use to engage in more rational planning for terrorism.
There will (predictably) be many many many times more smaller terrorist incidents than “big” ones—just as there will be many many times more minor than major earthquakes.
But for both quakes as for terrorist attacks, the destructive energy of progressively “larger” ones increases exponentially relative to their diminishing frequency.
We might expect, say, 10 magnitude 6 quakes for every magnitude 7, but the magnitude 7 quake (according to Silver, pp. 437-38) releases 1,000x more energy than each of the 6’s and in the absence of appropriate preparation cause much more damage than all the 6’s combined.
Similarly, in NATO countries, there were scores of terrorist acts smaller than 9/11—a “magnitude 7” attack--in the three decades that led up to it, but the 9/11 death toll was higher than that for all of those combined (p. 438).
If you have limited resources, it makes sense to invest them in the way that minimizes the expected harm by as great an amount as possible.
San Francisco should be devoting much more resources to bracing for the magnitude 7 quake than it does trying to prepare for the damage associated with the much more frequent but much less damaging smaller ones.
We devote a huge amount of resources to trying to deter or preempt small-scale attacks. Fine. But a blown up commuter-rail train or subway station will likely kill only a few dozen people, and even the downing of a commercial airliner only a few hundred.
A “magnitude 8” or even “9”-- the release of a chemical or biological agent or the detonation of a nuclear device in an urban area-- has the potential to kill 10,000s or even 100,000s and, while much less likely seems well within the range of power-law range of possibility.
Certainly we should—and I’m sure are-- doing a lot to try to avert the latter sort of attack. But trying to detect the precursors of such a thing—so that we can intervene and avert it—is very difficult; it is a lot like trying to predict when a magnitude 7 or 8 or 9 earthquake is going to occur.
Nevertheless, we should be preparing for a magnitude 7 or 8 or 9 terrorist attack, in the same way that we do prepare for magnitude 7 and above earthquakes in the places where they can be expected to occur. Cities like San Francisco mandate that structures be built to withstand major quakes and as a result only scores of people die in ones that kill 10,000s in countries like Iran and Armenia.
A civil preparedness capacity to respond to a magnitude 8 or 9 terrorist attack could likewise make a difference in orders of magnitude in the number of people killed by it.
We don’t have such a regime. Relatively little of the money that the Department of Homeland Security has doled out since its inception have been used for large-scale crisis-response planning. Moreover, major cities like Los Angeles and New York make what many would regard as only half-hearted efforts to prepare their emergency-response capacities, in part because things like mock evacuations and similar drills tend to be viewed as disruptive and also anxiety-provoking.
Instead, as Silver notes, the post-911 world is pervaded by various forms of “security theatre”— cumbersome screening procedures and goofy alert systems that gratify a public demand for action but are unlikely to contribute anything to public safety.
After the (completely inept) “underwear bomber” was foiled in Christmas 2009, President Obama responded to the media frenzy by upbraiding our intelligence services for “failing to connect the dots,” and “order[ed] an immediate effort to strengthen the criteria used to add individuals to our terrorist watchlists, especially the ‘no-fly’ list”—a device that experts view as too bloated to be effective and as imposing needless burdens on travelers and security personnel alike.
A guy as smart as Obama doesn't actually need Nate Silver to tell him that you don’t try to improve your ability to pick up a signal by adding noise.
Rather, what he and the rest of us need is a science of science communication that would make as as smart about managing the political-psychology dynamics of risk perception as Silver is about data-driven forecasting. If we lack the former, the substantial contribution the latter can make to securing the public welfare is wasted.