Numbers Rule Your World - novelonlinefull.com
You’re read light novel Numbers Rule Your World Part 4 online at NovelOnlineFull.com. Please use the follow button to get notification about the latest chapter next time when you visit NovelOnlineFull.com. Use F11 button to read novel in full-screen(PC only). Drop by anytime you want to read free – fast – latest novel. It’s great if you could leave a comment, share your opinion about the new chapters, new novel with others on the internet. We’ll do our best to bring you the finest, latest novel everyday. Enjoy
Grogan: Is your last name Canseco? [irrelevant]
Canseco: Yes.
Grogan: Did you ever inject Mark McGwire with steroids or human growth hormones? [relevant]
Canseco: Yes.
Grogan: In the last ten years, have you lied to benefit yourself financially? [control]
Canseco: No.
Grogan: Is your shirt black? [irrelevant]
Canseco: Yes.
And on it went. Grogan looked for any difference in emotions when Canseco answered relevant versus control questions. "Control" questions concern vague and broad categories of wrongdoing, such as office theft and white lies, designed to make even truthful subjects experience discomfort. Liars are supposed to feel greater anxiety toward the relevant questions, while truth tellers are expected to be bothered more by control questions.
Grogan did not equivocate regarding Canseco's performance: "He's one hundred percent telling the truth on all questions regarding human growth hormones and steroids. And the computer misses nothing, not the most smallest, insignificant tracings. . . . It gave him a .01 score on every chart, which, if this was in school, would be an A-plus on every chart collected." Take that, baseball!
Many other athletes also tried to clear their names via polygraph tests. The lawyers of superstar sprinter Marion Jones, in a bid to fend off persistent rumors of her steroid use, declared that she had pa.s.sed a polygraph test. Immediately, they challenged Jones's accuser-in-chief, BALCO founder Victor Conte, to submit to a polygraph himself (he never did). They taunted, "It is easy to go on national television and . . . make 'false, malicious and misleading' statements designed to do harm to Ms. Jones' character and reputation. However, it is quite another matter to take a polygraph examination that will test whether one is truthful or untruthful." When evergreen, superstar pitcher Roger Clemens found his name featured in Senator Mitch.e.l.l's report, he angrily denied the implication, and he told Mike Wallace, host of "60 Minutes," he might take a polygraph test to prove his innocence (but later recanted).
Polygraph evidence has a following not only among sports icons but also among politicians, celebrities, and business leaders. Disgraced Enron CEO Jeff Skilling publicized a favorable polygraph test to b.u.t.tress his a.s.sertion that he had played no role in the shady dealings that led to the apocalyptic collapse of the energy giant and wiped out the retirement savings of thousands of employees. A cousin of J. K. Rowling took a lie detector test, broadcast on American television, to prove (in vain) that he had inspired the Potter character in her Harry Potter novels. Larry Sinclair, a Minnesota man who claimed to have shared a bed with then presidential candidate Barack Obama, infamously failed a polygraph challenge sponsored by Whitehouse.com. Larry Flynt introduced polygraph evidence to show that a New Orleans prost.i.tute was telling the truth when she revealed having an extramarital affair with Senator David Vitter.
It may therefore be a surprise to find out that U.S. courts of law have long considered polygraphs inadmissible as evidence, ever since Marston attempted to set the precedent and failed in the 1920s. The standard has been loosened slightly in recent years in selected jurisdictions. Yet lawyers continue to seize headlines with lie detection results. One reason is that the public appears to trust polygraphs. The popular sentiment was underscored by the unpredicted success of the game show "The Moment of Truth," in which contestants chained to polygraph machines were dealt embarra.s.sing questions about personal relationships, petty crimes, and sundry private matters. In a notable episode, the audience applauded as the wife of a New York police officer publicly admitted to infidelity, a statement confirmed as true by the polygraph. After debuting on Fox in January 2008, the show ended the season as the most watched new show on network television, averaging 14.6 million viewers. Jose Canseco reportedly threw his cap in the ring to face the examiner on the popular show in his unyielding quest for credibility (though such a show was never aired).
The enthusiastic, widespread usage of lie detectors runs counter to their unaccredited status in the American legal system. In the 1920s, the courts introduced a litmus test of "general acceptance," which excluded evidence from polygraphs unless and until the science attained sufficient validation. Almost a century came and went with anemic progress: the scientific community has periodically reviewed available research and repeatedly warned the public that polygraphs make too many errors to be reliable, particularly when used to screen people. Comprehensive reports in 2002 and 1983 were barely distinguishable in their executive findings. Meanwhile, legislators sent mixed messages on the issue: Congress pa.s.sed the Employee Polygraph Protection Act of 1988, prohibiting American companies from conducting polygraph screening tests on potential or current employees, but it has not restrained government agencies or the police. And in 2008, Congress took a pa.s.s on scrutinizing the PCa.s.s (Preliminary Credibility a.s.sessment Screening System) after learning the portable lie detector was to be deployed in Iraq and Afghanistan.
Despite the lack of judicial or scientific standing, the FBI, the CIA, and the vast majority of local police forces routinely use polygraphs in criminal investigations. They utilize lie detectors indirectly, as a means to coerce suspects into making confessions. T. V. O'Malley, president of the American Polygraph a.s.sociation, has compared a polygraph examination to "confessing to a priest: you feel a little better by getting rid of your baggage." Confession evidence holds awesome power in the courtroom; a noted legal scholar believes it "makes the other aspects of a trial superfluous." For this reason, federal and local law enforcement officers regard the polygraph as "the most effective collection tool in their a.r.s.enal of security tools." In the United States, it is legal to obtain confessions via the reporting of false evidence, which means the police are free to tell a suspect he or she failed a lie detector test no matter what happened.
When the U.S. Army approved PCa.s.s in 2007, the portable gadget was intended for security screening (of non-U.S. citizens), rather than targeted investigations. This use is not new; at least ten government ent.i.ties, including the FBI, the CIA, the National Security Agency, the Secret Service, the Department of Energy, the Drug Enforcement Agency, and the Defense Intelligence Agency, as well as most police forces use lie detectors to screen new or current employees. At its peak, the Department of Energy's screening program covered all twenty thousand employees; bowing to pressure from scientists and Congress, the department later cut the list to twenty-three hundred targets who have access to certain "high-risk" programs.
The polygraph's pract.i.tioners and supporters argue that the machine is accurate enough, and certainly more accurate than any alternative. They are convinced that the mere presence of the lie detector intimidates some subjects into telling the truth. Since the real deal is the confession evidence, they don't think accuracy matters as much as the academics say it does. Furthermore, polygraph results have broken open some very difficult cases.
A case in point was the Angela Correa murder in Peekskill, New York. On November 15, 1989, Angela strolled into the woods of Hillcrest Park to snap photographs for school. She never walked out of the park. Two days later, her partially naked body was found, covered in leaves, raped, beaten, strangled, and murdered. She was fifteen years old. We reserve the word evil evil for people who commit such heinous crimes. The police detectives, working quickly, obtained an offender profile from the New York Police Department: they were told to look for a white or Hispanic man, younger than twenty-five and probably under nineteen, shorter than five feet ten inches; someone with a physical handicap or mental slowness; a loner unsure around women but who knew Angela; someone not involved in school activities and with a history of a.s.sault, drugs, and alcohol. for people who commit such heinous crimes. The police detectives, working quickly, obtained an offender profile from the New York Police Department: they were told to look for a white or Hispanic man, younger than twenty-five and probably under nineteen, shorter than five feet ten inches; someone with a physical handicap or mental slowness; a loner unsure around women but who knew Angela; someone not involved in school activities and with a history of a.s.sault, drugs, and alcohol.
From the first week, the detectives did not doubt that a cla.s.smate of Angela's had killed her. They had their eyes on sixteen-year-old Jeffrey Deskovic, who fit the NYPD profile of an introverted, young murderer, and they never looked back. Deskovic was said to have been absent from school at the time of Angela's death. Later, he showed unusual curiosity in the case, even volunteering to be interviewed by detectives without the presence of family, a friend, or a lawyer.
However, the investigation was stalling, not least because the scientific evidence proved wanting-completely negative. None of the three hair samples collected from Angela's body came from Deskovic (the police surmised they came from the medical examiner and his a.s.sistant). No fingerprint of Deskovic's was detected on a ca.s.sette player and tape, bottles, twigs, and other items retrieved near her body. Most exasperating to the police, the DNA in the live sperm swabbed from inside her body did not match Deskovic's and instead specifically excluded him. Nor did the detectives have direct-witness testimony.
Deskovic was interviewed no fewer than seven times during the two-month investigation. He started to act as if he were part of the investigation team, sharing notes with the detectives and drawing maps of the crime scene. The police knew they had the right guy but were frustrated they had scant evidence against him.
So it was the polygraph that saved the day. On January 25, 1990, Deskovic agreed to take a polygraph exam to prove he was telling the truth. Before this day, he had steadfastly maintained his innocence. Early in the morning, he was driven to Brewster, New York, where he was cooped up for eight hours in a ten-foot by ten-foot room, facing in turns Detective McIntyre and Investigator Stephens, who played good cop, bad cop. Eventually, Stephens told Deskovic he had failed the polygraph exam, whereupon a final confrontation ensued in which Deskovic made a confession to McIntyre.
On December 7, 1990, a jury convicted Jeffrey Deskovic of murder in the second degree, rape in the first degree, and criminal possession of a weapon in the fourth degree. On January 18, 1991, he was sent to jail with a sentence of fifteen years to life. The court described it as a "cla.s.sical tragedy." Deskovic ultimately served sixteen years in prison; he was released in September 2006.
Investigator Stephens, like many police officers, regarded the polygraph as a prop to "get the confession." Whereas most courts do not admit polygraph evidence, introducing a suspect's admission of guilt greatly aids a prosecutor's case; researchers have found a conviction rate of about 80 percent among U.S. criminal cases with confession evidence. In the Angela Correa case, Deskovic's confession helped the prosecutors overcome the absence of scientific evidence and witness testimony. Without the polygraph exam, there would have been no confession and thus no conviction.
In Pittsburgh, Carnegie Mellon University statistician Stephen Fienberg listened in disbelief as an MSNBC journalist spoke to him about the newly revealed PCa.s.s, the portable lie detector. The army had already poured over $2.5 million into its development and purchased about a hundred units for troops in Iraq and Afghanistan. Professor Fienberg saw this as utter disdain for the considered opinion of American scientists on the unreliability of lie detection technologies, and of polygraphs in particular. In 2002, he had served as the technical director of the report in which the National Academy of Sciences (NAS) resoundingly rejected the polygraph as an inadequate science, especially for use in national-security screening. The key sentence of the entire report was this one: Given [the polygraph's] level of accuracy, achieving a high probability of identifying individuals who pose major security risks in a population with a very low proportion of such individuals would require setting the test to be so sensitive that hundreds, or even thousands, of innocent individuals would be implicated for every major security violator correctly identified. hundreds, or even thousands, of innocent individuals would be implicated for every major security violator correctly identified.
This ratio of false positives to true positives (hundreds or thousands to one) elegantly captures what the scientists have dubbed the "unacceptable trade-off," which is a variant of the conundrum faced by anti-doping scientists hoping to pick out drug cheats from squads of clean athletes. Here, polygraph examiners must set the sensitivity of their machines so as to balance the benefits of possibly identifying suspicious individuals with the costs of falsely implicating law-abiding citizens. However, different settings merely redistribute errors between false positives and false negatives, not unlike using different thresholds in the hematocrit test. Addressing the flip side of this trade-off, the NAS opined, "The only way to be certain to limit the frequency of false positives is to administer the test in a manner that would almost certainly severely limit the proportion of serious transgressors identified." This science is dismal: as false positives ebb, so false negatives flow.
The 2002 NAS report specifically recommended that the government reduce or rescind the use of polygraphs for employee screening. Yet the investigative reporter from MSNBC unearthed decla.s.sified doc.u.ments disclosing how the army had been aggressively pursuing a new portable polygraph gadget destined for screening use. As a knockoff of the traditional polygraph, PCa.s.s records fewer measurements and is surely less accurate than its model. The crucial role of the examiner is abrogated, replaced by an "objective" computer program that is easily fooled by countermeasures it cannot see, such as breath control and tongue biting. What's more, NAS criticized the Johns Hopkins University lab hired to supply the computer program for being "unresponsive" to repeated requests for technical details, so that the research committee could not complete an independent evaluation of the lab's methodology. Among the uninspiring body of research on the accuracy of PCa.s.s, most studies were conducted by the same people who developed the device itself (conflict of interest, anyone?), and none attempted to replicate the battlefield conditions under which it would be deployed. Despite the lack of serious science to back up the claims of efficacy, Congress has failed to call any hearing on PCa.s.s.
In an act of self-regulation, the army acknowledged the weaknesses of the portable lie detector and proactively restricted its use to screening job applicants at military bases and screening potential insurgents at bomb scenes. As counterintelligence team leader David Thompson explained, the "Reds" (subjects determined to be deceptive) would face follow-on interrogations, most likely via the traditional polygraph examination, rather than immediate consequences. Implicit in this change of policy is the idea that the polygraph will be more acceptable if we set lower expectations-if we let PCa.s.s do half the job. Such a compromise seemed as though it should please the skeptical scientific community: while the decision didn't completely shelve deployment of the flawed technology, the army at least curtailed its role.
Far from being satisfied, Fienberg's panel concluded that while polygraphs are marginally useful for targeted investigations, they are essentially worthless for screening. Alas, against lowered expectations, lie detectors perform even worse! A sports commentator would say the team is playing down to the standard of the opponent. To understand why this must be so, compare the following two situations in which the polygraph attains the 90 percent accuracy level claimed by its supporters: Situation A: ScreeningThe agency believes 10 spies lurk among its 10,000 employees (1 in 1,000).Of the 10 spies, the polygraph correctly identifies 90 percent (9) and pa.s.ses 1 erroneously.Of the remaining 9,990 good employees, the polygraph erroneously fails 10 percent, or 999.For every spy caught, 111 good employees are falsely accused.Situation B: Police LineupThe police seek 20 murderers out of 100 suspects (1 in 5). Of the 20 murderers, the polygraph correctly identifies 90 percent (18) and pa.s.ses 2 erroneously.Of the remaining 80 innocent suspects, the polygraph erroneously fails 10 percent, or 8.For every 9 murderers caught, 4 innocent citizens are falsely accused.
Notice that Situation B offers a dramatically more favorable cost-to-benefit ratio than Situation A: when the lie detector is used in a specific investigation, such as a police lineup, the price of catching each criminal is less than 1 false accusation, but when it is used for screening, the comparable cost is 111 innocents sacrificed.
Given identical accuracy levels in both situations, the real reason for this difference is the divergent ratio of criminals to innocents subject to the tests. Situation A (screening) is more trying because the presence of so many innocents (999 out of 1,000) turns even a small error rate into a bounty of false positives and a roster of ruined careers. The baseball players' union would not be pleased as Mike Lowell's worst-case scenario materialized. For security screening, one expects that almost everyone examined is neither a spy nor an insurgent, so the situation is like A, not B. To overcome the challenge of Situation A, we must have a wholly accurate technology, one that yields an exceedingly small quarry of false positives. Scientists warn against PCa.s.s precisely because the military intends to use the gadget for screening ma.s.ses of mostly innocent people; in this arena, sometimes known as "prediction of rare events," the polygraph and its variants are decidedly not Magic La.s.sos.
When Jeffrey Deskovic walked out of jail on September 20, 2006, he walked out a free man. He also walked out an innocent innocent man. Not a typo. Deskovic became a poster boy for the Innocence Project, a pro bono legal aid consultancy dedicated to overturning wrongful convictions through the latest DNA technology. Earlier that year, the project leaders had convinced Janet DiFiore, the new Westchester County district attorney, to reexamine Deskovic's DNA. The result confirmed the original forensic finding that the quiet cla.s.smate of Angela Correa had nothing whatsoever to do with her murder. More significantly, the murderer's DNA matched Steven Cunningham, whose profile was inserted into a data bank of criminals due to a separate murder conviction for which he was serving a twenty-year sentence. Cunningham later admitted to Angela's murder and rape, closing the loop for DiFiore's office. man. Not a typo. Deskovic became a poster boy for the Innocence Project, a pro bono legal aid consultancy dedicated to overturning wrongful convictions through the latest DNA technology. Earlier that year, the project leaders had convinced Janet DiFiore, the new Westchester County district attorney, to reexamine Deskovic's DNA. The result confirmed the original forensic finding that the quiet cla.s.smate of Angela Correa had nothing whatsoever to do with her murder. More significantly, the murderer's DNA matched Steven Cunningham, whose profile was inserted into a data bank of criminals due to a separate murder conviction for which he was serving a twenty-year sentence. Cunningham later admitted to Angela's murder and rape, closing the loop for DiFiore's office.
It took Deskovic sixteen long years to win back his freedom and innocence. At the time of his release in 2006, he was thirty-three years old but just beginning his adult life. A reporter from the New York Times New York Times found him struggling with the basics of modern living, such as job hunting, balancing a checkbook, driving a car, and making friends. He said, "I lost all my friends. My family has become strangers to me. There was a woman who I wanted to marry at the time I was convicted, and I lost that too." found him struggling with the basics of modern living, such as job hunting, balancing a checkbook, driving a car, and making friends. He said, "I lost all my friends. My family has become strangers to me. There was a woman who I wanted to marry at the time I was convicted, and I lost that too."
"He had been incarcerated half his life for a crime he did not commit." DiFiore's office did not mince words in a candid review of the Angela Correa murder case. Her report continued, "Desk-ovic's January 25th statement was far and away the most important evidence at the trial. Without it, the State had no case against him. He would never have been prosecuted for killing Correa. He would never have been convicted. He would never have spent a day-let alone sixteen years-in prison."
Recall what transpired on that fateful day: Deskovic consented to a polygraph interrogation, during which he confessed to a crime he did not commit. The detectives concluded that Deskovic lied when he claimed innocence, and this error of judgment caused the grave miscarriage of justice. In hindsight, Deskovic was incarcerated half his life due to a false-positive error in a polygraph exam.
In a surprising twist, DiFiore acknowledged that the tactics used by the police were in fact lawful. They are allowed to seek polygraph exams (suspects may refuse) and to elicit confessions, even by citing false evidence, such as a fict.i.tious failed polygraph result. According to Saul Ka.s.sin, a leading forensic psychologist, such investigative methods frequently produce false confessions. Up to a quarter of the convicts exonerated by the Innocence Project had admitted to crimes they did not commit, and like Jeffrey Deskovic, many had done so during polygraphs.
One might think normal people do not make false confessions. But important research by Ka.s.sin and other psychologists has refuted this sensible a.s.sumption. Ka.s.sin has said pointedly that it is innocence itself that puts innocent people at risk. The statistics show that innocent people are more likely to waive the rights designed to protect them, such as the right to silence and to counsel, and they are more likely to agree to polygraphs, house searches, and other discretionary actions. Their desire to cooperate is fueled by another "confession myth" identified by Ka.s.sin, the incorrect belief that prosecutors, judges, or jurors will know a false confession in light of other evidence (or lack thereof). Sadly, confession evidence can be overpowering. Ka.s.sin reported that in his experiments with mock juries, even when the jurors stated that they fully discounted the confession as unreliable, the conviction rates of these cases were still significantly above those of the same cases presented without the confession evidence. Furthermore, this result held even when the jurors were specifically instructed to disregard the confession.
Ka.s.sin's number one confession myth is the misconception that trained interviewers can detect truth and deception, a direct challenge to polygraph supporters. He cited studies from around the world that have consistently found that the self-anointed experts, such as police interrogators, judges, psychiatrists, customs inspectors, and the like, do no better at discerning lies than untrained eyes. More alarmingly, there is emerging evidence that professional training in interrogation techniques does not affect accuracy but merely bolsters self-confidence-a misguided conviction, if not a delusion.
The Deskovic tragedy was a case in point. Everything other than his confession was either exculpatory or erroneous. The original scientific and forensic evidence, not just the DNA test, exonerated Deskovic but was explained away by speculative theories and then ignored by the jury. For instance, the prosecution claimed that the hair samples, which were not linked to Deskovic, could have come from the medical examiner and his a.s.sistant, and the jury accepted that explanation without proof. When Deskovic maintained his innocence through the sentencing phase and beyond, a.s.serting that he "didn't do anything," the jury chose to believe his earlier, unrecorded confession. The NYPD profile, which purportedly fit Deskovic almost perfectly, missed the mark on all fronts: the real perpetrator, Cunningham, was black, not white or Hispanic; his age was almost thirty, not under nineteen or twenty-five; and he was a complete stranger to the victim, not somebody she knew.
The psychologists worry that we are only seeing the tip of the iceberg of wrongful convictions. Statisticians elaborate: when we deploy polygraphs for screening, like those in the PCa.s.s project, there will be hundreds, or even thousands, of false positives for every major security threat correctly identified hundreds, or even thousands, of false positives for every major security threat correctly identified. Some, perhaps most, of these will lead to false confessions and wrongful convictions.
In calibrating the computer algorithm of PCa.s.s, the army requested that Greens (those judged to be truthful) be minimized and Reds (deceptive) be favored against Yellows (inconclusive). Accordingly, the Johns Hopkins researchers set the pa.s.sing rate-that is, the percentage of Greens-at less than 50 percent. This situation is as if the anti-doping agency set the hematocrit threshold at 46 percent, thus disqualifying half of the clean athletes while ensuring that all dopers are caught. The way PCa.s.s is calibrated tells us that army leaders are worried sick about the false negative. They are reluctant to pa.s.s any job applicant unless they can be sure the person has not lied. This policy is entirely consistent with the prevalent belief that even one undetected insurgent could prove devastating. After all, some terrorist strikes, like the anthrax attacks of 2001, can be perpetrated by a criminal acting alone, as far as we know.
By focusing its energy on making sure no potential insurgent goes unnoticed, the army is certain to have made loads of false-positive errors. Stemming from the unavoidable trade-off between the two errors, this result is clear unless one believes that the majority of the applicant pool (those judged to be Reds) could consist of insurgents. The sway of the asymmetric works in reverse relative to the case of steroid testing: here, the false-negative error can become highly toxic and highly public, while false-positive mistakes are well hidden and may come to light only through the painstaking work of activists like the Innocence Project.
The asymmetric costs a.s.sociated with national-security screening sways examiners toward condoning false positives while minimizing false negatives, which has profound consequences for all citizens. It took Jeffrey Deskovic sixteen years of perseverance, plus a pinch of good luck with the new district attorney, to expose the grave false-positive error. In several recent high-profile cases of alleged espionage, the suspects, such as the Chinese-American scientist Dr. Wen Ho Lee, reportedly flunked polygraph exams but were ultimately cleared of spying, winning multimillion-dollar settlements for their troubles. These false alarms not only cost investigators time and money in chasing dead-end leads but also tarnished the reputations and destroyed the careers of the victims and frequently also their a.s.sociates. Statistical a.n.a.lysis confirms that many more Deskovics, perhaps hundreds or thousands a year, are out there, most likely hapless.
Even if we trust the accuracy level claimed by the Johns Hopkins researchers, we can conclude that for every true insurgent caught by PCa.s.s, 93 regular folks would be falsely cla.s.sified as deceptive, their apparent "crime" being at the wrong place at the wrong time (see Figure 4-2 Figure 4-2). The statistics largely mirror those of the earlier Situation A, in which even a tiny false-positive rate will be magnified by the presence of a large number of regular folks within the applicant pool (9,990 out of 10,000 in our example). The portable lie detector exacts a high cost for catching the 8 or 9 insurgents, with almost 800 innocents mistaken as deceptive.
Figure 4-2 How PCa.s.s Produces 100 False Alarms for Every Insurgent Caught How PCa.s.s Produces 100 False Alarms for Every Insurgent Caught [image]
This ominous cost-to-benefit ratio ought to frighten us in four ways. First, it embodies a morbid calculus in which a crowd of almost 100 gets rounded up in response to 1 person's infraction, invoking unwelcome memories of the collective-accountability system. Second, among the Reds, there is no way of separating the 8 or 9 truly deceptive from the 800 falsely accused. Third, it makes a mockery of a "screening" device when it pa.s.ses only about half of the subjects (4,995 out of 10,000) while calling most of the rest "inconclusive"; since we expect only 10 insurgents in the applicant pool, almost all of these Yellows are in fact harmless people. Fourth, there is still the unfinished business of the 1 insurgent incorrectly given the green or yellow light while almost 800 innocents see "red" haphazardly. Add to these problems the overstated accuracy level and the possibility of countermeasures, and we have one highly suspect technology.
How many innocent lives should we ruin in the name of national security? This was the question Professor Fienberg raised when he warned against PCa.s.s and other lie detection techniques. "It may be harmless if television fails to discriminate between science and science fiction, but it is dangerous when government does not know the difference."
Truth be told, detection systems are far from perfect. Steroid testing suffers from a bout of false-negative errors, so drug cheats like Marion Jones and many others can hide behind strings of negatives. It is one thing to look for known molecular structures inside test tubes; it is quite another thing to vet the words of suspected liars. As Jose Canseco realized, the polygraph is one of few instruments our society trusts in these situations. However, statisticians say lie detectors produce excessive false-positive errors that result in false accusations, coerced confessions, dead-end leads, or ruined lives; the cruel fate visited upon Jeffrey Deskovic and undoubtedly numerous others serves to warn us against such excesses. Worse, the accuracy of detection systems slips markedly when the targets to be detected occur rarely or are measured indirectly; this explains why preemployment screening for potential potential security threats is harder than screening for security threats is harder than screening for past past security violations through indirect physiological measures, which is harder than detecting a specific steroid molecule. security violations through indirect physiological measures, which is harder than detecting a specific steroid molecule.
In the meantime, the public discourse is focused on other matters. In steroid testing, we keep hearing about the false-positive problem-how star athletes are being hunted by supercilious testers. In national security, we fear the false negative-how the one terrorist could sneak past the screening. As a result, the powers that be have decided that falsely accusing athletes and failing to detect terrorists are more expensive than undetected drug cheats and wrongful convictions.
Statisticians tell us to evaluate both types of errors at the same time, because they are interconnected through an unavoidable trade-off. In practice, the two errors often carry asymmetric costs, and in calibrating detection systems, decision makers, knowingly or not, will be swayed by the error that is public and toxic. For drug tests, this is the false positive, and for polygraphs, it is the false negative. But the trade-off ensures that any effort to minimize this error will aggravate the other; and because the other error is less visible, its deterioration is usually unnoticed.
Short of a technological breakthrough that dramatically improves the overall accuracy of polygraphs, it is not possible to reduce false positives and false negatives simultaneously, leaving us with an unacceptable, unpleasant trade-off. This trade-off is as true in PCa.s.s as in other large-scale screening initiatives, including various data-mining constructs rumored to have sprouted in the "War on Terror."
After September 11, 2001, a vast new market opened up for data-mining software, previously purchased primarily by large businesses. It is generally accepted that the terrorist attacks could have been prevented if our intelligence agencies had only "connected the dots" "in time." Thus, by building gigantic databases that keep tabs on everyone-storing all phone calls, e-mails, websites visited, bank transactions, tax records, and so on-and by unleashing search agents, spiders, bots, and other exotically named software species to sift through the data at lightning speeds-shaking out patterns and tendencies-our government can uncover plots before the terrorists strike. These secretive, expansive programs go by creative monikers such as TIA (Total Information Awareness; later renamed Terrorism Information Awareness), ADVISE (that's a.n.a.lysis, Dissemination, Visualization, Insight, and Semantic Enhancement), and Talon (apparently not an acronym). A celebratory confidence pervades the data-mining community as they make bold promises, as in the following example from Craig Norris, CEO of Attensity, a Palo Alto, Californiabased start-up that counts the National Security Agency and Department of Home-land Security as customers: "You don't even have to know what it is you're searching for. But you'll know it once you find it. If a terrorist is planning a bombing, they might say, 'Let's have a barbecue.' The software can detect if the word barbecue barbecue is being used more often than usual." is being used more often than usual."
If only real life were so perfect.
Using data-mining software to find terrorist plots is comparable to using polygraphs for preemployment screening in that we collect information about past or current behavior in order to predict future misconduct. In either case, reliance on indirect evidence and the sway of false negatives relative to false positives tend to produce lots of false alarms. Moreover, both applications involve prediction of rare events, and terrorist plots are even rarer than spies! Rarity is measured by how many relevant objects (say, spies) exist among the total pool of objects (say, employees). As every detail of our daily lives is sucked into those gigantic databases, the number of objects examined swells at a breakneck pace, while the number of known terrorist plots does not. Therefore, relevant objects become rarer and harder to find. If data-mining systems perform as accurately as polygraphs, they will drown under the weight of false positives in less time than it takes to sink PCa.s.s.
Security expert Bruce Schneier has looked at data-mining systems the same way we evaluated steroid tests and polygraphs: "We'll a.s.sume the [data-mining] system has a one in 100 false-positive rate . . . and a one in 1,000 false-negative rate. a.s.sume one trillion possible indicators to sift through: that's about 10 events-emails, phone calls, purchases, web destinations, whatever-per person in the United States per day. Also a.s.sume that 10 of them are actually terrorists plotting. This unrealistically accurate system will generate one billion false alarms for every real terrorist plot it uncovers. one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you're still chasing 2,750 false alarms per day but that will inevitably raise your false negatives, and you're going to miss some of those 10 real plots." Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you're still chasing 2,750 false alarms per day but that will inevitably raise your false negatives, and you're going to miss some of those 10 real plots."
But a realistic data-mining system does not surpa.s.s the accuracy level of polygraphs, so Schneier's numbers (diagrammed in Figure 4-3 Figure 4-3) are wildly optimistic, as he admonished.
Statisticians who perform the type of exploratory a.n.a.lysis Norris described, in which the computer discovers you-know-it-once-you-see-it patterns, know such findings are only approximate. To appropriate Norris's example, if no prior terrorist has ever used barbecue barbecue as a code word, no data-mining system will tag it as suspicious. If there were such a prescient system, it would throw off millions of false alarms ( as a code word, no data-mining system will tag it as suspicious. If there were such a prescient system, it would throw off millions of false alarms (picnic, umbrella, beach, sneakers, etc.). We would then have to inaugurate a new cla.s.s of crime, excessive communal verbal redundancy, to describe the dangerous state of repeating a word or a phrase too many times on one's own or within one's social network.
Figure 4-3 How Data-Mining Technologies Produce Billions of False Alarms How Data-Mining Technologies Produce Billions of False Alarms [image]
Expecting intelligence agencies to "connect the dots" is a pipe dream. Which dots mattered revealed themselves only after 9/11. The idea that we knew the dots and only needed someone to join them was cla.s.sic 20/20 hindsight. Imagine: if trains rather than planes had been involved, we would now be fidgeting about other dots!
The endless drumbeat about the miracle of data-mining systems tells us we have drawn the wrong lessons from 9/11. Sure, the tragedy made tangible the unimaginable cost of a false-negative mistake, of failing to identify potential terrorists. But too much fear of false negatives has inevitably resulted in too many false positives. It therefore makes statistical sense that few Guantanamo inmates have been convicted and many detainees were declared innocent or released without charge. When marketers use data mining to guess which customers will respond positively to sale offers, false positives may cause selected customers to receive junk mail; when banks use data mining to guess which credit card transactions may be fraudulent, false positives may cost honest customers time to call and verify blocked charges. These are inconveniences when compared with the psychological trauma and shattered lives that could stem from a charge of "excessive communal verbal redundancy." Apart from false imprisonment and loss of civil liberties, we must also consider how demoralizing, how expensive, and how counterproductive it is for our intelligence agents to be chasing down millions of bad leads.
What we should have learned from 9/11 is that terrorist plots are extremely rare events. Existing detection technologies are not accurate enough to qualify for the job; we know polygraphs can't do it, and large-scale data-mining systems perform even worse. The unacceptable trade-off remains just as unacceptable. The Magic La.s.so is still elusive. We need something much, much better.
5.
Jet Crashes / Jackpots The Power of Being Impossible The safest part of your journey is over. Now drive home safely.
-ANONYMOUS PILOT 1,000,000,000,000,000,000,000,000,000,000, 000,000,000,000,000,000.
-A QUINDECILLION QUINDECILLION On August 24, 2001, a store clerk in Ontario, Canada, claimed a CDN$250,000 prize in the Encore lottery. On October 31, 1999, a Boeing 767 jetliner plunged into the Atlantic Ocean off Nantucket Island, Ma.s.sachusetts, leaving no survivors. On the surface, these two events-winning a fortune and losing everything-had nothing to do with each other. Except when one considers how improbable either of these events was. Statisticians who keep score tell us that the odds of winning the Encore lottery are one in ten million, roughly comparable to the odds of dying in a plane crash. At these long odds, practically none of us will live long enough to win the Encore lottery or to perish in a plane crash. Yet about 50 percent of Americans play the state lotteries, and at least 30 percent fear flying. Our belief in miracles underlies both these att.i.tudes: even if rare events do not happen often, when they do happen, they will happen to us. If someone's ticket is going to win the million-dollar jackpot, it will be ours, so we gamble. If a plane is going to vanish in the Atlantic, it will be the one taking us, so we avoid flying.
By contrast, statisticians typically take the contrary view: they write off the chance of jackpots and are not worried about jet crashes. Why would they expose themselves to the risk of death but exclude themselves from the dream of riches? Can they be serious?
It was almost 2:00 A.M A.M., on October 31, 1999-the Sunday morning of a Halloween weekend, long after the residents of the pristine Nantucket Island had waved off their reveling friends on an unseasonably cool night. Stuart Flegg, a carpenter who had moved to a southeastern cliff of Nantucket eleven years before, was lounging in his backyard with buddies and beers, under the stars. Without warning, an orange fireball growled in the night sky and then faded quietly into the darkness. Stuart rubbed his eyes, as if to check his state of mind, and poked his friend on the back, pointing in the direction of the flash. It was unlike anything he had ever seen. Stuart and his friends prattled on for a bit, and then they let the thread drop.
Word seemed to come out of nowhere and creep through the neighborhood like wild vines. A few minutes before, from thirty-three thousand feet above sea level, a Boeing 767 jet had cut out a sixty-six-degree angle as it split and nose-dove, scattering 217 souls in the azure Atlantic waters. The horrified pa.s.sengers on board EgyptAir Flight 990 had endured a precipitous fall at four hundred feet per second, followed by an abrupt surge up eight thousand feet, only to tumble again, this time in finality. Many more details would trickle out over the coming months and years. At this time and place, to those witnesses, to the relatives of the perished, and their friends and neighbors, the impact was head-on and devastating.
Recognition of the disaster quickly gave way to disbelief, then to scant hope; soon information brought about acceptance of the tragic and an outpouring of sadness. With little delay, curiosity set in as well, in the form of loving concern: Do I know anyone scheduled to fly tonight? From Boston Logan? On EgyptAir? Heading to Cairo? Do I know anyone scheduled to fly tonight? From Boston Logan? On EgyptAir? Heading to Cairo? A single no brought relief, while four yeses brought fear, denial, and busy fingers on a cell phone. For most, the fear of losing a loved one would come and go, though uncertainty would linger, now in the form of foresight: A single no brought relief, while four yeses brought fear, denial, and busy fingers on a cell phone. For most, the fear of losing a loved one would come and go, though uncertainty would linger, now in the form of foresight: Better to cancel business trips and vacations for the near future. Better to travel on land than by air. Better to shun those dangerous Better to cancel business trips and vacations for the near future. Better to travel on land than by air. Better to shun those dangerous foreign foreign airlines. Better to avoid Logan, or night flights, or layovers in JFK. Never again save a penny. airlines. Better to avoid Logan, or night flights, or layovers in JFK. Never again save a penny.
Ninety miles away in Boston, the newsrooms buzzed to life. By midmorning, CBS, NBC, Fox, ABC, CNN, MSNBC, and Fox News had all swapped out their regular programming to carry continuous coverage of The Crash. Few businesses thrive on the morbid as much as the media, particularly on sleepy weekends bereft of ready-made headlines. Journalists on the disaster beat knew this was their time to shine. For the next week, if not the entire month, their reports would land on the front page and stay in the public's conscience. The extent of the media coverage is reflected in the statistics of New York Times New York Times front-page stories: researchers found 138 articles for every 1,000 plane crash deaths, but only 2 articles for every 1,000 homicides, and only 0.02 article for every 1,000 cancer deaths. front-page stories: researchers found 138 articles for every 1,000 plane crash deaths, but only 2 articles for every 1,000 homicides, and only 0.02 article for every 1,000 cancer deaths.
The Sunday of the crash, the front pages of newspapers big and small announced the air disaster with absolute solemnity. Sorting through the headlines, one might recognize the five prototypal disaster beat stories: the tragedy narrative about the available facts of the case; the human interest story singling out one unfortunate victim; the feel-good story about communities pulling together to cope with the disaster; the detective report citing a.n.a.lyses from all angles, from engineers, insurers, pa.s.sersby, psychologists, and even psychics; and the big-picture synthesis, courtesy of the editors.
The editorial piece has a predictable structure, frequently referencing a list of recent air tragedies, compiled in a table similar to this one: Corridor of Conspiracy [image]
Presented with such a table, we look for patterns of occurrence. Seek and ye shall find Seek and ye shall find-that is a law of statistics. It doesn't take a genius, or a news editor, to notice that between 1996 and 1999, a succession of jets plunged into the Atlantic near Nantucket: TWA, Swissair, EgyptAir, and John F. Kennedy Jr.'s private plane. Speaking to an a.s.sociated Press reporter, a local diving instructor lamented, "Nantucket is like the Bermuda Triangle of the Northeast." He was not the only one making that connection. Many reporters also connected the dots, so to speak.
Rounding out the piece, the editors would cite the latest polls confirming the heightened level of worry about air travel. Then they would advise readers to remain calm, reminding them that experts still consider flying to be safe relative to other forms of transportation.
The editorial stance on the EgyptAir disaster was predictable, and so was the lukewarm reaction to its call for calm. It is common during such times that emotions clash against logic, superst.i.tion against science, faith against reason. Hovering in the air was the obvious question: what had caused EgyptAir 990 to crash? The journalists were running at full tilt, consulting any and all experts, whose theories more often than not conflicted with each other. No chain of logic was denied, as one report after another flooded the news. The more information provided to the public, the greater confusion it caused, and the more speculation it created. Starting with the rational explanations, such as equipment failure or atmospheric anomaly, suspicion shifted to the sinister, such as terrorist attack or drunken pilot, and then to the bizarre, like electromagnetic interference, missile attack, or pilot suicide. Finally, logical reasoning gave way to raw emotion. Unable to point the finger at a specific airline, airport, jet manufacturer, or day of the week, the public turned its back on the concept of flying altogether. Many travelers canceled or postponed their planned trips, while others opted to jump into their cars instead. Fully half the people polled by Newsweek Newsweek after the EgyptAir crash said they experienced fear when flying, and about the same proportion indicated they would avoid airlines from Egypt and other Middle Eastern countries. Wary of its stigma, the airline quietly retired the flight's number. This fog of emotions would not clear for a few years, which is how long it takes for official air traffic accident investigations to conclude. after the EgyptAir crash said they experienced fear when flying, and about the same proportion indicated they would avoid airlines from Egypt and other Middle Eastern countries. Wary of its stigma, the airline quietly retired the flight's number. This fog of emotions would not clear for a few years, which is how long it takes for official air traffic accident investigations to conclude.
Taking a self-imposed moratorium on air travel is not much different from performing the rain dance to fight off a severe drought or beating drums to scare off locusts. When reason is exhausted, emotions fill up the void. But experience should teach us that logical reasoning bears the best hope, even in the face of inexplicable calamity. During the locust outbreak of 2004, African leaders convened and resolved to employ drums-the kind that hold pesticide.
At this point, you may be expecting a gotcha about misunderstanding relative risk, but we will not go there. If a lot of people come to the same conclusion about something, there must be a certain logic behind it. Interviews with people who said they would stop flying after hearing about a major crash showed that their anxiety was deeply felt. They understood that plane crashes were exceptionally rare; after all, only a handful of fatal accidents. .h.i.t the developed world during the 1990s. But people feared that these accidents were more likely to occur on their flights than on others. They were asking, if air crashes were random, what could explain the unlikely coincidence of four fatal accidents in four years occurring in the same air s.p.a.ce? They felt sure the morbid record was a conspiracy of the unknown: though n.o.body had yet identified the culprit, they knew something must have caused the crash. They thought the pattern was too tidy to have been the dirty work of random chance. In 1999, many were blaming the "Bermuda Triangle" over Nantucket.
While such allegations sound outlandish, the line of reasoning behind them reflects sound statistical thinking. Based on the pattern of occurrence, these folks rejected the idea that air crashes happened at random; instead, they believed in some sort of predetermination (Bermuda Triangle, equipment failure, drunken pilot, and the like). Statisticians call this logic "statistical testing," and we use it all the time, often without knowing.
If that is so, then why do the experts get so worked up over people's fears after any plane crash? In 2001, Professor Arnold Barnett, the nation's foremost airline safety expert, dared to ask rhetorically, "Is aviation safety . . . a problem that has been essentially solved [in the First World], to the extent that talking about it might suggest a personality disorder?" In even starker terms, Professor Barry Gla.s.sner, a psychologist who wrote The Culture of Fear The Culture of Fear, reckoned that the hysteria after a plane crash is as deadly as the crash itself because people who abandon their flight plans face a greater risk of dying-from road accidents. Why did these experts look at the same list of fatalities but come to the opposite conclusion? How could they explain the coincidence of four crashes in four years in the same general area? More importantly, how could they continue to trust foreign airlines?
On August 24, 2001, the Ontario Lottery and Gaming Corporation (OLG) awarded a CDN$250,000 check to Phyllis and Scott LaPlante, lucky winners of the Encore lottery on July 13, 2001. Each CDN$1 Encore ticket bought a chance to win CDN$250,000 if all six digits matched. With the odds of winning listed as one in ten million, someone who spends CDN$1 on Encore every day can expect to win once every twenty-seven thousand years, which rounds to about . . . never. Barnett, the air safety expert, has estimated that the chance of dying from a plane crash on a U.S. domestic nonstop flight is also one in ten million. Someone who takes one such flight every day would live twenty-seven thousand years before encountering a fatal crash. Thus, either event has about the same microscopic chance of happening.
What made Phyllis LaPlante's case special was her insider status: she and her husband owned Coby Milk and Variety, a small store in Coboconk, Ontario, which sold, among other things, lottery tickets. When she scanned the winning ticket, the machine chimed twice, announcing a major win. As her prize money exceeded CDN$50,000, it triggered an "insider win" investigation. Tracing the trail of tickets, the OLG staff cleverly deduced that the winner stuck to a regular set of numbers (9 4 2 9 8 1) in every lottery. So they asked for some old tickets, and the LaPlantes duly provided a few. The numbers matched.
What had actually taken place was that Phyllis LaPlante had just robbed an eighty-two-year-old man of a quarter of a million dollars. It took a statistician to prove the case decisively-using the logic of statistical testing-and determine if LaPlante was simply lucky or mighty crafty. In addressing this question, Jeffrey Rosenthal at the University of Toronto examined seven years of draws and winners of Ontario jackpots. Between 1999 and 2005, there were 5,713 "major" winners of prizes valued at CDN$50,000 or more. Rosenthal estimated that store owners and employees accounted for about CDN$22 million of the CDN$2.2 billion spent on Ontario lotteries during this period-that is to say, about CDN$1 out of every CDN$100 the OLG received. He reasoned that if store owners and employees were no luckier than everyone else, they would have won 1 out of every 100 major prizes, meaning about 57 of the 5,713 wins. But as shown in Figure 5-1 Figure 5-1, by the time Rosenthal completed his count, store insiders had actually struck gold more than 200 times! Either we had to believe that LaPlante and other store owners were blessed with extraordinary luck, or we might suspect foul play. Rosenthal was convinced of the latter.
"The fifth estate," a television program of the Canadian Broadcasting Corporation (CBC), broke the scandal on October 25, 2006, when it told the story of Bob Edmonds, the eighty-two-year-old senior citizen and bona fide winner of that Encore lottery. The winning numbers were a combination of birthdays: his own, his wife's, and his son's. The CBC hired Rosenthal to be the expert witness. He announced that the odds of lottery store insiders taking 200 of the OLG's 5,713 major prizes by luck alone by luck alone were one in a quindecillion. (That's the number 1 followed by forty-eight zeros.) No reasonable person could believe in such luck. On the strength of Rosenthal's statistical a.n.a.lysis, the sorry story expanded beyond Phyllis LaPlante; up to 140 store insider wins now appeared highly dubious, and complaints deluged the OLG. were one in a quindecillion. (That's the number 1 followed by forty-eight zeros.) No reasonable person could believe in such luck. On the strength of Rosenthal's statistical a.n.a.lysis, the sorry story expanded beyond Phyllis LaPlante; up to 140 store insider wins now appeared highly dubious, and complaints deluged the OLG.
Figure 5-1 Expected Wins Versus Actual Wins by Retailer Insiders, 19992005: Evidence of Foul Play Expected Wins Versus Actual Wins by Retailer Insiders, 19992005: Evidence of Foul Play [image]
You may be wondering just how the LaPlantes produced those old lottery tickets matching Edmonds's regular numbers. It was an old-fashioned con game, preying on a genial old man. When Bob Edmonds handed his ticket to the clerk on his lucky day, the scanning machine buzzed twice, indicating a big win. Distrusting his own ears, Edmonds instead believed LaPlante when she told him he had won the lesser prize of a free ticket. When LaPlante reported her "win" to OLG, she was clued in about the automatic insider-win investigation. The next day, her husband called Edmonds to the store, and they peppered him with questions. They learned that the winning numbers were his regular numbers and even obtained some of Edmonds's old losing tickets. Edmonds probably did not think expired, losing tickets would be valuable to anyone. He thought he was friends with the clerks at the corner store, but he was wrong. In an interview, he even suggested LaPlante might have been coming on to him that day! He was definitely wrong there. He realized his mistakes when the local newspaper reported that the LaPlantes were lucky winners of the Encore lottery, and he immediately filed a complaint with OLG, which eventually led to the CBC's investigation.
The story had a happy ending, though. The investigation by "the fifth estate" unleashed a maelstrom of controversy. Ontario's premier, Dalton McGuinty, was particularly alarmed because in Canada, proceeds from lotteries support the provincial budgets. In Ontario, this sum amounted to CDN$650 million in 20032004. About CDN$30 out of every CDN$100 spent on lotteries makes its way to the government coffers (while only CDN$54 out of every CDN$100 is paid out in prize money, ensuring that on average, the house always wins handily). A collapse in public trust in these lotteries could seriously affect Ontario's health care, education, and infrastructure. Therefore, McGuinty directed the provincial ombudsman to investigate the OLG's handling of complaints from customers. Put on the defensive, the OLG belatedly apologized to and recompensed Edmonds, released him from a prior gag order, and announced tighter regulations on insider players. The LaPlantes were sued for fraud but settled out of court after surrendering CDN$150,000 to Edmonds.