Rule

How Statistics Turned a Harmless Nurse Into a Vicious Killer

Let’s do a thought experiment. Suppose you have 2 million coins at hand and a machine that will flip them all at the same time. After twenty flips, you evaluate and you come across one particular coin that showed heads twenty times in a row. Suspicious? Alarming? Is there something wrong with this coin? Let’s dig deeper. How likely is it that a coin shows heads twenty times in a row? Luckily, that’s not so hard to compute. For each flip there’s a 0.5 probability that the coin shows heads and the chance of seeing this twenty times in a row is just 0.5^20 = 0.000001 (rounded). So the odds of this happening are incredibly low. Indeed we stumbled across a very suspicious coin. Deep down I always knew there was something up with this coin. He just had this “crazy flip”, you know what I mean? Guilty as charged and end of story.

Not quite, you say? You are right. After all, we flipped 2 million coins. If the odds of twenty heads in a row are 0.000001, we should expect 0.000001 * 2,000,000 = 2 coins to show this unlikely string. It would be much more surprising not to find this string among the large number of trials. Suddenly, the coin with the supposedly “crazy flip” doesn’t seem so guilty anymore.

What’s the point of all this? Recently, I came across the case of Lucia De Berk, a dutch nurse who was accused of murdering patients in 2003. Over the course of one year, seven of her patients had died and a “sharp” medical expert concluded that there was only a 1 in 342 million chance of this happening. This number and some other pieces of “evidence” (among them, her “odd” diary entries and her “obsession” with Tarot cards) led the court in The Hague to conclude that she must be guilty as charged, end of story.

Not quite, you say? You are right. In 2010 came the not guilty verdict. Turns out (funny story), she never commited any murder, she was just a harmless nurse that was transformed into vicious killer by faulty statistics. Let’s go back to the thought experiment for a moment, imperfect for this case though it may be. Imagine that each coin represents a nurse and each flip a month of duty. It is estimated that there are around 300,000 hospitals worldwide, so we are talking about a lot of nurses/coins doing a lot of work/flips. Should we become suspicious when seeing a string of several deaths for a particular nurse? No, of course not. By pure chance, this will occur. It would be much more surprising not to find a nurse with a “suspicious” string of deaths among this large number of nurses. Focusing in on one nurse only blurs the big picture.

And, leaving statistics behind, the case also goes to show that you can always find something “odd” about a person if you want to. Faced with new information, even if not reliable, you interpret the present and past behavior in a “new light”. The “odd” diary entries, the “obsession” with Tarot cards … weren’t the signs always there?

Be careful to judge. Benjamin Franklin once said he should consider himself lucky if he’s right 50 % of the time. And that’s a genius talking, so I don’t even want to know my stats …

Statistics: The Multiplication Rule Gently Explained

Multiplication is a surprisingly powerful tool in statistics. It enables us to solve a vast amount of problems with relative ease. One thing to remember though is that the multiplication rule, to which I’ll get in a bit, only works for independent events. So let’s talk about those first.

When we roll a dice, there’s a certain probability that the number six will show. This probability does not depend on what number we rolled before. The events “rolling a three” and “rolling a six” are independent in the sense, that the occurrence of the one event does not affect the probability for the other.

Let’s look at a card deck. We draw a card and note it. Afterward, we put it back in the deck and mix the cards. Then we draw another one. Does the event “draw an ace” in the first try affect the event “draw a king” in the second try? It does not, because we put the ace back in the deck and mixed the cards. We basically reset our experiment. In such a case, the events “draw an ace” and “draw a king” are independent.

But what if we don’t put the first card back in the deck? Well, when we take the ace out of the deck, the chance of drawing a king will increase from 4 / 52 (4 kings out of 52 cards) to 4 / 51 (4 kings out of 51 cards). If we don’t do the reset, the events “draw an ace” and “draw a king” are in fact dependent. The occurrence of one changes the probability for the other.

With this in mind, we can turn to our powerful tool called multiplication rule. We start with two independent events, A and B. The probabilities for their occurrence are respectively p(A) and p(B). The multiplication rule states that the probability of both events occurring is simply the product of the probabilities p(A) and p(B). In mathematical terms:

p(A and B) = p(A) · p(B).

A quick look at the dice will make this clear. Let’s take both A and B to be the event “rolling a six”. Obviously they are independent, rolling a six on one try will not change the probability of rolling a six in the following try. So we are allowed to use the multiplication rule here. The probability of rolling a six is 1/6, so p(A) = p(B) = 1/6. Using the multiplication rule, we can calculate the chance of rolling two six in a row: p(A and B) = 1/6 · 1/6 = 1/36. Note that if we took A to be “rolling a six” and B to be “rolling a three”, we would arrive at the same result. The chance of rolling two six in a row is the same as rolling a six and then a three.

 Can we also use this on the deck of cards, even if we don’t reset the experiment? Indeed we can. But we have to take into account that the probabilities change as we go along. In more abstract terms, instead of looking at the general events “draw an ace” and “draw a king”, we need to look at the events A = “draw an ace in the first try” and B = “draw a king with one ace missing”. With the order of the events clearly set, there’s no chance of them interfering. The occurrence of both events, first drawing an ace and then drawing a king with the ace missing, has the probability: p(A and B) = p(A) · p(B) = 4/52 · 4/51 = 16/2652 or 1 in about 165 or 0.6 %.

For examples on how to apply the multiplication rule check out Multiple Choice Tests and Monkeys on Typewriters.

Statistics and Monkeys on Typewriters

Here are the first two sentences of the prologue to Shakespeare’s Romeo and Juliet:

Two households, both alike in dignity,
In fair Verona, where we lay our scene

This excerpt has 77 characters. Now we let a monkey start typing random letters on a typewriter. Once he typed 77 characters, we change the sheet and let him start over. How many tries does he need to randomly reproduce the above paragraph?

There are 26 letters in the English alphabet and since he’ll be needing the comma and space, we’ll include those as well. So there’s a 1/28 chance of getting the first character right. Same goes for the second character, third character, etc … Because he’s typing randomly, the chance of getting a character right is independent of what preceded it. So we can just start multiplying:

p(reproduce) = 1/28 · 1/28 · … · 1/28 = (1/28)^77

The result is about 4 times ten to the power of -112. This is a ridiculously small chance! Even if he was able to complete one quadrillion tries per millisecond, it would most likely take him considerably longer than the estimated age of the universe to reproduce these two sentences.

Now what about the first word? It has only three letters, so he should be able to get at least this part in a short time. The chance of randomly reproducing the word “two” is:

p(reproduce) = 1/26 · 1/26 · 1/26 = (1/26)^3

Note that I dropped the comma and space as a choice, so now there’s a 1 in 26 chance to get a character right. The result is 5.7 times ten to the power of -5, which is about a 1 in 17500 chance. Even a slower monkey could easily get that done within a year, but I guess it’s still best to stick to human writers.

.This was an excerpt from the ebook “Statistical Snacks. Liked the excerpt? Get the book here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. Want more excerpts? Check out The Probability of Becoming a Homicide Victim and Missile Accuracy (CEP).

The Probability of Becoming a Homicide Victim

 Each year in the US there are about 5 homicides per 100000 people, so the probability of falling victim to a homicide in a given year is 0.00005 or 1 in 20000. What are the chances of falling victim to a homicide over a lifespan of 70 years?

 Let’s approach this the other way around. The chance of not becoming a homicide victim during one year is p = 0.99995. Using the multiplication rule we can calculate the probability of this event occurring 70 times in a row:

 p = 0.99995 · … · 0.99995 = 0.9999570

 Thus the odds of not becoming a homicide victim over the course of 70 years are 0.9965. This of course also means that there’s a 1 – 0.9965 = 0.0035, or 1 in 285, chance of falling victim to a homicide during a life span. In other words: two victims in every jumbo jet full of people. How does this compare to other countries?

 In Germany, the homicide rate is about 0.8 per 100000 people. Doing the same calculation gives us a 1 in 1800 chance of becoming a murder victim, so statistically speaking there’s one victim per small city. At the other end of the scale is Honduras with 92 homicides per 100000 people, which translates into a saddening 1 in 16 chance of becoming a homicide victim over the course of a life and is basically one victim in every family.

 It can get even worse if you live in a particularly crime ridden part of a country. The homicide rate for the city San Pedro Sula in Honduras is about 160 per 100000 people. If this remained constant over time and you never left the city, you’d have a 1 in 9 chance of having your life cut short in a homicide.

Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. For more excerpts check out Missile Accuracy (CEP), Immigrants and Crime and Monkeys on Typewriters.