# How Statistics Turned a Harmless Nurse Into a Vicious Killer

Let’s do a thought experiment. Suppose you have 2 million coins at hand and a machine that will flip them all at the same time. After twenty flips, you evaluate and you come across one particular coin that showed heads twenty times in a row. Suspicious? Alarming? Is there something wrong with this coin? Let’s dig deeper. How likely is it that a coin shows heads twenty times in a row? Luckily, that’s not so hard to compute. For each flip there’s a 0.5 probability that the coin shows heads and the chance of seeing this twenty times in a row is just 0.5^20 = 0.000001 (rounded). So the odds of this happening are incredibly low. Indeed we stumbled across a very suspicious coin. Deep down I always knew there was something up with this coin. He just had this “crazy flip”, you know what I mean? Guilty as charged and end of story.

Not quite, you say? You are right. After all, we flipped 2 million coins. If the odds of twenty heads in a row are 0.000001, we should expect 0.000001 * 2,000,000 = 2 coins to show this unlikely string. It would be much more surprising not to find this string among the large number of trials. Suddenly, the coin with the supposedly “crazy flip” doesn’t seem so guilty anymore.

What’s the point of all this? Recently, I came across the case of Lucia De Berk, a dutch nurse who was accused of murdering patients in 2003. Over the course of one year, seven of her patients had died and a “sharp” medical expert concluded that there was only a 1 in 342 million chance of this happening. This number and some other pieces of “evidence” (among them, her “odd” diary entries and her “obsession” with Tarot cards) led the court in The Hague to conclude that she must be guilty as charged, end of story.

Not quite, you say? You are right. In 2010 came the not guilty verdict. Turns out (funny story), she never commited any murder, she was just a harmless nurse that was transformed into vicious killer by faulty statistics. Let’s go back to the thought experiment for a moment, imperfect for this case though it may be. Imagine that each coin represents a nurse and each flip a month of duty. It is estimated that there are around 300,000 hospitals worldwide, so we are talking about a lot of nurses/coins doing a lot of work/flips. Should we become suspicious when seeing a string of several deaths for a particular nurse? No, of course not. By pure chance, this will occur. It would be much more surprising not to find a nurse with a “suspicious” string of deaths among this large number of nurses. Focusing in on one nurse only blurs the big picture.

And, leaving statistics behind, the case also goes to show that you can always find something “odd” about a person if you want to. Faced with new information, even if not reliable, you interpret the present and past behavior in a “new light”. The “odd” diary entries, the “obsession” with Tarot cards … weren’t the signs always there?

Be careful to judge. Benjamin Franklin once said he should consider himself lucky if he’s right 50 % of the time. And that’s a genius talking, so I don’t even want to know my stats …

# Statistics: The Multiplication Rule Gently Explained

Multiplication is a surprisingly powerful tool in statistics. It enables us to solve a vast amount of problems with relative ease. One thing to remember though is that the multiplication rule, to which I’ll get in a bit, only works for independent events. So let’s talk about those first.

When we roll a dice, there’s a certain probability that the number six will show. This probability does not depend on what number we rolled before. The events “rolling a three” and “rolling a six” are independent in the sense, that the occurrence of the one event does not affect the probability for the other.

Let’s look at a card deck. We draw a card and note it. Afterward, we put it back in the deck and mix the cards. Then we draw another one. Does the event “draw an ace” in the first try affect the event “draw a king” in the second try? It does not, because we put the ace back in the deck and mixed the cards. We basically reset our experiment. In such a case, the events “draw an ace” and “draw a king” are independent.

But what if we don’t put the first card back in the deck? Well, when we take the ace out of the deck, the chance of drawing a king will increase from 4 / 52 (4 kings out of 52 cards) to 4 / 51 (4 kings out of 51 cards). If we don’t do the reset, the events “draw an ace” and “draw a king” are in fact dependent. The occurrence of one changes the probability for the other.

With this in mind, we can turn to our powerful tool called multiplication rule. We start with two independent events, A and B. The probabilities for their occurrence are respectively p(A) and p(B). The multiplication rule states that the probability of both events occurring is simply the product of the probabilities p(A) and p(B). In mathematical terms:

p(A and B) = p(A) · p(B).

A quick look at the dice will make this clear. Let’s take both A and B to be the event “rolling a six”. Obviously they are independent, rolling a six on one try will not change the probability of rolling a six in the following try. So we are allowed to use the multiplication rule here. The probability of rolling a six is 1/6, so p(A) = p(B) = 1/6. Using the multiplication rule, we can calculate the chance of rolling two six in a row: p(A and B) = 1/6 · 1/6 = 1/36. Note that if we took A to be “rolling a six” and B to be “rolling a three”, we would arrive at the same result. The chance of rolling two six in a row is the same as rolling a six and then a three.

Can we also use this on the deck of cards, even if we don’t reset the experiment? Indeed we can. But we have to take into account that the probabilities change as we go along. In more abstract terms, instead of looking at the general events “draw an ace” and “draw a king”, we need to look at the events A = “draw an ace in the first try” and B = “draw a king with one ace missing”. With the order of the events clearly set, there’s no chance of them interfering. The occurrence of both events, first drawing an ace and then drawing a king with the ace missing, has the probability: p(A and B) = p(A) · p(B) = 4/52 · 4/51 = 16/2652 or 1 in about 165 or 0.6 %.

For examples on how to apply the multiplication rule check out Multiple Choice Tests and Monkeys on Typewriters.

# Physics (And The Formula That Got Me Hooked)

A long time ago, in my teen years, this was the formula that got me hooked on physics. Why? I can’t say for sure. I guess I was very surprised that you could calculate something like this so easily. So with some nostalgia, I present another great formula from the field of physics. It will be a continuation of and a last section on energy.

To heat something, you need a certain amount of energy E (in J). How much exactly? To compute this we require three inputs: the mass m (in kg) of the object we want to heat, the temperature difference T (in °C) between initial and final state and the so called specific heat c (in J per kg °C) of the material that is heated. The relationship is quite simple:

E = c · m · T

If you double any of the input quantities, the energy required for heating will double as well. A very helpful addition to problems involving heating is this formula:

E = P · t

with P (in watt = W = J/s) being the power of the device that delivers heat and t (in s) the duration of the heat delivery.

———————

The specific heat of water is c = 4200 J per kg °C. How much energy do you need to heat m = 1 kg of water from room temperature (20 °C) to its boiling point (100 °C)? Note that the temperature difference between initial and final state is T = 80 °C. So we have all the quantities we need.

E = 4200 · 1 · 80 = 336,000 J

Additional question: How long will it take a water heater with an output of 2000 W to accomplish this? Let’s set up an equation for this using the second formula:

336,000 = 2000 · t

t ≈ 168 s ≈ 3 minutes

———————-

We put m = 1 kg of water (c = 4200 J per kg °C) in one container and m = 1 kg of sand (c = 290 J per kg °C) in another next to it. This will serve as an artificial beach. Using a heater we add 10,000 J of heat to each container. By what temperature will the water and the sand be raised?

Let’s turn to the water. From the given data and the great formula we can set up this equation:

10,000 = 4200 · 1 · T

T ≈ 2.4 °C

So the water temperature will be raised by 2.4 °C. What about the sand? It also receives 10,000 J.

10,000 = 290 · 1 · T

T ≈ 34.5 °C

So sand (or any ground in general) will heat up much stronger than water. In other words: the temperature of ground reacts quite strongly to changes in energy input while water is rather sluggish. This explains why the climate near oceans is milder than inland, that is, why the summers are less hot and the winters less cold. The water efficiently dampens the changes in temperature.

It also explains the land-sea-breeze phenomenon (seen in the image below). During the day, the sun’s energy will cause the ground to be hotter than the water. The air above the ground rises, leading to cooler air flowing from the ocean to the land. At night, due to the lack of the sun’s power, the situation reverses. The ground cools off quickly and now it’s the air above the water that rises.

I hope this formula got you hooked as well. It’s simple, useful and can explain quite a lot of physics at the same time. It doesn’t get any better than this. Now it’s time to leave the concept of energy and turn to other topics.

This was an excerpt from my Kindle ebook: Great Formulas Explained – Physics, Mathematics, Economics. For another interesting physics quicky, check out: Intensity (or: How Much Power Will Burst Your Eardrums?).

# Physics: Free Fall and Terminal Velocity

After a while of free fall, any object will reach and maintain a terminal velocity. To calculate it, we need a lot of inputs.

The necessary quantities are: the mass of the object (in kg), the gravitational acceleration (in m/s²), the density of air D (in kg/m³), the projected area of the object A (in m²) and the drag coefficient c (dimensionless). The latter two quantities need some explaining.

The projected area is the largest cross-section in the direction of fall. You can think of it as the shadow of the object on the ground when the sun’s rays hit the ground at a ninety degree angle. For example, if the falling object is a sphere, the projected area will be a circle with the same radius.

The drag coefficient is a dimensionless number that depends in a very complex way on the geometry of the object. There’s no simple way to compute it, usually it is determined in a wind tunnel. However, you can find the drag coefficients for common shapes in the picture below. Now that we know all the inputs, let’s look at the formula for the terminal velocity v (in m/s). It will be valid for objects dropped from such a great heights that they manage to reach this limiting value, which is basically a result of the air resistance canceling out gravity.

v = sq root (2 * m * g / (c * D * A) )

Let’s do an example.

Skydivers are in free fall after leaving the plane, but soon reach the terminal velocity. We will set the mass to m = 75 kg, g = 9.81 (as usual) and D = 1.2 kg/m³. In a head-first position the skydiver has a drag coefficient of c = 0.8 and a projected area A = 0.3 m². What is the terminal velocity of the skydiver?

v = sq root (2 * 75 * 9.81 / (0.8 * 1.2 * 0.3) )

v ≈ 70 m/s ≈ 260 km/h ≈ 160 mph

Let’s take a look how changing the inputs varies the terminal velocity. Two bullet points will be sufficient here:

• If you quadruple the mass (or the gravitational acceleration), the terminal velocity doubles. So a very heavy skydiver or a regular skydiver on a massive planet would fall much faster.
• If you quadruple the drag coefficient (or the density or the projected area), the terminal velocity halves. This is why parachutes work. They have a higher drag coefficient and larger area, thus effectively reducing the terminal velocity.

This was an excerpt from the Kindle ebook: Great Formulas Explained – Physics. Mathematics, Economics. Check out my BEST OF for more interesting physics articles.

# Statistics and Monkeys on Typewriters

Here are the first two sentences of the prologue to Shakespeare’s Romeo and Juliet:

Two households, both alike in dignity,
In fair Verona, where we lay our scene

This excerpt has 77 characters. Now we let a monkey start typing random letters on a typewriter. Once he typed 77 characters, we change the sheet and let him start over. How many tries does he need to randomly reproduce the above paragraph?

There are 26 letters in the English alphabet and since he’ll be needing the comma and space, we’ll include those as well. So there’s a 1/28 chance of getting the first character right. Same goes for the second character, third character, etc … Because he’s typing randomly, the chance of getting a character right is independent of what preceded it. So we can just start multiplying:

p(reproduce) = 1/28 · 1/28 · … · 1/28 = (1/28)^77

The result is about 4 times ten to the power of -112. This is a ridiculously small chance! Even if he was able to complete one quadrillion tries per millisecond, it would most likely take him considerably longer than the estimated age of the universe to reproduce these two sentences.

Now what about the first word? It has only three letters, so he should be able to get at least this part in a short time. The chance of randomly reproducing the word “two” is:

p(reproduce) = 1/26 · 1/26 · 1/26 = (1/26)^3

Note that I dropped the comma and space as a choice, so now there’s a 1 in 26 chance to get a character right. The result is 5.7 times ten to the power of -5, which is about a 1 in 17500 chance. Even a slower monkey could easily get that done within a year, but I guess it’s still best to stick to human writers.

.This was an excerpt from the ebook “Statistical Snacks. Liked the excerpt? Get the book here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. Want more excerpts? Check out The Probability of Becoming a Homicide Victim and Missile Accuracy (CEP).

# My Fair Game – How To Use the Expected Value

You meet a nice man on the street offering you a game of dice. For a wager of just 2 \$, you can win 8 \$ when the dice shows a six. Sounds good? Let’s say you join in and play 30 rounds. What will be your expected balance after that?

You roll a six with the probability p = 1/6. So of the 30 rounds, you can expect to win 1/6 · 30 = 5, resulting in a pay-out of 40 \$. But winning 5 rounds of course also means that you lost the remaining 25 rounds, resulting in a loss of 50 \$. Your expected balance after 30 rounds is thus -10 \$. Or in other words: for the player this game results in a loss of 1/3 \$ per round.

Let’s make a general formula for just this case. We are offered a game which we win with a probability of p. The pay-out in case of victory is P, the wager is W. We play this game for a number of n rounds.

The expected number of wins is p·n, so the total pay-out will be: p·n·P. The expected number of losses is (1-p)·n, so we will most likely lose this amount of money: (1-p)·n·W.

Now we can set up the formula for the balance. We simply subtract the losses from the pay-out. But while we’re at it, let’s divide both sides by n to get the balance per round. It already includes all the information we need and requires one less variable.

B = p · P – (1-p) · W

This is what we can expect to win (or lose) per round. Let’s check it by using the above example. We had the winning chance p = 1/6, the pay-out P = 8 \$ and the wager W = 2 \$. So from the formula we get this balance per round:

B = 1/6 · 8 \$ – 5/6 · 2 \$ = – 1/3 \$ per round

Just as we expected. Let’s try another example. I’ll offer you a dice game. If you roll two six in a row, you get P = 175 \$. The wager is W = 5 \$. Quite the deal, isn’t it? Let’s see. Rolling two six in a row occurs with a probability of p = 1/36. So the expected balance per round is:

B = 1/36 · 175 \$ – 35/36 · 5 \$ = 0 \$ per round

I offered you a truly fair game. No one can be expected to lose in the long run. Of course if we only play a few rounds, somebody will win and somebody will lose.

It’s helpful to understand this balance as being sound for a large number of rounds but rather fragile in case of playing only a few rounds. Casinos are host to thousands of rounds per day and thus can predict their gains quite accurately from the balance per round. After a lot of rounds, all the random streaks and significant one-time events hardly impact the total balance anymore. The real balance will converge to the theoretical balance more and more as the number of rounds grows. This is mathematically proven by the Law of Large Numbers. Assuming finite variance, the proof can be done elegantly using Chebyshev’s Inequality.

The convergence can be easily demonstrated using a computer simulation. We will let the computer, equipped with random numbers, run our dice game for 2000 rounds. After each round the computer calculates the balance per round so far. The below picture shows the difference between the simulated balance per round and our theoretical result of – 1/3 \$ per round. (Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2)

# The Probability of Becoming a Homicide Victim

Each year in the US there are about 5 homicides per 100000 people, so the probability of falling victim to a homicide in a given year is 0.00005 or 1 in 20000. What are the chances of falling victim to a homicide over a lifespan of 70 years?

Let’s approach this the other way around. The chance of not becoming a homicide victim during one year is p = 0.99995. Using the multiplication rule we can calculate the probability of this event occurring 70 times in a row:

p = 0.99995 · … · 0.99995 = 0.9999570

Thus the odds of not becoming a homicide victim over the course of 70 years are 0.9965. This of course also means that there’s a 1 – 0.9965 = 0.0035, or 1 in 285, chance of falling victim to a homicide during a life span. In other words: two victims in every jumbo jet full of people. How does this compare to other countries?

In Germany, the homicide rate is about 0.8 per 100000 people. Doing the same calculation gives us a 1 in 1800 chance of becoming a murder victim, so statistically speaking there’s one victim per small city. At the other end of the scale is Honduras with 92 homicides per 100000 people, which translates into a saddening 1 in 16 chance of becoming a homicide victim over the course of a life and is basically one victim in every family.

It can get even worse if you live in a particularly crime ridden part of a country. The homicide rate for the city San Pedro Sula in Honduras is about 160 per 100000 people. If this remained constant over time and you never left the city, you’d have a 1 in 9 chance of having your life cut short in a homicide.

Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. For more excerpts check out Missile Accuracy (CEP), Immigrants and Crime and Monkeys on Typewriters.