Probability

How Statistics Turned a Harmless Nurse Into a Vicious Killer

Let’s do a thought experiment. Suppose you have 2 million coins at hand and a machine that will flip them all at the same time. After twenty flips, you evaluate and you come across one particular coin that showed heads twenty times in a row. Suspicious? Alarming? Is there something wrong with this coin? Let’s dig deeper. How likely is it that a coin shows heads twenty times in a row? Luckily, that’s not so hard to compute. For each flip there’s a 0.5 probability that the coin shows heads and the chance of seeing this twenty times in a row is just 0.5^20 = 0.000001 (rounded). So the odds of this happening are incredibly low. Indeed we stumbled across a very suspicious coin. Deep down I always knew there was something up with this coin. He just had this “crazy flip”, you know what I mean? Guilty as charged and end of story.

Not quite, you say? You are right. After all, we flipped 2 million coins. If the odds of twenty heads in a row are 0.000001, we should expect 0.000001 * 2,000,000 = 2 coins to show this unlikely string. It would be much more surprising not to find this string among the large number of trials. Suddenly, the coin with the supposedly “crazy flip” doesn’t seem so guilty anymore.

What’s the point of all this? Recently, I came across the case of Lucia De Berk, a dutch nurse who was accused of murdering patients in 2003. Over the course of one year, seven of her patients had died and a “sharp” medical expert concluded that there was only a 1 in 342 million chance of this happening. This number and some other pieces of “evidence” (among them, her “odd” diary entries and her “obsession” with Tarot cards) led the court in The Hague to conclude that she must be guilty as charged, end of story.

Not quite, you say? You are right. In 2010 came the not guilty verdict. Turns out (funny story), she never commited any murder, she was just a harmless nurse that was transformed into vicious killer by faulty statistics. Let’s go back to the thought experiment for a moment, imperfect for this case though it may be. Imagine that each coin represents a nurse and each flip a month of duty. It is estimated that there are around 300,000 hospitals worldwide, so we are talking about a lot of nurses/coins doing a lot of work/flips. Should we become suspicious when seeing a string of several deaths for a particular nurse? No, of course not. By pure chance, this will occur. It would be much more surprising not to find a nurse with a “suspicious” string of deaths among this large number of nurses. Focusing in on one nurse only blurs the big picture.

And, leaving statistics behind, the case also goes to show that you can always find something “odd” about a person if you want to. Faced with new information, even if not reliable, you interpret the present and past behavior in a “new light”. The “odd” diary entries, the “obsession” with Tarot cards … weren’t the signs always there?

Be careful to judge. Benjamin Franklin once said he should consider himself lucky if he’s right 50 % of the time. And that’s a genius talking, so I don’t even want to know my stats …

Two And A Half Fallacies (Statistics, Probability)

The field of statistics gives rise to a great number of fallacies (and intentional misuse for that matter). One of the most common is the Gambler’s Fallacy. It is the idea that an event can be “due” if it hasn’t appeared against all odds for quite some time.

In August 1913 an almost impossible string of events occurred in a casino in Monte Carlo. The roulette table showed black a record number of twenty-six times in a row. Since the chance for black on a single spin is about 0.474, the odds for this string are: 0.474^26 = 1 in about 270 million. For the casino, this was a lucky day. It profited greatly from players believing that once the table showed black several times in a row, the probability for another black to show up was impossibly slim. Red was due.

Unfortunately for the players, this logic failed. The chances for black remained at 0.474, no matter what colors appeared so far. Each spin is a complete reset of the game. The same goes for coins. No matter how many times a coin shows heads, the chance for this event will always stay 0.5. An unlikely string will not alter any probabilities if the events are truly independent.

Another common statistical fallacy is “correlation implies causation”. In countries with sound vaccination programmes, cancer rates are significantly elevated, whereas in countries where vaccination hardly takes place, there are only few people suffering from cancer. This seems to be a clear case against vaccination: it correlates with (and thus surely somehow must cause) cancer.

However, taking a third variable and additional knowledge about cancer into account produces a very different picture. Cancer is a disease of old age. Because it requires a string of undesired mutations to take place, it is usually not found in young people. It is thus clear that in countries with a higher life expectancy, you will find higher cancer rates. This increased life expectancy is reached via the many different tools of health care, vaccination being an important one of them. So vaccination leads to a higher life expectancy, which in turn leads to elevated rates in diseases of old age (among which is cancer). The real story behind the correlation turned out to be quite different from what could be expected at first.

Another interesting correlation was found by the parody religion FSM (Flying Spaghetti Monster). Deducting causation here would be madness. Over the 18th and 19th century, piracy, the one with the boats, not the one with the files and the sharing, slowly died out. At the same time, possibly within a natural trend and / or for reasons of increased industrial activity, the global temperature started increasing. If you plot the number of pirates and the global temperature in a coordinate system, you find a relatively strong correlation between the two. The more pirates there are, the colder the planet is. Here’s the corresponding formula:

T = 16 – 0.05 · P^0.33

with T being the average global temperature and P the number of pirates. Given enough pirates (about 3.3 million to be specific), we could even freeze Earth.

pirates global warming correlation flying spaghetti

But of course nobody in the right mind would see causality at work here, rather we have two processes, the disappearance of piracy and global warming, that happened to occur at the same time. So you shouldn’t be too surprised that the recent rise of piracy in Somalia didn’t do anything to stop global warming.

As we saw, a correlation between quantities can arise in many ways and does not always imply causation. Sometimes there is a third, unseen variable in the line of causation, other times it’s two completely independent processes happening at the same time. So be careful to draw your conclusions.

Though not a fallacy in the strict sense, combinations of low probability and a high number of trials are also a common cause for incorrect conclusions. We computed that in roulette the odds of showing black twenty-six times in a row are only 1 in 270 million. We might conclude that it is basically impossible for this to happen anywhere.

But considering there are something in the order of 3500 casinos worldwide, each playing roughly 100 rounds of roulette per day, we get about 130 million rounds per year. With this large number of trials, it would be foolish not to expect a 1 in 270 million event to occur every now and then. So when faced with a low probability for an event, always take a look at the number of trials. Maybe it’s not as unlikely to happen as suggested by the odds.

Code Transmission and Probability

Not long ago did mankind first send rovers to Mars to analyze the planet and find out if it ever supported life. The nagging question “Are we alone?” drives us to penetrate deeper into space. A special challenge associated with such journeys is communication. There needs to be a constant flow of digital data, strings of ones and zeros, back and forth to ensure the success of the space mission.

During the process of transmission over the endless distances, errors can occur. There’s always a chance that zeros randomly turn into ones and vice versa. What can we do to make communication more reliable? One way is to send duplicates.

Instead of simply sending a 0, we send the string 00000. If not too many errors occur during the transmission, we can still decode it on arrival. For example, if it arrives as 00010, we can deduce that the originating string was with a high probability a 0 rather than a 1. The single transmission error that occurred did not cause us to incorrectly decode the string.

Assume that the probability of a transmission error is p and that we add to each 0 (or 1) four copies, as in the above paragraph. What is the chance of us being able to decode it correctly? To be able to decode 00000 on arrival correctly, we can’t have more than two transmission errors occurring. So during the n = 5 transmissions, k = 0, k = 1 and k = 2 errors are allowed. Using the binomial distribution we can compute the probability for each of these events:

p(0 errors) = C(5,0) · p^0 · (1-p)^5

p(1 error) = C(5,1) · p^1 · (1-p)^4

p(2 errors) = C(5,2) · p^2 · (1-p)^3

We can simplify these expressions somewhat. A binomial calculator provides us with these values: C(5,0) = 1, C(5,1) = 5 and C(5,2) = 10. This leads to:

p(0 errors) = (1-p)^5

p(1 error) = 5 · p · (1-p)^4

p(2 errors) = 10 · p^2 · (1-p)^3

Adding the probabilities for all these desired events tells us how likely it is that we can correctly decode the string.

p(success) = (1-p)^3 · ((1-p)^2 + 5·p·(1-p) + 10·p^2)

In the graph below you can see the plot of this function. The x-axis represents the transmission error probability p and the y-axis the chance of successfully decoding the string. For p = 10 % (1 in 10 bits arrive incorrectly) the odds of identifying the originating string are still a little more than 99 %. For p = 20 % (1 in 5 bits arrive incorrectly) this drops to about 94 %.

Code Transmission

The downside to this gain in accuracy is that the amount of data to be transmitted, and thus the time it takes for the transmission to complete, increases fivefold.

Mathematical Model For (E-) Book Sales

It seems to be a no-brainer that with more books on the market, an author will see higher revenues. I wanted to know more about how the sales rate varies with the number of books. So I did what I always do when faced with an economic problem: construct a mathematical model. Even though it took me several tries to find the right approach, I’m fairly confident that the following model is able to explain why revenues grow overproportionally with the number of books an author has published. I also stumbled across a way to correct the marketing R/C for number of books.

The basic quantities used are:

  • n = number of books
  • i = impressions per day
  • q = conversion probability (which is the probability that an impression results in a sale)
  • s = sales per buyer
  • r = daily sales rate

Obviously the basic relationship is:

r = i(n) * q(n) * s(n)

with the brackets indicating a dependence of the quantities on the number of books.

1) Let’s start with s(n) = sales per buyer. Suppose there’s a probability p that a buyer, who has purchased an author’s book, will go on to buy yet another book of said author. To visualize this, think of the books as some kind of mirrors: each ray (sale) will either go through the book (no further sales from this buyer) or be reflected on another book of the author. In the latter case, the process repeats. Using this “reflective model”, the number of sales per buyer is:

s(n) = 1 + p + p² + … + pn = (1 – pn) / (1 – p)

For example, if the probability of a reader buying another book from the same author is p = 15 % = 0.15 and the author has n = 3 books available, we get:

s(3) = (1 – 0.153) / (1 – 0.15) = 1.17 sales per buyer

So the number of sales per buyer increases with the number of books. However, it quickly reaches a limiting value. Letting n go to infinity results in:

s(∞) = 1 / (1 – p)

Hence, this effect is a source for overproportional growth only for the first few books. After that it turns into a constant factor.

2) Let’s turn to q(n) = conversion probability. Why should there be a dependence on number of books at all for this quantity? Studies show that the probability of making a sale grows with the choice offered. That’s why ridiculously large malls work. When an author offers a large number of books, he is able to provide list impression (featuring all his / her books) additionally to the common single impressions (featuring only one book). With more choice, the conversion probability on list impressions will be higher than that on single impressions.

  • qs = single impression conversion probability
  • ps = percentage of impressions that are single impressions
  • ql = list impression conversion probability
  • pl = percentage of impressions that are list impressions

with ps + pl = 1. The overall conversion probability will be:

q(n) = qs(n) * ps(n) + ql(n)* pl(n)

With ql(n) and pl(n) obviously growing with the number of books and ps(n) decreasing accordingly, we get an increase in the overall conversion probability.

3) Finally let’s look at i(n) = impressions per day. Denoting with i1, i2, … the number of daily impressions by book number 1, book number 2, … , the average number of impressions per day and book are:

ib = 1/n * ∑[k] ik

with ∑[k] meaning the sum over all k. The overall impressions per day are:

i(n) = ib(n) * n

Assuming all books generate the same number of daily impressions, this is a linear growth. However, there might be an overproportional factor at work here. As an author keeps publishing, his experience in writing, editing and marketing will grow. Especially for initially inexperienced authors the quality of the books and the marketing approach will improve with each book. Translated in numbers, this means that later books will generate more impressions per day:

ik+1 > ik

which leads to an overproportional (instead of just linear) growth in overall impressions per day with the number of books. Note that more experience should also translate into a higher single impression conversion probability:

qs(n+1) > qs(n)

4) As a final treat, let’s look at how these effects impact the marketing R/C. The marketing R/C is the ratio of revenues that result from an ad divided by the costs of the ad:

R/C = Revenues / Costs

For an ad to be of worth to an author, this value should be greater than 1. Assume an ad generates the number of iad single impressions in total. For one book we get the revenues:

R = iad * qs(1)

If more than one book is available, this number changes to:

R = iad * qs(n) * (1 – pn) / (1 – p)

So if the R/C in the case of one book is (R/C)1, the corrected R/C for a larger number of books is:

R/C = (R/C)1 * qs(n) / qs(1) * (1 – pn) / (1 – p)

In short: ads, that aren’t profitable, can become profitable as the author offers more books.

For more mathematical modeling check out: Mathematics of Blog Traffic: Model and Tips for High Traffic.

Statistics: The Multiplication Rule Gently Explained

Multiplication is a surprisingly powerful tool in statistics. It enables us to solve a vast amount of problems with relative ease. One thing to remember though is that the multiplication rule, to which I’ll get in a bit, only works for independent events. So let’s talk about those first.

When we roll a dice, there’s a certain probability that the number six will show. This probability does not depend on what number we rolled before. The events “rolling a three” and “rolling a six” are independent in the sense, that the occurrence of the one event does not affect the probability for the other.

Let’s look at a card deck. We draw a card and note it. Afterward, we put it back in the deck and mix the cards. Then we draw another one. Does the event “draw an ace” in the first try affect the event “draw a king” in the second try? It does not, because we put the ace back in the deck and mixed the cards. We basically reset our experiment. In such a case, the events “draw an ace” and “draw a king” are independent.

But what if we don’t put the first card back in the deck? Well, when we take the ace out of the deck, the chance of drawing a king will increase from 4 / 52 (4 kings out of 52 cards) to 4 / 51 (4 kings out of 51 cards). If we don’t do the reset, the events “draw an ace” and “draw a king” are in fact dependent. The occurrence of one changes the probability for the other.

With this in mind, we can turn to our powerful tool called multiplication rule. We start with two independent events, A and B. The probabilities for their occurrence are respectively p(A) and p(B). The multiplication rule states that the probability of both events occurring is simply the product of the probabilities p(A) and p(B). In mathematical terms:

p(A and B) = p(A) · p(B).

A quick look at the dice will make this clear. Let’s take both A and B to be the event “rolling a six”. Obviously they are independent, rolling a six on one try will not change the probability of rolling a six in the following try. So we are allowed to use the multiplication rule here. The probability of rolling a six is 1/6, so p(A) = p(B) = 1/6. Using the multiplication rule, we can calculate the chance of rolling two six in a row: p(A and B) = 1/6 · 1/6 = 1/36. Note that if we took A to be “rolling a six” and B to be “rolling a three”, we would arrive at the same result. The chance of rolling two six in a row is the same as rolling a six and then a three.

 Can we also use this on the deck of cards, even if we don’t reset the experiment? Indeed we can. But we have to take into account that the probabilities change as we go along. In more abstract terms, instead of looking at the general events “draw an ace” and “draw a king”, we need to look at the events A = “draw an ace in the first try” and B = “draw a king with one ace missing”. With the order of the events clearly set, there’s no chance of them interfering. The occurrence of both events, first drawing an ace and then drawing a king with the ace missing, has the probability: p(A and B) = p(A) · p(B) = 4/52 · 4/51 = 16/2652 or 1 in about 165 or 0.6 %.

For examples on how to apply the multiplication rule check out Multiple Choice Tests and Monkeys on Typewriters.

The Standard Error – What it is and how it’s used

I smoke electronic cigarettes and recently I wanted to find out how much nicotine liquid I consume per day. I noted the used amount on five consecutive days:

3 ml, 3.4 ml, 7.2 ml, 3.7 ml, 4.3 ml

So how much do I use per day? Well, our best guess is to do the average, that is, sum all the amounts and divide by the number of measurements:

(3 ml + 3.4 ml + 7.2 ml + 3.7 ml + 4.3 ml) / 5 = 4.3 ml

Most people would stop here. However, there’s one very important piece of information missing: how accurate is that result? Surely an average value of 4.3 ml computed from 100 measurements is much more reliable than the same average computed from 5 measurements. Here’s where the standard error comes in and thanks to the internet, calculating it couldn’t be easier. You can type in the measurements here to get the standard error:

http://www.miniwebtool.com/standard-error-calculator/

It tells us that the standard error (of the mean, to be pedantically precise) of my five measurements is SEM = 0.75. This number is extremely useful because there’s a rule in statistics that states that with a 95 % probability, the true average lies within two standard errors of the computed average. For us this means that there’s a 95 % chance, which you could call beyond reasonable doubt, that the true average of my daily liquid consumption lies in this intervall:

4.3 ml ± 1.5 ml

or between 2.8 and 5.8 ml. So the computed average is not very accurate. Note that as long as the standard deviation remains more or less constant as further measurements come in, the standard error is inversely proportional to the square root of the number of measurements. In simpler terms: If you quadruple the number of measurements, the size of the error interval halves. With 20 instead of only 5 measurements, we should be able to archieve plus/minus 0.75 accuracy.

So when you have an average value to report, be sure to include the error intervall. Your result is much more informative this way and with the help of the online calculator as well as the above rule, computing it is quick and painless. It took me less than a minute.

A more detailed explanation of the average value, standard deviation and standard error (yes, the latter two are not the same thing) can be found in chapter 7 of my Kindle ebook Statistical Snacks (this was not an excerpt).

Increase Views per Visit by Linking Within your Blog

One of the most basic and useful performance indicator for blogs is the average number of views per visit. If it is high, that means visitors stick around to explore the blog after reading a post. They value the blog for being well-written and informative. But in the fast paced, content saturated online world, achieving a lot of views per visit is not easy.

You can help out a little by making exploring your blog easier for readers. A good way to do this is to link within your blog, that is, to provide internal links. Keep in mind though that random links won’t help much. If you link one of your blog post to another, they should be connected in a meaningful way, for example by covering the same topic or giving relevant additional information to what a visitor just read.

Being mathematically curious, I wanted to find a way to judge what impact such internal links have on the overall views per visit. Assume you start with no internal links and observe a current number views per visitor of x. Now you add n internal links in your blog, which has in total a number of m entries. Given that the probability for a visitor to make use of an internal link is p, what will the overall number of views per visit change to? Yesterday night I derived a formula for that:

x’ = x + (n / m) · (1 / (1-p) – 1)

For example, my blog (which has as of now very few internal links) has an average of x = 2.3 views per visit and m = 42 entries. If I were to add n = 30 internal links and assuming a reader makes use of an internal link with the probability p = 20 % = 0.2, this should theoretically change into:

x’ = 2.3 + (30 / 42) · (1 / 0.8 – 1) = 2.5 views per visit

A solid 9 % increase in views per visit and this just by providing visitors a simple way to explore. So make sure to go over your blog and connect articles that are relevant to each other. The higher the relevancy of the links, the higher the probability that readers will end up using them. For example, if I only added n = 10 internal links instead of thirty, but had them at such a level of relevancy that the probability of them being used increases to p = 40 % = 0.4, I would end up with the same overall views per visit:

x’ = 2.3 + (10 / 42) · (1 / 0.6 – 1) = 2.5 views per visit

So it’s about relevancy as much as it is about amount. And in the spirit of not spamming, I’d prefer adding a few high-relevancy internal links that a lot low-relevancy ones.

If you’d like to know more on how to optimize your blog, check out: Setting the Order for your WordPress Blog Posts and Keywords: How To Use Them Properly On a Website or Blog.

Probability and Multiple Choice Tests

Imagine taking a multiple choice test that has three possible answers to each question. This means that even if you don’t know any answer, your chance of getting a question right is still 1/3. How likely is it to get all questions right by guessing if the test contains ten questions?

Here we are looking at the event “correct answer” which occurs with a probability of p(correct answer) = 1/3. We want to know the odds of this event happening ten times in a row. For that we simply apply the multiplication rule:

  • p(all correct) = (1/3)10 = 0.000017

Doing the inverse, we can see that this corresponds to about 1 in 60000. So if we gave this test to 60000 students who only guessed the answers, we could expect only one to be that lucky. What about the other extreme? How likely is it to get none of the ten questions right when guessing?

Now we must focus on the event “incorrect answer” which has the probability p(incorrect answer) = 2/3. The odds for this to occur ten times in a row is:

  • p(all incorrect) = (2/3)10 = 0.017

In other words: 1 in 60. Among the 60000 guessing students, this outcome can be expected to appear 1000 times. How would these numbers change if we only had eight instead of ten questions? Or if we had four options per question instead of three? I leave this calculation up to you.

Statistics and Monkeys on Typewriters

Here are the first two sentences of the prologue to Shakespeare’s Romeo and Juliet:

Two households, both alike in dignity,
In fair Verona, where we lay our scene

This excerpt has 77 characters. Now we let a monkey start typing random letters on a typewriter. Once he typed 77 characters, we change the sheet and let him start over. How many tries does he need to randomly reproduce the above paragraph?

There are 26 letters in the English alphabet and since he’ll be needing the comma and space, we’ll include those as well. So there’s a 1/28 chance of getting the first character right. Same goes for the second character, third character, etc … Because he’s typing randomly, the chance of getting a character right is independent of what preceded it. So we can just start multiplying:

p(reproduce) = 1/28 · 1/28 · … · 1/28 = (1/28)^77

The result is about 4 times ten to the power of -112. This is a ridiculously small chance! Even if he was able to complete one quadrillion tries per millisecond, it would most likely take him considerably longer than the estimated age of the universe to reproduce these two sentences.

Now what about the first word? It has only three letters, so he should be able to get at least this part in a short time. The chance of randomly reproducing the word “two” is:

p(reproduce) = 1/26 · 1/26 · 1/26 = (1/26)^3

Note that I dropped the comma and space as a choice, so now there’s a 1 in 26 chance to get a character right. The result is 5.7 times ten to the power of -5, which is about a 1 in 17500 chance. Even a slower monkey could easily get that done within a year, but I guess it’s still best to stick to human writers.

.This was an excerpt from the ebook “Statistical Snacks. Liked the excerpt? Get the book here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. Want more excerpts? Check out The Probability of Becoming a Homicide Victim and Missile Accuracy (CEP).

Missile Accuracy (CEP) – Excerpt from “Statistical Snacks”

An important quantity when comparing missiles is the CEP (Circular Error Probable). It is defined as the radius of the circle in which 50 % of the fired missiles land. The smaller it is, the better the accuracy of the missile. The German V2 rockets for example had a CEP of about 17 km. So there was a 50/50 chance of a V2 landing within 17 km of its target. Targeting smaller cities or even complexes was next to impossible with this accuracy, one could only aim for a general area in which it would land rather randomly.

Today’s missiles are significantly more accurate. The latest version of China’s DF-21 has a CEP about 40 m, allowing the accurate targeting of small complexes or large buildings, while CEP of the American made Hellfire is as low as 4 m, enabling precision strikes on small buildings or even tanks.

Assuming the impacts are normally distributed, one can derive a formula for the probability of striking a circular target of Radius R using a missile with a given CEP:

p = 1 – exp( -0.41 · R² / CEP² )

This quantity is also called the “single shot kill probability” (SSKP). Let’s include some numerical values. Assume a small complex with the dimensions 100 m by 100 m is targeted with a missile having a CEP of 150 m. Converting the rectangular area into a circle of equal area gives us a radius of about 56 m. Thus the SSKP is:

p = 1 – exp( -0.41 · 56² / 150² ) = 0.056 = 5.6 %

So the chances of hitting the target are relatively low. But the lack in accuracy can be compensated by firing several missiles in succession. What is the chance of at least one missile hitting the target if ten missiles are fired? First we look at the odds of all missiles missing the target and answer the question from that. One missile misses with 0.944 probability, the chance of having this event occur ten times in a row is:

p(all miss) = 0.94410 = 0.562

Thus the chance of at least one hit is:

p(at least one hit) = 1 – 0.562 = 0.438 = 43.8 %

Still not great considering that a single missile easily costs 10000 $ upwards. How many missiles of this kind must be fired at the complex to have a 90 % chance at a hit? A 90 % chance at a hit means that the chance of all missiles missing is 10 %. So we can turn the above formula for p(all miss) into an equation by inserting p(all miss) = 0.1 and leaving the number of missiles n undetermined:

0.1 = 0.944n

All that’s left is doing the algebra. Applying the natural logarithm to both sides and solving for n results in:

n = ln(0.1) / ln(0.944) = 40

So forty missiles with a CEP of 150 m are required to have a 90 % chance at hitting the complex. As you can verify by doing the appropriate calculations, three DF-21 missiles would have achieved the same result.

Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. For more excerpts see The Probability of Becoming a Homicide Victim and How To Use the Expected Value.

My Fair Game – How To Use the Expected Value

You meet a nice man on the street offering you a game of dice. For a wager of just 2 $, you can win 8 $ when the dice shows a six. Sounds good? Let’s say you join in and play 30 rounds. What will be your expected balance after that?

You roll a six with the probability p = 1/6. So of the 30 rounds, you can expect to win 1/6 · 30 = 5, resulting in a pay-out of 40 $. But winning 5 rounds of course also means that you lost the remaining 25 rounds, resulting in a loss of 50 $. Your expected balance after 30 rounds is thus -10 $. Or in other words: for the player this game results in a loss of 1/3 $ per round.

 Let’s make a general formula for just this case. We are offered a game which we win with a probability of p. The pay-out in case of victory is P, the wager is W. We play this game for a number of n rounds.

The expected number of wins is p·n, so the total pay-out will be: p·n·P. The expected number of losses is (1-p)·n, so we will most likely lose this amount of money: (1-p)·n·W.

 Now we can set up the formula for the balance. We simply subtract the losses from the pay-out. But while we’re at it, let’s divide both sides by n to get the balance per round. It already includes all the information we need and requires one less variable.

B = p · P – (1-p) · W

This is what we can expect to win (or lose) per round. Let’s check it by using the above example. We had the winning chance p = 1/6, the pay-out P = 8 $ and the wager W = 2 $. So from the formula we get this balance per round:

B = 1/6 · 8 $ – 5/6 · 2 $ = – 1/3 $ per round

Just as we expected. Let’s try another example. I’ll offer you a dice game. If you roll two six in a row, you get P = 175 $. The wager is W = 5 $. Quite the deal, isn’t it? Let’s see. Rolling two six in a row occurs with a probability of p = 1/36. So the expected balance per round is:

B = 1/36 · 175 $ – 35/36 · 5 $ = 0 $ per round

I offered you a truly fair game. No one can be expected to lose in the long run. Of course if we only play a few rounds, somebody will win and somebody will lose.

It’s helpful to understand this balance as being sound for a large number of rounds but rather fragile in case of playing only a few rounds. Casinos are host to thousands of rounds per day and thus can predict their gains quite accurately from the balance per round. After a lot of rounds, all the random streaks and significant one-time events hardly impact the total balance anymore. The real balance will converge to the theoretical balance more and more as the number of rounds grows. This is mathematically proven by the Law of Large Numbers. Assuming finite variance, the proof can be done elegantly using Chebyshev’s Inequality.

The convergence can be easily demonstrated using a computer simulation. We will let the computer, equipped with random numbers, run our dice game for 2000 rounds. After each round the computer calculates the balance per round so far. The below picture shows the difference between the simulated balance per round and our theoretical result of – 1/3 $ per round.

Image

(Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2)

The Probability of Becoming a Homicide Victim

 Each year in the US there are about 5 homicides per 100000 people, so the probability of falling victim to a homicide in a given year is 0.00005 or 1 in 20000. What are the chances of falling victim to a homicide over a lifespan of 70 years?

 Let’s approach this the other way around. The chance of not becoming a homicide victim during one year is p = 0.99995. Using the multiplication rule we can calculate the probability of this event occurring 70 times in a row:

 p = 0.99995 · … · 0.99995 = 0.9999570

 Thus the odds of not becoming a homicide victim over the course of 70 years are 0.9965. This of course also means that there’s a 1 – 0.9965 = 0.0035, or 1 in 285, chance of falling victim to a homicide during a life span. In other words: two victims in every jumbo jet full of people. How does this compare to other countries?

 In Germany, the homicide rate is about 0.8 per 100000 people. Doing the same calculation gives us a 1 in 1800 chance of becoming a murder victim, so statistically speaking there’s one victim per small city. At the other end of the scale is Honduras with 92 homicides per 100000 people, which translates into a saddening 1 in 16 chance of becoming a homicide victim over the course of a life and is basically one victim in every family.

 It can get even worse if you live in a particularly crime ridden part of a country. The homicide rate for the city San Pedro Sula in Honduras is about 160 per 100000 people. If this remained constant over time and you never left the city, you’d have a 1 in 9 chance of having your life cut short in a homicide.

Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. For more excerpts check out Missile Accuracy (CEP), Immigrants and Crime and Monkeys on Typewriters.