Assume we are given a country with a population that is 90 % native and 10 % immigrant. As it is often the case in the first world, the native population is on average older than the immigrant population.

Let’s look at a certain type of crime, say robberies. Now a statistic shows that of all the robberies in the country, 80 % have been committed by natives and 20 % by immigrants. Can we conclude from these numbers that the immigrants are more inclined to steal than the natives? Many people would do so.

The police keeps basic records of all crimes that have been reported. This enables us to get a closer look at the situation. Consider the graph below, it shows the age distribution of people accused of robbery in Canada in 2008. It immediately becomes clear that it is for the most part a “young person’s crime”. The rates are significantly elevated for ages 14 – 20 and then decrease with age. Even without crunching the numbers it is clear that the younger a population is, the more robberies will occur.

Let’s go back to our fictional country of 90 % natives and 10 % immigrants, with the immigrant population being younger. Assuming the same inclination to committing robberies for both groups, the immigrant population would contribute more than 10 % to the total amount of robberies for the simple reason that robbery is a crime mainly committed by young people.

Using a simplistic example, we can put this logic to the test. Let’s stick to our numbers of 90 % natives and 10 % immigrants. This time however, we’ll crudely specify an age distribution for both. For the native population the breakdown is:

– 15 % below age 15

– 15 % between age 15 and 25

– 70 % above age 25

For the immigrants we take a slightly different distribution that results in a lower average age:

– 20 % below age 15

– 20 % between age 15 and 25

– 60 % above age 25

We’ll set the total population count to 100 million. Now assume that there’s a crime that is committed solely by people in the age group 15 to 25. Within this age group, 1 in 100000 will commit this crime over the course of one year, independently of what population group he or she belongs to. Note that this means that there’s no inclination towards this crime in any of the two groups.

It’s time to crunch the numbers. There are 0.9 · 100 million = 90 million natives. Of these, 0.15 · 90 million = 13.5 million are in the age group 15 to 25. This means we can expect 135 natives to commit this crime during a year.

As for the immigrants, there are 0.1 · 100 million = 10 million in the country, with 0.2 · 10 million = 2 million being in the age group of interest. They will give rise to an expected number of 20 crimes of this kind per year.

In total, we can expect this crime to be committed 155 times, with the immigrants having a share of 20 / 155 = 12.9 %. This is higher than their proportional share of 10 % despite there being no inclination for committing said crime. All that led to this result was the population being younger on average.

So concluding from a larger than proportional share of crime that there’s an inclination towards crime in this part of the population is not mathematically sound. To be able to draw any conclusions, we would need to know the expected value, which can be calculated from the age distribution of the crime and that of the population and can differ quite strongly from the proportional value.

(Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2)