code

Code Transmission and Probability

Not long ago did mankind first send rovers to Mars to analyze the planet and find out if it ever supported life. The nagging question “Are we alone?” drives us to penetrate deeper into space. A special challenge associated with such journeys is communication. There needs to be a constant flow of digital data, strings of ones and zeros, back and forth to ensure the success of the space mission.

During the process of transmission over the endless distances, errors can occur. There’s always a chance that zeros randomly turn into ones and vice versa. What can we do to make communication more reliable? One way is to send duplicates.

Instead of simply sending a 0, we send the string 00000. If not too many errors occur during the transmission, we can still decode it on arrival. For example, if it arrives as 00010, we can deduce that the originating string was with a high probability a 0 rather than a 1. The single transmission error that occurred did not cause us to incorrectly decode the string.

Assume that the probability of a transmission error is p and that we add to each 0 (or 1) four copies, as in the above paragraph. What is the chance of us being able to decode it correctly? To be able to decode 00000 on arrival correctly, we can’t have more than two transmission errors occurring. So during the n = 5 transmissions, k = 0, k = 1 and k = 2 errors are allowed. Using the binomial distribution we can compute the probability for each of these events:

p(0 errors) = C(5,0) · p^0 · (1-p)^5

p(1 error) = C(5,1) · p^1 · (1-p)^4

p(2 errors) = C(5,2) · p^2 · (1-p)^3

We can simplify these expressions somewhat. A binomial calculator provides us with these values: C(5,0) = 1, C(5,1) = 5 and C(5,2) = 10. This leads to:

p(0 errors) = (1-p)^5

p(1 error) = 5 · p · (1-p)^4

p(2 errors) = 10 · p^2 · (1-p)^3

Adding the probabilities for all these desired events tells us how likely it is that we can correctly decode the string.

p(success) = (1-p)^3 · ((1-p)^2 + 5·p·(1-p) + 10·p^2)

In the graph below you can see the plot of this function. The x-axis represents the transmission error probability p and the y-axis the chance of successfully decoding the string. For p = 10 % (1 in 10 bits arrive incorrectly) the odds of identifying the originating string are still a little more than 99 %. For p = 20 % (1 in 5 bits arrive incorrectly) this drops to about 94 %.

Code Transmission

The downside to this gain in accuracy is that the amount of data to be transmitted, and thus the time it takes for the transmission to complete, increases fivefold.