Surprised me? Who cares? How about when Claude Shannon surprised the entire computer science community.
It was the result of his famous 1948 paper: The Mathematical Theory of Communication.
To get you surprised, let’s set up the problem.
The goal is to communicate a string of 1’s and 0’s across a ‘noisy’ channel. That means for each digit that passes through the channel, it’ll flip that digit to the other digit with probability pp.
How would you communicate a string to someone on the other end accurately?
As is typically advised, start simple. Let’s say we want to communicate one digit. Here’s an idea. Repeat that digit 1,000 times and send that length-1,000 string through the channel. Then have the recipient interpret the most common digit as the digit you intended to send. The longer that sequence is, the less likely the recipient received an incorrect answer.
But there’s a cost. You effectively slowed the rate of communication tremendously (by a factor of a 1,000) to ensure such a low probability of error.
But what about a length 10 string? We can probably do better than naively repeating each digit since we have more structure to play with.
So here’s the question:
For a given noise level pp and a given desired error probability ee (on a per bit basis) what are the achievable communicate rates RR?
Picture the problem this way. For some noisy channel, you draw two axes:
Now tell me - what regions in this space are achievable? Out of all conceivable algorithms, what’s impossible and what’s possible for this channel?
I don’t know, but here’s a reasonable hypothesis:
Whatever the separation is, it goes through the origin. In other words, if you want a zero probability of error, you’ll have to communicate infinitely slowly.
Makes sense right? We aren’t talking about a small probability of error - we are talking zero, impossible - it cannot happen! Such a demand for limitless perfection certainly demands limitless effort. This is certainly true in our simple repeat digit strategy.
…
…
…
But it’s not in general.
The tank that is Claude Shannon proved that the separation looks something like this:
He found that you can communicate with arbitrarily small error rates at non-zero communication rates.
That is incredible.
Sources
[1] Sadly, this wasn’t inspired by reading the original paper. Instead, I got it from the late David MacKay’s Information Theory, Inference, and Learning Algorithms.