The Mathematics of Codebreaking.  


WORKSHEET INSTRUCTIONS
Grab some paper and pens to note down your answers to the questions in this worksheet as you read through! Don’t forget to take a look at the extra resources and have a go at the activity at the end.


Secret codes aren’t just a thing from mystery novels and spy movies. They are used (sometimes unknowingly) by most people on a daily basis.

Can you think of some examples of where secret codes may be used in daily life?

If you’re using a website and there’s a lock icon on the search bar, that means that the site uses HTTPS encryption, making it hard for anyone to see what you’re looking at. Encryption is also used whenever you’re shopping with a credit or debit card. Secure messaging apps such as Whatsapp use encryption to ensure that your conversations are kept private.

There are several different ways of encrypting (or encoding) messages and information, and these methods often require mathematics and statistics. One way of encoding a message is called a cipher. One of the earliest examples of encrypted messages in history is the Caesar cipher, used by Roman emperor Julius Caesar to encode important military messages. It is known as a substitution cipher, which means that it operates by substituting one unit of ‘true’ text with a unit of ‘cipher’ text. In the case of the Caesar cipher, letters are shifted over by a certain number.

So, if the shift was 3, what would A be?

 

If the shift was 3, then A would be coded as D, T as W and so on. It was up to the recipient of the message to deduce the shift and therefore decode the message.

While this can be done by trial and error, some clever statistics allow us to make a better guess at what the shift of a cipher might be. Not every letter in the English language is used an equal amount, on average. By looking at the average frequency of letters in English texts, we know the approximate probability of a random letter in a text being that letter.

So, we can find out what the approximate probability of the shift is by comparing the average frequencies of letters to the frequency of letters we can see in a code. For example, E is the most frequently occurring letter in English text. If the most frequently occurring letter in a cipher is Q, then there is a good chance that the shift of the cipher is 12 (the distance between E and Q in the alphabet).

However, keep in mind that the statistics of a sample don’t always match the statistics of a population (in this case English texts). Conclusions about individual letters always come with a degree of uncertainty. When decrypting a message, however, you can usually tell whether you’ve got the right method by whether your decrypted message actually forms English words or sentences!

What problems can you think of with this method? How might it be different for different languages?

The Caesar cipher is rarely used on its own anymore, as it’s quite easy to break, but it is often used as a part of more complex ciphers. For example, the famous German Enigma machine used in WWII used a series of three substitution ciphers to encode messages that were then transmitted as Morse code. The first scrambler wheel turned once every time a letter was typed, so the shift of the cipher changed for every subsequent letter.

A Colossus Mark 2 computer being operated by Dorothy Du Boisson (left) and Elsie Booker (right), 1943

Codebreakers during the Second World War managed to crack the Enigma code using statistics – they found certain patterns in the data (the intercepted encoded messages). For example, it was possible to find bigrams and trigrams (sets of two or three subsequent encoded letters) in a code could be compared to known statistics about which letters are more likely to follow others.

For instance, there is a 1 in 3 chance of a ‘t’ being followed by an ‘h’. Such ‘dependency’ probabilities were used to assign a probability score to bigrams and trigrams, with the highest score indicating the most likely actual letter combination. By using this technique alongside other computational and probabilistic methods, codebreakers at Bletchley Park succeeded at cracking the Enigma code and helping the Allies anticipate and counter Germany’s secret plans. 

These days, a lot of messages and data encryption takes place online. This is your chance to go on the internet and do a little research into how encryption works today, and how hackers can sometimes still get through.

Further reading

Check out this Crash Course video for a start. If you’d like to learn about the mathematics of codebreaking in more detail this video is also super interesting! 

Read this article about the strengths and weaknesses of modern encryption systems.

If you’d like to learn more about the Enigma Code, read this article and listen to this podcast.

Click here to return to the main timetable