Cracking Caesar Cipher
Visualising frequency analysis in encrypted messages using Caesar and Vigenère ciphers
Introduction
The Caesar cipher is a method of message encryption easily crackable using frequency analysis. To evade this analysis our secrets are safer using the Vigenère cipher. This is the advantage of using a polyalphabetic cipher over an affine monoalphabetic substitution cipher, in other words: the same letter is not always encrypted the same way.
To make sense of all of this estrange words I made a Python script that encrypts a massage using both Caesar and Vigenère ciphers and performs a letter frequency analysis plotting some pretty graphs to illustrate it all.
This articles tries to explain the whole process going through the code and the output. The complete project with the full code and usage instructions can be found here: https://github.com/jrinconada/cracking-caesar-cipher
Letter counting
Going through every letter in a message counting how many times it appears and saving it as a list of integer numbers.
Definitions
Usage
Output
Plotting letter count
Plotting a line with the letter count as the y coordinate and the alphabet as the x coordinate.
Definition
Usage
Output
Adding Caesar cipher
Adding another line to represent letter count in a message encrypted using Caesar cipher.
Notice that previous definitions are needed for this code to work. They are not included to avoid redundancy and keep the code cleaner.
Definition
Usage
Output
Adding Vigenère cipher
Adding yet another line to represent letter count in a message encrypted using Vigenère cipher.
Definition
Usage
Output
Plotting theoretical frequencies
Using a pie chart and a bar chart to display theoretical letter frequency in a pleasant way.
Definition
Usage
Output
Comparing theoretical vs actual frequencies
Plotting lines for theoretical and actual frequencies of a sample message and highlight the differences as green or red fills.
Definition
Usage
Output
Usage with a longer message
To illustrate how a longer message will give a letter count closer to the theoretical letter frequency.
Output
Now the differences (red and green regions) are smaller.
Showing letter frequency for every letter
Every letter is represented as a square in a grid and the frequency is the color intensity of the square.
Definition
Usage with the longer message
The same long message from before is used, omitted for code clarity.