Encryption

Lab setup

First, make sure you have completed the initial setup.

If you are part of a course

  1. Open Terminal. Run the update command to make sure you have the latest code.
    $ mwc update
  2. Move to this lab's directory.
    $ cd ~/Desktop/making_with_code/ny-csdf/unit2/lab_encryption
    
  3. Enter this lab's shell environment.
    $ poetry shell
    

If you are working on your own

  1. Move to your MWC directory.
    $ cd ~/Desktop/making_with_code
    
  2. Get a copy of this lab's materials.
    git clone https://git.makingwithcode.org/mwc/lab_encryption.git

In the previous lab we explored how encoding can be used to compress information into fewer bytes. Now let's consider another way we can use encoding: to keep information a secret. The encryption algorithms we will consider in this lab require the user to choose a secret. The secret is used to convert a plaintext message into an encrypted ciphertext. The secret is required to convert the ciphertext back into plaintext.

Therefore, if you and a friend both know a secret, you can send encrypted messages back and forth--even if someone intecepts the message, they won't be able to understand it. Similarly, you could save an encrypted text file with your secret thoughts. Even if someone else got access to your computer, they would not be able to read the file unless they had the secret.

Caesar cipher

The Caesar Cipher is ancient--it was used by Julius Caesar, the Roman emperor, in 50 BC. In this encryption algorithm, each character is converted into a number, then a secret number is added to each (if the result is too large, we loop it back around), and the result is converted back into another character. The Caesar Cipher is implemented in ciphers/caesar.py

💻 Play around with the caesar cipher.

$ python
>>> from ciphers.caesar import CaesarCipher
>>> cipher = CaesarCipher(1)
>>> cipher.encrypt("Hello")
'Ifmmp'
>>> cipher.decrypt('Ifmmp')
'Hello'
>>> cipher2 = CaesarCipher(2)
>>> cipher2.encrypt('Hello')
'Jgnnq'

The CaesarCipher is initialized with a secret number, and then its encrypt and decrypt methods can be used to convert plaintext to ciphertext and back again. When the secret number is 1, H is rotated forward 1 and becomes I. When the secret number is 2, H is rotated forward 2 and becomes J.

Cracking Caesar

Unfortunately, the Caesar Cipher has a serious weakness: frequency analysis. We know the most common characters (space, e, t, a, i, o, n, s) in English, so the most common ciphertext characters likely correspond to these plaintext characters. For example, if 'B' is the most common character in an encrypted message, it probably corresponds to a space. From here, we could calculate how far the space character is from 'B' to get the likely secret.

For frequency analysis, collections.Counter is a very useful class. It takes a list (or another sequence like a string), and returns a dict counting how often each item appears.

>>> from easybits import Bits
>>> from collections import Counter
>>> from cipher.caesar import CaesarCipher
>>> ciphertext = "v+,6B,6B$B3(5)(&7/<B1250$/Bg1*/,6+B6(17(1&(P"
>>> Counter(ciphertext)
Counter({
  'B': 6, '(': 5, '6': 4, '1': 4, ',': 3, '/': 3, '+': 2, 
  '$': 2, '5': 2, '&': 2, '7': 2, 'v': 1, '3': 1, ')': 1, 
  '<': 1, '2': 1, '0': 1, 'g': 1, '*': 1, 'P': 1
})
>>> Bits('B').int - Bits(' ').int
34
>>> cipher = CaesarCipher(34)
>>> cipher.decrypt(ciphertext)
"This is a perfectly normal English sentence."

Polyalphabetic ciphers

So far, our analysis tracks the actual historical progress of cryptography. Realizing that a Caesar Cipher can be easily cracked with frequency analysis, several people came up with an idea which today is called a polyalphabetic cipher, where a sequence of secret numbers are used instead of a single number. During encryption, each plaintext letter is encrypted with the next secret number; when the secret numbers run out we loop back around. Here's an example:

>>> from ciphers.poly import PolyCipher
>>> cipher = PolyCipher([0, 1, 2])
>>> cipher.encrypt('AAAAAAAAAAAAAAAAAAAAAAAAA')
'ABCABCABCABCABCABCABCABCA'

You can also initialize a PolyCipher with a string--in this case, we use the ASCII code for each character of the string as the secret number.

>>> cipher = PolyCipher("supersecret")
>>> cipher.encrypt('AAAAAAAAAAAAAAAAAAAAAAAAA')
'UWRGTUGETGVUWRGTUGETGVUWR'

Even when encrypting a long string of the same character, the ciphertext varies.

Polyalphabetic ciphers are much harder to crack than Caeear ciphers, but there are strategies, especially if you can collect a lot of ciphertext. During World War II, the Nazi Enigma Machine was a very fancy polyalphabetic cipher used for sending encrypted messages over the radio. Polish and British mathematicians managed to crack the Enigma cipher, which played a major role in winning the war.