Cryptography is the mathematical foundation of all digital security. Every HTTPS connection, every encrypted disk, every password stored in a database relies on cryptographic algorithms. Yet most developers and security professionals treat cryptography as a black box — they know to use AES or TLS but have only a vague understanding of how or why it works. This guide demystifies cryptography from historical ciphers to modern algorithms, with real examples of where it fails.
From Caesar to XOR: The Foundations
Julius Caesar used a substitution cipher: shift every letter by 3. A becomes D, B becomes E. This is trivially broken by frequency analysis — in English, E is the most common letter, so the most common ciphertext letter is likely E shifted by whatever the key is. Modern cryptography evolved specifically to eliminate this kind of statistical pattern.
# Caesar cipher in Python
def caesar_encrypt(text, shift):
result = ""
for char in text:
if char.isalpha():
shifted = ord(char) + shift
if char.isupper():
result += chr((shifted - 65) % 26 + 65)
else:
result += chr((shifted - 97) % 26 + 97)
else:
result += char
return result
# XOR cipher: the simplest symmetric cipher
# XOR key with each byte -- basis of stream ciphers
# Critical property: A XOR B XOR B = A (reversible)
def xor_cipher(data: bytes, key: bytes) -> bytes:
return bytes(b ^ key[i % len(key)] for i, b in enumerate(data))
# XOR weakness: if key is short and message is long, frequency analysis breaks it
# If same key is reused for two messages: msg1 XOR msg2 = cipher1 XOR cipher2
Symmetric Encryption: AES
AES (Advanced Encryption Standard) is the workhorse of modern symmetric encryption. The same key encrypts and decrypts. It operates on 128-bit blocks with 128, 192, or 256-bit keys. Understanding the different modes of operation is critical — the wrong mode choice in code can completely undermine AES security.
# ECB Mode (Electronic Code Book) -- NEVER USE THIS
# Problem: identical plaintext blocks produce identical ciphertext blocks
# This is why the "ECB penguin" demo works:
# An image encrypted with AES-ECB still visually reveals the outline of the penguin
# because identical pixel blocks encrypt to identical ciphertext blocks
from Crypto.Cipher import AES
# WRONG - ECB reveals patterns:
cipher = AES.new(key, AES.MODE_ECB)
# CORRECT - CBC (Cipher Block Chaining) with random IV:
import os
iv = os.urandom(16) # ALWAYS use a random IV
cipher = AES.new(key, AES.MODE_CBC, iv)
ciphertext = cipher.encrypt(padded_plaintext)
# Store IV with ciphertext (it is not secret, just must be random and unique)
# BEST - GCM mode (authenticated encryption):
iv = os.urandom(12)
cipher = AES.new(key, AES.MODE_GCM, nonce=iv)
ciphertext, tag = cipher.encrypt_and_digest(plaintext)
# Tag authenticates the ciphertext -- detects tampering
Asymmetric Encryption: RSA and Why Key Size Matters
RSA uses a mathematically linked key pair: a public key that anyone can use to encrypt, and a private key only you hold to decrypt. Security relies on the difficulty of factoring large numbers. A 512-bit RSA key can be factored in hours with modern hardware. A 2048-bit key is currently safe; 4096-bit is paranoid-safe.
# Generate RSA key pair (use at least 2048 bits)
openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -pubout -out public.pem
# Encrypt file with recipient's public key
openssl rsautl -encrypt -inkey public.pem -pubin -in message.txt -out message.enc
# Decrypt with private key
openssl rsautl -decrypt -inkey private.pem -in message.enc -out message.txt
# RSA is slow -- never use it to encrypt large data directly
# Instead: encrypt a random AES session key with RSA, use AES for the data
# This is exactly what TLS does
Hashing: One-Way Functions and How They Fail
# MD5 -- BROKEN for security purposes (collision attacks exist)
# SHA-1 -- BROKEN for security purposes (collision demonstrated 2017)
# SHA-256 -- SAFE for data integrity
# SHA-3 -- SAFE, different design (Keccak sponge construction)
# bcrypt/scrypt/Argon2 -- REQUIRED for password storage (slow by design)
# Why bcrypt for passwords, not SHA-256?
# SHA-256 of "password123" takes 0.0001 seconds
# Argon2 of "password123" takes 0.3 seconds (configurable)
# Attacker can try SHA-256 at 10 billion guesses/second with a GPU
# Argon2 limits attacker to ~3 guesses/second -- game changer
import bcrypt
# Hash a password
hashed = bcrypt.hashpw(b"mypassword", bcrypt.gensalt(rounds=12))
# Verify (use constant-time comparison to prevent timing attacks)
is_valid = bcrypt.checkpw(b"mypassword", hashed)
# Length extension attack on SHA-256 (CTF/real-world relevance):
# If HMAC = SHA256(secret || message), attacker can forge HMAC for extended message
# Fix: use HMAC properly: HMAC = SHA256(key XOR opad || SHA256(key XOR ipad || message))
TLS: How HTTPS Actually Works
Every HTTPS connection performs a TLS handshake. Understanding this handshake explains why certificate pinning matters, why expired certificates cause outages, and why cipher suite choices affect security:
- Client sends: supported TLS versions and cipher suites
- Server responds: chosen cipher suite + its certificate (public key)
- Key exchange: client and server agree on a session key (using ECDHE for forward secrecy)
- All further communication encrypted with AES-GCM using the session key
# Analyze TLS configuration of a server
openssl s_client -connect plainlysec.com:443 -showcerts 2>/dev/null | head -40
# Check for weak cipher suites with testssl.sh
./testssl.sh plainlysec.com
# What to look for:
# - TLS 1.3 or 1.2 only (disable TLS 1.0 and 1.1)
# - ECDHE key exchange (forward secrecy)
# - AES-GCM or ChaCha20-Poly1305 (authenticated encryption)
# - No RC4, DES, 3DES (broken ciphers)
# - Certificate uses SHA-256 signature (not SHA-1)
Cryptography does not fail because the mathematics breaks — it fails because developers misuse it. The wrong mode, the wrong hash function for passwords, the hardcoded key, the reused nonce — these implementation errors are the real threat. Understanding the principles behind the algorithms helps you recognize when code is making a dangerous choice.