Learning Goals
3 minBy the end of this lesson you can:
- Explain why plain text and plain fast-hashes both fail for passwords.
- Describe rainbow-table attacks and how a salt defeats them.
- State why a password hash must be slow and tunable.
- Verify a password in constant time to avoid timing leaks.
Warm-Up · The Breach That Teaches Everyone
5 minDatabases get stolen. It's not if but when. The question that decides whether it's a disaster or a non-event: what was in the password column?
Stored as... Attacker who steals the DB gets... plain text EVERY password instantly. Catastrophe. MD5 / SHA-256 (unsalted) cracks most in minutes via rainbow tables. SHA-256 + salt better, but fast hashes still fall to GPUs. bcrypt/argon2 + salt cracking is slow & expensive → most survive.
You never store the password — you store something derived from it that lets you check a login but can't be reversed. Three properties make that derivation safe: it's salted (unique per user, defeats precomputation), slow (so guessing is expensive), and verified in constant time (so the comparison leaks nothing). Miss any one and you're the next headline.
New Concept · Salt, Slow, Constant-Time
14 minWhy plain text is indefensible
If you store password = "hunter2", then anyone who reads the database — a hacker, a rogue employee, a leaked backup — has every user's actual password. And since people reuse passwords, you've also compromised their email, bank, everything. There is never a reason to store a recoverable password.
Why plain hashing isn't enough either
import hashlib # tempting but WRONG: stored = hashlib.sha256("hunter2".encode()).hexdigest() # Problem 1: identical passwords → identical hashes (visible in the DB) # Problem 2: precomputed "rainbow tables" map common-password hashes → instant crack # Problem 3: SHA-256 is FAST — a GPU tries billions of guesses/second
Attackers don't reverse the hash; they guess: hash a dictionary of common passwords and compare. Because the hash is deterministic and fast, and everyone's "password123" hashes the same, most fall quickly. Rainbow tables make it instant.
Defence 1: Salt — a unique random value per password
import hashlib, os # a salt makes each hash unique, even for identical passwords salt = os.urandom(16) # 16 random bytes, unique per user hashed = hashlib.sha256(salt + "hunter2".encode()).hexdigest() # store BOTH the salt and the hash. The salt isn't secret — it's anti-precomputation.
- Two users with the same password now have different hashes (the salt differs).
- Rainbow tables are useless — they'd need a separate table per salt.
- The salt is stored alongside the hash; it doesn't need to be secret, just unique and random.
Defence 2: Slow — make each guess expensive
For files you want hashing fast. For passwords you want it deliberately slow — because the attacker has to run it for every guess. A purpose-built password hash (bcrypt, scrypt, argon2) has a tunable "work factor": set it so one hash takes ~250ms. A legitimate login barely notices; an attacker doing billions of guesses is throttled to a crawl. As hardware improves, you raise the work factor.
Defence 3: Constant-time comparison
import hmac # WRONG: == returns early on the first differing byte → timing leak if stored == provided_hash: ... # RIGHT: compares in constant time regardless of where they differ if hmac.compare_digest(stored, provided_hash): ...
A regular == on bytes can return faster when strings differ early — measurable over many tries, leaking how much of a guess is correct (a timing attack). hmac.compare_digest always takes the same time. (bcrypt's own checkpw already does this for you — next lesson.)
The takeaway hierarchy
plain text ✗✗✗ never fast hash, no salt ✗✗ rainbow-tabled in seconds fast hash + salt ✗ GPU-cracked (still too fast) SLOW hash + salt ✓ bcrypt / argon2 — the only acceptable answer + constant-time check ✓ no timing leaks on verify
Worked Example · Demonstrate Why Salt & Speed Matter
12 minGoal: a safe, local demonstration (against passwords you invent) of how unsalted fast hashes fall and how salting + slowness defend — making the principles undeniable. No real credentials anywhere.
import hashlib, os, time # --- demo 1: unsalted fast hashes are crackable by a dictionary --- def naive_store(password: str) -> str: return hashlib.sha256(password.encode()).hexdigest() # WRONG approach # an attacker's offline dictionary attack (against MY OWN demo data): COMMON = ["123456", "password", "qwerty", "hunter2", "letmein"] def crack(stolen_hash: str) -> str | None: for guess in COMMON: # in reality, millions of words if hashlib.sha256(guess.encode()).hexdigest() == stolen_hash: return guess return None stolen = naive_store("hunter2") print("unsalted SHA-256 cracked →", crack(stolen)) # found instantly # --- demo 2: salt makes identical passwords look different --- def salted_store(password: str): salt = os.urandom(16) h = hashlib.sha256(salt + password.encode()).hexdigest() return salt.hex(), h s1 = salted_store("hunter2") s2 = salted_store("hunter2") print("same password, different stored hashes:", s1[1] != s2[1]) # True # --- demo 3: 'slow' is the real defence (illustrative timing) --- def slow_hash(password: str, salt: bytes, rounds: int = 200_000) -> str: # PBKDF2 = repeat the hash many times to make each guess expensive return hashlib.pbkdf2_hmac("sha256", password.encode(), salt, rounds).hex() t = time.perf_counter() slow_hash("hunter2", os.urandom(16)) print(f"one slow hash took {time.perf_counter()-t:.3f}s " f"→ a billion guesses now takes a LONG time")
unsalted SHA-256 cracked → hunter2 same password, different stored hashes: True one slow hash took 0.085s → a billion guesses now takes a LONG time
Read the code
Three demos prove the three defences, all against passwords you invented (never anyone's real ones). Demo 1 cracks an unsalted SHA-256 instantly with a tiny dictionary — that's real attacks at small scale. Demo 2 shows salt making identical passwords store differently, killing rainbow tables. Demo 3 uses PBKDF2 (a slow hash via many rounds) so each guess costs ~85ms instead of nanoseconds — multiply by billions of guesses and cracking becomes infeasible. Don't roll your own for production — use bcrypt/argon2 (next lesson). This is just to make why undeniable.
Try It Yourself
13 minAll exercises use passwords you make up — never real credentials, yours or anyone's.
Hash the same made-up password three times with a fresh salt each time, and three times without a salt. Confirm the unsalted hashes are identical and the salted ones all differ.
Time PBKDF2 at 10k, 100k, and 500k rounds. Find the rounds that take ~250ms on your machine — that's a reasonable work factor. Explain why you'd increase it over the years.
Hint
import hashlib, os, time for rounds in (10_000, 100_000, 500_000): salt = os.urandom(16); t = time.perf_counter() hashlib.pbkdf2_hmac("sha256", b"demo", salt, rounds) print(f"{rounds:>7} rounds: {time.perf_counter()-t:.3f}s") # Raise rounds as CPUs get faster, to keep each guess ~costly.
Write a verify function using hmac.compare_digest. Explain in comments the timing attack that == would enable and why constant-time comparison closes it.
Hint
import hmac, hashlib, os def make(pw, rounds=200_000): salt = os.urandom(16) h = hashlib.pbkdf2_hmac("sha256", pw.encode(), salt, rounds) return salt, h def verify(pw, salt, expected, rounds=200_000): h = hashlib.pbkdf2_hmac("sha256", pw.encode(), salt, rounds) return hmac.compare_digest(h, expected) # constant time # '==' on bytes can return early at the first mismatch; timing the # response over many tries leaks how many leading bytes were right.
Mini-Challenge · A (Correct) Mini Auth Store
8 minBuild a tiny local user store with register(user, password) and login(user, password) that stores only salt + slow-hash (PBKDF2), never the password, and verifies in constant time. Test that two users with the same password get different stored values and that login succeeds only with the right password. (Next lesson swaps PBKDF2 for bcrypt — the production choice.)
Show a sample solution
import hashlib, os, hmac, json from pathlib import Path DB = Path("users.json") ROUNDS = 200_000 def _load(): return json.loads(DB.read_text()) if DB.exists() else {} def _save(d): DB.write_text(json.dumps(d)) def register(user: str, password: str) -> None: users = _load() if user in users: raise ValueError("user exists") salt = os.urandom(16) h = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, ROUNDS) users[user] = {"salt": salt.hex(), "hash": h.hex()} # NO plaintext _save(users) def login(user: str, password: str) -> bool: users = _load() rec = users.get(user) if not rec: # still do work to avoid leaking whether the user exists (timing) hashlib.pbkdf2_hmac("sha256", password.encode(), os.urandom(16), ROUNDS) return False salt = bytes.fromhex(rec["salt"]) h = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, ROUNDS) return hmac.compare_digest(h.hex(), rec["hash"]) register("aisha", "demo-pw-1"); register("ben", "demo-pw-1") print("different stored hashes:", _load()["aisha"] != _load()["ben"]) # True print("right pw:", login("aisha", "demo-pw-1")) # True print("wrong pw:", login("aisha", "nope")) # False
Non-negotiables: salt + slow hash, no plaintext stored, constant-time verify, identical passwords store differently, and constant work even for unknown users.
Recap
3 minNever store passwords recoverably. Plain text hands an attacker everything; plain fast hashes fall to rainbow tables and GPUs. The correct recipe has three parts: salt (unique random per password — defeats precomputation and hides identical passwords), slow (a tunable work factor so each guess is expensive — the opposite of file hashing), and constant-time verification (hmac.compare_digest, to avoid timing leaks). We demonstrated each with PBKDF2 against invented passwords — but for production, don't roll your own: use bcrypt or argon2, which is exactly the next lesson.
Vocabulary Card
- salt
- A unique random value added per password to defeat precomputation.
- rainbow table
- A precomputed map of hashes → passwords; salting makes it useless.
- work factor
- A tunable cost making each password hash deliberately slow.
- timing attack
- Inferring secrets from how long a comparison takes; constant-time defeats it.
Homework
4 minBuild the correct mini auth store and prove all four properties (no plaintext, salted, slow, constant-time). Then write a one-page explainer, for a non-technical manager, of why "we hash passwords" isn't enough and what salting + slow hashing add — using the breach framing ("when the DB is stolen, what does the attacker get?").
Sample · manager explainer
"We hash passwords" — manager explainer Assume our database WILL be stolen one day. The only question is what the thief gets from the password column. - Plain text: every customer's password, instantly. They reuse passwords, so we've also handed over their email and bank logins. - Just hashing (SHA-256): attackers don't reverse it — they guess. They hash millions of common passwords and match. Identical passwords look identical in our DB. Most crack in minutes. - + SALT (unique random per user): identical passwords now look different, and precomputed "rainbow tables" become useless. - + SLOW hashing (bcrypt): each guess takes ~0.25s instead of nanoseconds, so trying billions becomes years of compute. Real logins don't notice; attackers are stopped. Bottom line: salt + slow hashing turns a catastrophic breach into a manageable one. It's a few lines of code and an industry standard.
Non-negotiables: working four-property store, and a clear breach-framed explanation a non-technical reader understands.