Learning Goals
3 min- Follow a multi-step TDD build from empty file to feature-complete.
- Let each new requirement arrive as a failing test.
- Keep steps tiny; refactor only on green.
- Handle error cases test-first too.
Warm-Up · The Spec (as Tests)
5 minThe classic "String Calculator" kata. The spec, which we'll discover one test at a time:
add("") → 0
add("5") → 5
add("2,3") → 5
add("1,2,3,4") → 10
add("1\n2,3") → 6 (newlines also separate)
add("-1") → ValueError ("negatives not allowed: -1")Don't read the whole spec and build it all. Take ONE requirement, write its test (red), make it pass (green), refactor, then take the next. The design emerges; you never get ahead of your tests.
The Build, Step by Step
14 minStep 1 — empty string → 0
# 🔴 def test_empty(): assert add("") == 0 # 🟢 def add(s): return 0
Step 2 — single number
# 🔴 def test_single(): assert add("5") == 5 # 🟢 def add(s): return int(s) if s else 0
Step 3 — two numbers
# 🔴 def test_two(): assert add("2,3") == 5 # 🟢 def add(s): if not s: return 0 return sum(int(n) for n in s.split(","))
Step 4 — many numbers (already passes!)
# 🔴 def test_many(): assert add("1,2,3,4") == 10 # 🟢 — no code change needed; the split already handles it. ✅
Sometimes a new test passes immediately — proof your design generalised. Keep it; it's a regression guard.
Step 5 — newlines as separators
# 🔴 def test_newlines(): assert add("1\n2,3") == 6 # 🟢 import re def add(s): if not s: return 0 parts = re.split(r"[,\n]", s) return sum(int(n) for n in parts)
Step 6 — reject negatives (error path, test-first)
# 🔴 import pytest def test_negatives(): with pytest.raises(ValueError, match="-1"): add("-1") # 🟢 def add(s): if not s: return 0 nums = [int(n) for n in re.split(r"[,\n]", s)] negs = [n for n in nums if n < 0] if negs: raise ValueError(f"negatives not allowed: {negs}") return sum(nums)
The Finished Artifacts
12 min# calc.py — every line exists to satisfy a test import re def add(s): if not s: return 0 nums = [int(n) for n in re.split(r"[,\n]", s)] negatives = [n for n in nums if n < 0] if negatives: raise ValueError(f"negatives not allowed: {negatives}") return sum(nums)
# test_calc.py — the spec, as executable tests import pytest from calc import add def test_empty(): assert add("") == 0 def test_single(): assert add("5") == 5 def test_two(): assert add("2,3") == 5 def test_many(): assert add("1,2,3,4") == 10 def test_newlines(): assert add("1\n2,3") == 6 def test_negatives(): with pytest.raises(ValueError, match="-1"): add("-1")
$ pytest test_calc.py -v test_empty PASSED test_single PASSED test_two PASSED test_many PASSED test_newlines PASSED test_negatives PASSED 6 passed
Read the diff
The calculator was never "designed" up front — it grew to satisfy six tests, each added one at a time. The test file doubles as a precise, runnable specification. And coverage is 100% by construction: there's no line that wasn't demanded by a test. That's the TDD promise delivered on a real feature.
Try It Yourself
13 minTDD a new requirement: add ignores numbers bigger than 1000. Write the test first, watch it fail, then make it pass.
TDD support for a custom delimiter: add("//;\\n1;2") == 3 (first line declares the separator). Tiny steps.
After 8+ passing tests, refactor add into smaller helper functions (parse, validate, sum). Confirm all tests stay green.
Mini-Challenge · TDD a Tip Splitter
8 minFrom scratch, TDD split_bill(total, people, tip_pct): returns the per-person amount including tip, rounded to 2dp; raises on 0 people; rejects negative totals. Write each test before its code.
Show the test-first sequence
import pytest def test_basic(): assert split_bill(100, 4, 0) == 25.0 def test_with_tip(): assert split_bill(100, 4, 10) == 27.5 def test_rounding(): assert split_bill(100, 3, 0) == 33.33 def test_zero_people(): with pytest.raises(ValueError): split_bill(100, 0, 10) def test_negative_total(): with pytest.raises(ValueError): split_bill(-5, 2, 10)
Non-negotiables: each test written before the matching code; rounding and the two error paths covered.
Recap
3 minA real TDD build grows the code one failing test at a time: each requirement becomes a red test, then the simplest green, then a refactor. Some tests pass immediately (good — keep them). Error cases are tested first too. The result: a runnable spec, fearless refactoring, and 100% meaningful coverage. Next: a two-part TDD project — a library catalogue.
Homework
4 minTDD a feature with at least 8 requirements (a string calculator extension, a tip splitter, a password-strength scorer). Commit after every green. Submit the test file, code, and commit log showing the red-green-refactor rhythm. Coverage should be ~100% naturally.
The string-calculator kata (this lesson + the Try-It extensions) is a perfect submission. The commit log proving test-first order is the graded artifact.