PY-L6-37 · Testing Database Code — Real DB vs Mock

Learning Goals

3 min

Compare mock vs real-DB testing on speed and realism.
Keep real-DB tests fast: in-memory, fixtures, transaction rollback.
Choose the right approach per situation.
Catch SQL-level bugs a mock would miss.

Warm-Up · The Trade-Off Table

5 min

                  MOCK connection      REAL test DB
speed             fastest              fast (in-memory) → slow (server)
proves SQL works? NO                   YES
catches bad SQL   no                   yes (typos, constraints, joins)
catches "did we   yes                  harder
  commit?"
setup effort      low                  medium

Today's big idea

For SQLite, the real in-memory DB is both fast AND realistic — use it as the default. Mock the connection only to assert behaviour (was commit called?) or when the real DB is genuinely heavy (a remote Postgres). Realism usually wins.

New Concept · Fast, Realistic DB Tests

14 min

Default: in-memory SQLite, fresh per test

import sqlite3, pytest

@pytest.fixture
def db():
    con = sqlite3.connect(":memory:")
    con.executescript(open("schema.sql").read())   # real schema
    yield con
    con.close()

Speed trick: build the schema once, roll back per test

If schema setup is expensive, create it once (session scope) and wrap each test in a transaction that's rolled back — so every test sees a clean DB without rebuilding it.

@pytest.fixture(scope="session")
def base_db():
    con = sqlite3.connect(":memory:")
    con.executescript(SCHEMA)
    return con

@pytest.fixture
def db(base_db):
    base_db.execute("BEGIN")
    yield base_db
    base_db.rollback()        # undo everything this test did → clean for next

Rollback is the professional pattern for real databases (especially Postgres): the schema persists, but each test's data changes vanish, keeping tests both fast and isolated.

When to mock instead

Mock the connection when:
  - the real DB is a remote server too slow for unit tests
  - you're testing "did the code COMMIT / RETRY / handle an error?"
  - you can't run the real DB in CI

Otherwise prefer the real (in-memory) DB — it catches SQL bugs.

Bugs only a real DB catches

# a UNIQUE constraint, a typo'd column, a bad JOIN, a NOT NULL —
# a mock happily accepts all of these; the real DB rejects them:
def test_duplicate_email_rejected(db):
    add_user(db, "a@x.com")
    with pytest.raises(sqlite3.IntegrityError):
        add_user(db, "a@x.com")    # the real UNIQUE constraint fires

Worked Example · Both Approaches, Side by Side

12 min

# repo.py
class OrderRepo:
    def __init__(self, conn): self.conn = conn
    def place(self, item, qty):
        if qty <= 0:
            raise ValueError("qty must be positive")
        self.conn.execute("INSERT INTO orders (item, qty) VALUES (?,?)",
                          (item, qty))
        self.conn.commit()
    def total_qty(self):
        return self.conn.execute("SELECT COALESCE(SUM(qty),0) FROM orders").fetchone()[0]

# test_repo.py
import sqlite3, pytest
from unittest.mock import Mock
from repo import OrderRepo

@pytest.fixture
def db():
    con = sqlite3.connect(":memory:")
    con.execute("CREATE TABLE orders (item TEXT, qty INTEGER NOT NULL)")
    yield con; con.close()

# REAL DB — proves the SQL + aggregation work
def test_place_and_total(db):
    repo = OrderRepo(db)
    repo.place("roti", 3); repo.place("milo", 2)
    assert repo.total_qty() == 5

def test_empty_total(db):
    assert OrderRepo(db).total_qty() == 0      # COALESCE handles empty ✓

def test_bad_qty_rejected(db):
    with pytest.raises(ValueError):
        OrderRepo(db).place("x", 0)

# MOCK — proves behaviour (we commit) without a DB
def test_place_commits():
    conn = Mock()
    OrderRepo(conn).place("roti", 1)
    conn.commit.assert_called_once()

$ pytest test_repo.py -v
test_place_and_total PASSED      # real SQL + SUM aggregation verified
test_empty_total PASSED          # COALESCE path verified
test_bad_qty_rejected PASSED
test_place_commits PASSED        # commit behaviour verified via mock
4 passed in 0.02s

Read the diff

Most tests use the real in-memory DB — they prove the INSERT, the SUM, and the COALESCE-empty path actually work in SQL. One mock test verifies commit() is called (pure behaviour). The real-DB tests would catch a typo'd column or a broken aggregation; the mock never would. Default to real; reach for mock only for behaviour or heavy DBs.

Try It Yourself

13 min

01 🟢 Real-DB CRUD

Test a repo's create/read with an in-memory DB. Confirm a SUM/COUNT aggregation gives the right answer.

02 🟡 Constraint test

Add a UNIQUE or NOT NULL constraint to your schema and write a test proving the real DB rejects a violation. Confirm a mock would not catch it.

03 🔴 Rollback isolation

Implement the session-scoped-schema + per-test-rollback fixture pattern. Confirm tests are isolated AND the schema is built only once.

Mini-Challenge · The SQL-Bug Demo

8 min

Write a query with a deliberate bug (wrong column name, or SUM on the wrong field). Show the real-DB test fails (catching it), then show a mock-connection test of the same method PASSES (missing it). This is the definitive argument for real-DB testing where it's feasible.

Recap

3 min

Real DB vs mock is speed vs realism. For SQLite, in-memory gives you both — make it the default; it catches SQL typos, constraints, and aggregation bugs a mock never would. Use session-schema + per-test rollback to keep real-DB tests fast and isolated. Mock the connection only for behaviour checks or genuinely heavy databases. Next: testing AI/ML code, where "correct" is fuzzy.

Vocabulary Card

test database: A real DB used only for tests, reset between runs.
transaction rollback: Undoing a test's DB changes so the next test starts clean.
integrity constraint: A DB rule (UNIQUE, NOT NULL) that only a real DB enforces.
realism vs speed: The core DB-testing trade-off; in-memory SQLite gives much of both.

Homework

4 min

For a repo with a constraint and an aggregation, write a real-DB test suite (in-memory, with rollback isolation) plus one mock test for a commit. Then do the SQL-bug demo: prove the real test catches a query bug the mock misses. Submit both and a one-line conclusion on your default choice.

MOCK connection REAL test DB speed fastest fast (in-memory) → slow (server) proves SQL works? NO YES catches bad SQL no yes (typos, constraints, joins) catches "did we yes harder commit?" setup effort low medium

@pytest.fixture(scope="session") def base_db(): con = sqlite3.connect(":memory:") con.executescript(SCHEMA) return con @pytest.fixture def db(base_db): base_db.execute("BEGIN") yield base_db base_db.rollback() # undo everything this test did → clean for next

Mock the connection when: - the real DB is a remote server too slow for unit tests - you're testing "did the code COMMIT / RETRY / handle an error?" - you can't run the real DB in CI Otherwise prefer the real (in-memory) DB — it catches SQL bugs.

# a UNIQUE constraint, a typo'd column, a bad JOIN, a NOT NULL — # a mock happily accepts all of these; the real DB rejects them: def test_duplicate_email_rejected(db): add_user(db, "a@x.com") with pytest.raises(sqlite3.IntegrityError): add_user(db, "a@x.com") # the real UNIQUE constraint fires

# repo.py class OrderRepo: def __init__(self, conn): self.conn = conn def place(self, item, qty): if qty <= 0: raise ValueError("qty must be positive") self.conn.execute("INSERT INTO orders (item, qty) VALUES (?,?)", (item, qty)) self.conn.commit() def total_qty(self): return self.conn.execute("SELECT COALESCE(SUM(qty),0) FROM orders").fetchone()[0]

# test_repo.py import sqlite3, pytest from unittest.mock import Mock from repo import OrderRepo @pytest.fixture def db(): con = sqlite3.connect(":memory:") con.execute("CREATE TABLE orders (item TEXT, qty INTEGER NOT NULL)") yield con; con.close() # REAL DB — proves the SQL + aggregation work def test_place_and_total(db): repo = OrderRepo(db) repo.place("roti", 3); repo.place("milo", 2) assert repo.total_qty() == 5 def test_empty_total(db): assert OrderRepo(db).total_qty() == 0 # COALESCE handles empty ✓ def test_bad_qty_rejected(db): with pytest.raises(ValueError): OrderRepo(db).place("x", 0) # MOCK — proves behaviour (we commit) without a DB def test_place_commits(): conn = Mock() OrderRepo(conn).place("roti", 1) conn.commit.assert_called_once()

$ pytest test_repo.py -v test_place_and_total PASSED # real SQL + SUM aggregation verified test_empty_total PASSED # COALESCE path verified test_bad_qty_rejected PASSED test_place_commits PASSED # commit behaviour verified via mock 4 passed in 0.02s