Learning Goals
3 min- Explain where AI bias comes from and how to detect it.
- Apply a fairness check to a model's predictions across groups.
- Reason about privacy, consent, and data provenance.
- Adopt the builder's checklist: transparency, human accountability, harm-minimisation.
Warm-Up · Garbage In, Bias Out
5 minA hiring model trained on 10 years of mostly-male hires learns to prefer male candidates — not because it's "evil", but because it faithfully copied a biased history.
AI learns patterns from data — including unfair ones baked into history. A model can be technically accurate AND deeply unfair. Catching this is YOUR job as the builder; the model won't flag it for you.
New Concept · The Five Questions
14 min1. Bias — where does it come from?
historical bias data reflects past unfairness (hiring example) sampling bias some groups under-represented in the data label bias the labels themselves were assigned unfairly proxy bias a "neutral" feature (postcode) stands in for a protected one
2. Fairness — measure it across groups
# check accuracy / approval rate per group, not just overall for group in df["gender"].unique(): mask = df["gender"] == group rate = model.predict(X[mask]).mean() # approval rate print(f"{group}: approval rate {rate:.1%}") # big gaps between groups → investigate, don't ship
Overall accuracy can hide that a model works great for one group and badly for another. Always break metrics down by relevant groups.
3. Privacy & consent
- Did the people in your data agree to this use?
- Are you storing more personal data than you need?
- Could outputs re-identify someone who should stay anonymous?
4. Transparency
Can you explain a decision to the person it affects? Prefer interpretable models (trees, logistic regression) for high-stakes decisions; document data sources and known limits.
5. Human accountability
AI should ASSIST consequential decisions, not make them alone. A human must be able to review, override, and be answerable. "The algorithm decided" is never an acceptable excuse.
Worked Example · A Fairness Audit
12 min# fairness_audit.py — check a model's behaviour across a group import pandas as pd from sklearn.metrics import accuracy_score # df has: features, true label 'y', a sensitive column 'group', model preds def audit(df, group_col, y_col, pred_col): print(f"overall accuracy: {accuracy_score(df[y_col], df[pred_col]):.2%}\n") print(f"{'group':<12}{'n':>6}{'accuracy':>10}{'positive rate':>15}") for g, sub in df.groupby(group_col): acc = accuracy_score(sub[y_col], sub[pred_col]) pos = sub[pred_col].mean() print(f"{str(g):<12}{len(sub):>6}{acc:>10.1%}{pos:>15.1%}") # audit(df, "group", "y", "pred")
Sample output
overall accuracy: 88.0% group n accuracy positive rate A 520 91.0% 62.0% B 180 74.0% 31.0% ← much worse + lower approval
Read the diff
Overall accuracy (88%) looked fine — but group B gets 74% accuracy and half the approval rate of group A. That gap is a red flag: the model may be unfair to group B, possibly due to under-representation in training. The audit makes the invisible visible. A responsible builder investigates and fixes this before shipping — more data for B, reweighting, or not deploying.
Try It Yourself
13 minRun the fairness audit on a model you built earlier (e.g., Titanic by sex/class). Are metrics even across groups?
List 3 "neutral" features that could secretly proxy for a protected attribute (e.g., postcode → race/income). Why is dropping the protected column NOT enough?
Prompt an LLM with parallel sentences differing only by a name/gender/origin. Do the responses differ in tone or assumptions? Document what you find.
Mini-Challenge · A Model Card
8 minWrite a one-page "model card" for a model you built: what it does, training data + provenance, intended use, out-of-scope uses, performance overall AND per group, known biases/limits, and who is accountable. This is industry best practice.
Show the template
# Model Card: <name> - Purpose: ... - Training data: source, size, date, consent status - Intended use: ... - Out-of-scope: things it must NOT be used for - Performance: overall + per-group metrics - Known limits: biases, failure modes, blind spots - Human oversight: who reviews, who can override, who's accountable
Recap
3 minAI learns biases from data; accurate ≠ fair. Measure metrics per group, watch for proxy features, respect privacy and consent, prefer transparent models for high stakes, and keep a human accountable. A model card documents all of it. Building powerful AI responsibly is part of the job — not an afterthought. Next: your capstone.
Vocabulary Card
- algorithmic bias
- Systematic unfairness in a model's outputs, usually learned from data.
- proxy feature
- A "neutral" feature that stands in for a protected attribute.
- fairness audit
- Checking performance across groups, not just overall.
- model card
- A document describing a model's purpose, data, performance, and limits.
Homework
4 minWrite a model card for one model you built in Level 5. Include a real per-group fairness check (even a small one) and at least three honest limitations. You'll attach this to your capstone next lesson.
Use the model-card template + fairness_audit.py. A strong card is honest about what the model can't do, not just what it can.