Learning Goals
3 minBy the end of this lesson you can:
- Explain software-supply-chain risk and how a trusted dependency turns malicious.
- Pin dependencies and use hash-locked installs for reproducibility + integrity.
- Scan dependencies for known vulnerabilities with
pip-audit. - Verify integrity of updates/data (signatures, checksums) and avoid insecure deserialization.
Warm-Up · You Run More Code Than You Wrote
5 minA typical Python app imports dozens of libraries, which import hundreds more. You wrote maybe 2% of the code that runs — and you trust the other 98% completely. If an attacker compromises one popular package (or tricks you into installing a look-alike), their code runs with your app's privileges. That's a supply-chain attack.
A08 is about trusting code and data you didn't write — and the failures come from not verifying it: installing unpinned/unverified packages, accepting unsigned updates, deserializing untrusted data into objects, or pulling a malicious look-alike package. The defences are about verifiable trust: pin exact versions, lock hashes, scan for known CVEs, verify signatures, and never deserialize untrusted input. Trust only what you can verify.
New Concept · Supply-Chain Risk & Defences
14 minHow the supply chain gets attacked
typosquatting a malicious "reqeusts" / "python-sqlite" mimics a real package dependency confusion attacker publishes a public package with an internal name account takeover a maintainer's account is hijacked; a backdoor ships in an update malicious update a previously-good package adds malware in a new version compromised CI the build pipeline is tampered to inject code (SolarWinds-style) poisoned data a model/dataset/config you load contains an exploit
Defence 1: pin exact versions
# BAD — unpinned: 'pip install' may pull a brand-new (possibly malicious) version # requirements.txt: # requests # flask # GOOD — pinned to exact versions you've vetted: # requests==2.32.3 # flask==3.0.3
Pinning makes builds reproducible and stops a surprise update from silently changing (or backdooring) your app. Use a lockfile (pip freeze, Poetry/uv lock, pipenv) so every install is identical.
Defence 2: hash-locked installs (integrity)
# requirements.txt with hashes — pip verifies the downloaded file matches: # requests==2.32.3 \ # --hash=sha256:55365417734eb18255590a9ff9eb97e9e1da868d4ccd6402399eaf68af20a760 # install with hash checking enforced: # pip install --require-hashes -r requirements.txt
Hashes go beyond pinning: even if an attacker replaced version 2.32.3 on the index with a malicious build, the hash wouldn't match and pip would refuse it. This is the integrity check (Lesson 11) applied to your dependencies.
Defence 3: scan for known vulnerabilities
# pip-audit checks your installed packages against vulnerability databases: # pip install pip-audit # pip-audit → lists packages with known CVEs + safe versions # pip-audit -r requirements.txt # (also: GitHub Dependabot, 'safety', Snyk — same idea, automated.)
This is OWASP A06 (Vulnerable Components) too: an outdated library with a public, exploited CVE is a common breach vector. Scan regularly and in CI (Level 7) so you find out before attackers do.
Defence 4: verify signatures & avoid insecure deserialization
# Verify update/release signatures (Lesson 17) before trusting them. # ⚠️ NEVER unpickle untrusted data — pickle can execute arbitrary code: import pickle # data = pickle.loads(user_supplied_bytes) # ✗ RCE if attacker controls bytes! # Safe: use JSON for untrusted data (it can't carry code): import json data = json.loads(user_supplied_text) # ✓ data only, no code
load) can run codePython's pickle deserializes into objects and can be crafted to execute arbitrary code on load — pickle.loads(untrusted_bytes) is remote code execution. Same for yaml.load without SafeLoader. This is an A08 "data integrity" failure: trusting serialized data you didn't produce. Use json for untrusted data, and yaml.safe_load for YAML.
Worked Example · A Dependency Integrity Checker
12 minGoal: a tool that audits a project's dependency hygiene — unpinned packages, missing hashes, and (via pip-audit) known-vulnerable versions — producing a supply-chain readiness report.
import re, subprocess, json from pathlib import Path def check_pinning(req_path: str) -> list[str]: issues = [] for line in Path(req_path).read_text().splitlines(): line = line.strip() if not line or line.startswith("#"): continue if "==" not in line: issues.append(f"🟡 unpinned: '{line}' — pin to ==exact.version") if "--hash" not in Path(req_path).read_text() and "==" in line: pass # hashes checked at file level below if "--hash" not in Path(req_path).read_text(): issues.append("🟡 no hashes — use --require-hashes for integrity.") return issues def scan_cves() -> list[str]: """Run pip-audit and parse known-vulnerable packages.""" try: out = subprocess.run(["pip-audit", "--format", "json"], capture_output=True, text=True, timeout=120) data = json.loads(out.stdout or "{}") vulns = [] for dep in data.get("dependencies", []): for v in dep.get("vulns", []): vulns.append(f"🔴 {dep['name']} {dep['version']}: {v['id']} " f"→ upgrade to {v.get('fix_versions', ['?'])}") return vulns or ["✓ no known CVEs in installed packages"] except FileNotFoundError: return ["(install pip-audit to scan for CVEs)"] print("=== Supply-chain readiness ===") for issue in check_pinning("requirements.txt"): print(" ", issue) print() for v in scan_cves(): print(" ", v)
=== Supply-chain readiness === 🟡 unpinned: 'requests' — pin to ==exact.version 🟡 no hashes — use --require-hashes for integrity. 🔴 jinja2 3.1.2: GHSA-h5c8-rqwp-cp95 → upgrade to ['3.1.3'] ✓ (other packages clean)
Read the code
The checker covers the three pillars of dependency hygiene: pinning (reproducible, no surprise updates), hashing (integrity — the downloaded file is the one you vetted), and CVE scanning via pip-audit (catching known-vulnerable versions — A06/A08). Run it in CI (Level 7) and a vulnerable or unpinned dependency fails the build before it ships. The deeper habit it enforces: you don't just pip install and hope — you verify what you depend on, because that code runs with your app's full privileges.
Try It Yourself
13 minTake a project's requirements.txt, pin everything to exact versions (from pip freeze), then run pip-audit. Note any known-vulnerable packages and their safe versions.
On your own machine, show that unpickling crafted bytes can run code (a harmless print via __reduce__), then switch to json for the same data and confirm it can't. This makes "never unpickle untrusted data" visceral.
Hint (harmless demo)
import pickle, os class Demo: def __reduce__(self): return (print, ("⚠️ this ran during unpickling!",)) payload = pickle.dumps(Demo()) pickle.loads(payload) # prints the message — code ran on load! # JSON can't do this: json.loads only produces data, never executes.
Generate a hash-locked requirements.txt (e.g. with pip-compile --generate-hashes from pip-tools) and install with --require-hashes. Then tamper with one hash and confirm pip refuses the install — integrity verification in action.
Mini-Challenge · A CI Supply-Chain Gate
8 minBuild a script suitable for CI (Level 7's GitHub Actions) that fails the build (exits non-zero) if: any dependency is unpinned, any has a known CVE (via pip-audit), or any code uses pickle.loads/yaml.load on potentially-untrusted input. This is a real supply-chain quality gate.
Show a sample solution
import subprocess, sys, re from pathlib import Path def gate() -> int: failures = [] # 1. all deps pinned? for line in Path("requirements.txt").read_text().splitlines(): line = line.strip() if line and not line.startswith("#") and "==" not in line and "--hash" not in line: failures.append(f"unpinned dependency: {line}") # 2. known CVEs? audit = subprocess.run(["pip-audit"], capture_output=True, text=True) if audit.returncode != 0: failures.append("pip-audit found vulnerable packages:\n" + audit.stdout) # 3. dangerous deserialization in source? for py in Path("src").rglob("*.py"): src = py.read_text(encoding="utf-8") if re.search(r"pickle\.loads?\(|yaml\.load\((?!.*SafeLoader)", src): failures.append(f"unsafe deserialization in {py}") if failures: print("❌ SUPPLY-CHAIN GATE FAILED:") for f in failures: print(" -", f) return 1 print("✅ supply-chain gate passed") return 0 sys.exit(gate())
Non-negotiables: fails the build on unpinned deps, known CVEs, or unsafe deserialization — a real CI gate that exits non-zero.
Recap
3 minYou run far more code than you write, so A08 is about trusting code and data you didn't author — and the failures come from not verifying it. Defences build verifiable trust: pin exact versions (reproducible, no surprise updates), hash-lock installs (the file is the one you vetted), scan with pip-audit for known CVEs (also A06), verify signatures on updates (Lesson 17), and never deserialize untrusted data (pickle.loads/yaml.load = RCE; use json/safe_load). Automate these as a CI gate so a vulnerable or unpinned dependency can't ship.
Vocabulary Card
- supply-chain attack
- Compromising a dependency/build so malicious code runs in apps that trust it.
- pinning / lockfile
- Fixing exact dependency versions for reproducible, vetted installs.
- hash-locked install
- Verifying downloaded packages match known hashes (integrity).
- insecure deserialization
- Loading untrusted serialized data (pickle/yaml) that can execute code.
Homework
4 minPin and CVE-scan a real project's dependencies; fix any vulnerable versions. Demonstrate the pickle-RCE risk (harmlessly) and switch to JSON. Build the CI supply-chain gate. Write a short note: the three things you now verify about every dependency, and why "just pip install latest" is a security risk.
Sample · dependency-verification note
Three things I now verify about every dependency:
1. EXACT version (==) — pinned + lockfile, so builds are reproducible
and a malicious new release can't slip in silently.
2. INTEGRITY — hash-locked (--require-hashes), so the downloaded file
must match the one I vetted; a tampered package on the index is
refused by pip.
3. KNOWN VULNERABILITIES — pip-audit in CI; a package with a public
CVE fails the build until upgraded.
Why "pip install latest" is risky: it pulls whatever version is
newest at install time — which could be a compromised update, a
typosquat, or a version with a fresh CVE. I'd be running unverified
code with my app's full privileges. Also fixed an unsafe
pickle.loads on uploaded data → switched to json.loads (pickle can
execute arbitrary code on load).Non-negotiables: pinned+scanned deps with a real fix, the pickle→json demo, the CI gate, and the three-verifications + "latest is risky" explanation.