PY-L8-47 · Capstone: Vulnerability Scanner

The Capstone Brief

5 min

Build vulnscan: an authorized, defensive vulnerability scanner that assesses a target you own and produces an actionable report.

Authorization gate — a hard allow-list; refuses any non-permitted target (Lesson 19/20).
Checks — open ports (20), TLS health (9), security headers + config (34), exposed paths (6), dependency CVEs (35).
Findings engine — normalise results into ranked findings with evidence + remediation (46).
Evidence integrity + report — hash the evidence (11/45), output JSON/CSV/PDF (46), log the run.

⚠️ Authorized targets only — built into the tool

This is a real scanner. The authorization gate isn't optional documentation — it's code that refuses to run against anything not on your allow-list (your own apps, localhost, lab VMs, scanme.nmap.org). Pointing a vuln scanner at systems you don't own is a crime (Lesson 1, 19). The tool is built so you can't do it by accident.

Architecture

8 min

vulnscan <target>
   │
   ▼  assert_authorized(target)   ← GATE (refuses non-allow-listed targets)
   │
[CHECKS] run each module, collect raw results:
   ports (L8-20) · TLS (L8-09) · headers/config (L8-34)
   · exposed paths (L8-06) · dependency CVEs (L8-35)
   │
[FINDINGS] normalise → standard schema (title, severity, evidence, fix) (L8-46)
   │
[INTEGRITY] hash the evidence so the report is tamper-evident (L8-11/45)
   │
[REPORT] ranked JSON + CSV + PDF, executive summary, audit log (L8-46/45)

Integration, not invention

Every module is something you built this level. The capstone is the architecture: a pluggable check interface, a normalisation layer, and a report pipeline — plus the non-negotiable authorization gate at the front. Aim for clean structure: adding a new check should be one new function returning standard findings.

Build It · Gate & Pluggable Checks

16 min

The authorization gate (first, always)

import socket

AUTHORISED = {"127.0.0.1", "localhost", "scanme.nmap.org"}   # + your lab IPs

def assert_authorized(target: str) -> None:
    host = target.replace("https://", "").replace("http://", "").split("/")[0].split(":")[0]
    try:
        ip = socket.gethostbyname(host)
    except socket.gaierror:
        ip = host
    if host not in AUTHORISED and ip not in AUTHORISED:
        raise PermissionError(
            f"Refusing to scan '{target}'. Only scan systems you OWN or are "
            f"AUTHORISED to test. Add it to AUTHORISED only if you're certain.")

The standard finding + a check interface

from dataclasses import dataclass, field

SEVERITY = {"critical": 4, "high": 3, "medium": 2, "low": 1, "info": 0}

@dataclass
class Finding:
    title: str
    severity: str            # critical/high/medium/low/info
    category: str            # OWASP code
    impact: str
    remediation: str
    evidence: str = ""

# a "check" is any function: (target) -> list[Finding]
# adding a new check = writing one such function.

Example checks (reusing your modules)

import socket, requests

def check_ports(target: str) -> list[Finding]:
    host = target.split("//")[-1].split("/")[0].split(":")[0]
    findings, risky = [], {3306: "MySQL", 5432: "PostgreSQL", 6379: "Redis",
                           23: "Telnet", 21: "FTP"}
    for port, svc in risky.items():
        with socket.socket() as s:
            s.settimeout(0.5)
            if s.connect_ex((host, port)) == 0:
                findings.append(Finding(
                    title=f"Exposed {svc} on port {port}", severity="high",
                    category="A05", impact="sensitive service reachable",
                    remediation="bind to localhost / firewall (L8-34)",
                    evidence=f"port {port} open"))
    return findings

def check_headers(target: str) -> list[Finding]:
    findings = []
    try:
        r = requests.get(target, timeout=10)
    except requests.RequestException as e:
        return [Finding("Target unreachable", "info", "-", str(e), "check the URL")]
    present = {k.lower() for k in r.headers}
    wanted = {"strict-transport-security": "add HSTS (L8-34)",
              "content-security-policy": "add CSP (L8-33)",
              "x-frame-options": "set X-Frame-Options (clickjacking)"}
    for h, fix in wanted.items():
        if h not in present:
            findings.append(Finding(
                title=f"Missing security header: {h}", severity="low",
                category="A05", impact="weakens browser-side defences",
                remediation=fix, evidence=f"{h} absent from response"))
    return findings

Each check returns a list of Finding — a uniform interface. The scanner just runs every registered check and concatenates the results. Want to add the TLS-expiry check (Lesson 9) or dependency CVE scan (Lesson 35)? Write one more function with the same signature.

Build It · Engine, Integrity & Report

16 min

Wire the gate, checks, evidence-hashing, and reporting into one CLI.

import argparse, json, csv, hashlib, logging
from dataclasses import asdict
from datetime import datetime, timezone

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
log = logging.getLogger("vulnscan")

CHECKS = [check_ports, check_headers]      # + check_tls, check_deps, check_paths

def run_scan(target: str) -> dict:
    assert_authorized(target)              # GATE — before any packet
    log.info("scanning %s (authorised)", target)
    findings = []
    for check in CHECKS:
        try:
            findings.extend(check(target))
        except Exception as e:
            log.warning("check %s failed: %s", check.__name__, e)

    findings.sort(key=lambda f: SEVERITY[f.severity], reverse=True)
    records = [asdict(f) for f in findings]

    report = {
        "target": target,
        "scanned_at": datetime.now(timezone.utc).isoformat(),
        "summary": {sev: sum(1 for f in findings if f.severity == sev)
                    for sev in SEVERITY},
        "findings": records,
    }
    # tamper-evident: hash the findings (L8-11/45)
    report["evidence_hash"] = hashlib.sha256(
        json.dumps(records, sort_keys=True).encode()).hexdigest()
    return report

def write_outputs(report: dict) -> None:
    json.dump(report, open("vulnscan.json", "w"), indent=2)
    with open("vulnscan.csv", "w", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=["title", "severity", "category",
                                          "impact", "remediation"],
                           extrasaction="ignore")
        w.writeheader(); w.writerows(report["findings"])
    # (PDF via reportlab — reuse L8-46 generate_report)
    log.info("wrote vulnscan.json + vulnscan.csv (evidence_hash=%s…)",
             report["evidence_hash"][:12])

def main():
    p = argparse.ArgumentParser(description="Authorized vulnerability scanner.")
    p.add_argument("target")
    a = p.parse_args()
    print("⚠️  Authorised targets only (own systems / lab / scanme.nmap.org).\n")
    try:
        report = run_scan(a.target)
    except PermissionError as e:
        log.error("%s", e); return
    s = report["summary"]
    print(f"\n=== {a.target} ===")
    print(f"{sum(s.values())} findings: " +
          ", ".join(f"{n} {sev}" for sev, n in s.items() if n))
    for f in report["findings"][:10]:
        print(f"  [{f['severity'].upper()}] {f['title']} → {f['remediation']}")
    write_outputs(report)

if __name__ == "__main__":
    main()

⚠️  Authorised targets only (own systems / lab / scanme.nmap.org).

2026-05-28T... INFO scanning http://127.0.0.1:5000 (authorised)
=== http://127.0.0.1:5000 ===
4 findings: 1 high, 3 low
  [HIGH] Exposed PostgreSQL on port 5432 → bind to localhost / firewall (L8-34)
  [LOW]  Missing security header: strict-transport-security → add HSTS (L8-34)
  [LOW]  Missing security header: content-security-policy → add CSP (L8-33)
  [LOW]  Missing security header: x-frame-options → set X-Frame-Options
INFO wrote vulnscan.json + vulnscan.csv (evidence_hash=a1b2c3d4e5f6…)

# (point it at an unauthorised target → PermissionError, no packets sent)

Read the result

This is the whole level in one tool. The gate runs first — an unauthorised target fails before a single packet. Pluggable checks (each returning standard Findings) reuse your port scanner, header/config auditor, TLS and dependency checks; one check failing is isolated, not fatal. Findings are ranked and rolled into a summary, the evidence is hashed for tamper-evidence, and outputs go to JSON/CSV/PDF with an audit log. Adding a new vulnerability check is one function. This is a genuinely useful defensive tool — and a portfolio centrepiece.

Build Your Capstone

20 min

Build the scanner and run it against a target you own — your Lesson 41 secured blog, a local app, or scanme.nmap.org for the port-scan parts. Tackle it in stages.

01 🟢 Gate + two checks

Implement assert_authorized, the Finding model, and two checks (ports + headers). Confirm it refuses a non-allow-listed target and produces findings for an authorised one.

02 🟡 More checks + ranking + report

Add TLS-expiry (Lesson 9), exposed-paths (Lesson 6: /.git, /.env, debug pages), and a dependency CVE check (Lesson 35). Rank findings, hash the evidence, and emit JSON + CSV + PDF with an executive summary.

03 🔴 Scan your secured blog

Run the full scanner against your Lesson 41 hardened blog. It should come back largely clean (proving your hardening worked). Then point it at the unhardened version and confirm it catches the issues. The before/after is the deliverable.

Stretch · Make It Production-Grade

10 min

Pick one or two to push your capstone further:

Concurrency — run checks in parallel (Lesson 20's thread pool) with rate limiting, for speed without aggression.
Diffing — compare today's scan to a baseline and report only new findings (great for CI / continuous scanning).
Config file — load the allow-list, checks, and severity thresholds from YAML/JSON (use yaml.safe_load! Lesson 35).
CI gate — exit non-zero on any new critical/high finding so a regression fails the build (Level 7).
Signed reports — sign the report with RSA (Lesson 17) so its authenticity is verifiable.

Show the diffing sketch

import json
from pathlib import Path

def diff_against_baseline(report: dict, baseline_path="baseline.json") -> list:
    base = json.loads(Path(baseline_path).read_text()) if Path(baseline_path).exists() else {"findings": []}
    def key(f): return (f["title"], f["category"])
    old = {key(f) for f in base["findings"]}
    new = [f for f in report["findings"] if key(f) not in old]
    return new      # only NEW findings since the baseline — what to action now

# CI gate: fail the build on any NEW critical/high finding
new = diff_against_baseline(report)
critical_new = [f for f in new if f["severity"] in ("critical", "high")]
import sys; sys.exit(1 if critical_new else 0)

Non-negotiables: one production feature working — parallel checks, baseline diffing, config-driven, CI gate, or signed reports.

Recap

3 min

The capstone composes the whole level: an authorization gate that refuses non-permitted targets (the ethics, in code), pluggable checks (ports, TLS, headers/config, exposed paths, dependency CVEs — each returning standard Findings), a findings engine that ranks and summarises, evidence hashing for a tamper-evident report, and multi-format output with an audit log. The architecture is the lesson — adding a check is one function, and the gate makes misuse impossible by accident. Run it against your own hardened app to prove your defences hold, and against the unhardened one to prove the scanner finds real issues.

Vocabulary Card

vulnerability scanner: A tool that checks a target for known weaknesses and reports them.
authorization gate: Code that refuses to scan targets not on an allow-list.
pluggable check: A function returning standard findings; adding one extends the scanner.
baseline diff: Reporting only findings new since a previous scan (for CI/continuous use).

Homework · Ship Your Capstone

5 min

Finish and ship vulnscan with the authorization gate, at least four checks, ranking, evidence hashing, and a multi-format report with an executive summary. Run the before/after against your hardened vs. unhardened blog and include both reports. Add one production stretch. Write a README: what it checks, how the authorization gate works, how to add a check, and the ethics statement. This is your Level 8 portfolio piece.

Sample · vulnscan README

vulnscan — authorized defensive vulnerability scanner

Checks: open risky ports (L8-20), TLS health/expiry (L8-09),
  missing security headers + debug/config exposure (L8-34),
  exposed paths /.git /.env /debug (L8-06), dependency CVEs (L8-35).
Each check returns standard Findings (title, severity, category,
  impact, remediation, evidence).

Authorization gate: assert_authorized(target) checks the host/IP
  against an AUTHORISED allow-list (my localhost, lab VMs,
  scanme.nmap.org). Any other target → PermissionError BEFORE any
  packet is sent. Editing the allow-list is a deliberate act.

Add a check: write fn(target) -> list[Finding] and append it to
  CHECKS. Nothing else changes.

Output: ranked findings → vulnscan.json (tooling), .csv (sheets),
  .pdf (executive summary + details). Evidence is SHA-256 hashed so
  the report is tamper-evident; the run is audit-logged.

Before/after: against my hardened L8-41 blog → only 2 low (info)
  findings. Against the UN-hardened version → 1 critical (SQLi),
  1 high (IDOR), exposed DB port, missing headers. Proves both the
  scanner and my hardening work.

ETHICS: only ever run against systems you OWN or are explicitly
  AUTHORISED to test. Unauthorized scanning is a crime (L8-19).

Non-negotiables: working gated scanner with ≥4 checks + ranking + hashed evidence + multi-format report, a before/after, one stretch, and a README incl. ethics.

vulnscan <target> │ ▼ assert_authorized(target) ← GATE (refuses non-allow-listed targets) │ [CHECKS] run each module, collect raw results: ports (L8-20) · TLS (L8-09) · headers/config (L8-34) · exposed paths (L8-06) · dependency CVEs (L8-35) │ [FINDINGS] normalise → standard schema (title, severity, evidence, fix) (L8-46) │ [INTEGRITY] hash the evidence so the report is tamper-evident (L8-11/45) │ [REPORT] ranked JSON + CSV + PDF, executive summary, audit log (L8-46/45)

import socket AUTHORISED = {"127.0.0.1", "localhost", "scanme.nmap.org"} # + your lab IPs def assert_authorized(target: str) -> None: host = target.replace("https://", "").replace("http://", "").split("/")[0].split(":")[0] try: ip = socket.gethostbyname(host) except socket.gaierror: ip = host if host not in AUTHORISED and ip not in AUTHORISED: raise PermissionError( f"Refusing to scan '{target}'. Only scan systems you OWN or are " f"AUTHORISED to test. Add it to AUTHORISED only if you're certain.")

from dataclasses import dataclass, field SEVERITY = {"critical": 4, "high": 3, "medium": 2, "low": 1, "info": 0} @dataclass class Finding: title: str severity: str # critical/high/medium/low/info category: str # OWASP code impact: str remediation: str evidence: str = "" # a "check" is any function: (target) -> list[Finding] # adding a new check = writing one such function.

import socket, requests def check_ports(target: str) -> list[Finding]: host = target.split("//")[-1].split("/")[0].split(":")[0] findings, risky = [], {3306: "MySQL", 5432: "PostgreSQL", 6379: "Redis", 23: "Telnet", 21: "FTP"} for port, svc in risky.items(): with socket.socket() as s: s.settimeout(0.5) if s.connect_ex((host, port)) == 0: findings.append(Finding( title=f"Exposed {svc} on port {port}", severity="high", category="A05", impact="sensitive service reachable", remediation="bind to localhost / firewall (L8-34)", evidence=f"port {port} open")) return findings def check_headers(target: str) -> list[Finding]: findings = [] try: r = requests.get(target, timeout=10) except requests.RequestException as e: return [Finding("Target unreachable", "info", "-", str(e), "check the URL")] present = {k.lower() for k in r.headers} wanted = {"strict-transport-security": "add HSTS (L8-34)", "content-security-policy": "add CSP (L8-33)", "x-frame-options": "set X-Frame-Options (clickjacking)"} for h, fix in wanted.items(): if h not in present: findings.append(Finding( title=f"Missing security header: {h}", severity="low", category="A05", impact="weakens browser-side defences", remediation=fix, evidence=f"{h} absent from response")) return findings

import argparse, json, csv, hashlib, logging from dataclasses import asdict from datetime import datetime, timezone logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s") log = logging.getLogger("vulnscan") CHECKS = [check_ports, check_headers] # + check_tls, check_deps, check_paths def run_scan(target: str) -> dict: assert_authorized(target) # GATE — before any packet log.info("scanning %s (authorised)", target) findings = [] for check in CHECKS: try: findings.extend(check(target)) except Exception as e: log.warning("check %s failed: %s", check.__name__, e) findings.sort(key=lambda f: SEVERITY[f.severity], reverse=True) records = [asdict(f) for f in findings] report = { "target": target, "scanned_at": datetime.now(timezone.utc).isoformat(), "summary": {sev: sum(1 for f in findings if f.severity == sev) for sev in SEVERITY}, "findings": records, } # tamper-evident: hash the findings (L8-11/45) report["evidence_hash"] = hashlib.sha256( json.dumps(records, sort_keys=True).encode()).hexdigest() return report def write_outputs(report: dict) -> None: json.dump(report, open("vulnscan.json", "w"), indent=2) with open("vulnscan.csv", "w", newline="", encoding="utf-8") as f: w = csv.DictWriter(f, fieldnames=["title", "severity", "category", "impact", "remediation"], extrasaction="ignore") w.writeheader(); w.writerows(report["findings"]) # (PDF via reportlab — reuse L8-46 generate_report) log.info("wrote vulnscan.json + vulnscan.csv (evidence_hash=%s…)", report["evidence_hash"][:12]) def main(): p = argparse.ArgumentParser(description="Authorized vulnerability scanner.") p.add_argument("target") a = p.parse_args() print("⚠️ Authorised targets only (own systems / lab / scanme.nmap.org).\n") try: report = run_scan(a.target) except PermissionError as e: log.error("%s", e); return s = report["summary"] print(f"\n=== {a.target} ===") print(f"{sum(s.values())} findings: " + ", ".join(f"{n} {sev}" for sev, n in s.items() if n)) for f in report["findings"][:10]: print(f" [{f['severity'].upper()}] {f['title']} → {f['remediation']}") write_outputs(report) if __name__ == "__main__": main()

⚠️ Authorised targets only (own systems / lab / scanme.nmap.org). 2026-05-28T... INFO scanning http://127.0.0.1:5000 (authorised) === http://127.0.0.1:5000 === 4 findings: 1 high, 3 low [HIGH] Exposed PostgreSQL on port 5432 → bind to localhost / firewall (L8-34) [LOW] Missing security header: strict-transport-security → add HSTS (L8-34) [LOW] Missing security header: content-security-policy → add CSP (L8-33) [LOW] Missing security header: x-frame-options → set X-Frame-Options INFO wrote vulnscan.json + vulnscan.csv (evidence_hash=a1b2c3d4e5f6…) # (point it at an unauthorised target → PermissionError, no packets sent)

import json from pathlib import Path def diff_against_baseline(report: dict, baseline_path="baseline.json") -> list: base = json.loads(Path(baseline_path).read_text()) if Path(baseline_path).exists() else {"findings": []} def key(f): return (f["title"], f["category"]) old = {key(f) for f in base["findings"]} new = [f for f in report["findings"] if key(f) not in old] return new # only NEW findings since the baseline — what to action now # CI gate: fail the build on any NEW critical/high finding new = diff_against_baseline(report) critical_new = [f for f in new if f["severity"] in ("critical", "high")] import sys; sys.exit(1 if critical_new else 0)