PY-L8-06 · Project: Passive Reconnaissance Toolkit

The Brief

3 min

Build recon.py <domain> that gathers a domain's public footprint into a tidy report:

WHOIS — registrar, creation/expiry dates, name servers.
DNS records — A/AAAA, MX, NS, TXT (Lesson 5).
HTTP fingerprint — public response headers (server, technologies).
A defender's summary — what this footprint exposes and how to reduce it.

⚠️ Passive means passive — and authorized

Passive recon reads public records (WHOIS, DNS, search results) — it does not probe, scan, or log into the target. That keeps it low-impact, but you should still only profile domains you own, run a sanctioned engagement against, or that explicitly invite it (a bug-bounty in scope). Use it primarily to audit your own exposure. Reading public data about a third party isn't hacking — but harassment, stalking, or doxxing built on it is illegal and unethical. Recon serves defence.

Passive vs. Active

5 min

PASSIVE (this lesson)            ACTIVE (later, with authorization)
reads public records only        sends packets TO the target
WHOIS, DNS, search engines       port scans, vuln probes, login attempts
no contact with target systems   directly interacts with the target
low/no legal risk for public data needs explicit written permission

Today's big idea

Information is published everywhere — registrars, DNS, certificate logs, job ads, GitHub. Passive recon (a.k.a. OSINT — open-source intelligence) assembles this scattered public information into a picture. The defensive value is huge: run it on yourself and you see exactly what an attacker sees, so you can shrink it. You're building an auditing tool, framed as recon.

Build It · WHOIS & DNS Gathering

14 min

WHOIS — who registered the domain

import whois        # pip install python-whois

def get_whois(domain: str) -> dict:
    try:
        w = whois.whois(domain)
        return {
            "registrar": w.registrar,
            "created": str(w.creation_date),
            "expires": str(w.expiration_date),
            "name_servers": w.name_servers,
            "org": w.org,
        }
    except Exception as e:
        return {"error": str(e)}

WHOIS is a public database of domain registrations. It reveals the registrar, key dates (a soon-to-expire domain is a risk), and sometimes the organisation. Note: many registrars now redact personal contact info under privacy laws — a good thing.

DNS records (from Lesson 5)

import dns.resolver       # pip install dnspython

def get_dns(domain: str) -> dict:
    records = {}
    for rtype in ("A", "AAAA", "MX", "NS", "TXT"):
        try:
            records[rtype] = [str(r) for r in dns.resolver.resolve(domain, rtype)]
        except Exception:
            records[rtype] = []
    return records

HTTP fingerprint — public headers only

import requests

def get_http_fingerprint(domain: str) -> dict:
    try:
        # a single normal GET — what any browser sends; reads public headers
        r = requests.get(f"https://{domain}", timeout=10,
                         headers={"User-Agent": "recon-audit/1.0"})
        interesting = {k: v for k, v in r.headers.items()
                       if k.lower() in ("server", "x-powered-by",
                                        "via", "x-frame-options",
                                        "strict-transport-security",
                                        "content-security-policy")}
        return {"status": r.status_code, "headers": interesting}
    except requests.RequestException as e:
        return {"error": str(e)}

A single normal request reveals the response headers any visitor sees: the Server header may name the web server, X-Powered-By the framework — and crucially, whether security headers (HSTS, CSP, X-Frame-Options) are present. Missing security headers are a finding you can act on defensively.

The line we don't cross

One ordinary HTTPS GET (what a browser does) is passive — you're a visitor reading what's served. Probing many paths, guessing admin URLs, or hammering the server is active and needs authorization (Lessons 19-20). Our fingerprint makes exactly one polite request.

Build It · Assemble the Report

12 min

Wire the gatherers into one tool that profiles a domain and adds a defender's summary — pulling in your Level 7 reporting skills (JSON output, logging).

import argparse, json, logging
from datetime import datetime

logging.basicConfig(level=logging.INFO, format="%(levelname)s %(message)s")
log = logging.getLogger("recon")

def analyse_exposure(dns_records: dict, http: dict) -> list[str]:
    """Turn raw findings into defensive observations."""
    notes = []
    if dns_records.get("MX"):
        notes.append("MX records reveal your email provider — phishers "
                     "will spoof its login page.")
    if len(dns_records.get("TXT", [])) > 3:
        notes.append("Many TXT records — prune stale verification tokens "
                     "that leak which SaaS tools you use.")
    headers = http.get("headers", {})
    if "strict-transport-security" not in {k.lower() for k in headers}:
        notes.append("No HSTS header — add Strict-Transport-Security.")
    if "content-security-policy" not in {k.lower() for k in headers}:
        notes.append("No CSP header — add a Content-Security-Policy (L8-33).")
    if "server" in {k.lower() for k in headers}:
        notes.append("Server header exposes software/version — consider hiding it.")
    return notes

def recon(domain: str) -> dict:
    log.info("passive recon: %s (public records only)", domain)
    report = {
        "domain": domain,
        "scanned_at": datetime.now().isoformat(),
        "whois": get_whois(domain),
        "dns": get_dns(domain),
        "http": get_http_fingerprint(domain),
    }
    report["exposure_notes"] = analyse_exposure(report["dns"], report["http"])
    return report

if __name__ == "__main__":
    p = argparse.ArgumentParser(description="Passive recon (public data only).")
    p.add_argument("domain")
    p.add_argument("--out", default="recon.json")
    a = p.parse_args()

    print("⚠️  Only profile domains you own or are authorised to assess.\n")
    report = recon(a.domain)

    # human summary
    print(f"\n=== {a.domain} ===")
    print("Registrar:", report["whois"].get("registrar"))
    print("A records:", report["dns"].get("A"))
    print("\nDefensive notes:")
    for note in report["exposure_notes"]:
        print("  •", note)

    with open(a.out, "w", encoding="utf-8") as f:
        json.dump(report, f, indent=2, default=str)
    log.info("full report → %s", a.out)

⚠️  Only profile domains you own or are authorised to assess.

INFO passive recon: example.com (public records only)
=== example.com ===
Registrar: RESERVED-Internet Assigned Numbers Authority
A records: ['93.184.216.34']

Defensive notes:
  • No HSTS header — add Strict-Transport-Security.
  • No CSP header — add a Content-Security-Policy (L8-33).
INFO full report → recon.json

Read the result

The toolkit gathers only public data — WHOIS, DNS, and one polite HTTPS request — then the analyse_exposure step is what makes it defensive: it converts raw findings into an action list ("add HSTS," "prune stale TXT," "hide the Server header"). The on-screen warning and the "own or authorised" framing keep it ethical. Run this on your own sites and you'll find real things to fix — which is exactly the point of recon for a defender.

Build It Yourself

13 min

Profile a domain you own (or a clearly-public one like example.com/wikipedia.org for practice).

01 🟢 WHOIS + DNS

Get the WHOIS and DNS gatherers working and print a combined summary for one domain. Note which WHOIS fields are redacted (privacy protection in action).

02 🟡 Security-header check

Extend the HTTP fingerprint to report which of the key security headers (HSTS, CSP, X-Frame-Options, X-Content-Type-Options) are present vs. missing, as a checklist.

Hint

WANT = ["strict-transport-security", "content-security-policy",
        "x-frame-options", "x-content-type-options"]
present = {k.lower() for k in r.headers}
for h in WANT:
    print(f"  [{'✓' if h in present else ' '}] {h}")

03 🔴 Subdomain hints (public sources)

Add an optional step that reads public certificate-transparency logs (e.g. crt.sh's public JSON API) to list subdomains that have issued TLS certs. This is classic OSINT — all public. Report them and note that each subdomain is extra attack surface to audit.

Hint

import requests
def cert_subdomains(domain):
    # crt.sh publishes certificate transparency data (public)
    r = requests.get(f"https://crt.sh/?q=%25.{domain}&output=json", timeout=20)
    names = {row["name_value"] for row in r.json()}
    return sorted(n for n in names if "*" not in n)

Stretch · The Self-Audit Report

8 min

Turn the toolkit into a defensive self-audit: run it on a domain you own, produce a Markdown report (Level 7 skills) with sections for footprint, missing security headers, stale DNS records, and a prioritised "reduce your exposure" checklist. This is a genuinely useful deliverable for any site owner.

Show the report-builder sketch

from pathlib import Path
from datetime import date

def write_markdown(report: dict, out: str = "self_audit.md") -> None:
    lines = [f"# Exposure Self-Audit — {report['domain']}",
             f"_Generated {date.today()}_  ·  public records only\n",
             "## Footprint",
             f"- Registrar: {report['whois'].get('registrar')}",
             f"- A records: {report['dns'].get('A')}",
             f"- MX: {report['dns'].get('MX')}",
             f"- Nameservers: {report['dns'].get('NS')}\n",
             "## Reduce your exposure"]
    for note in report["exposure_notes"]:
        lines.append(f"- [ ] {note}")
    Path(out).write_text("\n".join(lines), encoding="utf-8")
    print("wrote", out)

Non-negotiables: own-domain only, a readable report, and a prioritised exposure-reduction checklist.

Recap

3 min

Passive reconnaissance (OSINT) assembles a target's public footprint — WHOIS, DNS, certificate logs, response headers — without touching its systems, which keeps it low-impact. The defender's use is the best use: run it on yourself to see what attackers see, then shrink it (prune stale DNS, add security headers, hide version strings). The toolkit you built gathers public data, then interprets it into a fix list. The line is firm: passive reading of public data is one thing; probing, scanning, or any use to harass is active/illegal. Recon serves defence — and only against what you own or are authorised to assess.

Vocabulary Card

passive recon / OSINT: Gathering public information without interacting with the target.
WHOIS: The public registration record of a domain (registrar, dates, NS).
fingerprint: Identifying software/config from public responses (e.g. headers).
attack surface (recap): Each subdomain, header, and exposed detail an attacker can use.

Homework

4 min

Run your finished toolkit as a self-audit on a domain you own (or example.com for practice) and produce the Markdown report. Write two sentences: the most surprising thing in your public footprint, and the first exposure you'd reduce. Include the ethics banner in your tool's output.

Sample · self-audit reflection

Most surprising: crt.sh listed a "staging.mysite.com" subdomain I'd
forgotten about — it had a valid cert and was reachable. That's
extra attack surface nobody was watching.
First fix: take staging offline (or put it behind auth), then add
the missing HSTS + CSP headers flagged by the audit. Pruned two
old TXT verification tokens for tools we stopped using in 2024.

(Tool prints the "only profile what you own/are authorised for"
banner on every run.)

Non-negotiables: a real self-audit, a genuine finding, a prioritised first fix, and the ethics banner present.

PASSIVE (this lesson) ACTIVE (later, with authorization) reads public records only sends packets TO the target WHOIS, DNS, search engines port scans, vuln probes, login attempts no contact with target systems directly interacts with the target low/no legal risk for public data needs explicit written permission

import whois # pip install python-whois def get_whois(domain: str) -> dict: try: w = whois.whois(domain) return { "registrar": w.registrar, "created": str(w.creation_date), "expires": str(w.expiration_date), "name_servers": w.name_servers, "org": w.org, } except Exception as e: return {"error": str(e)}

import dns.resolver # pip install dnspython def get_dns(domain: str) -> dict: records = {} for rtype in ("A", "AAAA", "MX", "NS", "TXT"): try: records[rtype] = [str(r) for r in dns.resolver.resolve(domain, rtype)] except Exception: records[rtype] = [] return records

import requests def get_http_fingerprint(domain: str) -> dict: try: # a single normal GET — what any browser sends; reads public headers r = requests.get(f"https://{domain}", timeout=10, headers={"User-Agent": "recon-audit/1.0"}) interesting = {k: v for k, v in r.headers.items() if k.lower() in ("server", "x-powered-by", "via", "x-frame-options", "strict-transport-security", "content-security-policy")} return {"status": r.status_code, "headers": interesting} except requests.RequestException as e: return {"error": str(e)}

import argparse, json, logging from datetime import datetime logging.basicConfig(level=logging.INFO, format="%(levelname)s %(message)s") log = logging.getLogger("recon") def analyse_exposure(dns_records: dict, http: dict) -> list[str]: """Turn raw findings into defensive observations.""" notes = [] if dns_records.get("MX"): notes.append("MX records reveal your email provider — phishers " "will spoof its login page.") if len(dns_records.get("TXT", [])) > 3: notes.append("Many TXT records — prune stale verification tokens " "that leak which SaaS tools you use.") headers = http.get("headers", {}) if "strict-transport-security" not in {k.lower() for k in headers}: notes.append("No HSTS header — add Strict-Transport-Security.") if "content-security-policy" not in {k.lower() for k in headers}: notes.append("No CSP header — add a Content-Security-Policy (L8-33).") if "server" in {k.lower() for k in headers}: notes.append("Server header exposes software/version — consider hiding it.") return notes def recon(domain: str) -> dict: log.info("passive recon: %s (public records only)", domain) report = { "domain": domain, "scanned_at": datetime.now().isoformat(), "whois": get_whois(domain), "dns": get_dns(domain), "http": get_http_fingerprint(domain), } report["exposure_notes"] = analyse_exposure(report["dns"], report["http"]) return report if __name__ == "__main__": p = argparse.ArgumentParser(description="Passive recon (public data only).") p.add_argument("domain") p.add_argument("--out", default="recon.json") a = p.parse_args() print("⚠️ Only profile domains you own or are authorised to assess.\n") report = recon(a.domain) # human summary print(f"\n=== {a.domain} ===") print("Registrar:", report["whois"].get("registrar")) print("A records:", report["dns"].get("A")) print("\nDefensive notes:") for note in report["exposure_notes"]: print(" •", note) with open(a.out, "w", encoding="utf-8") as f: json.dump(report, f, indent=2, default=str) log.info("full report → %s", a.out)

⚠️ Only profile domains you own or are authorised to assess. INFO passive recon: example.com (public records only) === example.com === Registrar: RESERVED-Internet Assigned Numbers Authority A records: ['93.184.216.34'] Defensive notes: • No HSTS header — add Strict-Transport-Security. • No CSP header — add a Content-Security-Policy (L8-33). INFO full report → recon.json

WANT = ["strict-transport-security", "content-security-policy", "x-frame-options", "x-content-type-options"] present = {k.lower() for k in r.headers} for h in WANT: print(f" [{'✓' if h in present else ' '}] {h}")

import requests def cert_subdomains(domain): # crt.sh publishes certificate transparency data (public) r = requests.get(f"https://crt.sh/?q=%25.{domain}&output=json", timeout=20) names = {row["name_value"] for row in r.json()} return sorted(n for n in names if "*" not in n)

from pathlib import Path from datetime import date def write_markdown(report: dict, out: str = "self_audit.md") -> None: lines = [f"# Exposure Self-Audit — {report['domain']}", f"_Generated {date.today()}_ · public records only\n", "## Footprint", f"- Registrar: {report['whois'].get('registrar')}", f"- A records: {report['dns'].get('A')}", f"- MX: {report['dns'].get('MX')}", f"- Nameservers: {report['dns'].get('NS')}\n", "## Reduce your exposure"] for note in report["exposure_notes"]: lines.append(f"- [ ] {note}") Path(out).write_text("\n".join(lines), encoding="utf-8") print("wrote", out)