The Brief
3 minBuild recon.py <domain> that gathers a domain's public footprint into a tidy report:
- WHOIS — registrar, creation/expiry dates, name servers.
- DNS records — A/AAAA, MX, NS, TXT (Lesson 5).
- HTTP fingerprint — public response headers (server, technologies).
- A defender's summary — what this footprint exposes and how to reduce it.
Passive recon reads public records (WHOIS, DNS, search results) — it does not probe, scan, or log into the target. That keeps it low-impact, but you should still only profile domains you own, run a sanctioned engagement against, or that explicitly invite it (a bug-bounty in scope). Use it primarily to audit your own exposure. Reading public data about a third party isn't hacking — but harassment, stalking, or doxxing built on it is illegal and unethical. Recon serves defence.
Passive vs. Active
5 minPASSIVE (this lesson) ACTIVE (later, with authorization) reads public records only sends packets TO the target WHOIS, DNS, search engines port scans, vuln probes, login attempts no contact with target systems directly interacts with the target low/no legal risk for public data needs explicit written permission
Information is published everywhere — registrars, DNS, certificate logs, job ads, GitHub. Passive recon (a.k.a. OSINT — open-source intelligence) assembles this scattered public information into a picture. The defensive value is huge: run it on yourself and you see exactly what an attacker sees, so you can shrink it. You're building an auditing tool, framed as recon.
Build It · WHOIS & DNS Gathering
14 minWHOIS — who registered the domain
import whois # pip install python-whois def get_whois(domain: str) -> dict: try: w = whois.whois(domain) return { "registrar": w.registrar, "created": str(w.creation_date), "expires": str(w.expiration_date), "name_servers": w.name_servers, "org": w.org, } except Exception as e: return {"error": str(e)}
WHOIS is a public database of domain registrations. It reveals the registrar, key dates (a soon-to-expire domain is a risk), and sometimes the organisation. Note: many registrars now redact personal contact info under privacy laws — a good thing.
DNS records (from Lesson 5)
import dns.resolver # pip install dnspython def get_dns(domain: str) -> dict: records = {} for rtype in ("A", "AAAA", "MX", "NS", "TXT"): try: records[rtype] = [str(r) for r in dns.resolver.resolve(domain, rtype)] except Exception: records[rtype] = [] return records
HTTP fingerprint — public headers only
import requests def get_http_fingerprint(domain: str) -> dict: try: # a single normal GET — what any browser sends; reads public headers r = requests.get(f"https://{domain}", timeout=10, headers={"User-Agent": "recon-audit/1.0"}) interesting = {k: v for k, v in r.headers.items() if k.lower() in ("server", "x-powered-by", "via", "x-frame-options", "strict-transport-security", "content-security-policy")} return {"status": r.status_code, "headers": interesting} except requests.RequestException as e: return {"error": str(e)}
A single normal request reveals the response headers any visitor sees: the Server header may name the web server, X-Powered-By the framework — and crucially, whether security headers (HSTS, CSP, X-Frame-Options) are present. Missing security headers are a finding you can act on defensively.
One ordinary HTTPS GET (what a browser does) is passive — you're a visitor reading what's served. Probing many paths, guessing admin URLs, or hammering the server is active and needs authorization (Lessons 19-20). Our fingerprint makes exactly one polite request.
Build It · Assemble the Report
12 minWire the gatherers into one tool that profiles a domain and adds a defender's summary — pulling in your Level 7 reporting skills (JSON output, logging).
import argparse, json, logging from datetime import datetime logging.basicConfig(level=logging.INFO, format="%(levelname)s %(message)s") log = logging.getLogger("recon") def analyse_exposure(dns_records: dict, http: dict) -> list[str]: """Turn raw findings into defensive observations.""" notes = [] if dns_records.get("MX"): notes.append("MX records reveal your email provider — phishers " "will spoof its login page.") if len(dns_records.get("TXT", [])) > 3: notes.append("Many TXT records — prune stale verification tokens " "that leak which SaaS tools you use.") headers = http.get("headers", {}) if "strict-transport-security" not in {k.lower() for k in headers}: notes.append("No HSTS header — add Strict-Transport-Security.") if "content-security-policy" not in {k.lower() for k in headers}: notes.append("No CSP header — add a Content-Security-Policy (L8-33).") if "server" in {k.lower() for k in headers}: notes.append("Server header exposes software/version — consider hiding it.") return notes def recon(domain: str) -> dict: log.info("passive recon: %s (public records only)", domain) report = { "domain": domain, "scanned_at": datetime.now().isoformat(), "whois": get_whois(domain), "dns": get_dns(domain), "http": get_http_fingerprint(domain), } report["exposure_notes"] = analyse_exposure(report["dns"], report["http"]) return report if __name__ == "__main__": p = argparse.ArgumentParser(description="Passive recon (public data only).") p.add_argument("domain") p.add_argument("--out", default="recon.json") a = p.parse_args() print("⚠️ Only profile domains you own or are authorised to assess.\n") report = recon(a.domain) # human summary print(f"\n=== {a.domain} ===") print("Registrar:", report["whois"].get("registrar")) print("A records:", report["dns"].get("A")) print("\nDefensive notes:") for note in report["exposure_notes"]: print(" •", note) with open(a.out, "w", encoding="utf-8") as f: json.dump(report, f, indent=2, default=str) log.info("full report → %s", a.out)
⚠️ Only profile domains you own or are authorised to assess. INFO passive recon: example.com (public records only) === example.com === Registrar: RESERVED-Internet Assigned Numbers Authority A records: ['93.184.216.34'] Defensive notes: • No HSTS header — add Strict-Transport-Security. • No CSP header — add a Content-Security-Policy (L8-33). INFO full report → recon.json
Read the result
The toolkit gathers only public data — WHOIS, DNS, and one polite HTTPS request — then the analyse_exposure step is what makes it defensive: it converts raw findings into an action list ("add HSTS," "prune stale TXT," "hide the Server header"). The on-screen warning and the "own or authorised" framing keep it ethical. Run this on your own sites and you'll find real things to fix — which is exactly the point of recon for a defender.
Build It Yourself
13 minProfile a domain you own (or a clearly-public one like example.com/wikipedia.org for practice).
Get the WHOIS and DNS gatherers working and print a combined summary for one domain. Note which WHOIS fields are redacted (privacy protection in action).
Extend the HTTP fingerprint to report which of the key security headers (HSTS, CSP, X-Frame-Options, X-Content-Type-Options) are present vs. missing, as a checklist.
Hint
WANT = ["strict-transport-security", "content-security-policy", "x-frame-options", "x-content-type-options"] present = {k.lower() for k in r.headers} for h in WANT: print(f" [{'✓' if h in present else ' '}] {h}")
Add an optional step that reads public certificate-transparency logs (e.g. crt.sh's public JSON API) to list subdomains that have issued TLS certs. This is classic OSINT — all public. Report them and note that each subdomain is extra attack surface to audit.
Hint
import requests def cert_subdomains(domain): # crt.sh publishes certificate transparency data (public) r = requests.get(f"https://crt.sh/?q=%25.{domain}&output=json", timeout=20) names = {row["name_value"] for row in r.json()} return sorted(n for n in names if "*" not in n)
Stretch · The Self-Audit Report
8 minTurn the toolkit into a defensive self-audit: run it on a domain you own, produce a Markdown report (Level 7 skills) with sections for footprint, missing security headers, stale DNS records, and a prioritised "reduce your exposure" checklist. This is a genuinely useful deliverable for any site owner.
Show the report-builder sketch
from pathlib import Path from datetime import date def write_markdown(report: dict, out: str = "self_audit.md") -> None: lines = [f"# Exposure Self-Audit — {report['domain']}", f"_Generated {date.today()}_ · public records only\n", "## Footprint", f"- Registrar: {report['whois'].get('registrar')}", f"- A records: {report['dns'].get('A')}", f"- MX: {report['dns'].get('MX')}", f"- Nameservers: {report['dns'].get('NS')}\n", "## Reduce your exposure"] for note in report["exposure_notes"]: lines.append(f"- [ ] {note}") Path(out).write_text("\n".join(lines), encoding="utf-8") print("wrote", out)
Non-negotiables: own-domain only, a readable report, and a prioritised exposure-reduction checklist.
Recap
3 minPassive reconnaissance (OSINT) assembles a target's public footprint — WHOIS, DNS, certificate logs, response headers — without touching its systems, which keeps it low-impact. The defender's use is the best use: run it on yourself to see what attackers see, then shrink it (prune stale DNS, add security headers, hide version strings). The toolkit you built gathers public data, then interprets it into a fix list. The line is firm: passive reading of public data is one thing; probing, scanning, or any use to harass is active/illegal. Recon serves defence — and only against what you own or are authorised to assess.
Vocabulary Card
- passive recon / OSINT
- Gathering public information without interacting with the target.
- WHOIS
- The public registration record of a domain (registrar, dates, NS).
- fingerprint
- Identifying software/config from public responses (e.g. headers).
- attack surface (recap)
- Each subdomain, header, and exposed detail an attacker can use.
Homework
4 minRun your finished toolkit as a self-audit on a domain you own (or example.com for practice) and produce the Markdown report. Write two sentences: the most surprising thing in your public footprint, and the first exposure you'd reduce. Include the ethics banner in your tool's output.
Sample · self-audit reflection
Most surprising: crt.sh listed a "staging.mysite.com" subdomain I'd forgotten about — it had a valid cert and was reachable. That's extra attack surface nobody was watching. First fix: take staging offline (or put it behind auth), then add the missing HSTS + CSP headers flagged by the audit. Pruned two old TXT verification tokens for tools we stopped using in 2024. (Tool prints the "only profile what you own/are authorised for" banner on every run.)
Non-negotiables: a real self-audit, a genuine finding, a prioritised first fix, and the ethics banner present.