PY-L8-39 · OAuth 2.0: The Authorisation Code Flow

Learning Goals

3 min

By the end of this lesson you can:

Explain what OAuth 2.0 solves: delegated access without sharing your password.
Walk the authorisation code flow and name each participant and step.
Explain the security roles of state (anti-CSRF) and PKCE.
Distinguish OAuth (authorization) from OpenID Connect (authentication).

Warm-Up · The Valet Key

5 min

You want a photo-printing app to access your Google Photos. The terrible way: give the app your Google password (now it can read your email, delete your account, everything — forever). OAuth is the valet key: Google gives the app a limited, revocable token for just your photos, and your password never leaves Google.

Today's big idea

OAuth 2.0 is a delegated authorization protocol: it lets you grant a third-party app scoped, revocable access to your data on another service without revealing your credentials to that app. The authorisation code flow is the secure standard for web apps. Understanding it means you can both integrate "Sign in with X" correctly and spot the pitfalls (missing state, leaked secrets, open redirects) that turn it into a vulnerability.

New Concept · The Flow & Its Safeguards

14 min

The four participants

RESOURCE OWNER     you (the user)
CLIENT             the app that wants access (the photo printer)
AUTHORIZATION SERVER  issues tokens after you consent (Google's login)
RESOURCE SERVER    holds your data (Google Photos API)

The authorisation code flow, step by step

1. Client redirects you to the AUTH SERVER with:
   client_id, redirect_uri, scope, response_type=code, STATE
2. You log in to the auth server (your password stays THERE) and CONSENT
   to the requested scope ("this app wants to read your photos").
3. Auth server redirects back to the client's redirect_uri with a
   short-lived AUTHORIZATION CODE (+ the same state).
4. Client (server-side) exchanges the code + its CLIENT_SECRET for an
   ACCESS TOKEN (and refresh token), over a back-channel HTTPS call.
5. Client uses the access token to call the RESOURCE SERVER for your data.

The key insight: the app never sees your password, and the code (step 3) is useless without the client's secret (step 4), which is exchanged server-to-server — so even if the code leaks in the redirect URL, an attacker can't redeem it.

Why `state` matters (anti-CSRF)

import secrets
# step 1: generate a random state, store it in the user's session
state = secrets.token_urlsafe(16)
session["oauth_state"] = state
# ...include &state=... in the redirect to the auth server...

# step 3 (callback): the auth server returns the SAME state. Verify it:
if request.args.get("state") != session.pop("oauth_state", None):
    abort(400, "state mismatch — possible CSRF")   # reject!

The state parameter is a random value you send and the auth server echoes back. Verifying it on the callback stops CSRF: an attacker can't trick your browser into completing a login they initiated, because they can't forge your session's state.

PKCE — for public clients

Mobile/SPA apps can't keep a client_secret secret (it's shipped to users). PKCE (Proof Key for Code Exchange) fixes this: the client generates a random code_verifier, sends its hash (code_challenge) in step 1, and the original verifier in step 4. Only the app that started the flow can finish it — so a stolen authorization code is useless. PKCE is now recommended for all clients.

⚠️ OAuth security pitfalls

The classic OAuth mistakes

No state → CSRF / login-CSRF attacks.
Open redirect — not strictly validating redirect_uri against a registered allow-list lets attackers steal the code by redirecting it to themselves.
Client secret leaked — in front-end code, git, or logs (Lesson 44). Use PKCE for public clients; keep secrets server-side.
Implicit flow — the old flow that returned tokens directly in the URL is deprecated; use the code flow + PKCE.
Over-broad scopes — request the minimum scope you need (least privilege).

OAuth vs. OpenID Connect

OAuth is authorization ("this app may access your photos"). OpenID Connect (OIDC) is a thin layer on top for authentication ("here is who the user is") — it adds an id_token (a JWT, Lesson 38). "Sign in with Google" is really OIDC. Using OAuth access tokens as proof of identity is a known anti-pattern; for "who is this user," use OIDC's id_token.

Worked Example · The Flow in Flask (shape)

12 min

Goal: the secure shape of the code flow as Flask routes — with state, server-side secret exchange, and scope. (In real apps use a library like authlib; this shows what it does under the hood.)

import os, secrets, requests
from urllib.parse import urlencode
from flask import Flask, request, session, redirect, abort

app = Flask(__name__)
app.secret_key = os.environ["SECRET_KEY"]

CLIENT_ID = os.environ["OAUTH_CLIENT_ID"]
CLIENT_SECRET = os.environ["OAUTH_CLIENT_SECRET"]   # server-side ONLY
AUTH_URL = "https://provider.example/authorize"
TOKEN_URL = "https://provider.example/token"
REDIRECT_URI = "https://myapp.example/callback"      # registered, exact match

@app.get("/login/oauth")
def start():
    state = secrets.token_urlsafe(16)                # ✓ anti-CSRF
    session["oauth_state"] = state
    params = {
        "client_id": CLIENT_ID,
        "redirect_uri": REDIRECT_URI,
        "response_type": "code",
        "scope": "read:photos",                       # ✓ minimal scope
        "state": state,
    }
    return redirect(f"{AUTH_URL}?{urlencode(params)}")

@app.get("/callback")
def callback():
    # ✓ verify state (CSRF), reject if missing/mismatched
    if request.args.get("state") != session.pop("oauth_state", None):
        abort(400, "state mismatch")
    if "error" in request.args:
        abort(400, request.args["error"])
    code = request.args["code"]

    # ✓ exchange code + SECRET server-side (back channel, over HTTPS)
    resp = requests.post(TOKEN_URL, data={
        "grant_type": "authorization_code",
        "code": code,
        "redirect_uri": REDIRECT_URI,
        "client_id": CLIENT_ID,
        "client_secret": CLIENT_SECRET,
    }, timeout=10)
    resp.raise_for_status()
    access_token = resp.json()["access_token"]        # store securely; never log

    # ✓ use the token to fetch ONLY the scoped resource
    photos = requests.get("https://provider.example/photos",
                          headers={"Authorization": f"Bearer {access_token}"},
                          timeout=10).json()
    return f"got {len(photos)} photos (no password ever shared)"

/login/oauth → redirect to provider (with state) → user consents
/callback?code=...&state=... → state verified → code+secret → access_token
→ fetch photos with the token. The user's provider password never touched my app.

Read the code

Every safeguard is visible: a random state generated and verified (CSRF defence), an exact registered redirect_uri (no open redirect), the client_secret used only in the server-side token exchange (never shipped to the browser), a minimal scope, and the access token used only for the scoped resource. The user authenticated on the provider — your app never saw their password. In production you'd use authlib (which also does PKCE), but knowing the underlying steps is what lets you configure it securely and audit it.

Try It Yourself

13 min

Use a real provider's sandbox (GitHub OAuth apps are free and easy) or a local mock — never another user's account.

01 🟢 Trace the flow

Draw the four participants and the five steps for "Sign in with GitHub." Mark exactly where the user's password is used and confirm it never reaches the client app.

02 🟡 Real integration (sandbox)

Register a GitHub OAuth app (redirect to localhost), implement /login/oauth and /callback with state verification, and fetch the user's public profile. Confirm it works without ever handling their password.

Hint

# GitHub endpoints:
#   authorize: https://github.com/login/oauth/authorize
#   token:     https://github.com/login/oauth/access_token  (Accept: application/json)
#   api:       https://api.github.com/user  (Authorization: Bearer <token>)
# Store CLIENT_ID/SECRET in .env (L8-44); redirect_uri = http://localhost:5000/callback

03 🔴 Break it to understand it

In your sandbox app, temporarily remove the state check and explain (in writing) the login-CSRF attack it opens. Then restore it. Also explain why an unvalidated redirect_uri would let an attacker steal the authorization code.

Hint

No state: an attacker starts a flow with THEIR account, sends you the
callback URL; your browser completes it and you're silently logged
into the attacker's account (login CSRF) — or vice versa. state ties
the callback to YOUR session.
Open redirect_uri: if the provider would redirect the code to any URL,
the attacker sets redirect_uri to their server and captures your code.
Exact allow-listed redirect_uri prevents this.

Mini-Challenge · An OAuth Config Auditor

8 min

Write a checklist auditor for an OAuth integration that verifies: state generated and checked, exact redirect_uri (no wildcards), client_secret from env (not in code), minimal scope, code flow (not implicit), and PKCE for public clients. Report pass/fail per item — the OAuth section of a security review.

Show a sample solution

import re
from pathlib import Path

def audit_oauth(path: str) -> None:
    src = Path(path).read_text(encoding="utf-8")
    checks = {
        "state generated":      "oauth_state" in src and "token_urlsafe" in src,
        "state verified":       'args.get("state")' in src or "state" in src and "!=" in src,
        "secret from env":      "os.environ" in src and 'client_secret = "' not in src.lower(),
        "exact redirect_uri":   "REDIRECT_URI" in src and "*" not in src,
        "code flow (not implicit)": "response_type" in src and '"code"' in src,
        "scope present":        "scope" in src.lower(),
    }
    for name, ok in checks.items():
        print(f"  [{'✓' if ok else '✗'}] {name}")
    gaps = [n for n, ok in checks.items() if not ok]
    print("\nOAuth config OK" if not gaps else f"\n{len(gaps)} gap(s) — fix before shipping")

audit_oauth("oauth_routes.py")

Non-negotiables: checks state (gen+verify), secret-from-env, exact redirect, code flow, scope — pass/fail report.

Recap

3 min

OAuth 2.0 grants an app scoped, revocable access to your data on another service without sharing your password. The authorisation code flow: the app redirects you to the auth server (with state + scope), you log in and consent there, the server returns a short-lived code, and the app exchanges it (plus its client secret, server-side) for an access token. Safeguards: state (anti-CSRF), exact redirect_uri (no open redirect), secret kept server-side, PKCE for public clients, minimal scope, and the code flow (not deprecated implicit). OAuth is authorization; OpenID Connect adds authentication (the id_token) for "who is this user."

Vocabulary Card

OAuth 2.0: Delegated authorization — scoped access to your data without sharing credentials.
authorisation code: A short-lived code exchanged (with the client secret) for a token.
state: A random value tying the callback to your session — anti-CSRF.
PKCE: Proof Key for Code Exchange — secures public clients that can't keep a secret.

Homework

4 min

Implement a working "Sign in with GitHub" (or similar) in a sandbox app with proper state verification and server-side token exchange, secrets in .env. Run the OAuth config auditor. Write a paragraph explaining to a non-expert why "Sign in with Google" is safer than giving the app your Google password, naming two specific protections (no password sharing, scoped+revocable access, state/redirect validation).

Sample · why OAuth beats password sharing

If I gave the photo app my Google password, it could do ANYTHING my
account can — read my email, change my password, delete everything —
forever, and I'd have to change my password to stop it.

"Sign in with Google" (OAuth/OIDC) is different:
1. No password sharing: I type my password only on Google's own
   page; the app never sees it. Google just hands the app a token.
2. Scoped + revocable: the token only grants what I consented to
   (e.g. "read photos"), not full account access, and I can revoke
   it anytime in my Google settings without changing my password.

Behind the scenes, state + an exact redirect_uri stop attackers from
hijacking the login, and the client secret (server-side only) means a
leaked authorization code can't be redeemed by anyone else. My auditor
reports all checks pass; secrets live in .env, never in code.

Non-negotiables: a working sandbox OAuth login with state + server-side exchange, the auditor run, and a clear lay explanation citing real protections.

RESOURCE OWNER you (the user) CLIENT the app that wants access (the photo printer) AUTHORIZATION SERVER issues tokens after you consent (Google's login) RESOURCE SERVER holds your data (Google Photos API)

1. Client redirects you to the AUTH SERVER with: client_id, redirect_uri, scope, response_type=code, STATE 2. You log in to the auth server (your password stays THERE) and CONSENT to the requested scope ("this app wants to read your photos"). 3. Auth server redirects back to the client's redirect_uri with a short-lived AUTHORIZATION CODE (+ the same state). 4. Client (server-side) exchanges the code + its CLIENT_SECRET for an ACCESS TOKEN (and refresh token), over a back-channel HTTPS call. 5. Client uses the access token to call the RESOURCE SERVER for your data.

import secrets # step 1: generate a random state, store it in the user's session state = secrets.token_urlsafe(16) session["oauth_state"] = state # ...include &state=... in the redirect to the auth server... # step 3 (callback): the auth server returns the SAME state. Verify it: if request.args.get("state") != session.pop("oauth_state", None): abort(400, "state mismatch — possible CSRF") # reject!

import os, secrets, requests from urllib.parse import urlencode from flask import Flask, request, session, redirect, abort app = Flask(__name__) app.secret_key = os.environ["SECRET_KEY"] CLIENT_ID = os.environ["OAUTH_CLIENT_ID"] CLIENT_SECRET = os.environ["OAUTH_CLIENT_SECRET"] # server-side ONLY AUTH_URL = "https://provider.example/authorize" TOKEN_URL = "https://provider.example/token" REDIRECT_URI = "https://myapp.example/callback" # registered, exact match @app.get("/login/oauth") def start(): state = secrets.token_urlsafe(16) # ✓ anti-CSRF session["oauth_state"] = state params = { "client_id": CLIENT_ID, "redirect_uri": REDIRECT_URI, "response_type": "code", "scope": "read:photos", # ✓ minimal scope "state": state, } return redirect(f"{AUTH_URL}?{urlencode(params)}") @app.get("/callback") def callback(): # ✓ verify state (CSRF), reject if missing/mismatched if request.args.get("state") != session.pop("oauth_state", None): abort(400, "state mismatch") if "error" in request.args: abort(400, request.args["error"]) code = request.args["code"] # ✓ exchange code + SECRET server-side (back channel, over HTTPS) resp = requests.post(TOKEN_URL, data={ "grant_type": "authorization_code", "code": code, "redirect_uri": REDIRECT_URI, "client_id": CLIENT_ID, "client_secret": CLIENT_SECRET, }, timeout=10) resp.raise_for_status() access_token = resp.json()["access_token"] # store securely; never log # ✓ use the token to fetch ONLY the scoped resource photos = requests.get("https://provider.example/photos", headers={"Authorization": f"Bearer {access_token}"}, timeout=10).json() return f"got {len(photos)} photos (no password ever shared)"

/login/oauth → redirect to provider (with state) → user consents /callback?code=...&state=... → state verified → code+secret → access_token → fetch photos with the token. The user's provider password never touched my app.

# GitHub endpoints: # authorize: https://github.com/login/oauth/authorize # token: https://github.com/login/oauth/access_token (Accept: application/json) # api: https://api.github.com/user (Authorization: Bearer <token>) # Store CLIENT_ID/SECRET in .env (L8-44); redirect_uri = http://localhost:5000/callback

No state: an attacker starts a flow with THEIR account, sends you the callback URL; your browser completes it and you're silently logged into the attacker's account (login CSRF) — or vice versa. state ties the callback to YOUR session. Open redirect_uri: if the provider would redirect the code to any URL, the attacker sets redirect_uri to their server and captures your code. Exact allow-listed redirect_uri prevents this.

import re from pathlib import Path def audit_oauth(path: str) -> None: src = Path(path).read_text(encoding="utf-8") checks = { "state generated": "oauth_state" in src and "token_urlsafe" in src, "state verified": 'args.get("state")' in src or "state" in src and "!=" in src, "secret from env": "os.environ" in src and 'client_secret = "' not in src.lower(), "exact redirect_uri": "REDIRECT_URI" in src and "*" not in src, "code flow (not implicit)": "response_type" in src and '"code"' in src, "scope present": "scope" in src.lower(), } for name, ok in checks.items(): print(f" [{'✓' if ok else '✗'}] {name}") gaps = [n for n, ok in checks.items() if not ok] print("\nOAuth config OK" if not gaps else f"\n{len(gaps)} gap(s) — fix before shipping") audit_oauth("oauth_routes.py")

Learning Goals

Warm-Up · The Valet Key

New Concept · The Flow & Its Safeguards

The four participants

The authorisation code flow, step by step

Why state matters (anti-CSRF)

PKCE — for public clients

⚠️ OAuth security pitfalls

OAuth vs. OpenID Connect

Worked Example · The Flow in Flask (shape)

Read the code

Try It Yourself

Mini-Challenge · An OAuth Config Auditor

Recap

Vocabulary Card

Homework

Sample · why OAuth beats password sharing

Learning Goals

Warm-Up · The Valet Key

New Concept · The Flow & Its Safeguards

The four participants

The authorisation code flow, step by step

Why state matters (anti-CSRF)

PKCE — for public clients

⚠️ OAuth security pitfalls

OAuth vs. OpenID Connect

Worked Example · The Flow in Flask (shape)

Read the code

Try It Yourself

Mini-Challenge · An OAuth Config Auditor

Recap

Vocabulary Card

Homework

Sample · why OAuth beats password sharing

Why `state` matters (anti-CSRF)

Why `state` matters (anti-CSRF)