Learning Goals
3 min- Understand the sigmoid: turns any number into a 0-1 probability.
- Fit
LogisticRegression; readpredictvspredict_proba. - Move the decision threshold to trade precision for recall.
- Remember to scale features (it's linear under the hood).
Warm-Up · The Sigmoid Squash
5 minimport numpy as np def sigmoid(z): return 1 / (1 + np.exp(-z)) print(sigmoid(-5)) # 0.007 → very unlikely print(sigmoid(0)) # 0.5 → coin flip print(sigmoid(5)) # 0.993 → very likely
Linear regression can output any number, even -200 or 3000 — nonsense for a probability. The sigmoid squashes the linear output into [0, 1], so it reads as "probability of yes".
Logistic regression computes a linear score, then sigmoids it into a probability. Predict "yes" when the probability passes a threshold (default 0.5). Move that threshold and you trade precision for recall.
New Concept · predict_proba & Thresholds
14 minFit (with scaling — it's linear)
from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression X, y = load_breast_cancer(return_X_y=True) Xtr, Xte, ytr, yte = train_test_split(X, y, stratify=y, random_state=0) clf = make_pipeline(StandardScaler(), LogisticRegression(max_iter=5000)).fit(Xtr, ytr) print(clf.score(Xte, yte).round(3))
Probabilities, not just labels
probs = clf.predict_proba(Xte) # shape (n, 2): [P(class0), P(class1)] print(probs[:3].round(3)) print(clf.predict(Xte)[:3]) # the 0/1 decision at threshold 0.5
[[0.01 0.99] [0.97 0.03] [0.12 0.88]] [1 0 1]
Custom threshold
# Be more cautious about calling something "benign" (class 1): p_benign = probs[:, 1] strict = (p_benign >= 0.7).astype(int) # need 70% confidence for "benign"
Raising the threshold for "benign" means fewer false "benign" calls (higher precision for benign) but more borderline cases flagged as malignant (lower recall). This is the precision/recall dial from Lesson 11.
Coefficients still interpretable
lr = clf.named_steps["logisticregression"] # Positive coefficient → pushes toward class 1 as the feature rises
Multiclass too
Logistic regression handles 3+ classes automatically (one-vs-rest or softmax). Same fit/predict.
Worked Example · Threshold Tuning
12 min# threshold.py — see precision/recall trade as the threshold moves from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.metrics import precision_score, recall_score X, y = load_breast_cancer(return_X_y=True) # treat malignant (0) as the "positive" we care about catching y_pos = (y == 0).astype(int) Xtr, Xte, ytr, yte = train_test_split(X, y_pos, stratify=y_pos, random_state=0) clf = make_pipeline(StandardScaler(), LogisticRegression(max_iter=5000)).fit(Xtr, ytr) proba = clf.predict_proba(Xte)[:, 1] # P(malignant) print(f"{'thresh':>7} {'precision':>10} {'recall':>8}") for t in [0.3, 0.4, 0.5, 0.6, 0.7]: pred = (proba >= t).astype(int) p = precision_score(yte, pred) r = recall_score(yte, pred) print(f"{t:>7} {p:>10.2f} {r:>8.2f}")
Sample output
thresh precision recall
0.3 0.89 1.00
0.4 0.93 0.98
0.5 0.95 0.95
0.6 0.97 0.93
0.7 1.00 0.88Read the diff
Lower threshold → catch every malignant case (recall 1.00) at the cost of more false alarms (precision 0.89). For cancer screening you'd pick a low threshold — missing a tumour is far worse than a false alarm. The model gave you a probability; you choose the operating point based on real-world cost.
Try It Yourself
13 minFit logistic regression on any 2-class dataset. Print predict_proba for the first 5 test samples.
Use one feature only. Scatter the data (0/1) and overlay the model's predicted probability curve across that feature's range.
Plot the ROC curve and compute AUC with sklearn.metrics.roc_curve / roc_auc_score. What does AUC = 0.99 mean?
Hint
from sklearn.metrics import roc_auc_score, RocCurveDisplay print("AUC:", round(roc_auc_score(yte, proba), 3)) RocCurveDisplay.from_predictions(yte, proba)
AUC ≈ 1.0 means the model ranks positives above negatives almost perfectly across all thresholds.
Mini-Challenge · Pick the Operating Point
8 minFor a spam filter, a false positive (real email → spam) is very costly; a false negative (spam → inbox) is mild. Sweep thresholds and pick the one that gives precision ≥ 0.99 with the highest possible recall. Report it.
Show one possible solution
import numpy as np from sklearn.metrics import precision_score, recall_score best = None for t in np.linspace(0.5, 0.99, 50): pred = (proba >= t).astype(int) if pred.sum() == 0: continue p = precision_score(yte, pred) r = recall_score(yte, pred) if p >= 0.99 and (best is None or r > best[2]): best = (t, p, r) print(f"chosen threshold {best[0]:.2f}: precision {best[1]:.2f}, recall {best[2]:.2f}")
Non-negotiables: search thresholds, enforce the precision floor, maximise recall subject to it. This is how products tune classifiers to business cost.
Recap
3 minLogistic regression is a classifier: linear score → sigmoid → probability. predict_proba gives the probability; the threshold (default 0.5) turns it into a label. Move the threshold to trade precision for recall based on real-world cost. Scale features (it's linear). Fast, interpretable, a great baseline. Next: feature engineering, the prep that powers all of these.
Vocabulary Card
- sigmoid
- S-shaped function mapping any number to (0, 1) — a probability.
- predict_proba
- Returns class probabilities instead of a hard label.
- threshold
- The probability cutoff for predicting the positive class; your precision/recall dial.
- ROC / AUC
- Curve of true-positive vs false-positive rate; AUC summarises ranking quality (1 = perfect).
Homework
4 minOn a 2-class dataset, fit logistic regression, plot the ROC curve with AUC, and pick a threshold for a stated cost scenario (you choose precision-critical or recall-critical). Justify your choice in one sentence.
Combine the ROC code from Try-It #3 with the threshold search from the mini-challenge, adjusted to your chosen cost scenario.