PY-L5-29 · Image Classification with Pre-Trained Models

Learning Goals

3 min

Load a pre-trained model and classify a photo (ImageNet 1000 classes).
Understand transfer learning: reuse the "feature extractor", retrain only the head.
Freeze base layers; add and train a small classifier on your own classes.
Know why this beats training from scratch on small datasets.

Warm-Up · Borrow a Genius

5 min

A model trained on ImageNet has already learned, in its early layers, what edges, textures, and shapes look like. Those features are universal — useful for any image task. Transfer learning reuses them so you only train a tiny new "head" for your specific classes.

Today's big idea

Don't start from zero. A pre-trained net is a free, expert feature extractor. Freeze it, bolt on a small classifier, train only that — you get strong accuracy from a few hundred images instead of millions.

New Concept · Pretrained + Transfer Learning

14 min

1. Classify out of the box

import numpy as np
from tensorflow.keras.applications.mobilenet_v2 import (
    MobileNetV2, preprocess_input, decode_predictions)
from tensorflow.keras.preprocessing import image

model = MobileNetV2(weights="imagenet")     # downloads once

img = image.load_img("dog.jpg", target_size=(224, 224))
x = preprocess_input(np.expand_dims(image.img_to_array(img), 0))
preds = model.predict(x)

for _, name, prob in decode_predictions(preds, top=3)[0]:
    print(f"  {name:<20} {prob:.1%}")

  golden_retriever     88.3%
  Labrador_retriever    6.1%
  cocker_spaniel        1.4%

2. Transfer learning — your own classes

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2

# load the base WITHOUT its classifier head, and freeze it
base = MobileNetV2(weights="imagenet", include_top=False,
                   input_shape=(224, 224, 3))
base.trainable = False                  # freeze the expert features

model = keras.Sequential([
    base,
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.3),
    layers.Dense(3, activation="softmax"),  # YOUR 3 classes
])
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])

The recipe

1. load pretrained base, include_top=False
2. freeze it (base.trainable = False)
3. add GlobalAveragePooling + Dense head for your classes
4. train only the head (fast, few images needed)
5. (optional) "fine-tune": unfreeze top base layers, train at a tiny LR

Why it works on small data

The hard part — learning visual features — is already done. You're only learning "which combination of known features means my class". That needs far fewer examples and far less compute.

Worked Example · Transfer-Learn a Tiny Dataset

12 min

# transfer.py — classify your own image folders with MobileNet
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

# folder structure:  data/train/<class>/*.jpg , data/val/<class>/*.jpg
train = keras.utils.image_dataset_from_directory(
    "data/train", image_size=(224, 224), batch_size=32)
val = keras.utils.image_dataset_from_directory(
    "data/val", image_size=(224, 224), batch_size=32)
class_names = train.class_names
print("classes:", class_names)

# MobileNet expects preprocessed inputs
train = train.map(lambda x, y: (preprocess_input(x), y))
val   = val.map(lambda x, y: (preprocess_input(x), y))

base = MobileNetV2(weights="imagenet", include_top=False,
                   input_shape=(224, 224, 3))
base.trainable = False

model = keras.Sequential([
    base,
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.3),
    layers.Dense(len(class_names), activation="softmax"),
])
model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(train, validation_data=val, epochs=5)

Why this is amazing

With ~100 images per class and 5 epochs you can hit 90%+ —
training from scratch would need thousands of images and hours.
That's the power of transfer learning.

Read the diff

The frozen base does all the heavy lifting; only the tiny Dense head learns your classes. image_dataset_from_directory reads labelled folders automatically. This exact pattern powers most real-world image apps — almost nobody trains a vision model from scratch anymore.

Try It Yourself

13 min

01 🟢 Classify 5 photos

Run MobileNetV2 on 5 photos from your phone. Print the top-3 predictions for each. Where is it confidently right? Confidently wrong?

02 🟡 Build a 2-class folder dataset

Collect ~30 images each of two things you can photograph (e.g., spoons vs forks). Transfer-learn a classifier. Report val accuracy.

03 🔴 Fine-tune

After the head trains, unfreeze the top ~20 layers of the base and continue training at a very low learning rate (1e-5). Does it improve?

Hint

base.trainable = True
for layer in base.layers[:-20]:
    layer.trainable = False
model.compile(keras.optimizers.Adam(1e-5),
              "sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(train, validation_data=val, epochs=3)

Mini-Challenge · A Real Mini-App

8 min

Train a 3-class classifier on something you care about (e.g., three local fruits, three dog breeds). Save the model. Write a predict script that takes an image path and prints the predicted class + confidence. (You'll wrap this in Flask in Lesson 44.)

Recap

3 min

Pre-trained models classify 1000 ImageNet categories out of the box. Transfer learning reuses their feature extractor: freeze the base, add a small head, train only the head — strong accuracy from a few hundred images. Optionally fine-tune the top layers at a tiny LR. This is how real image apps are built. Next: classic computer vision with OpenCV.

Vocabulary Card

pre-trained model: A network already trained on a large dataset, reusable for new tasks.
transfer learning: Reusing a pre-trained model's features and retraining only a small head.
freezing: Setting trainable = False so layers' weights don't update.
fine-tuning: Unfreezing some base layers and training them at a very low learning rate.

Homework

4 min

Build a small transfer-learning classifier (2-3 classes, your own photos). Report val accuracy, show 3 correct and any wrong predictions. One paragraph: how much data + time did this take vs what training from scratch would have needed?

import numpy as np from tensorflow.keras.applications.mobilenet_v2 import ( MobileNetV2, preprocess_input, decode_predictions) from tensorflow.keras.preprocessing import image model = MobileNetV2(weights="imagenet") # downloads once img = image.load_img("dog.jpg", target_size=(224, 224)) x = preprocess_input(np.expand_dims(image.img_to_array(img), 0)) preds = model.predict(x) for _, name, prob in decode_predictions(preds, top=3)[0]: print(f" {name:<20} {prob:.1%}")

from tensorflow import keras from tensorflow.keras import layers from tensorflow.keras.applications import MobileNetV2 # load the base WITHOUT its classifier head, and freeze it base = MobileNetV2(weights="imagenet", include_top=False, input_shape=(224, 224, 3)) base.trainable = False # freeze the expert features model = keras.Sequential([ base, layers.GlobalAveragePooling2D(), layers.Dropout(0.3), layers.Dense(3, activation="softmax"), # YOUR 3 classes ]) model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"])

1. load pretrained base, include_top=False 2. freeze it (base.trainable = False) 3. add GlobalAveragePooling + Dense head for your classes 4. train only the head (fast, few images needed) 5. (optional) "fine-tune": unfreeze top base layers, train at a tiny LR

# transfer.py — classify your own image folders with MobileNet import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from tensorflow.keras.applications import MobileNetV2 from tensorflow.keras.applications.mobilenet_v2 import preprocess_input # folder structure: data/train/<class>/*.jpg , data/val/<class>/*.jpg train = keras.utils.image_dataset_from_directory( "data/train", image_size=(224, 224), batch_size=32) val = keras.utils.image_dataset_from_directory( "data/val", image_size=(224, 224), batch_size=32) class_names = train.class_names print("classes:", class_names) # MobileNet expects preprocessed inputs train = train.map(lambda x, y: (preprocess_input(x), y)) val = val.map(lambda x, y: (preprocess_input(x), y)) base = MobileNetV2(weights="imagenet", include_top=False, input_shape=(224, 224, 3)) base.trainable = False model = keras.Sequential([ base, layers.GlobalAveragePooling2D(), layers.Dropout(0.3), layers.Dense(len(class_names), activation="softmax"), ]) model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"]) model.fit(train, validation_data=val, epochs=5)

base.trainable = True for layer in base.layers[:-20]: layer.trainable = False model.compile(keras.optimizers.Adam(1e-5), "sparse_categorical_crossentropy", metrics=["accuracy"]) model.fit(train, validation_data=val, epochs=3)