Learning Goals
3 min- Install TensorFlow/Keras; build a
Sequentialmodel. - Add
Denselayers with the right activations. compilewith a loss + optimiser, thenfit.- Evaluate and read the training history.
Warm-Up · Install & Imports
5 minpip install tensorflow # (If install is heavy on your machine, run this lesson in # Google Colab — TensorFlow is pre-installed there, free GPU too.)
import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers print(tf.__version__)
Keras turns the architecture you drew in Lesson 21 into code: each Dense(n, activation) is a layer of n neurons. compile picks how to learn; fit runs the training loop. You stopped doing the matrix maths by hand the moment you imported Keras.
New Concept · Sequential, Dense, compile, fit
14 minBuild the architecture
from tensorflow import keras from tensorflow.keras import layers model = keras.Sequential([ layers.Input(shape=(4,)), # 4 input features (iris) layers.Dense(16, activation="relu"), layers.Dense(8, activation="relu"), layers.Dense(3, activation="softmax"), # 3 classes ]) model.summary()
Sequential = a straight stack of layers. Dense(n) = a fully-connected layer of n neurons. The last layer's activation + size matches the task (Lesson 22).
Compile — how to learn
model.compile( optimizer="adam", # how to update weights loss="sparse_categorical_crossentropy", # what to minimise (integer labels) metrics=["accuracy"], )
Loss cheat sheet: binary: binary_crossentropy (sigmoid output) multiclass (ints):sparse_categorical_crossentropy (softmax) multiclass (1-hot):categorical_crossentropy regression: mse / mae
Fit — run the training loop
history = model.fit( X_train, y_train, validation_split=0.2, # watch generalisation as it trains epochs=50, # passes over the data batch_size=16, verbose=0, )
Evaluate & predict
loss, acc = model.evaluate(X_test, y_test, verbose=0) print(f"test accuracy: {acc:.1%}") probs = model.predict(X_test[:3]) # softmax probabilities preds = probs.argmax(axis=1) # pick the top class
Worked Example · Iris in Keras
12 min# iris_keras.py — a 3-layer net on iris import numpy as np from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from tensorflow import keras from tensorflow.keras import layers X, y = load_iris(return_X_y=True) X = StandardScaler().fit_transform(X) # nets train better on scaled data Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.2, stratify=y, random_state=0) model = keras.Sequential([ layers.Input(shape=(4,)), layers.Dense(16, activation="relu"), layers.Dense(8, activation="relu"), layers.Dense(3, activation="softmax"), ]) model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]) history = model.fit(Xtr, ytr, validation_split=0.2, epochs=80, batch_size=8, verbose=0) loss, acc = model.evaluate(Xte, yte, verbose=0) print(f"test accuracy: {acc:.1%}") # plot learning curves import matplotlib.pyplot as plt plt.plot(history.history["accuracy"], label="train") plt.plot(history.history["val_accuracy"], label="val") plt.xlabel("epoch"); plt.ylabel("accuracy"); plt.legend() plt.savefig("learning_curve.png", dpi=150)
Sample output
test accuracy: 96.7%
Read the diff
Ten lines built and trained a real neural net. Two habits to keep: scale the features (nets are sensitive to scale) and plot train vs val accuracy — if val stops rising while train keeps climbing, you're overfitting (Lesson 25). For iris, a forest does this just as well in less time — neural nets shine on images and text, which is where we're headed.
Try It Yourself
13 minBuild a 2-hidden-layer net for a binary task (sigmoid output, binary_crossentropy). Print model.summary() and read the parameter count.
Try 1 hidden layer vs 3, and 8 neurons vs 64. Which generalises best on a held-out set?
Build a net for the property data (Lesson 20): linear output (1 node), loss="mse". Compare its RMSE to the random forest.
Hint
model = keras.Sequential([ layers.Input(shape=(n_features,)), layers.Dense(64, activation="relu"), layers.Dense(32, activation="relu"), layers.Dense(1), # no activation = regression ]) model.compile(optimizer="adam", loss="mse", metrics=["mae"])
Mini-Challenge · Early Stopping
8 minAdd an EarlyStopping callback that halts training when validation loss stops improving, restoring the best weights. This prevents wasting epochs and overfitting.
Show one possible solution
from tensorflow.keras.callbacks import EarlyStopping es = EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True) history = model.fit(Xtr, ytr, validation_split=0.2, epochs=500, batch_size=8, callbacks=[es], verbose=0) print("stopped at epoch", len(history.history["loss"]))
Non-negotiables: monitor val_loss, a patience window, restore_best_weights. Now you can set epochs high and let the callback decide when to stop.
Recap
3 minKeras: Sequential stacks Dense layers; compile sets optimiser + loss + metrics; fit trains; evaluate scores. Scale your features, pick the loss by task, and always watch train-vs-val curves. EarlyStopping saves time and curbs overfitting. Next: what the optimiser and loss are actually doing.
Vocabulary Card
- Sequential
- A linear stack of layers — the simplest Keras model.
- Dense
- A fully-connected layer; every input connects to every neuron.
- epoch / batch
- One full pass over the data / a small chunk processed at once.
- loss function
- The number training tries to minimise; chosen by task type.
Homework
4 minBuild, compile, fit and evaluate a Keras net on any tabular dataset. Use EarlyStopping, plot the learning curves, and report test accuracy/RMSE. Compare it honestly to a scikit-learn model on the same data — which won, and was the net worth the extra complexity?
Combine iris_keras.py with EarlyStopping. Honest finding: on small tabular data, a random forest usually matches or beats the net with far less effort — nets earn their keep on images/text.