TensorFlow Lite Micro — Arduino L4

Learning Goals 5 min

TensorFlow Lite Micro (TFLM) is Google's ML runtime designed to fit on microcontrollers — just 16 KB of code. Today you train a small model on your captured data and convert it to TFLM format, ready to embed in an Arduino sketch. By the end of this lesson you will:

Train a small neural network on your 90 captured samples using TensorFlow + Colab.
Convert the trained model to TensorFlow Lite (a smaller flatbuffer format) then to a C array.
Drop the C array into an Arduino sketch — ready for inference tomorrow.

Warm-Up 10 min

Hardware: just the laptop today. Tomorrow the model goes back onto the Nano.

Open Google Colab

Visit colab.research.google.com. Sign in. New Python 3 notebook. Free GPU available (not needed for our tiny model, but fast for bigger ones).

Upload your CSVs

Click the folder icon in Colab's sidebar → upload files. Drag the 90 CSV files from yesterday. Or zip them and upload one zip, then unzip in Python.

New Concept · The training pipeline 25 min

Step 1 — load the data

import numpy as np
import pandas as pd
import glob, os

def load(label, folder="data"):
    samples = []
    for f in sorted(glob.glob(f"{folder}/{label}_*.csv")):
        df = pd.read_csv(f)
        x = df.values   # shape (T, 3)
        # Pad/crop to fixed length T=150 (1.5 s at 100 Hz)
        if x.shape[0] >= 150:
            x = x[:150]
        else:
            pad = np.zeros((150 - x.shape[0], 3))
            x = np.vstack([x, pad])
        samples.append(x.flatten())   # shape (450,)
    return np.array(samples)

X_wave   = load("wave");   y_wave   = np.full(len(X_wave),   0)
X_punch  = load("punch");  y_punch  = np.full(len(X_punch),  1)
X_circle = load("circle"); y_circle = np.full(len(X_circle), 2)

X = np.vstack([X_wave, X_punch, X_circle])
y = np.concatenate([y_wave, y_punch, y_circle])

print(X.shape, y.shape)

Output: (90, 450) (90,). 90 samples, each 450 features (150 timesteps × 3 axes).

Step 2 — train/test split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y)

Step 3 — build a tiny neural network

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(450,)),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(16, activation="relu"),
    tf.keras.layers.Dense(3, activation="softmax"),
])
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])
model.summary()

~15 KB model. Three layers. Input = the 450 raw features (no fancy feature engineering — neural nets learn their own features).

Step 4 — train

model.fit(X_train, y_train, epochs=50, batch_size=4,
          validation_data=(X_test, y_test))

50 passes over the training data. Watch the loss go down + accuracy go up. With 72 training samples and 3 classes, expect 90 %+ test accuracy after 50 epochs.

Step 5 — evaluate

loss, acc = model.evaluate(X_test, y_test)
print(f"test accuracy: {acc:.2%}")

# Confusion matrix
from sklearn.metrics import confusion_matrix
y_pred = model.predict(X_test).argmax(axis=1)
print(confusion_matrix(y_test, y_pred))

If test accuracy is < 70%: capture more data, vary it more, or the gestures are too similar.

Step 6 — convert to TFLite

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]    # quantise to int8
tflite_model = converter.convert()

with open("gesture_model.tflite", "wb") as f:
    f.write(tflite_model)

print(f"size: {len(tflite_model)} bytes")

Typical: 5–15 KB. Fits comfortably in flash.

Step 7 — to a C array

# Use xxd-style conversion
import re
with open("gesture_model.tflite", "rb") as f:
    raw = f.read()
hex_str = ",".join(f"0x{b:02x}" for b in raw)
c_code = (
    f"const unsigned char gesture_model[] = {{ {hex_str} }};\n"
    f"const unsigned int gesture_model_len = {len(raw)};\n"
)
with open("gesture_model.h", "w") as f:
    f.write(c_code)

You now have a .h file with the model as a byte array. Drop it into your Arduino sketch.

Worked Example · End-to-end on Colab 25 min

Open Colab. Paste in the §3 code blocks step by step. By the end you should have:

gesture_model.tflite — the trained model.
gesture_model.h — C-array version ready for Arduino.
Test accuracy printed (target: > 85%).

Common issues + fixes

Symptom	Fix
Test accuracy ~33% (random)	Data isn't separable. Re-capture with clearer gesture distinction.
Train acc 99%, test 50%	Overfit. Smaller model (less Dense neurons) or more training data.
Loss = NaN	Features wildly out of range. Normalise: `X = (X - X.mean()) / X.std()`.
Confusion matrix shows one class always wrong	That class is too similar to another. Re-capture or add more variation.

Bonus: visualise the embeddings

from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

pca = PCA(n_components=2)
X_2d = pca.fit_transform(X)
for label, c in zip([0,1,2], ["red", "blue", "green"]):
    mask = (y == label)
    plt.scatter(X_2d[mask, 0], X_2d[mask, 1], c=c, label=["wave","punch","circle"][label])
plt.legend()
plt.show()

Three clouds of points; if they overlap, gestures are too similar.

Try It Yourself 15 min

🟢 Easy

Goal: Try different hidden layer sizes. Compare test accuracy with [16,8], [32,16], [64,32]. Bigger isn't always better (overfit).

🟡 Medium

Goal: Add a 4th gesture (e.g. "tap"). Re-capture 30 examples. Re-train. Confirm 4-class accuracy.

🔴 Stretch

Goal: Use Edge Impulse instead of Colab. It does the same workflow as a managed web service: capture, label, train, evaluate, deploy. Compare experiences.

Mini-Challenge · Ship the model 10 min

Train a final model with > 85% test accuracy.
Save the C-array .h file.
Note the file size and the test accuracy.
Tomorrow we run it on the Nano.

Recap 5 min

Train in TensorFlow on a laptop or Colab, convert to TFLite + quantise, export as a C array. Pipeline: capture → train → convert → deploy. The output: ~10 KB model file ready to drop into an Arduino sketch. Tomorrow we run inference on the Nano.

TensorFlow Lite Micro (TFLM): Google's embedded ML runtime. ~16 KB code footprint. Runs TFLite models.
Quantisation: Reducing weights from 32-bit float to 8-bit int. 4× smaller, 4× faster, ~1% accuracy loss.
Flatbuffer: The .tflite binary format. Compact, self-describing, mmap-able.
C array export: Converting the .tflite bytes into a const unsigned char[] for embedding in firmware.
Epoch: One full pass through the training data during training.
Batch size: How many samples processed before updating weights. 4–32 typical for small datasets.
Loss: The training objective. Lower = better fit. Goes down during training.
Validation set / test set: Data held back to honestly estimate model performance. 20% of total is standard.

Homework 5 min

Run the §3 pipeline on Colab. Save the .h file.
Note your test accuracy.
Read ahead to ARD-L04-35 (Gesture Recognition).

Learning Goals 5 min

Train a small neural network on your 90 captured samples using TensorFlow + Colab.
Convert the trained model to TensorFlow Lite (a smaller flatbuffer format) then to a C array.
Drop the C array into an Arduino sketch — ready for inference tomorrow.

Warm-Up 10 min

Hardware: just the laptop today. Tomorrow the model goes back onto the Nano.

Open Google Colab

Visit colab.research.google.com. Sign in. New Python 3 notebook. Free GPU available (not needed for our tiny model, but fast for bigger ones).

Upload your CSVs

Click the folder icon in Colab's sidebar → upload files. Drag the 90 CSV files from yesterday. Or zip them and upload one zip, then unzip in Python.

New Concept · The training pipeline 25 min

Step 1 — load the data

import numpy as np
import pandas as pd
import glob, os

def load(label, folder="data"):
    samples = []
    for f in sorted(glob.glob(f"{folder}/{label}_*.csv")):
        df = pd.read_csv(f)
        x = df.values   # shape (T, 3)
        # Pad/crop to fixed length T=150 (1.5 s at 100 Hz)
        if x.shape[0] >= 150:
            x = x[:150]
        else:
            pad = np.zeros((150 - x.shape[0], 3))
            x = np.vstack([x, pad])
        samples.append(x.flatten())   # shape (450,)
    return np.array(samples)

X_wave   = load("wave");   y_wave   = np.full(len(X_wave),   0)
X_punch  = load("punch");  y_punch  = np.full(len(X_punch),  1)
X_circle = load("circle"); y_circle = np.full(len(X_circle), 2)

X = np.vstack([X_wave, X_punch, X_circle])
y = np.concatenate([y_wave, y_punch, y_circle])

print(X.shape, y.shape)

Output: (90, 450) (90,). 90 samples, each 450 features (150 timesteps × 3 axes).

Step 2 — train/test split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y)

Step 3 — build a tiny neural network

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(450,)),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(16, activation="relu"),
    tf.keras.layers.Dense(3, activation="softmax"),
])
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])
model.summary()

~15 KB model. Three layers. Input = the 450 raw features (no fancy feature engineering — neural nets learn their own features).

Step 4 — train

model.fit(X_train, y_train, epochs=50, batch_size=4,
          validation_data=(X_test, y_test))

50 passes over the training data. Watch the loss go down + accuracy go up. With 72 training samples and 3 classes, expect 90 %+ test accuracy after 50 epochs.

Step 5 — evaluate

loss, acc = model.evaluate(X_test, y_test)
print(f"test accuracy: {acc:.2%}")

# Confusion matrix
from sklearn.metrics import confusion_matrix
y_pred = model.predict(X_test).argmax(axis=1)
print(confusion_matrix(y_test, y_pred))

If test accuracy is < 70%: capture more data, vary it more, or the gestures are too similar.

Step 6 — convert to TFLite

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]    # quantise to int8
tflite_model = converter.convert()

with open("gesture_model.tflite", "wb") as f:
    f.write(tflite_model)

print(f"size: {len(tflite_model)} bytes")

Typical: 5–15 KB. Fits comfortably in flash.

Step 7 — to a C array

# Use xxd-style conversion
import re
with open("gesture_model.tflite", "rb") as f:
    raw = f.read()
hex_str = ",".join(f"0x{b:02x}" for b in raw)
c_code = (
    f"const unsigned char gesture_model[] = {{ {hex_str} }};\n"
    f"const unsigned int gesture_model_len = {len(raw)};\n"
)
with open("gesture_model.h", "w") as f:
    f.write(c_code)

You now have a .h file with the model as a byte array. Drop it into your Arduino sketch.

Worked Example · End-to-end on Colab 25 min

Open Colab. Paste in the §3 code blocks step by step. By the end you should have:

gesture_model.tflite — the trained model.
gesture_model.h — C-array version ready for Arduino.
Test accuracy printed (target: > 85%).

Common issues + fixes

Symptom	Fix
Test accuracy ~33% (random)	Data isn't separable. Re-capture with clearer gesture distinction.
Train acc 99%, test 50%	Overfit. Smaller model (less Dense neurons) or more training data.
Loss = NaN	Features wildly out of range. Normalise: `X = (X - X.mean()) / X.std()`.
Confusion matrix shows one class always wrong	That class is too similar to another. Re-capture or add more variation.

Bonus: visualise the embeddings

from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

pca = PCA(n_components=2)
X_2d = pca.fit_transform(X)
for label, c in zip([0,1,2], ["red", "blue", "green"]):
    mask = (y == label)
    plt.scatter(X_2d[mask, 0], X_2d[mask, 1], c=c, label=["wave","punch","circle"][label])
plt.legend()
plt.show()

Three clouds of points; if they overlap, gestures are too similar.

Try It Yourself 15 min

🟢 Easy

Goal: Try different hidden layer sizes. Compare test accuracy with [16,8], [32,16], [64,32]. Bigger isn't always better (overfit).

🟡 Medium

Goal: Add a 4th gesture (e.g. "tap"). Re-capture 30 examples. Re-train. Confirm 4-class accuracy.

🔴 Stretch

Goal: Use Edge Impulse instead of Colab. It does the same workflow as a managed web service: capture, label, train, evaluate, deploy. Compare experiences.

Mini-Challenge · Ship the model 10 min

Train a final model with > 85% test accuracy.
Save the C-array .h file.
Note the file size and the test accuracy.
Tomorrow we run it on the Nano.

Recap 5 min

TensorFlow Lite Micro (TFLM): Google's embedded ML runtime. ~16 KB code footprint. Runs TFLite models.
Quantisation: Reducing weights from 32-bit float to 8-bit int. 4× smaller, 4× faster, ~1% accuracy loss.
Flatbuffer: The .tflite binary format. Compact, self-describing, mmap-able.
C array export: Converting the .tflite bytes into a const unsigned char[] for embedding in firmware.
Epoch: One full pass through the training data during training.
Batch size: How many samples processed before updating weights. 4–32 typical for small datasets.
Loss: The training objective. Lower = better fit. Goes down during training.
Validation set / test set: Data held back to honestly estimate model performance. 20% of total is standard.

Homework 5 min

Run the §3 pipeline on Colab. Save the .h file.
Note your test accuracy.
Read ahead to ARD-L04-35 (Gesture Recognition).