ArduinoLevel 4 · Cert PrepLesson 31

Level 4 · Lesson 31 · ARD-L04-31

What Is Edge AI?

90 minutes · Ages 13–16 · Opens Cluster F · Edge AI and Embedded ML

Learning Goals 5 min

"AI" usually means "data uploaded to a cloud server with a GPU". Edge AI moves that inference onto small devices — phones, doorbells, sensors. With TensorFlow Lite Micro a Nano 33 BLE Sense can run a real neural network and recognise gestures. By the end of this lesson you will:

  1. Compare cloud AI vs edge AI and explain why each fits different problems.
  2. Identify the resource constraints of edge devices (RAM, flash, no GPU, milliwatt power budget).
  3. Name three real edge AI uses: keyword spotting ("Hey Google"), gesture recognition, image classification.

Warm-Up 10 min

No new hardware today — conceptual lesson. Bring a Nano 33 BLE Sense if you have it; you'll meet its sensors.

Where AI runs today

LocationHardwareModel sizeLatency
Cloud (OpenAI, Google)A100 GPU clusters~1 TB (LLMs)200 ms – seconds + network
Phone (Siri / Google Lens)Mobile NPU~100 MB~50 ms
Edge AI (Nano 33 BLE Sense)nRF52840 ARM Cortex-M4~50 KB~10 ms

Edge AI runs the smallest models on the smallest chips. The trade-off: 50 KB of model vs 1 TB. You can't run ChatGPT on a Nano. But you CAN classify gestures, detect keywords ("hey"), recognise objects from a tiny camera. The use cases just have to be small enough.

New Concept · Edge vs cloud trade-offs 25 min

Why edge over cloud

  1. Privacy: data never leaves the device. Camera frames, microphone audio, biometric sensors — all stay local.
  2. Latency: no round trip to the cloud. 10 ms decisions instead of 200+ ms.
  3. Reliability: works offline. The doorbell that says "Amazon delivery" doesn't care if your WiFi is down.
  4. Cost: no per-API-call fee. One $30 board, lifetime free inference.
  5. Power: edge inference can be milliwatts. Cloud inference is kilowatts.

Why cloud over edge

  1. Model size: GPT-class models simply don't fit on edge hardware.
  2. Updates: cloud models update centrally; edge models need OTA flashes.
  3. Collective data: cloud sees all users; edge sees one.

Edge AI examples in the wild

ProductModelChip class
Alexa "wake word"~50 KB keyword spotterARM Cortex-M4
Apple Face IDCustom NN on the Neural EngineApple A-series NPU
Ring doorbell person-detect~200 KB image classifierNPU on the doorbell SoC
Smart-watch heart-rhythm detectionTiny LSTMARM Cortex-M
Industrial "is this machine failing" vibration analyser~10 KB autoencoderSTM32

The hardware: Nano 33 BLE Sense

Arduino's edge-AI poster board. Includes:

  • nRF52840 ARM Cortex-M4 + 1 MB flash + 256 KB RAM.
  • IMU (LSM9DS1 — 9-axis: accel + gyro + magnetometer).
  • Microphone (MP34DT05).
  • Temperature + humidity sensor (HTS221).
  • Pressure sensor (LPS22HB).
  • Gesture / proximity / light (APDS-9960).
  • BLE radio.

That's a wearable's worth of sensors on a £30 board. The right brain for gesture / sound / breath / etc. recognition.

The Edge AI workflow

  1. Collect training data on the device (acceleration during "wave" gesture, etc.).
  2. Train a small model on a laptop or in the cloud (Google Colab, Edge Impulse).
  3. Convert the trained model to TensorFlow Lite Micro (a tiny inference runtime).
  4. Deploy the model + runtime onto the device as a C array.
  5. Run the model on live sensor data in your Arduino sketch.

We'll work through this end-to-end over L04-32 to L04-36.

Worked Example · Run the bundled gesture-classifier demo 20 min

If you have a Nano 33 BLE Sense

  1. Install the "Arduino_TensorFlowLite" library (Tools → Manage Libraries).
  2. File → Examples → Arduino_TensorFlowLite → magic_wand.
  3. Upload to your Nano 33 BLE Sense.
  4. Open Serial Monitor at 9600 baud.
  5. Hold the board flat. Move it in a clear "W", "O", or "ring" gesture.
  6. The serial monitor prints which gesture it recognised (with confidence).

That's a real neural network classifying real-time IMU data on a £30 board with no internet. The TensorFlow Lite runtime occupies ~80 KB of flash; the model itself is ~20 KB.

If you don't have one

You can train + simulate models on Google Colab and watch the inference. The hardware is needed only for the actual on-device deployment. We'll provide simulation paths in L04-33.

Try It Yourself · Paper exercises 15 min

🟢 Q1

Edge or cloud: a real-time pedestrian detector on a car dashboard. Why?

Reveal

Edge. Latency matters (cars move fast); reliability matters (mid-tunnel = no cloud); privacy matters. NVIDIA Drive / Tesla FSD chips run these on-vehicle.

🟢 Q2

Edge or cloud: GPT-style chatbot. Why?

Reveal

Cloud. Model too big (~1 TB of weights), needs heavy GPUs. Even on-device LLMs need 10+ GB of RAM — not in "edge" territory.

🟡 Q3

Why is "Alexa, wake word" specifically on-device, while the actual command interpretation happens in the cloud?

Reveal

The mic must listen 24/7. Streaming audio to the cloud constantly = privacy violation + bandwidth cost + latency. A tiny on-device wake-word model triggers ~100% of the time; only after wake-up does the device stream the next 2–5 s to the cloud for full interpretation. Edge + cloud combined.

🔴 Q4

You want a baby monitor that detects when the baby is crying. Edge, cloud, or both? Justify.

Reveal

Pure edge: privacy (no streaming a baby's sounds to the cloud), reliability (works on flaky WiFi), latency (instant alert). Small audio model (~50 KB) trained on baby-cry vs not, on a Nano 33 BLE Sense.

Mini-Challenge · Plan your edge AI project 10 min

Sketch in your notebook one product idea for each of:

  1. Gesture recognition (accelerometer).
  2. Keyword spotting (microphone).
  3. Vibration anomaly detection (industrial).

For each, note: model size, training data needed, why edge over cloud, target hardware.

Recap 5 min

Edge AI = small models on small chips. Wins on privacy, latency, offline reliability, cost. Loses on model size and central updates. Nano 33 BLE Sense is Arduino's edge AI flagship — Cortex-M4 + 9-axis IMU + mic + climate + light. Tomorrow we learn ML basics; the rest of Cluster F applies them.

Edge AI
Running inference on the data-generating device, not in the cloud.
Inference
Running a trained model to make a prediction. Distinct from training (which produces the model).
TensorFlow Lite Micro
Google's ML runtime for microcontrollers. ~16 KB code footprint. Runs on Cortex-M0+ and up.
NPU (Neural Processing Unit)
Specialised AI chip. Phones (Apple Neural Engine, Google Tensor) include them; Arduino-class don't.
Model size
How many bytes the trained weights occupy. Tiny models: < 100 KB. Big models: GB+.
Wake word
A small phrase ("Hey Google") detected on-device before activating cloud-based interpretation.
Keyword spotting
Detecting one of a small vocabulary of words. Doable on edge.
Anomaly detection
Spotting "this isn't the normal pattern". Small autoencoder models excel.

Homework 5 min

  1. If you have a Nano 33 BLE Sense, upload the magic_wand example. Try the gestures.
  2. Read ahead to ARD-L04-32 (ML Fundamentals).
  3. If you have a laptop, install Python 3 and the tensorflow package (or sign up for Google Colab — easier).