What Is Edge AI? — Arduino L4 · Advaslearning Hub

Learning Goals 5 min

"AI" usually means "data uploaded to a cloud server with a GPU". Edge AI moves that inference onto small devices — phones, doorbells, sensors. With TensorFlow Lite Micro a Nano 33 BLE Sense can run a real neural network and recognise gestures. By the end of this lesson you will:

Compare cloud AI vs edge AI and explain why each fits different problems.
Identify the resource constraints of edge devices (RAM, flash, no GPU, milliwatt power budget).
Name three real edge AI uses: keyword spotting ("Hey Google"), gesture recognition, image classification.

Warm-Up 10 min

No new hardware today — conceptual lesson. Bring a Nano 33 BLE Sense if you have it; you'll meet its sensors.

Where AI runs today

Location	Hardware	Model size	Latency
Cloud (OpenAI, Google)	A100 GPU clusters	~1 TB (LLMs)	200 ms – seconds + network
Phone (Siri / Google Lens)	Mobile NPU	~100 MB	~50 ms
Edge AI (Nano 33 BLE Sense)	nRF52840 ARM Cortex-M4	~50 KB	~10 ms

Edge AI runs the smallest models on the smallest chips. The trade-off: 50 KB of model vs 1 TB. You can't run ChatGPT on a Nano. But you CAN classify gestures, detect keywords ("hey"), recognise objects from a tiny camera. The use cases just have to be small enough.

New Concept · Edge vs cloud trade-offs 25 min

Why edge over cloud

Privacy: data never leaves the device. Camera frames, microphone audio, biometric sensors — all stay local.
Latency: no round trip to the cloud. 10 ms decisions instead of 200+ ms.
Reliability: works offline. The doorbell that says "Amazon delivery" doesn't care if your WiFi is down.
Cost: no per-API-call fee. One $30 board, lifetime free inference.
Power: edge inference can be milliwatts. Cloud inference is kilowatts.

Why cloud over edge

Model size: GPT-class models simply don't fit on edge hardware.
Updates: cloud models update centrally; edge models need OTA flashes.
Collective data: cloud sees all users; edge sees one.

Edge AI examples in the wild

Product	Model	Chip class
Alexa "wake word"	~50 KB keyword spotter	ARM Cortex-M4
Apple Face ID	Custom NN on the Neural Engine	Apple A-series NPU
Ring doorbell person-detect	~200 KB image classifier	NPU on the doorbell SoC
Smart-watch heart-rhythm detection	Tiny LSTM	ARM Cortex-M
Industrial "is this machine failing" vibration analyser	~10 KB autoencoder	STM32

The hardware: Nano 33 BLE Sense

Arduino's edge-AI poster board. Includes:

nRF52840 ARM Cortex-M4 + 1 MB flash + 256 KB RAM.
IMU (LSM9DS1 — 9-axis: accel + gyro + magnetometer).
Microphone (MP34DT05).
Temperature + humidity sensor (HTS221).
Pressure sensor (LPS22HB).
Gesture / proximity / light (APDS-9960).
BLE radio.

That's a wearable's worth of sensors on a £30 board. The right brain for gesture / sound / breath / etc. recognition.

The Edge AI workflow

Collect training data on the device (acceleration during "wave" gesture, etc.).
Train a small model on a laptop or in the cloud (Google Colab, Edge Impulse).
Convert the trained model to TensorFlow Lite Micro (a tiny inference runtime).
Deploy the model + runtime onto the device as a C array.
Run the model on live sensor data in your Arduino sketch.

We'll work through this end-to-end over L04-32 to L04-36.

Worked Example · Run the bundled gesture-classifier demo 20 min

If you have a Nano 33 BLE Sense

Install the "Arduino_TensorFlowLite" library (Tools → Manage Libraries).
File → Examples → Arduino_TensorFlowLite → magic_wand.
Upload to your Nano 33 BLE Sense.
Open Serial Monitor at 9600 baud.
Hold the board flat. Move it in a clear "W", "O", or "ring" gesture.
The serial monitor prints which gesture it recognised (with confidence).

That's a real neural network classifying real-time IMU data on a £30 board with no internet. The TensorFlow Lite runtime occupies ~80 KB of flash; the model itself is ~20 KB.

If you don't have one

You can train + simulate models on Google Colab and watch the inference. The hardware is needed only for the actual on-device deployment. We'll provide simulation paths in L04-33.

Try It Yourself · Paper exercises 15 min

🟢 Q1

Edge or cloud: a real-time pedestrian detector on a car dashboard. Why?

Reveal

Edge. Latency matters (cars move fast); reliability matters (mid-tunnel = no cloud); privacy matters. NVIDIA Drive / Tesla FSD chips run these on-vehicle.

🟢 Q2

Edge or cloud: GPT-style chatbot. Why?

Reveal

Cloud. Model too big (~1 TB of weights), needs heavy GPUs. Even on-device LLMs need 10+ GB of RAM — not in "edge" territory.

🟡 Q3

Why is "Alexa, wake word" specifically on-device, while the actual command interpretation happens in the cloud?

Reveal

The mic must listen 24/7. Streaming audio to the cloud constantly = privacy violation + bandwidth cost + latency. A tiny on-device wake-word model triggers ~100% of the time; only after wake-up does the device stream the next 2–5 s to the cloud for full interpretation. Edge + cloud combined.

🔴 Q4

You want a baby monitor that detects when the baby is crying. Edge, cloud, or both? Justify.

Reveal

Pure edge: privacy (no streaming a baby's sounds to the cloud), reliability (works on flaky WiFi), latency (instant alert). Small audio model (~50 KB) trained on baby-cry vs not, on a Nano 33 BLE Sense.

Mini-Challenge · Plan your edge AI project 10 min

Sketch in your notebook one product idea for each of:

Gesture recognition (accelerometer).
Keyword spotting (microphone).
Vibration anomaly detection (industrial).

For each, note: model size, training data needed, why edge over cloud, target hardware.

Recap 5 min

Edge AI = small models on small chips. Wins on privacy, latency, offline reliability, cost. Loses on model size and central updates. Nano 33 BLE Sense is Arduino's edge AI flagship — Cortex-M4 + 9-axis IMU + mic + climate + light. Tomorrow we learn ML basics; the rest of Cluster F applies them.

Edge AI: Running inference on the data-generating device, not in the cloud.
Inference: Running a trained model to make a prediction. Distinct from training (which produces the model).
TensorFlow Lite Micro: Google's ML runtime for microcontrollers. ~16 KB code footprint. Runs on Cortex-M0+ and up.
NPU (Neural Processing Unit): Specialised AI chip. Phones (Apple Neural Engine, Google Tensor) include them; Arduino-class don't.
Model size: How many bytes the trained weights occupy. Tiny models: < 100 KB. Big models: GB+.
Wake word: A small phrase ("Hey Google") detected on-device before activating cloud-based interpretation.
Keyword spotting: Detecting one of a small vocabulary of words. Doable on edge.
Anomaly detection: Spotting "this isn't the normal pattern". Small autoencoder models excel.

Homework 5 min

If you have a Nano 33 BLE Sense, upload the magic_wand example. Try the gestures.
Read ahead to ARD-L04-32 (ML Fundamentals).
If you have a laptop, install Python 3 and the tensorflow package (or sign up for Google Colab — easier).