ML Fundamentals — Arduino L4 · Advaslearning Hub

Learning Goals 5 min

Before we use ML on a Nano 33 BLE, we need the vocabulary. Just enough to discuss training, classification, neural networks, and overfitting — without scary maths. By the end of this lesson you will:

Define feature, label, training, inference, model.
Describe a simple neural network (input layer → hidden layer → output layer) with one analogy each.
Explain three common ML failure modes: underfitting, overfitting, training-on-bad-data.

Warm-Up 10 min

No hardware. Just paper.

The classic ML problem

Imagine a basket of fruit. You want a machine to sort "apple" from "banana". You measure 50 examples: length, width, weight. Each labelled apple or banana. You feed this to a learning algorithm. It builds a model that, given new measurements, predicts the label.

That's machine learning in one paragraph. Let's unpack each piece.

New Concept · The vocabulary 25 min

Five terms

Term	Meaning	Fruit example
Feature	A measurable property of one example	Length, width, weight
Label	The answer for one example	"apple" or "banana"
Dataset	A collection of (features, label) pairs	50 measured fruits
Training	The process of learning a model from a dataset	Algorithm finds rules separating apples from bananas
Inference	Using a trained model to predict on new data	New fruit: length 18, width 3, weight 110 → predicts banana

The simplest model: a rule

Even before neural networks, classification can be a simple rule:

If length / width > 2 → banana.
Else → apple.

Works ~95% of the time. Some apples are oblong, some bananas are short — accuracy isn't 100%. ML algorithms find better rules, especially when features interact in non-obvious ways.

The neural network analogy

Imagine 3 layers of small judges:

Input layer: takes the features (length, width, weight).
Hidden layer: each "neuron" looks at all inputs, weights them, decides "does this look like X?" (X might be "long thing" or "light thing" or "round thing" — the network learns what X means).
Output layer: takes the hidden layer's votes, decides "apple" or "banana".

Training adjusts the weights between layers so the predictions match the labels. Modern deep networks have many hidden layers and millions of weights — but the principle is the same.

Three failure modes

Underfitting: model too simple. Can't separate apples from bananas. Solution: more features, bigger model.
Overfitting: model memorises the training data. Perfect on training fruits, terrible on new ones. Solution: more data, simpler model, regularisation.
Training-on-bad-data: if your 50 bananas were all yellow and your 50 apples all red, the model learns "colour" — then a green apple looks like a banana to it. Solution: representative training data.

The training/test split

Take your 50 fruits. Use 40 for training; hold back 10 for "testing". Train on the 40. Predict on the 10. The test accuracy is your honest estimate of how good the model is. If training accuracy is 100% but test is 60%, you've overfit.

Edge AI workflow recap

From L04-31:

Collect features (sensor readings) labelled with the right class.
Train a small model.
Validate with a held-out test set.
Convert the model to TFLite Micro format.
Deploy onto Nano 33 BLE Sense.
Inference at runtime on live sensor data.

Worked Example · Tiny dataset on paper 25 min

Imagine 10 fruits:

#	Length (cm)	Weight (g)	Label
1	6	120	apple
2	7	140	apple
3	6.5	130	apple
4	8	180	apple
5	7.5	150	apple
6	20	120	banana
7	18	110	banana
8	22	130	banana
9	17	100	banana
10	19	105	banana

Find a rule by inspection

Length > 10 → banana. Else → apple. 100% on this dataset.

Is weight useful?

Both apples (120–180 g) and bananas (100–130 g) overlap. Weight alone is a worse feature than length. ML algorithms automatically weigh features by usefulness.

What if you add "orange"?

Oranges might be 7 cm × 150 g. Indistinguishable from apples by length alone. Need a third feature: colour, shape, surface texture. Algorithms can handle 10s or 100s of features.

What if a label is wrong?

Suppose you accidentally label fruit #6 (the 20 cm one) as "apple". Now the "length > 10" rule fails on it. ML training is robust to a few wrong labels — averages them out across many examples. But many wrong labels = bad model.

Try a tiny neural network mentally

Hidden neuron: "is length > 12?". Output: "if hidden neuron says yes, banana; else apple". That's a 1-neuron network. Real networks have hundreds, learning combinations like "long AND lightweight = banana; round AND heavy = apple".

Try It Yourself · Pen-and-paper 15 min

🟢 Q1

You collect 5 examples of waving and 5 of clapping using an accelerometer. What's the "feature" for each?

Reveal

A sequence of (x, y, z) accel values over time. Might be 50 samples × 3 channels = 150 features per example. Or a summary: peak frequency, mean magnitude, etc.

🟢 Q2

You trained a model with 90% accuracy on the training set and 50% on the test set. Underfit or overfit?

Reveal

Overfit — memorised the training, can't generalise.

🟡 Q3

You collect 100 examples of "wave hand" from your dominant hand only. The deployed product fails for left-handed users. What went wrong?

Reveal

Training data not representative. Solution: collect from both hands; even better, include variation in motion paths and speeds.

🔴 Q4

Why does 99% accuracy not mean the model is good if the "positive" class is rare?

Reveal

If positives are 1 in 100, predicting "always negative" gives 99% accuracy and is useless. Look at precision / recall / F1 score / confusion matrix instead. Class imbalance hides under accuracy.

Mini-Challenge · Design a dataset 10 min

You want to recognise three gestures with the Nano 33 BLE Sense: wave, punch, clap.

How many examples per gesture do you need? (50 is a starting point.)
What variation should each gesture include? (Different speeds? Both hands? Multiple people?)
What's the feature representation? (Raw accel sequence? Summary statistics?)
How would you split into train / test? (80 / 20 standard.)

Recap 5 min

ML vocabulary: feature, label, dataset, training, inference. Models learn from labelled examples; performance depends on data quality + representation. Watch for underfit / overfit / bad data. Tomorrow we collect real training data from the Nano 33 BLE Sense.

Feature: A measurable input variable. For gesture recognition: accelerometer values over time.
Label: The correct answer for one example ("wave", "punch").
Dataset: A collection of labelled examples.
Training: The process of learning model parameters from the dataset.
Inference: Using the trained model to predict the label for new data.
Model: The learned mathematical function mapping features to predictions.
Underfitting: Model too simple; poor on both train and test.
Overfitting: Model memorises training data; great on train, bad on test.
Train / test split: Holding back some data to honestly evaluate the model. 80/20 typical.
Class imbalance: One label has many more examples than another. Skews training; needs careful handling.
Neural network: A model made of layered "neurons" that combine features non-linearly. The dominant model class today.

Homework 5 min

Sign up for Google Colab if you haven't. Free Python ML environment.
Read ahead to ARD-L04-33 (Capturing Training Data). Bring the Nano 33 BLE Sense + a laptop.

Learning Goals 5 min

Define feature, label, training, inference, model.
Describe a simple neural network (input layer → hidden layer → output layer) with one analogy each.
Explain three common ML failure modes: underfitting, overfitting, training-on-bad-data.

Warm-Up 10 min

No hardware. Just paper.

The classic ML problem

That's machine learning in one paragraph. Let's unpack each piece.

New Concept · The vocabulary 25 min

Five terms

Term	Meaning	Fruit example
Feature	A measurable property of one example	Length, width, weight
Label	The answer for one example	"apple" or "banana"
Dataset	A collection of (features, label) pairs	50 measured fruits
Training	The process of learning a model from a dataset	Algorithm finds rules separating apples from bananas
Inference	Using a trained model to predict on new data	New fruit: length 18, width 3, weight 110 → predicts banana

The simplest model: a rule

Even before neural networks, classification can be a simple rule:

If length / width > 2 → banana.
Else → apple.

Works ~95% of the time. Some apples are oblong, some bananas are short — accuracy isn't 100%. ML algorithms find better rules, especially when features interact in non-obvious ways.

The neural network analogy

Imagine 3 layers of small judges:

Input layer: takes the features (length, width, weight).
Hidden layer: each "neuron" looks at all inputs, weights them, decides "does this look like X?" (X might be "long thing" or "light thing" or "round thing" — the network learns what X means).
Output layer: takes the hidden layer's votes, decides "apple" or "banana".

Training adjusts the weights between layers so the predictions match the labels. Modern deep networks have many hidden layers and millions of weights — but the principle is the same.

Three failure modes

Underfitting: model too simple. Can't separate apples from bananas. Solution: more features, bigger model.
Overfitting: model memorises the training data. Perfect on training fruits, terrible on new ones. Solution: more data, simpler model, regularisation.
Training-on-bad-data: if your 50 bananas were all yellow and your 50 apples all red, the model learns "colour" — then a green apple looks like a banana to it. Solution: representative training data.

The training/test split

Edge AI workflow recap

From L04-31:

Collect features (sensor readings) labelled with the right class.
Train a small model.
Validate with a held-out test set.
Convert the model to TFLite Micro format.
Deploy onto Nano 33 BLE Sense.
Inference at runtime on live sensor data.

Worked Example · Tiny dataset on paper 25 min

Imagine 10 fruits:

#	Length (cm)	Weight (g)	Label
1	6	120	apple
2	7	140	apple
3	6.5	130	apple
4	8	180	apple
5	7.5	150	apple
6	20	120	banana
7	18	110	banana
8	22	130	banana
9	17	100	banana
10	19	105	banana

Find a rule by inspection

Length > 10 → banana. Else → apple. 100% on this dataset.

Is weight useful?

Both apples (120–180 g) and bananas (100–130 g) overlap. Weight alone is a worse feature than length. ML algorithms automatically weigh features by usefulness.

What if you add "orange"?

Oranges might be 7 cm × 150 g. Indistinguishable from apples by length alone. Need a third feature: colour, shape, surface texture. Algorithms can handle 10s or 100s of features.

What if a label is wrong?

Try a tiny neural network mentally

Try It Yourself · Pen-and-paper 15 min

🟢 Q1

You collect 5 examples of waving and 5 of clapping using an accelerometer. What's the "feature" for each?

Reveal

A sequence of (x, y, z) accel values over time. Might be 50 samples × 3 channels = 150 features per example. Or a summary: peak frequency, mean magnitude, etc.

🟢 Q2

You trained a model with 90% accuracy on the training set and 50% on the test set. Underfit or overfit?

Reveal

Overfit — memorised the training, can't generalise.

🟡 Q3

You collect 100 examples of "wave hand" from your dominant hand only. The deployed product fails for left-handed users. What went wrong?

Reveal

Training data not representative. Solution: collect from both hands; even better, include variation in motion paths and speeds.

🔴 Q4

Why does 99% accuracy not mean the model is good if the "positive" class is rare?

Reveal

If positives are 1 in 100, predicting "always negative" gives 99% accuracy and is useless. Look at precision / recall / F1 score / confusion matrix instead. Class imbalance hides under accuracy.

Mini-Challenge · Design a dataset 10 min

You want to recognise three gestures with the Nano 33 BLE Sense: wave, punch, clap.

How many examples per gesture do you need? (50 is a starting point.)
What variation should each gesture include? (Different speeds? Both hands? Multiple people?)
What's the feature representation? (Raw accel sequence? Summary statistics?)
How would you split into train / test? (80 / 20 standard.)

Recap 5 min

Feature: A measurable input variable. For gesture recognition: accelerometer values over time.
Label: The correct answer for one example ("wave", "punch").
Dataset: A collection of labelled examples.
Training: The process of learning model parameters from the dataset.
Inference: Using the trained model to predict the label for new data.
Model: The learned mathematical function mapping features to predictions.
Underfitting: Model too simple; poor on both train and test.
Overfitting: Model memorises training data; great on train, bad on test.
Train / test split: Holding back some data to honestly evaluate the model. 80/20 typical.
Class imbalance: One label has many more examples than another. Skews training; needs careful handling.
Neural network: A model made of layered "neurons" that combine features non-linearly. The dominant model class today.

Homework 5 min

Sign up for Google Colab if you haven't. Free Python ML environment.
Read ahead to ARD-L04-33 (Capturing Training Data). Bring the Nano 33 BLE Sense + a laptop.