Learning Goals
3 minBy the end of this lesson you can:
- Implement linear search three ways: find-first, find-all, count.
- Return both the index and the value when needed.
- Reason about best, worst and average case — for a list of N items.
- Identify when linear search is the right choice and when to reach for binary search instead.
Warm-Up · You've Done This Already
5 minEvery in operator and every .index() call you've written is a linear search:
names = ["Aisyah", "Wei Jie", "Priya", "Iman"] "Priya" in names # Python walks the list until it finds Priya names.index("Priya") # Same walk; returns the index (2)
Today we write the algorithm by hand. Same logic, more control — return early on hit, handle the not-found case explicitly, count matches.
Linear search is unbeatable when the data isn't sorted, doesn't fit any index, or is searched only once. It's O(n) — slow on giant lists, but simple and always works.
New Concept · Three Flavours
14 min1 · Find-first — return the index, or -1
def find_first(lst, target): for i, item in enumerate(lst): if item == target: return i return -1 print(find_first(["a", "b", "c", "d"], "c")) # 2 print(find_first(["a", "b", "c", "d"], "z")) # -1
The canonical shape. enumerate gives index + value. return i exits the moment we hit. return -1 after the loop signals "not found".
2 · Find-all — collect every match
def find_all(lst, target): return [i for i, item in enumerate(lst) if item == target] print(find_all([10, 20, 30, 20, 40, 20], 20)) # [1, 3, 5] print(find_all([1, 2, 3], 99)) # []
List comprehension over enumerate. No early exit — must scan everything to find all matches.
3 · Count occurrences
def count_of(lst, target): n = 0 for item in lst: if item == target: n += 1 return n # Or more Pythonically: def count_of_v2(lst, target): return sum(1 for item in lst if item == target)
Same loop shape. We're still doing O(n) work — full scan unavoidable.
The cases — best, worst, average
Best case Target is the first item → 1 comparison Worst case Target is the last (or not present) → N comparisons Average case Random position in a random list → N/2 comparisons Conclusion: linear search is O(n). For N = 1,000,000, that's up to 1 million comparisons.
What if the items are objects?
Same shape — just compare a specific attribute or use a predicate.
heroes = [Hero("Aisyah", 100), Hero("Wei Jie", 80), Hero("Priya", 95)] # By attribute def find_by_name(heroes, name): for h in heroes: if h.name == name: return h return None # By any predicate def find_first_where(seq, predicate): for item in seq: if predicate(item): return item return None # Find the first hero with HP < 90 weak = find_first_where(heroes, lambda h: h.hp < 90) print(weak.name if weak else "none found")
The predicate version is the most general — works for anything you can express as a boolean condition.
Early-exit short-circuit · the built-in way
Python's next() with a generator expression gives you find-first in one line:
weak = next((h for h in heroes if h.hp < 90), None) # Returns first match, or None if no match
Same logic as the loop version, far more compact. Use this for one-off finds in real code.
Worked Example · Verifying the Bounds
12 minLet's measure linear search empirically — confirm it's O(n). Save as linear_bench.py:
# linear_bench.py — count comparisons; demonstrate O(n) behaviour import random import time class CountingList: """Wraps a list, counts comparisons against its items.""" def __init__(self, items): self.items = items self.comparisons = 0 def __iter__(self): return iter(self.items) def __getitem__(self, i): self.comparisons += 1 return self.items[i] def __len__(self): return len(self.items) def linear_search(seq, target): """Find target. Returns index or -1.""" for i in range(len(seq)): if seq[i] == target: return i return -1 # Test for sizes from 10k to 1m for size in [10_000, 100_000, 1_000_000]: data = CountingList(list(range(size))) target = -1 # not in list — worst case data.comparisons = 0 start = time.perf_counter() linear_search(data, target) elapsed = time.perf_counter() - start print(f" N = {size:>8} comparisons = {data.comparisons:>8} " f"time = {elapsed * 1000:.2f} ms")
Sample output
N = 10000 comparisons = 10000 time = 1.32 ms N = 100000 comparisons = 100000 time = 12.15 ms N = 1000000 comparisons = 1000000 time = 121.05 ms
Read the diff
Comparisons grow exactly linearly with N. Time grows roughly the same — 10× more items = 10× more time. That's O(n). Tomorrow's binary search will do the same job in ~20 comparisons for a million items — exponentially fewer.
Try It Yourself
13 minWrite find_by_score(students, target_score) that returns the first student matching, or None. Each student is a dict with name and score.
Hint
def find_by_score(students, target_score): for s in students: if s["score"] == target_score: return s return None students = [{"name": "A", "score": 50}, {"name": "B", "score": 95}] print(find_by_score(students, 95)) # {'name': 'B', 'score': 95} print(find_by_score(students, 33)) # None
Write indices_of(lst, target) that returns a list of every index where target appears.
Hint
def indices_of(lst, target): return [i for i, x in enumerate(lst) if x == target] print(indices_of([1, 2, 1, 3, 1, 4], 1)) # [0, 2, 4] print(indices_of("banana", "a")) # [1, 3, 5]
Works on strings too — strings are iterable, enumerate gives (index, char).
Write first_where(seq, predicate) that returns the first item for which predicate(item) is True. Implement both with a loop and with next() + generator expression.
Hint
# Loop version def first_where(seq, predicate): for item in seq: if predicate(item): return item return None # next() version def first_where_v2(seq, predicate): return next((item for item in seq if predicate(item)), None) # Both work the same way: print(first_where([3, -1, 4, -2, 5], lambda x: x < 0)) # -1 print(first_where_v2([3, -1, 4, -2, 5], lambda x: x < 0)) # -1
The next(gen, default) trick handles the not-found case via the default value.
Mini-Challenge · The Multi-Field Search
8 minGiven a list of dicts with name, email, phone — build search(records, query) that returns every record matching the query as a substring of any field.
records = [ {"name": "Aisyah", "email": "aisyah@example.com", "phone": "012-3456789"}, {"name": "Wei Jie", "email": "wj@school.edu.my", "phone": "017-9921122"}, {"name": "Priya", "email": "priya@school.edu.my", "phone": "019-1112233"}, ] search(records, "school") # 2 matches — Wei Jie and Priya search(records, "017") # 1 match — Wei Jie search(records, "aisyah") # 1 match — Aisyah (case-insensitive)
Show one possible solution
def search(records, query): q = query.lower() out = [] for r in records: if any(q in str(v).lower() for v in r.values()): out.append(r) return out for r in search(records, "school"): print(f" {r['name']:<10} {r['email']}") print() for r in search(records, "017"): print(f" {r['name']:<10} {r['phone']}")
Non-negotiables: case-insensitive (lower-case both sides), substring match (in), any across all field values. The shape is exactly what every "search box" in every app does — Pythonically.
Recap
3 minLinear search walks a sequence until it finds the target or runs out. Three flavours: find-first (early exit, return index), find-all (full scan, return list), count (full scan, return number). O(n) in the worst case — N comparisons for N items. in, .index(), and next() + generator are Python's built-in versions. Linear search is the right tool when the data isn't sorted, when you'll only search once, or when the list is small.
Vocabulary Card
- linear search
- Walk the sequence; check each item; stop on hit or end. O(n).
- find-first vs find-all
- Find-first can exit early; find-all must scan the whole list.
- predicate
- A function returning True/False. Generalises search beyond equality.
- next(gen, default)
- Pythonic find-first idiom — generator expression + default for not-found.
Homework
4 minBuild search_kit.py with four functions:
find_min(seq)— return the smallest item using a single pass.find_min_index(seq)— return the index of the smallest.any_negative(seq)— True if any item is negative (early exit when you see one).all_positive(seq)— True only if every item is positive (early exit on the first non-positive).
All four are O(n) in worst case — the best case for #3 is O(1) (find a negative immediately) and same for #4 (find a non-positive immediately).
Sample · search_kit.py
def find_min(seq): if not seq: return None smallest = seq[0] for x in seq[1:]: if x < smallest: smallest = x return smallest def find_min_index(seq): if not seq: return -1 idx = 0 for i in range(1, len(seq)): if seq[i] < seq[idx]: idx = i return idx def any_negative(seq): for x in seq: if x < 0: return True # early exit! return False def all_positive(seq): for x in seq: if x <= 0: return False # early exit on first non-positive return True print(find_min([3, 1, 4, 1, 5, 9, 2, 6])) # 1 print(find_min_index([3, 1, 4, 1, 5, 9, 2, 6])) # 1 (first occurrence) print(any_negative([1, 2, -3, 4])) # True print(all_positive([1, 2, 3])) # True print(all_positive([1, 2, 0])) # False
Non-negotiables: any_negative and all_positive use early exit. The Pythonic versions would be any(x < 0 for x in seq) and all(x > 0 for x in seq) — Python's built-ins are already short-circuiting.