Learning Goals
3 minBy the end of this lesson you can:
- Use
yieldto turn a function into a generator. - Trace how
yieldpauses execution and resumes it. - Use
yieldinside a loop to produce a stream of values. - Compare a generator to yesterday's class-based iterator — same protocol, simpler code.
Warm-Up · The Smallest Generator
5 mindef hello(): print("before yield") yield 1 print("after yield, before next") yield 2 print("after second yield") g = hello() # nothing prints! g is a generator object, not yet running print(next(g)) # → prints "before yield" then yields 1 print(next(g)) # → prints "after yield, before next" then yields 2 print(next(g)) # → prints "after second yield" then raises StopIteration
Three new things to absorb. (1) Calling hello() doesn't run the function — it makes a generator. (2) Each next(g) runs the function until the next yield. (3) After the last yield, falling off the end raises StopIteration automatically.
A generator is a function that pauses at each yield. Local variables are kept frozen. Calling next() resumes execution until the next yield. The whole iterator protocol — free.
New Concept · yield in Anger
14 minFrom yesterday's class to today's function
# Yesterday — 12 lines class Countdown: def __init__(self, n): self.n = n def __iter__(self): return self def __next__(self): if self.n <= 0: raise StopIteration v = self.n self.n -= 1 return v # Today — 3 lines def countdown(n): while n > 0: yield n n -= 1 for x in countdown(5): print(x) # 5, 4, 3, 2, 1
Same behaviour. Same iterator protocol underneath. Five times less code.
What Python does behind the scenes
When you call countdown(5):
- Python sees a
yieldinside the function definition — it knows this is a generator function. - Calling
countdown(5)returns a generator object — no code from inside has run yet. - The first
next()on it runs until the firstyield n, hands backn, and freezes. - The next
next()picks up right after the yield, runsn -= 1and the loop check, then yields again. - When the function's execution falls off the end (no more
yields to hit), Python raisesStopIteration.
The for-loop calls next() internally, just like with any other iterator.
Generators ARE iterators
A generator object satisfies the iterator protocol — __iter__ returns itself; __next__ resumes execution. Everywhere an iterator works, a generator works.
g = countdown(5) print(type(g)) # <class 'generator'> print(iter(g) is g) # True — generator returns itself from __iter__ print(next(g)) # 5 print(next(g)) # 4
Yielding from a loop
The pattern you'll use most. Loop something; yield each interesting value.
def squares(n): for i in range(1, n + 1): yield i * i print(list(squares(5))) # [1, 4, 9, 16, 25] print(sum(squares(10))) # 385 def even_only(numbers): for n in numbers: if n % 2 == 0: yield n print(list(even_only(range(1, 11)))) # [2, 4, 6, 8, 10]
squares generates one value per iteration. even_only filters — yields only matching items. These are the "generator versions" of comprehensions; useful when you can't fit logic into one expression.
Multiple yields per call
A generator can yield several values in one step through the function body:
def fanfare(): yield "drumroll..." yield "trumpets!" yield "TA-DA!" for line in fanfare(): print(line)
Generator expressions vs generator functions
From PY-L3-21, you already know generator expressions — (x for x in seq). They're sugar for simple generator functions. For one-liners, use the expression. For anything with loops, conditions, or state, use a def ... yield function.
The yield from delegation
If you want a generator to yield everything from another iterable:
def chain(a, b): yield from a yield from b print(list(chain([1, 2, 3], [4, 5]))) # [1, 2, 3, 4, 5]
yield from iterable is sugar for for x in iterable: yield x. Great for composing generators.
Worked Example · File Line-by-line
12 minGenerators shine for processing big files without slurping. Save as scan_log.py:
# scan_log.py — pipeline of generators on a (potentially huge) log file def read_lines(path): """Yield stripped lines from a file. Memory-light.""" with open(path, encoding="utf-8") as f: for line in f: yield line.strip() def error_lines(lines): """Filter — yield only ERROR-level entries.""" for line in lines: if "ERROR" in line: yield line def parse_user(lines): """Transform — yield just the user name from each line.""" import re for line in lines: m = re.search(r"user=(\w+)", line) if m: yield m.group(1) # Build the pipeline lines = read_lines("app.log") errs = error_lines(lines) users = parse_user(errs) # Process — only now is anything actually read from collections import Counter counts = Counter(users) print(counts.most_common(5))
Sample app.log
2026-05-27T08:01:02 INFO user=aisyah action=login 2026-05-27T08:02:11 ERROR user=wei_jie action=invalid_login 2026-05-27T08:05:45 ERROR user=priya action=db_query 2026-05-27T08:11:33 INFO user=iman action=logout 2026-05-27T08:14:50 ERROR user=wei_jie action=auth_failed ...
Read the diff
Three generator functions composed into a pipeline. Nothing actually reads the file until Counter(users) starts iterating. Each line is touched once, then thrown away. A 10GB log file would work — no memory spike. With list comprehensions you'd build three intermediate lists, each potentially huge.
Each value flows through the pipeline one at a time. Counter asks users for next; users asks errs; errs asks lines; lines reads one line from the file. Repeat. Memory cost = one line. Time cost = one pass over the file.
Try It Yourself
13 minRewrite yesterday's Countdown class as a one-function generator. Run it.
Hint
def countdown(n): while n > 0: yield n n -= 1 print(list(countdown(5))) # [5, 4, 3, 2, 1]
From 12 lines to 3. Same behaviour.
Write a generator positives(seq) that yields only positive numbers from any iterable.
Hint
def positives(seq): for n in seq: if n > 0: yield n print(list(positives([-3, 2, -1, 5, 0, 7]))) # [2, 5, 7]
Three lines. The equivalent class-based version would be ten.
Write a generator running_total(seq) that yields the running sum.
Hint
def running_total(seq): total = 0 for n in seq: total += n yield total print(list(running_total([1, 2, 3, 4, 5]))) # [1, 3, 6, 10, 15]
The local total persists between yields. That's the whole point — the generator remembers state without a class.
Mini-Challenge · Three-Generator Pipeline
8 minBuild word_pipeline.py. Three generator functions, composed:
lines(text)— yields one line at a time, stripped.words(lines)— yields each word from each line, lower-cased.long_words(words, min_len=4)— yields only words at leastmin_lenlong.
Connect them and print the unique long words from a paragraph.
Show one possible solution
# word_pipeline.py — composed generators def lines(text): for line in text.splitlines(): s = line.strip() if s: yield s def words(lines): for line in lines: for w in line.split(): yield w.lower() def long_words(words, min_len=4): for w in words: if len(w) >= min_len: yield w text = """The quick brown fox jumps over the lazy dog. Pack my box with five dozen liquor jugs!""" pipe = long_words(words(lines(text)), min_len=5) unique = sorted(set(pipe)) print(unique)
Non-negotiables: three generator functions, each def+yield; the pipeline composed by nesting; set + sorted to dedupe and order. The data never sits as a list anywhere — pure streaming.
Recap
3 minA generator is a function with yield in it. Calling it returns a generator object. next() resumes execution to the next yield. Local state is preserved across yields. Falling off the end raises StopIteration automatically. A generator is an iterator — works in for-loops, list/set/dict, sum/max/min, and as input to other generators. yield from chains generators cleanly. Compared to class-based iterators, generators are typically five times less code for the same job.
Vocabulary Card
- yield
- Pauses the function, hands back the value, freezes locals. Resumes on next call.
- generator function
- Any function containing at least one
yield. Calling it returns a generator object. - generator object
- The actual iterator. Defines
__iter__and__next__. - yield from
- Delegate to another iterable — yield every value from it.
- lazy evaluation
- Values produced only when asked for. The whole point of generators.
Homework
4 minWrite four small generators:
evens_up_to(n)— yields 2, 4, 6, ..., up to n.letters_of(text)— yields each alphabetic character, lower-cased.chunk(seq, size)— yields chunks of lengthsizefromseq. The last chunk may be shorter.cycle(seq, times)— yields every item ofseq,timestotal cycles.
Test each.
Sample · gen_kit.py
def evens_up_to(n): i = 2 while i <= n: yield i i += 2 def letters_of(text): for ch in text: if ch.isalpha(): yield ch.lower() def chunk(seq, size): buf = [] for item in seq: buf.append(item) if len(buf) == size: yield buf buf = [] if buf: yield buf def cycle(seq, times): for _ in range(times): for item in seq: yield item print(list(evens_up_to(10))) # [2,4,6,8,10] print(list(letters_of("Hello, World!"))) # ['h','e','l','l','o','w','o','r','l','d'] print(list(chunk([1,2,3,4,5,6,7], 3))) # [[1,2,3],[4,5,6],[7]] print(list(cycle(["a","b"], 3))) # ['a','b','a','b','a','b']
Non-negotiables: four generator functions, each yielding one value at a time. chunk shows that you can yield lists too — the yield gives back whatever you say.