PY-L3-23 · Filtering Inside a Comprehension

Learning Goals

3 min

By the end of this lesson you can:

Use a filter if at the end to drop unwanted items.
Use a conditional expression at the front to transform items.
Combine both in one comprehension.
Spot when the comprehension is doing too much and refactor to a loop.

Warm-Up · Two if's, Two Roles

5 min

scores = [82, 47, 95, 60, 38]

# Filter — KEEP only passing scores. Drops failures.
passing = [s for s in scores if s >= 50]
# → [82, 95, 60]

# Conditional value — TRANSFORM each score to a label. Keeps all five.
labels = ["pass" if s >= 50 else "fail" for s in scores]
# → ['pass', 'fail', 'pass', 'pass', 'fail']

Both use the keyword if — but the role and position differ wildly. The filter drops. The conditional expression keeps everyone and transforms.

Today's big idea

Position matters. if at the end = filter. x if cond else y at the front = conditional expression. Different jobs.

New Concept · The Filter Toolkit

14 min

The shape

[ EXPR  for VAR in SEQ  if FILTER ]
  └┬┘  └────┬─────────┘  └───┬──┘
   value     loop           filter
   (always)  (always)       (optional)

Multiple filters — chain ifs

nums = list(range(1, 21))

# Multiples of 3 that are also even
result = [n for n in nums if n % 3 == 0 if n % 2 == 0]
# → [6, 12, 18]

# Equivalent and (often) clearer
result = [n for n in nums if n % 3 == 0 and n % 2 == 0]
# → [6, 12, 18]

Two ifs in a row mean "all conditions must hold". and in one filter is the same. Pick whichever reads better.

Filter on a nested attribute

students = [
    {"name": "Aisyah",  "score": 92, "year": 13},
    {"name": "Wei Jie", "score": 47, "year": 12},
    {"name": "Priya",   "score": 95, "year": 13},
    {"name": "Iman",    "score": 60, "year": 12},
]

# Year-13 students with passing scores
top_year = [s["name"] for s in students if s["year"] == 13 if s["score"] >= 50]
# → ['Aisyah', 'Priya']

# Same thing with 'and'
top_year = [s["name"] for s in students if s["year"] == 13 and s["score"] >= 50]

Conditional expression — transform without dropping

# Add 10 bonus marks to anyone below 50; leave the rest as-is
boosted = [s + 10 if s < 50 else s for s in scores]
# Same length as scores — every score appears, some boosted.

# Compare to filter — that drops failures entirely
keep_passing = [s for s in scores if s >= 50]
# Shorter than original — failures gone.

Both at once — transform AND filter

# Take only the passing scores, then label them. Two operations, one comp.
labels = ["A" if s >= 90 else "B" if s >= 75 else "C"
          for s in scores
          if s >= 50]
# scores: [82, 47, 95, 60, 38] → only [82, 95, 60] survive the filter
# then: ['B', 'A', 'C']

Front: nested conditional expression (A/B/C). End: filter (drop failures). Reads left-to-right as "for each surviving score, give me a grade label".

The mental order

Python evaluates a comprehension in the order:
  1. Loop variables and filters are applied left to right.
  2. The expression at the front fires for surviving items.
  3. The container collects the results.

So [transform if cond1 else other for x in seq if cond2]:
  for each x:
    if cond2 fails → skip
    else compute (transform if cond1 else other), include it

When to switch to a loop

If your filter has three different conditions and the value has nested ternaries — switch to a for-loop. The comprehension stops earning its keep around the second nested if/else:

# ❌ Too dense to read
labels = ["A+" if s >= 95 else "A" if s >= 90 else "B" if s >= 75 else "C" if s >= 50 else "F"
          for s in scores]

# ✅ Clearer
def grade(s):
    if s >= 95: return "A+"
    if s >= 90: return "A"
    if s >= 75: return "B"
    if s >= 50: return "C"
    return "F"

labels = [grade(s) for s in scores]

Extract the multi-branch logic to a function, call it from the comprehension. Best of both.

Worked Example · The Sales Report

12 min

Save as sales_filter.py:

# sales_filter.py — comprehensions doing real work

sales = [
    {"item": "nasi lemak",  "qty": 12, "price":  8.0, "kind": "food"},
    {"item": "teh tarik",   "qty":  8, "price":  3.5, "kind": "drink"},
    {"item": "roti canai",  "qty": 15, "price":  3.5, "kind": "food"},
    {"item": "milo",        "qty":  5, "price":  4.5, "kind": "drink"},
    {"item": "cendol",      "qty":  3, "price":  5.0, "kind": "drink"},
    {"item": "char kway teow", "qty": 9, "price":  9.5, "kind": "food"},
    {"item": "kopi",        "qty":  2, "price":  3.0, "kind": "drink"},
]

# 1 — items selling more than 5 units
hot = [s["item"] for s in sales if s["qty"] > 5]
print(f"Hot items: {hot}")

# 2 — total revenue from food only
food_revenue = sum(s["qty"] * s["price"] for s in sales if s["kind"] == "food")
print(f"Food revenue: RM {food_revenue:.2f}")

# 3 — Apply 10% discount to slow-moving items (< 5 sold); keep prices for the rest
adjusted = [
    {"item": s["item"], "price": s["price"] * 0.9 if s["qty"] < 5 else s["price"]}
    for s in sales
]
for a in adjusted:
    print(f"  {a['item']:<14} RM {a['price']:.2f}")

# 4 — Top-3 items by revenue (transform, filter, sort, slice)
top3 = sorted(
    [{"item": s["item"], "revenue": s["qty"] * s["price"]} for s in sales],
    key=lambda r: r["revenue"],
    reverse=True,
)[:3]
print("\nTop 3 by revenue:")
for r in top3:
    print(f"  {r['item']:<14} RM {r['revenue']:.2f}")

Output

Hot items: ['nasi lemak', 'teh tarik', 'roti canai', 'char kway teow']
Food revenue: RM 232.00
  nasi lemak     RM 8.00
  teh tarik      RM 3.50
  roti canai     RM 3.50
  milo           RM 4.50
  cendol         RM 4.50          # discounted (qty 3 < 5)
  char kway teow RM 9.50
  kopi           RM 2.70          # discounted (qty 2 < 5)

Top 3 by revenue:
  char kway teow RM 85.50
  nasi lemak     RM 96.00
  ...

Read the diff

Each report uses a different comprehension shape. (1) filter only. (2) generator with filter, fed to sum. (3) transform with conditional value — same length out as in. (4) build, then sort, then slice — chained. All compose cleanly because each operation has a focused role.

Try It Yourself

13 min

01 🟢 Drop the negatives

From [-3, 2, -1, 0, 5, -7, 4], build a list of just the non-negatives.

Hint

nums = [-3, 2, -1, 0, 5, -7, 4]
print([n for n in nums if n >= 0])     # [2, 0, 5, 4]

02 🟡 Convert negatives to 0

Same input, but this time convert each negative to 0 and keep the positives unchanged. Length doesn't change.

Hint

print([n if n >= 0 else 0 for n in nums])    # [0, 2, 0, 0, 5, 0, 4]

Conditional expression at the front. Length stays the same — every element processed.

03 🔴 Both at once (stretch)

From a list of words, keep only those longer than 3 chars, and shout the result if it's longer than 6 chars.

Hint

words = ["hi", "python", "is", "fantastic", "fun", "even"]
result = [w.upper() if len(w) > 6 else w
          for w in words
          if len(w) > 3]
print(result)
# → ['python', 'FANTASTIC', 'even']

The filter at the end drops short words. The conditional expression at the front shouts only the long survivors. Two operations, one line — clearly readable.

Mini-Challenge · Grading Engine

8 min

Build grading.py. Given students = [{"name": ..., "score": ...}, ...]:

Use a comprehension + filter to build top_3 — the three highest-scoring students.
Use a comprehension to build letter_grades — list of (name, grade) tuples. Grade: A ≥ 90, B ≥ 75, C ≥ 50, F otherwise. Use a helper function called from the comp.
Use a dict comprehension to build by_grade — grade letter → list of names with that grade.
Use a comprehension to build boosted — same students but everyone gets a 5-point bonus, capped at 100.

Show one possible solution

# grading.py — filter + transform pipelines

students = [
    {"name": "Aisyah",  "score": 92},
    {"name": "Wei Jie", "score": 47},
    {"name": "Priya",   "score": 95},
    {"name": "Iman",    "score": 60},
    {"name": "Aizat",   "score": 82},
    {"name": "Hafiz",   "score": 88},
]

# 1 — top 3 (sort desc, slice)
top_3 = sorted(students, key=lambda s: -s["score"])[:3]

# 2 — letter grades via helper
def grade_of(s):
    if s >= 90: return "A"
    if s >= 75: return "B"
    if s >= 50: return "C"
    return "F"

letter_grades = [(s["name"], grade_of(s["score"])) for s in students]

# 3 — by_grade (nested comp)
grades_seen = {g for _, g in letter_grades}
by_grade = {
    g: [name for name, lg in letter_grades if lg == g]
    for g in grades_seen
}

# 4 — boosted with cap
boosted = [
    {"name": s["name"], "score": min(100, s["score"] + 5)}
    for s in students
]

print("Top 3       :", [s["name"] for s in top_3])
print("Grades      :", letter_grades)
print("By grade    :", by_grade)
print("Boosted    :", boosted)

Non-negotiables: at least one filter, at least one conditional value, a helper function for multi-branch logic, and a nested dict comprehension that groups results.

Recap

3 min

Two flavours of if in comprehensions. The filter at the end drops items. The conditional expression at the front transforms items without dropping. Both can appear in the same comprehension. Chain filters with multiple ifs or one and. Refactor to a helper function when the conditional value has more than one nested ternary. The rule of thumb: readable beats clever.

Vocabulary Card

filter clause: if cond at the end. Drops items.
conditional expression: x if cond else y at the front. Transforms items.
multi-filter: Two ifs in a row, or one if with and. Same effect.
helper extraction: Pull multi-branch logic out to a function. Call from the comprehension. Easier to test.

Homework

4 min

Given:

books = [
    {"title": "Charlotte's Web", "pages": 192, "year": 1952},
    {"title": "The Hobbit",       "pages": 310, "year": 1937},
    {"title": "Wings of Fire",    "pages": 304, "year": 2012},
    {"title": "Tom Gates",        "pages": 256, "year": 2011},
    {"title": "Diary of a Wimpy", "pages": 221, "year": 2007},
]

Build, each as a single comprehension:

modern_titles — titles of books published 2000 or later.
length_labels — list of (title, "short"/"medium"/"long") tuples. Short < 200, medium < 300, long ≥ 300.
two_word_titles — only titles with exactly 2 words. title.split().
by_decade — dict comprehension: decade (1930, 1940, …) → list of titles in that decade.

Sample · books_filter.py

modern_titles = [b["title"] for b in books if b["year"] >= 2000]

def length_of(p):
    return "short" if p < 200 else "medium" if p < 300 else "long"

length_labels = [(b["title"], length_of(b["pages"])) for b in books]

two_word_titles = [b["title"] for b in books if len(b["title"].split()) == 2]

decades = {b["year"] // 10 * 10 for b in books}
by_decade = {
    d: [b["title"] for b in books if b["year"] // 10 * 10 == d]
    for d in decades
}

print("Modern   :", modern_titles)
print("Lengths  :", length_labels)
print("Two-word :", two_word_titles)
print("By decade:", by_decade)

Non-negotiables: filter at the end for #1 and #3, helper function for the multi-branch length classifier in #2, nested comprehension for #4. year // 10 * 10 floors to the start of the decade.

scores = [82, 47, 95, 60, 38] # Filter — KEEP only passing scores. Drops failures. passing = [s for s in scores if s >= 50] # → [82, 95, 60] # Conditional value — TRANSFORM each score to a label. Keeps all five. labels = ["pass" if s >= 50 else "fail" for s in scores] # → ['pass', 'fail', 'pass', 'pass', 'fail']

nums = list(range(1, 21)) # Multiples of 3 that are also even result = [n for n in nums if n % 3 == 0 if n % 2 == 0] # → [6, 12, 18] # Equivalent and (often) clearer result = [n for n in nums if n % 3 == 0 and n % 2 == 0] # → [6, 12, 18]

students = [ {"name": "Aisyah", "score": 92, "year": 13}, {"name": "Wei Jie", "score": 47, "year": 12}, {"name": "Priya", "score": 95, "year": 13}, {"name": "Iman", "score": 60, "year": 12}, ] # Year-13 students with passing scores top_year = [s["name"] for s in students if s["year"] == 13 if s["score"] >= 50] # → ['Aisyah', 'Priya'] # Same thing with 'and' top_year = [s["name"] for s in students if s["year"] == 13 and s["score"] >= 50]

# Add 10 bonus marks to anyone below 50; leave the rest as-is boosted = [s + 10 if s < 50 else s for s in scores] # Same length as scores — every score appears, some boosted. # Compare to filter — that drops failures entirely keep_passing = [s for s in scores if s >= 50] # Shorter than original — failures gone.

# Take only the passing scores, then label them. Two operations, one comp. labels = ["A" if s >= 90 else "B" if s >= 75 else "C" for s in scores if s >= 50] # scores: [82, 47, 95, 60, 38] → only [82, 95, 60] survive the filter # then: ['B', 'A', 'C']

Python evaluates a comprehension in the order: 1. Loop variables and filters are applied left to right. 2. The expression at the front fires for surviving items. 3. The container collects the results. So [transform if cond1 else other for x in seq if cond2]: for each x: if cond2 fails → skip else compute (transform if cond1 else other), include it

# ❌ Too dense to read labels = ["A+" if s >= 95 else "A" if s >= 90 else "B" if s >= 75 else "C" if s >= 50 else "F" for s in scores] # ✅ Clearer def grade(s): if s >= 95: return "A+" if s >= 90: return "A" if s >= 75: return "B" if s >= 50: return "C" return "F" labels = [grade(s) for s in scores]

# sales_filter.py — comprehensions doing real work sales = [ {"item": "nasi lemak", "qty": 12, "price": 8.0, "kind": "food"}, {"item": "teh tarik", "qty": 8, "price": 3.5, "kind": "drink"}, {"item": "roti canai", "qty": 15, "price": 3.5, "kind": "food"}, {"item": "milo", "qty": 5, "price": 4.5, "kind": "drink"}, {"item": "cendol", "qty": 3, "price": 5.0, "kind": "drink"}, {"item": "char kway teow", "qty": 9, "price": 9.5, "kind": "food"}, {"item": "kopi", "qty": 2, "price": 3.0, "kind": "drink"}, ] # 1 — items selling more than 5 units hot = [s["item"] for s in sales if s["qty"] > 5] print(f"Hot items: {hot}") # 2 — total revenue from food only food_revenue = sum(s["qty"] * s["price"] for s in sales if s["kind"] == "food") print(f"Food revenue: RM {food_revenue:.2f}") # 3 — Apply 10% discount to slow-moving items (< 5 sold); keep prices for the rest adjusted = [ {"item": s["item"], "price": s["price"] * 0.9 if s["qty"] < 5 else s["price"]} for s in sales ] for a in adjusted: print(f" {a['item']:<14} RM {a['price']:.2f}") # 4 — Top-3 items by revenue (transform, filter, sort, slice) top3 = sorted( [{"item": s["item"], "revenue": s["qty"] * s["price"]} for s in sales], key=lambda r: r["revenue"], reverse=True, )[:3] print("\nTop 3 by revenue:") for r in top3: print(f" {r['item']:<14} RM {r['revenue']:.2f}")

Hot items: ['nasi lemak', 'teh tarik', 'roti canai', 'char kway teow'] Food revenue: RM 232.00 nasi lemak RM 8.00 teh tarik RM 3.50 roti canai RM 3.50 milo RM 4.50 cendol RM 4.50 # discounted (qty 3 < 5) char kway teow RM 9.50 kopi RM 2.70 # discounted (qty 2 < 5) Top 3 by revenue: char kway teow RM 85.50 nasi lemak RM 96.00 ...

words = ["hi", "python", "is", "fantastic", "fun", "even"] result = [w.upper() if len(w) > 6 else w for w in words if len(w) > 3] print(result) # → ['python', 'FANTASTIC', 'even']

# grading.py — filter + transform pipelines students = [ {"name": "Aisyah", "score": 92}, {"name": "Wei Jie", "score": 47}, {"name": "Priya", "score": 95}, {"name": "Iman", "score": 60}, {"name": "Aizat", "score": 82}, {"name": "Hafiz", "score": 88}, ] # 1 — top 3 (sort desc, slice) top_3 = sorted(students, key=lambda s: -s["score"])[:3] # 2 — letter grades via helper def grade_of(s): if s >= 90: return "A" if s >= 75: return "B" if s >= 50: return "C" return "F" letter_grades = [(s["name"], grade_of(s["score"])) for s in students] # 3 — by_grade (nested comp) grades_seen = {g for _, g in letter_grades} by_grade = { g: [name for name, lg in letter_grades if lg == g] for g in grades_seen } # 4 — boosted with cap boosted = [ {"name": s["name"], "score": min(100, s["score"] + 5)} for s in students ] print("Top 3 :", [s["name"] for s in top_3]) print("Grades :", letter_grades) print("By grade :", by_grade) print("Boosted :", boosted)

books = [ {"title": "Charlotte's Web", "pages": 192, "year": 1952}, {"title": "The Hobbit", "pages": 310, "year": 1937}, {"title": "Wings of Fire", "pages": 304, "year": 2012}, {"title": "Tom Gates", "pages": 256, "year": 2011}, {"title": "Diary of a Wimpy", "pages": 221, "year": 2007}, ]