Learning Goals
3 minBy the end of this lesson you can:
- Read a 100-word file with one word per line into a clean Python list.
- Validate that every loaded word is exactly five letters and alphabetical only.
- Pick a random secret word with
random.choice. - Confirm randomness by sampling 1000 picks and printing the count of distinct picks.
Setup · The Word Bank
5 minCreate wordbank.txt in the same folder as today's Python file. Put one five-letter English word per line. Use lower-case letters only.
Start with this set of 30. Add more as homework. Keep them all lower-case and exactly 5 letters:
apple brave crisp dance eagle flame grape honey input jolly kayak lemon moral night olive piano quick river spicy tiger ultra vivid water yacht zesty chess dough flock mango silly
A few reasons. (1) Files are data — you can edit the word bank without touching the game code. (2) A 1000-word list would clutter your .py file horribly. (3) It's the realistic shape — every word game on the App Store loads its dictionary from a file or a database. The format we use today (one word per line) is the simplest possible.
New Concept · Load, Validate, Pick
14 minStep 1 · Load
Use the line-by-line loop from PY-L2-20. Strip whitespace, lower-case, and drop blank lines on the way in.
def load_word_bank(path): words = [] with open(path, encoding="utf-8") as f: for line in f: clean = line.strip().lower() if clean != "": words.append(clean) return words bank = load_word_bank("wordbank.txt") print(f"Loaded {len(bank)} words.")
Step 2 · Validate
Before using the bank, make sure every word is exactly five letters and alphabetical. If anything fails, complain loudly — broken word lists corrupt a game silently.
def validate(bank): bad = [] for w in bank: if len(w) != 5 or not w.isalpha(): bad.append(w) return bad problems = validate(bank) if len(problems) > 0: print("These words are invalid:") for w in problems: print(" -", repr(w)) else: print("All words look good.")
The string method .isalpha() returns True only when every character is a letter — no numbers, no spaces, no punctuation. Pair it with the length check and you catch most realistic data problems.
repr(w) for printing?repr shows the string with its quotes, so you can see leading/trailing whitespace or weird characters that bare print hides. Useful when you're debugging a data file.
Step 3 · Filter out duplicates
The same word twice would let you cheat in Wordle. Strip duplicates with the list(set(...)) trick from PY-L2-08, then sort so the order is stable.
def dedupe(bank): return sorted(set(bank)) bank = dedupe(bank) print(f"After dedupe: {len(bank)} words.")
Step 4 · Pick a secret
One call. Done.
import random secret = random.choice(bank) print(f"Today's secret has {len(secret)} letters.")
We won't print secret directly — that would spoil tomorrow's game. But the test in the worked example confirms the picker is genuinely random.
Putting it together
load → list of strings from disk validate → list of bad words (should be empty) dedupe + sort → clean, ordered, unique list random.choice → one secret word
Worked Example · The Word Bank Loader
12 minSave as wordle_part1.py:
Code
# wordle_part1.py — load and inspect the word bank import random def load_word_bank(path): words = [] with open(path, encoding="utf-8") as f: for line in f: clean = line.strip().lower() if clean != "": words.append(clean) return words def validate(bank): return [w for w in bank if len(w) != 5 or not w.isalpha()] def dedupe(bank): return sorted(set(bank)) # --- main --- bank = load_word_bank("wordbank.txt") print(f"Raw loaded : {len(bank)} words.") bad = validate(bank) print(f"Invalid words : {len(bad)}", bad if bad else "") bank = dedupe(bank) print(f"After dedupe : {len(bank)} words.") # Sanity-check the picker — pick 1000 times, count distinct picks random.seed(0) picks = [random.choice(bank) for _ in range(1000)] print(f"In 1000 picks : {len(set(picks))} distinct words seen.") # Lock in a secret for today's run random.seed() # reset to true randomness secret = random.choice(bank) print(f"Secret length : {len(secret)} letters.") print(f"Secret starts : '{secret[0]}' (full word hidden)")
Sample output
Raw loaded : 30 words. Invalid words : 0 After dedupe : 30 words. In 1000 picks : 30 distinct words seen. Secret length : 5 letters. Secret starts : 't' (full word hidden)
Read the diff
Three things to spot. (1) validate uses a list comprehension to gather just the bad words — empty list means everything passed. (2) The random.seed(0)/random.seed() pair gives us a deterministic 1000-pick test, then restores real randomness for the actual game. (3) The final "starts with" line is the hint we'll use to debug Part 2 without spoiling the answer.
If validate finds problems, the program prints them but keeps going. For a hostile file you might want to raise SystemExit("Word bank corrupt") instead — but for a teaching project, telling the student is enough.
Try It Yourself
13 minPrint all loaded words in a 6-wide column. Use width specifiers from PY-L2-16.
Hint
for i, w in enumerate(bank): end = "\n" if (i + 1) % 6 == 0 else " " print(f"{w:<6}", end=end) print()
The end="\\n" if ... trick lets you control whether print adds a newline or a space. After every sixth word we wrap to a new line.
Use random.sample (PY-L2-18) to draw three different secrets in one call. Print each.
Hint
hand = random.sample(bank, 3) print(hand) print("All different?", len(set(hand)) == 3)
sample is the right tool when you want a hand without repeats — exactly like dealing cards in PY-L2-19.
Add three deliberately bad lines to wordbank.txt — a four-letter word, a six-letter word, and a word containing a digit. Save. Re-run the validator. Confirm all three are caught.
Hint
# wordbank.txt now contains: # tiger # crab <- only 4 letters # zombie <- 6 letters # pyt3on <- has a digit # yacht # ... bad = validate(bank) print(bad) # → ['crab', 'zombie', 'pyt3on']
That's the validator earning its keep. If you skip validation and play with this bank, the game will sometimes pick a word the player can never type.
Mini-Challenge · Dictionary Stats
8 minBuild wordbank_stats.py. Load and validate the bank, then print a small report. Don't print individual words — print numbers about the bank.
Your file must print:
- Total valid words.
- How many start with each letter (use a dict-of-counts).
- The single most common starting letter.
- How many words contain at least one of
aeiou(which is "basically all of them") — and how many don't. - How many distinct letters the whole bank uses (set of all characters).
Stretch goal. Print the 5 letters that appear in the most words.
Show one possible solution
# wordbank_stats.py — analyse the word bank from collections import Counter def load_word_bank(path): with open(path, encoding="utf-8") as f: return [line.strip().lower() for line in f if line.strip()] def validate(bank): return [w for w in bank if len(w) != 5 or not w.isalpha()] bank = load_word_bank("wordbank.txt") bad = validate(bank) if bad: print("WARNING — invalid words:", bad) bank = [w for w in bank if w not in bad] print(f"Valid words : {len(bank)}") # Starts starts = {} for w in bank: starts[w[0]] = starts.get(w[0], 0) + 1 print(f"By starting letter: {starts}") print(f"Most common start : {max(starts, key=starts.get)}") # Vowels with_vowel = sum(1 for w in bank if any(ch in "aeiou" for ch in w)) print(f"With a vowel: {with_vowel}, without: {len(bank) - with_vowel}") # Distinct letters in the whole bank all_letters = set() for w in bank: all_letters.update(w) print(f"Distinct letters used: {len(all_letters)}", sorted(all_letters)) # Stretch — letters in most words letter_in_count = {} for w in bank: for letter in set(w): letter_in_count[letter] = letter_in_count.get(letter, 0) + 1 top5 = sorted(letter_in_count.items(), key=lambda kv: kv[1], reverse=True)[:5] print(f"Top 5 letters by word-coverage: {top5}")
Non-negotiables: a count-by-starting-letter dict, a vowel check, and a distinct-letter set. The stretch piece uses a sorted-items pattern from PY-L2-04. Counter from collections is also imported here as a teaser — we'll use it properly in Level 3.
Recap
3 minA real word game starts with a word bank on disk. Load it with the line-by-line read from PY-L2-20, strip whitespace and lower-case as you go. Validate every entry — len(w) == 5 and w.isalpha(). Dedupe with list(set(...)) from PY-L2-08, then sort for predictability. Pick the secret with random.choice from PY-L2-18. Don't print the secret — only its length and starting letter. Tomorrow we wrap this in the actual Wordle guessing loop.
Vocabulary Card
- word bank
- A file of valid words for a word game. One per line is the simplest format.
- .isalpha()
- String method that returns
Trueif every character is a letter. - load → validate → pick
- The three-step shape every game with external data follows.
- repr(x)
- Show a value with its quotes and escape sequences — great for debugging mysterious strings.
Homework
4 minGrow your word bank to 100 words in wordbank.txt. Pick from common English five-letter words — animals, foods, colours, weather, places, common verbs.
Your homework file build_bank.py must:
- Load the bank, validate it, dedupe it.
- If anything was invalid, print the list and which line numbers they came from.
- Print the size before and after dedupe — to catch any accidental duplicates you typed.
- Write the cleaned bank back to
wordbank.clean.txtusing PY-L2-21's file-writing pattern. (Keep the original untouched.)
Sample · build_bank.py
# build_bank.py — load, validate, dedupe, write back def load_with_line_numbers(path): pairs = [] with open(path, encoding="utf-8") as f: for n, line in enumerate(f, start=1): clean = line.strip().lower() if clean: pairs.append((n, clean)) return pairs pairs = load_with_line_numbers("wordbank.txt") print(f"Lines with content: {len(pairs)}") bad = [(n, w) for n, w in pairs if len(w) != 5 or not w.isalpha()] if bad: print("\nInvalid:") for n, w in bad: print(f" line {n}: {w!r}") good = [w for _, w in pairs if (len(w) == 5 and w.isalpha())] print(f"Good words : {len(good)}") unique = sorted(set(good)) print(f"After dedupe: {len(unique)}") with open("wordbank.clean.txt", "w", encoding="utf-8") as f: for w in unique: f.write(w + "\n") print("Saved → wordbank.clean.txt")
Non-negotiables: line numbers reported with bad entries, dedupe with set, and writing the clean bank back to a new file. Note {w!r} — same as repr(w) but as an f-string conversion. Saves typing.