Learning Goals
3 min- Read deeply nested JSON: dict-of-lists-of-dicts.
- Walk a JSON tree to extract specific fields.
- Use
indent,sort_keysandensure_asciito control output. - Update one nested value and write the file back.
Warm-Up · Read This Tree
5 minThis is the shape every dataset eventually takes:
{
"school": "SMK Sungai Long",
"year": 2026,
"classes": [
{
"name": "Form 1A",
"students": [
{"id": 1, "name": "Aisyah", "scores": [88, 92, 79]},
{"id": 2, "name": "Wei Jie", "scores": [70, 75, 81]}
]
},
{
"name": "Form 1B",
"students": [
{"id": 3, "name": "Suresh", "scores": [95, 90, 88]}
]
}
]
}Predict: how would you reach Wei Jie's first score (70)?
Show the answer
data["classes"][0]["students"][1]["scores"][0]
Six steps: dict key → list index → dict key → list index → dict key → list index. Reading nested JSON is just chaining those.
Nested JSON is read by chaining [key] and [index] until you reach the value. To process every item, nest a for loop inside another for loop. That's the whole skill.
New Concept · Walking the Tree
14 minSave the data file
Create school.json with the JSON from the warm-up. Then read it:
import json with open("school.json") as f: school = json.load(f)
Walk it with nested for loops
for cls in school["classes"]: print(f"\nClass: {cls['name']}") for student in cls["students"]: avg = sum(student["scores"]) / len(student["scores"]) print(f" {student['name']:<10} avg = {avg:.1f}")
Class: Form 1A Aisyah avg = 86.3 Wei Jie avg = 75.3 Class: Form 1B Suresh avg = 91.0
Pretty-print kwargs
json.dump and json.dumps take three formatting kwargs:
indent=2 → readable indentation sort_keys=True → keys in alphabetical order ensure_ascii=False → keep Unicode (中, é, 🚀) as-is
Compare:
import json d = {"name": "Café", "type": "Kopitiam", "rating": 4.5} print(json.dumps(d)) print("---") print(json.dumps(d, indent=2)) print("---") print(json.dumps(d, indent=2, sort_keys=True, ensure_ascii=False))
{"name": "Caf\u00e9", "type": "Kopitiam", "rating": 4.5}
---
{
"name": "Caf\u00e9",
"type": "Kopitiam",
"rating": 4.5
}
---
{
"name": "Café",
"rating": 4.5,
"type": "Kopitiam"
}The default escapes non-ASCII to \\uXXXX. Set ensure_ascii=False when you want real characters in the file.
Modify and write back
Reading + editing + saving is a classic three-step pattern:
with open("school.json") as f: school = json.load(f) # Edit: add a new score to Aisyah school["classes"][0]["students"][0]["scores"].append(95) with open("school.json", "w") as f: json.dump(school, f, indent=2, ensure_ascii=False)
Worked Example · Class Report
12 minBuild a one-page printable report from the school JSON.
# class_report.py — pretty report from nested JSON import json with open("school.json") as f: school = json.load(f) print(f"📘 {school['school']} — {school['year']}") print("=" * 40) for cls in school["classes"]: print(f"\n🏫 {cls['name']} ({len(cls['students'])} students)") print("-" * 40) for student in cls["students"]: scores = student["scores"] avg = sum(scores) / len(scores) best = max(scores) print(f" #{student['id']:<2} {student['name']:<10} " f"avg={avg:5.1f} best={best}") # Save a summary section back into the JSON school["summary"] = { "total_students": sum(len(c["students"]) for c in school["classes"]), "class_count": len(school["classes"]), } with open("school.json", "w") as f: json.dump(school, f, indent=2, ensure_ascii=False) print(f"\n📝 summary block written.")
Sample output
📘 SMK Sungai Long — 2026 ======================================== 🏫 Form 1A (2 students) ---------------------------------------- #1 Aisyah avg= 86.3 best=92 #2 Wei Jie avg= 75.3 best=81 🏫 Form 1B (1 students) ---------------------------------------- #3 Suresh avg= 91.0 best=95 📝 summary block written.
Read the diff
Two patterns to notice. The nested loop walks classes-then-students. The aggregating comprehension sum(len(c["students"]) for c in school["classes"]) totals across the whole tree in one expression. That's how data work feels in Python.
Try It Yourself
13 minLoop the school tree and print just the names — one per line, no formatting.
Hint
for cls in school["classes"]: for s in cls["students"]: print(s["name"])
Compute each class's average score across all students, then print the winning class name.
Hint
best_class, best_avg = None, -1 for cls in school["classes"]: all_scores = [n for s in cls["students"] for n in s["scores"]] avg = sum(all_scores) / len(all_scores) if avg > best_avg: best_class, best_avg = cls["name"], avg print(f"{best_class} → {best_avg:.1f}")
Append a new student to Form 1B with id 4 and scores [60, 65, 70]. Save the file with pretty output, then reload and confirm the new student appears.
Hint
new = {"id": 4, "name": "Devi", "scores": [60, 65, 70]} school["classes"][1]["students"].append(new) with open("school.json", "w") as f: json.dump(school, f, indent=2, ensure_ascii=False)
Mini-Challenge · Top 3 Students School-Wide
8 minBuild top3.py. Flatten the school JSON into a single list of students, attach each student's class name and average, then print the top 3 students by average across the whole school.
Show one possible solution
# top3.py — flatten + sort + slice import json with open("school.json") as f: school = json.load(f) flat = [] for cls in school["classes"]: for s in cls["students"]: flat.append({ "name": s["name"], "class": cls["name"], "avg": sum(s["scores"]) / len(s["scores"]), }) top = sorted(flat, key=lambda x: x["avg"], reverse=True)[:3] print("🏆 Top 3 students") for i, s in enumerate(top, 1): print(f" {i}. {s['name']:<10} ({s['class']}) avg={s['avg']:.1f}")
Non-negotiables: flatten across classes, attach the class name to each student, sort descending, slice to 3.
Recap
3 minNested JSON is read by chaining [key] and [index] until you reach the value. To process the whole tree, nest for loops one per level. The formatting kwargs (indent, sort_keys, ensure_ascii) control how the output looks on disk. Read + modify + write is the standard pattern.
Vocabulary Card
- nested data
- Data structures inside data structures — lists of dicts, dicts of lists, and so on.
- indent
- Number of spaces of indentation.
indent=2is the common default. - sort_keys
- When True, keys are written alphabetically. Useful for diffs.
- ensure_ascii
- When True (the default), non-ASCII chars are escaped to
\\uXXXX. Set False to keep them real.
Homework
4 minBuild library.json by hand. It should have a top-level library name, a list of shelves, and each shelf has a topic and a list of books (each book: title, author, year). Add at least 2 shelves and 3 books per shelf.
Then write library_view.py that:
- Prints the library name and total book count.
- For each shelf, prints the topic and every book.
- Adds a new book to one shelf, saves the JSON back with
indent=2.
Sample · library_view.py
# library_view.py — walk and update a nested JSON library import json with open("library.json") as f: lib = json.load(f) total = sum(len(s["books"]) for s in lib["shelves"]) print(f"📚 {lib['library']} ({total} books across {len(lib['shelves'])} shelves)\n") for shelf in lib["shelves"]: print(f" Shelf: {shelf['topic']}") for b in shelf["books"]: print(f" - {b['title']} · {b['author']} ({b['year']})") # Add a new book to the first shelf lib["shelves"][0]["books"].append({ "title": "Tide Pools of Penang", "author": "Z. Lim", "year": 2025, }) with open("library.json", "w") as f: json.dump(lib, f, indent=2, ensure_ascii=False) print(f"\n✏️ Added one book. New total: " f"{sum(len(s['books']) for s in lib['shelves'])}")
Non-negotiables: total computed across all shelves, every shelf and book printed, JSON saved back with indent=2.