Learning Goals
3 minBy the end of this lesson you can:
- Copy files with
shutil.copy2(keeping timestamps) and trees withcopytree. - Move and rename across drives with
shutil.move. - Delete a whole folder tree with
shutil.rmtree— carefully. - Zip a directory into an archive with
make_archiveand unpack it withunpack_archive.
Warm-Up · The Right Tool for the Scale
5 minYou already know how to read and write one file. But how would you copy a folder containing 500 files in 30 subfolders, preserving the structure? Looping with read_bytes/write_bytes and re-creating each subfolder is dozens of fiddly lines.
shutil ("shell utilities") is the standard-library module for high-level file operations — the things you'd do with cp -r, mv, rm -rf, and zip in a terminal. One function call replaces a whole loop, and it's battle-tested for edge cases you'd forget.
New Concept · The shutil Toolkit
14 minCopying files
import shutil from pathlib import Path shutil.copy("report.csv", "backup/report.csv") # copy contents shutil.copy2("report.csv", "backup/report.csv") # copy + keep timestamps
Prefer copy2 for backups — it preserves the modified time and other metadata, so your backup looks identical to the original. Both accept Path objects or strings.
Copying whole trees
shutil.copytree("project", "project-backup") # overwrite an existing target (Python 3.8+): shutil.copytree("project", "project-backup", dirs_exist_ok=True)
copytree recreates the entire folder structure. By default it errors if the target exists; pass dirs_exist_ok=True to merge into an existing folder.
Moving and renaming
shutil.move("downloads/report.csv", "archive/report.csv") shutil.move("old_folder", "new_location/old_folder")
shutil.move works across different drives (where Path.rename fails), and moves files or whole folders. It's the robust choice for "put this somewhere else."
Deleting trees — handle with care
shutil.rmtree("temp_folder") # deletes the folder AND everything inside # safer: refuse to run on suspicious targets target = Path("temp_folder") if target.is_dir() and target.name == "temp_folder": shutil.rmtree(target)
shutil.rmtree permanently deletes a tree — there's no recycle bin. Always validate the target first (does it exist? is it the folder you meant?), never build the path from unchecked user input, and consider printing what you're about to delete before doing it. This is the most dangerous function in the lesson.
Archiving: zip a folder in one line
# make_archive(base_name, format, root_dir) shutil.make_archive("backup-2026-05-28", "zip", "project") # → creates backup-2026-05-28.zip from the contents of project/ shutil.unpack_archive("backup-2026-05-28.zip", "restored") # → extracts into restored/
- Formats:
"zip","tar","gztar"(.tar.gz),"bztar","xztar". - Don't add the extension to
base_name—make_archiveadds it for you.
Bonus: disk space and finding programs
total, used, free = shutil.disk_usage("/") print(f"{free/1e9:.1f} GB free") print(shutil.which("python")) # full path to a program, or None
Worked Example · A Dated Backup Tool
12 minGoal: back up a folder into a timestamped zip, keep only the 5 newest backups, and report free disk space. This is a complete, useful tool.
import shutil from pathlib import Path from datetime import datetime def backup(source: str, dest: str, keep: int = 5) -> Path: src = Path(source) if not src.is_dir(): raise NotADirectoryError(f"{src} is not a folder") out = Path(dest) out.mkdir(parents=True, exist_ok=True) stamp = datetime.now().strftime("%Y%m%d-%H%M%S") base = out / f"{src.name}-{stamp}" archive = shutil.make_archive(str(base), "zip", str(src)) print(f"Created {archive}") # rotation: keep only the newest <keep> zips zips = sorted(out.glob(f"{src.name}-*.zip"), key=lambda p: p.stat().st_mtime, reverse=True) for old in zips[keep:]: old.unlink() print(f"Pruned old backup: {old.name}") free = shutil.disk_usage(out).free / 1e9 print(f"{free:.1f} GB free on disk") return Path(archive) backup("project", "backups", keep=5)
Created backups/project-20260528-143012.zip Pruned old backup: project-20260521-090000.zip 118.4 GB free on disk
Read the code
Four shutil features combine into one tool: make_archive zips the folder, glob + sort by st_mtime finds the oldest backups, unlink prunes them, and disk_usage reports headroom. The "keep newest N" rotation is the pattern behind every real backup system — you'll reuse it in Lessons 38-39.
Try It Yourself
13 minCopy a file to a backup/ folder using copy2. Then check both files' st_mtime match — proving the timestamp was preserved.
Write zip_folder(src, name) that archives a folder into name.zip and prints the resulting file's size in KB.
Hint
import shutil from pathlib import Path def zip_folder(src, name): path = shutil.make_archive(name, "zip", src) kb = Path(path).stat().st_size / 1000 print(f"{path}: {kb:.1f} KB") zip_folder("docs", "docs-backup")
Write safe_clean(folder) that deletes a tree with rmtree — but only after printing what's inside and confirming the folder name contains "temp" or "cache". Refuse and warn otherwise.
Hint
import shutil from pathlib import Path def safe_clean(folder): p = Path(folder) if not p.is_dir(): print("Not a folder"); return if not any(w in p.name.lower() for w in ("temp", "cache")): print(f"Refusing to delete {p} (name not temp/cache)"); return count = sum(1 for _ in p.rglob("*")) print(f"Deleting {p} ({count} items)…") shutil.rmtree(p)
Mini-Challenge · Snapshot & Restore
8 minBuild two functions: snapshot(folder) zips a folder into snapshots/<name>-<timestamp>.zip and returns the path; restore(zip_path, into) unpacks it into a target folder. Test that you can snapshot, change the original, and restore it back.
Show a sample solution
import shutil from pathlib import Path from datetime import datetime def snapshot(folder: str) -> Path: src = Path(folder) out = Path("snapshots"); out.mkdir(exist_ok=True) stamp = datetime.now().strftime("%Y%m%d-%H%M%S") base = out / f"{src.name}-{stamp}" return Path(shutil.make_archive(str(base), "zip", str(src))) def restore(zip_path: str, into: str) -> None: shutil.unpack_archive(zip_path, into) print(f"Restored {zip_path} → {into}") snap = snapshot("project") restore(str(snap), "project-restored")
Non-negotiables: timestamped archive, returns the path, and a working round-trip restore.
Recap
3 minshutil is the high-level file toolkit: copy2 (file, keeping timestamps), copytree (whole tree, with dirs_exist_ok), move (across drives), rmtree (delete a tree — irreversibly, so validate first), and make_archive/unpack_archive (zip and unzip in one line). Plus disk_usage and which for system info. One call replaces a whole loop — and rmtree is the one to respect: no recycle bin, no undo.
Vocabulary Card
- copy2
- Copy a file and its metadata (timestamps) — best for backups.
- copytree / rmtree
- Recursively copy or delete an entire folder and its contents.
- make_archive
- Bundle a folder into a .zip/.tar.gz archive in one call.
- backup rotation
- Keeping only the newest N backups and pruning the rest.
Homework
4 minCombine Lessons 5-7: build tidy.py <folder> that recursively finds every file, copies (not moves — keep originals safe) each into tidy/<extension>/ using copy2, then zips the whole tidy/ folder into a timestamped archive and deletes the intermediate tidy/ folder. Report the archive path and how many files were processed.
Sample · tidy.py
import argparse, shutil from pathlib import Path from datetime import datetime p = argparse.ArgumentParser(description="Sort + archive a folder.") p.add_argument("folder") a = p.parse_args() src = Path(a.folder) tidy = Path("tidy") count = 0 for f in src.rglob("*"): if f.is_file(): kind = f.suffix.lstrip(".").lower() or "other" dest = tidy / kind dest.mkdir(parents=True, exist_ok=True) shutil.copy2(f, dest / f.name) count += 1 stamp = datetime.now().strftime("%Y%m%d-%H%M%S") archive = shutil.make_archive(f"tidy-{stamp}", "zip", "tidy") shutil.rmtree(tidy) print(f"Processed {count} files → {archive}")
Non-negotiables: copy2 (originals preserved), extension folders, one zip, cleanup of tidy/, and a count.