Learning Goals
3 minBy the end of this lesson you can:
- Install Playwright and its bundled browsers.
- Drive a page with the sync API: navigate, click, fill, read.
- Use auto-waiting locators and never write a manual wait again.
- Decide between Playwright and Selenium for a given task.
Warm-Up · What Selenium Made You Do
5 minRecall the Selenium dance: import By, WebDriverWait, expected_conditions; wrap every interaction in an explicit wait; remember to quit(). Powerful, but verbose — and forgetting a wait makes tests flaky.
pip install playwright playwright install # downloads bundled Chromium/Firefox/WebKit
Playwright's locators auto-wait: when you click or read a locator, it automatically waits for the element to be present, visible, and actionable first. No WebDriverWait, no flakiness. It also bundles its own browsers (no driver-version headaches) and uses a context-manager API that cleans up for you. Same concepts as Selenium, far less ceremony.
New Concept · The Playwright API
14 minLaunch with a context manager
from playwright.sync_api import sync_playwright with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() page.goto("https://example.com") print(page.title()) browser.close()
The with block manages startup/teardown. headless=True by default; set headless=False to watch it run while developing. Three browser engines are built in: p.chromium, p.firefox, p.webkit.
Locators that auto-wait
# build a locator — nothing happens yet heading = page.locator("h1") search = page.get_by_role("textbox", name="Search") button = page.get_by_role("button", name="Submit") # acting on it auto-waits for the element to be ready: button.click() # waits until clickable, then clicks search.fill("playwright") # waits, clears, types print(heading.inner_text()) # waits until present, then reads
A locator is a recipe for finding an element, re-evaluated each time you use it. The action (click, fill) waits for the element to be actionable — so timing bugs largely disappear.
User-facing selectors (recommended)
page.get_by_role("button", name="Login") # by accessible role + name page.get_by_text("Add to cart") # by visible text page.get_by_label("Email") # form field by its label page.get_by_placeholder("Search…") # by placeholder page.locator("#results .item") # CSS, when needed
Playwright nudges you toward selectors a user would recognise (role, text, label). These are robust (Lesson 23's lesson again) and double as accessibility checks.
Explicit assertions and waits (when you do need them)
from playwright.sync_api import expect expect(page.locator(".result")).to_be_visible() # auto-retries up to a timeout expect(page.locator(".count")).to_have_text("3 found") page.wait_for_url("**/dashboard") # wait for navigation page.wait_for_load_state("networkidle") # wait for network to settle
expect(...) retries the assertion until it passes or times out — perfect for "wait until the result appears." You rarely need raw waits.
Handy extras
page.screenshot(path="shot.png", full_page=True) page.pdf(path="page.pdf") # save the page as PDF (chromium) content = page.content() # rendered HTML → BeautifulSoup page.goto(url, wait_until="domcontentloaded")
Worked Example · Same Scrape, Less Code
12 minGoal: the same JS-rendered quote scrape from Lesson 26, now in Playwright — notice how the auto-waiting removes the explicit-wait boilerplate.
from playwright.sync_api import sync_playwright import csv, logging logging.basicConfig(level=logging.INFO, format="%(levelname)s %(message)s") log = logging.getLogger("playwright") def scrape_quotes() -> list[dict]: quotes = [] with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() page.goto("https://quotes.toscrape.com/js/") # locator auto-waits for the quotes to render — no WebDriverWait needed cards = page.locator(".quote") cards.first.wait_for() # ensure at least one exists count = cards.count() log.info("found %d quotes", count) for i in range(count): card = cards.nth(i) quotes.append({ "text": card.locator(".text").inner_text(), "author": card.locator(".author").inner_text(), }) browser.close() return quotes rows = scrape_quotes() with open("quotes.csv", "w", newline="", encoding="utf-8") as f: w = csv.DictWriter(f, fieldnames=["text", "author"]) w.writeheader(); w.writerows(rows) log.info("wrote %d quotes", len(rows))
INFO found 10 quotes INFO wrote 10 quotes
Read the code
Compare with Lesson 26: no By, no WebDriverWait/expected_conditions imports, no try/finally (the with block closes the browser). The cards.first.wait_for() is the only explicit wait, and even that's often unnecessary because acting on a locator auto-waits. cards.nth(i) indexes the matched set. Same result, noticeably less ceremony — which is why Playwright has become many teams' default for new browser automation.
Playwright: new projects, flaky-test pain, modern apps, want bundled browsers and async support. Selenium: existing test suites, the widest language/grid ecosystem, or a tool/integration that mandates it. Both control real browsers; the concepts you learned in Lesson 26 transfer directly. Pick one and go — don't agonise.
Try It Yourself
13 minOpen a site headless, print its title, and save a full-page screenshot — all inside a with sync_playwright() block. No manual cleanup needed.
On a search or login page, locate fields with get_by_label/get_by_role, fill them, click the submit button, and assert the result with expect(...).to_be_visible().
Hint
from playwright.sync_api import expect page.get_by_label("Username").fill("demo") page.get_by_label("Password").fill("secret") page.get_by_role("button", name="Login").click() expect(page.get_by_text("Welcome")).to_be_visible()
Open a content page, wait for it to settle (wait_for_load_state("networkidle")), and save it as a PDF with page.pdf(...). Combine with Lesson 21 ideas — a webpage archived as a document.
Hint
page.goto("https://example.com") page.wait_for_load_state("networkidle") page.pdf(path="example.pdf", format="A4")
Mini-Challenge · The Multi-Page Crawler
8 minWrite a Playwright crawler that scrapes a paginated practice site (e.g. quotes.toscrape.com), clicking the "Next" button with a locator until it's no longer visible, collecting all items across pages into one CSV. Let auto-waiting handle the page transitions.
Show a sample solution
from playwright.sync_api import sync_playwright import csv def crawl() -> list[dict]: rows = [] with sync_playwright() as p: page = p.chromium.launch(headless=True).new_page() page.goto("https://quotes.toscrape.com/") while True: cards = page.locator(".quote") for i in range(cards.count()): c = cards.nth(i) rows.append({ "text": c.locator(".text").inner_text(), "author": c.locator(".author").inner_text(), }) nxt = page.locator("li.next a") if nxt.count() == 0: break nxt.click() # auto-waits for the next page page.wait_for_load_state("domcontentloaded") return rows data = crawl() with open("all_quotes.csv", "w", newline="", encoding="utf-8") as f: w = csv.DictWriter(f, fieldnames=["text", "author"]) w.writeheader(); w.writerows(data) print(f"crawled {len(data)} quotes")
Non-negotiables: locator-driven pagination, stops when Next disappears, all pages into one CSV.
Recap
3 minPlaywright is the modern take on browser automation: with sync_playwright() manages the browser, bundled engines avoid driver headaches, and locators auto-wait for elements to be ready — so flaky timing bugs mostly vanish and you rarely write an explicit wait. Prefer user-facing selectors (get_by_role, get_by_text, get_by_label), and use expect(...) for retrying assertions. The concepts are identical to Selenium (Lesson 26) with far less ceremony. Choose Playwright for new work and flaky-test relief; Selenium for legacy suites and the widest ecosystem.
Vocabulary Card
- locator
- A re-evaluated recipe for an element; actions on it auto-wait.
- auto-waiting
- Playwright waits for an element to be actionable before acting.
- get_by_role
- Selecting elements the way a user/screen-reader perceives them.
- expect()
- An assertion that retries until it passes or times out.
Homework
4 minPort your Lesson 26 Selenium homework to Playwright. Then write a short note comparing the two: lines of code, how each handled waiting, and which you'd choose for a new project and why. Bonus: add a feature Playwright makes easy (PDF export, multi-browser run, or get_by_role selectors).
Sample · port + comparison
from playwright.sync_api import sync_playwright import csv with sync_playwright() as p: page = p.chromium.launch(headless=True).new_page() page.goto("https://quotes.toscrape.com/js/") cards = page.locator(".quote") cards.first.wait_for() rows = [{ "text": cards.nth(i).locator(".text").inner_text(), "author": cards.nth(i).locator(".author").inner_text(), } for i in range(cards.count())] page.screenshot(path="page.png", full_page=True) with open("jobs.csv", "w", newline="", encoding="utf-8") as f: w = csv.DictWriter(f, fieldnames=["text", "author"]) w.writeheader(); w.writerows(rows)
Comparison: - Playwright: ~15 lines, no By/WebDriverWait/EC imports, no try/finally (with-block closes the browser), waiting is automatic. - Selenium: ~25 lines, manual explicit waits, manual quit(). - New project → Playwright (less flakiness, bundled browsers). - Legacy suite / Selenium Grid → stay on Selenium.
Non-negotiables: a working port, a Playwright-only feature, and a concrete comparison.