PY-L2-14 · String Superpowers — Advaslearning Hub

String Superpowers

Four string methods do most of the real text work you'll ever need. .split() breaks one string into many. .join() glues many strings into one. .strip() trims away the rubbish. .find() tells you where something is. Master these and parsing messy text becomes routine.

⏱ 1 hour✂️ Concept lesson📚 After PY-L2-13💻 VS Code or online-python.com

Learning Goals

3 min

By the end of this lesson you can:

Cut a string into a list of pieces with .split("sep").
Glue a list of strings back into one string with "sep".join(list).
Trim leading/trailing whitespace with .strip().
Locate a substring with .find("needle") — and know the difference between "not found" and "found at position 0".

Warm-Up

5 min

You used .split(", ") back in PY-L2-04 to break apart an address. Today we own that method properly.

Predict the outputs:

line = "Aisyah, 12, Kuala Lumpur"
parts = line.split(", ")
print(parts)
print(len(parts))
print(", ".join(parts))

Show the answer

['Aisyah', '12', 'Kuala Lumpur']
3
Aisyah, 12, Kuala Lumpur

.split made a list of three. .join put them back together. The two methods are inverses of each other — break apart and weld back.

Today's big idea

Real text arrives as one long string. To work with it you almost always have to chop it up. To save it back you almost always have to glue it together. .split and .join are how.

New Concept · The Four Methods

14 min

1 · `.split(sep)` — chop into a list

Pass a separator. Get back a list of pieces — the separator itself is removed.

csv = "milo,kopi,teh tarik,horlicks"
drinks = csv.split(",")
print(drinks)
# → ['milo', 'kopi', 'teh tarik', 'horlicks']

sentence = "the quick brown fox"
words = sentence.split(" ")
print(words)
# → ['the', 'quick', 'brown', 'fox']

Call .split() with no argument and Python splits on any whitespace — runs of spaces, tabs, newlines — all in one go. That's often what you want:

messy = "  hello   world  "
print(messy.split())
# → ['hello', 'world']    (no empty strings!)

print(messy.split(" "))
# → ['', '', 'hello', '', '', 'world', '', '']    ← oof

The no-arg form is more forgiving with messy whitespace. Reach for it first; only specify a separator when you genuinely need to split on something specific.

2 · `"sep".join(list)` — weld a list into one string

Note the surprising shape: the separator goes on the left, the list goes inside the brackets. Read it as "put this separator between each item of this list".

drinks = ["milo", "kopi", "teh tarik"]
print(", ".join(drinks))     # → milo, kopi, teh tarik
print(" - ".join(drinks))    # → milo - kopi - teh tarik
print("".join(drinks))       # → milokopiteh tarik
print("\n".join(drinks))    # → milo
                             #    kopi
                             #    teh tarik

The classic typo

Writing drinks.join(", ") looks natural but it's the wrong way round — and Python crashes with AttributeError: 'list' object has no attribute 'join'. The method belongs to the string (the separator), not to the list. Always type the separator first.

3 · `.strip()` — trim the rubbish

Removes whitespace (spaces, tabs, newlines) from the start and end of the string. Doesn't touch the middle.

raw = "   Aisyah   \n"
print(raw.strip())
# → Aisyah    (no extra spaces, no newline)

# Strip a specific character instead
print("###Hello###".strip("#"))
# → Hello

Two close cousins: .lstrip() trims only the left (leading) side; .rstrip() trims only the right.

4 · `.find(needle)` — where is it?

Returns the position of the first occurrence of needle — or -1 if it isn't there at all.

sentence = "the quick brown fox"
print(sentence.find("quick"))      # → 4
print(sentence.find("brown"))      # → 10
print(sentence.find("rabbit"))     # → -1   (not found!)

Beware the "found at 0" trap

Don't test .find()'s return value with if sentence.find("x"): — that's falsy when the match is at position 0! Always compare explicitly:

pos = sentence.find("quick")
if pos != -1:
    print("Found at index", pos)
else:
    print("Not in the sentence.")

Or, for a plain yes/no, use the operator in instead: if "quick" in sentence:.

The cheat-sheet

Question                              Method
"Break this string into pieces"       text.split(sep)
"Glue these pieces into one string"   sep.join(list_of_strings)
"Trim leading/trailing spaces"        text.strip()
"Where is X inside this string?"      text.find(x)    (-1 = not found)
"Is X in here at all?"                x in text       (True / False)

Worked Example · The Comma-Separated Order Form

12 min

The story

A teacher posts orders on a sticky note as one long string — names separated by semicolons, with sloppy spacing. You need to clean each name, count them, and print a tidy comma-separated rota.

Save as orders.py:

Code

# orders.py — split, strip, join

raw = "  Aisyah ; Wei Jie ; Priya ; Iman ;  Aizat  "

# 1 — split on the semicolon
parts = raw.split(";")
print("After split  :", parts)
# → ['  Aisyah ', ' Wei Jie ', ' Priya ', ' Iman ', '  Aizat  ']

# 2 — strip every piece (loop)
clean = []
for p in parts:
    clean.append(p.strip())
print("After strip  :", clean)
# → ['Aisyah', 'Wei Jie', 'Priya', 'Iman', 'Aizat']

# 3 — back together, comma-separated
tidy = ", ".join(clean)
print("Tidy rota    :", tidy)

# 4 — quick question
print("Total people :", len(clean))
print("Is Priya in? :", "Priya" in clean)

Output

After split  : ['  Aisyah ', ' Wei Jie ', ' Priya ', ' Iman ', '  Aizat  ']
After strip  : ['Aisyah', 'Wei Jie', 'Priya', 'Iman', 'Aizat']
Tidy rota    : Aisyah, Wei Jie, Priya, Iman, Aizat
Total people : 5
Is Priya in? : True

Read the diff

One string in, one string out — but in the middle it briefly became a list. split → strip → join is the most common text-cleanup pipeline you'll ever write. Memorise that shape.

The one-line version

If you're comfortable with list comprehensions (PY-L2-02), the loop in step 2 collapses to one line:

clean = [p.strip() for p in raw.split(";")]
tidy  = ", ".join(clean)

Two lines for the whole pipeline. That's the kind of code real Python projects are full of.

Try It Yourself

13 min

01 🟢 Count the words

Ask the user for a sentence with input(). Print how many words it contains, using .split() with no argument.

Hint

text = input("Sentence: ")
words = text.split()
print("Word count:", len(words))

No-arg .split() handles whatever whitespace the user types — even if they accidentally double-space.

02 🟡 Reverse-join

Take the list ["Kuala", "Lumpur", "Malaysia"]. Print it joined with spaces, then with hyphens, then with newlines.

Hint

parts = ["Kuala", "Lumpur", "Malaysia"]
print(" ".join(parts))      # → Kuala Lumpur Malaysia
print("-".join(parts))      # → Kuala-Lumpur-Malaysia
print("\n".join(parts))    # → on three lines

03 🔴 First-name initials (stretch)

Given names = "Aisyah binti Hassan, Wei Jie Tan, Priya Kumar", build a list of initials — "A.", "W.", "P." — and print them joined by spaces.

Hint

names = "Aisyah binti Hassan, Wei Jie Tan, Priya Kumar"
people = [p.strip() for p in names.split(",")]
initials = [p[0] + "." for p in people]
print(" ".join(initials))
# → A. W. P.

Three pipeline steps: split the big string into people, then a strip on each, then build the initials list with the first character of each name. .join ties it all together for printing.

Mini-Challenge · The Tag Cleaner

8 min

You've scraped tags off the internet. They're a horrible mess — leading hashes, extra spaces, mixed case, duplicates. Build clean_tags.py that tidies them.

raw = "  #PYTHON ,  #beginner;# Python ; #LEVEL2  ,#WORDS ; #beginner "

Your file must:

Replace every ; with , using .replace(";", ",") (a method we met in PY-L1-23 — same family).
Split on commas.
For each piece: strip whitespace, strip leading "#", lower-case it.
Drop empty pieces (strings that became "" after cleaning).
Drop duplicates using a set.
Print the cleaned tags joined by ", " with a leading "#" on each — like #python, #beginner, #level2, #words.

Stretch goal. Print the original tag count and the cleaned count side by side, like {"raw": 6, "clean": 4}.

Show one possible solution

# clean_tags.py — tidy a messy tag string

raw = "  #PYTHON ,  #beginner;# Python ; #LEVEL2  ,#WORDS ; #beginner "

normalised = raw.replace(";", ",")
pieces     = normalised.split(",")

clean = []
for p in pieces:
    p = p.strip()
    p = p.lstrip("#")
    p = p.strip()       # strip again in case "# Python" -> " Python"
    p = p.lower()
    if p != "":
        clean.append(p)

# Drop duplicates while keeping order
seen   = set()
unique = []
for t in clean:
    if t not in seen:
        unique.append(t)
        seen.add(t)

# Print joined with hash prefix
hashed = ["#" + t for t in unique]
print(", ".join(hashed))

# Stretch
print({"raw": len(pieces), "clean": len(unique)})

Non-negotiables: .replace, .split, a loop that .strip()s and .lstrip("#")s each piece, a set-based dedupe and one .join at the end. This is the most realistic mini-pipeline you've written so far.

Recap

3 min

Four string methods unlock most real text work. .split(sep) chops a string into a list. "sep".join(list) welds a list back into a string. .strip() trims unwanted whitespace from both ends. .find(needle) tells you where a substring sits — but watch the "found at 0" trap; prefer x in text for a clean yes/no. The common pipeline split → strip → join is the bread and butter of every text-cleaning script.

Vocabulary Card

.split(sep): Cut a string at every sep. With no argument, splits on any run of whitespace.
"sep".join(list): Glue the items of list with sep between them. The separator goes on the left of .join.
.strip() / .lstrip() / .rstrip(): Trim whitespace (or a chosen character) from both / left / right.
.find(needle): Index of the first match, or -1 if not found.
x in text: Pure yes/no membership — simpler than .find() != -1 when you don't need the position.

Homework

4 min

Save csv_to_pretty.py. Given the comma-separated string below, print a tidy markdown-style table.

raw = "name,age,city\nAisyah, 12 , Kuala Lumpur \nWei Jie ,13,Penang\nPriya,11, Ipoh"

Your file must:

Split the string on "\\n" to get four lines.
For each line, split on "," and strip each cell — using a list comprehension is fine.
Print each row joined with " | ". Add a divider "----|-----|------" between the header and the data rows.

Stretch. Find the row containing "Penang" using in and print just that line on its own at the end.

Sample · csv_to_pretty.py

# csv_to_pretty.py — CSV-ish string to a markdown table

raw = "name,age,city\nAisyah, 12 , Kuala Lumpur \nWei Jie ,13,Penang\nPriya,11, Ipoh"

lines = raw.split("\n")

rows = []
for line in lines:
    cells = [c.strip() for c in line.split(",")]
    rows.append(cells)

# Print header
print(" | ".join(rows[0]))
print("----|-----|------")

# Print body
for row in rows[1:]:
    print(" | ".join(row))

# Stretch — find Penang line
for row in rows[1:]:
    if "Penang" in row:
        print()
        print("Penang row:", " | ".join(row))

Non-negotiables: outer .split("\\n"), inner .split(",") with .strip() on each cell, and " | ".join for the printed line. Real CSV parsers use the csv module — you'll meet it in Level 4 — but this two-split shape handles a lot.

String Superpowers

⏱ 1 hour✂️ Concept lesson📚 After PY-L2-13💻 VS Code or online-python.com

csv = "milo,kopi,teh tarik,horlicks" drinks = csv.split(",") print(drinks) # → ['milo', 'kopi', 'teh tarik', 'horlicks'] sentence = "the quick brown fox" words = sentence.split(" ") print(words) # → ['the', 'quick', 'brown', 'fox']

drinks = ["milo", "kopi", "teh tarik"] print(", ".join(drinks)) # → milo, kopi, teh tarik print(" - ".join(drinks)) # → milo - kopi - teh tarik print("".join(drinks)) # → milokopiteh tarik print("\n".join(drinks)) # → milo # kopi # teh tarik

Question Method "Break this string into pieces" text.split(sep) "Glue these pieces into one string" sep.join(list_of_strings) "Trim leading/trailing spaces" text.strip() "Where is X inside this string?" text.find(x) (-1 = not found) "Is X in here at all?" x in text (True / False)

# orders.py — split, strip, join raw = " Aisyah ; Wei Jie ; Priya ; Iman ; Aizat " # 1 — split on the semicolon parts = raw.split(";") print("After split :", parts) # → [' Aisyah ', ' Wei Jie ', ' Priya ', ' Iman ', ' Aizat '] # 2 — strip every piece (loop) clean = [] for p in parts: clean.append(p.strip()) print("After strip :", clean) # → ['Aisyah', 'Wei Jie', 'Priya', 'Iman', 'Aizat'] # 3 — back together, comma-separated tidy = ", ".join(clean) print("Tidy rota :", tidy) # 4 — quick question print("Total people :", len(clean)) print("Is Priya in? :", "Priya" in clean)

After split : [' Aisyah ', ' Wei Jie ', ' Priya ', ' Iman ', ' Aizat '] After strip : ['Aisyah', 'Wei Jie', 'Priya', 'Iman', 'Aizat'] Tidy rota : Aisyah, Wei Jie, Priya, Iman, Aizat Total people : 5 Is Priya in? : True

# clean_tags.py — tidy a messy tag string raw = " #PYTHON , #beginner;# Python ; #LEVEL2 ,#WORDS ; #beginner " normalised = raw.replace(";", ",") pieces = normalised.split(",") clean = [] for p in pieces: p = p.strip() p = p.lstrip("#") p = p.strip() # strip again in case "# Python" -> " Python" p = p.lower() if p != "": clean.append(p) # Drop duplicates while keeping order seen = set() unique = [] for t in clean: if t not in seen: unique.append(t) seen.add(t) # Print joined with hash prefix hashed = ["#" + t for t in unique] print(", ".join(hashed)) # Stretch print({"raw": len(pieces), "clean": len(unique)})

Learning Goals

Warm-Up

New Concept · The Four Methods

1 · .split(sep) — chop into a list

2 · "sep".join(list) — weld a list into one string

3 · .strip() — trim the rubbish

4 · .find(needle) — where is it?

The cheat-sheet

Worked Example · The Comma-Separated Order Form

The story

Read the diff

Try It Yourself

Mini-Challenge · The Tag Cleaner

Recap

Vocabulary Card

Homework

Sample · csv_to_pretty.py

Learning Goals

Warm-Up

New Concept · The Four Methods

1 · .split(sep) — chop into a list

2 · "sep".join(list) — weld a list into one string

3 · .strip() — trim the rubbish

4 · .find(needle) — where is it?

The cheat-sheet

Worked Example · The Comma-Separated Order Form

The story

Read the diff

Try It Yourself

Mini-Challenge · The Tag Cleaner

Recap

Vocabulary Card

Homework

Sample · csv_to_pretty.py

1 · `.split(sep)` — chop into a list

2 · `"sep".join(list)` — weld a list into one string

3 · `.strip()` — trim the rubbish

4 · `.find(needle)` — where is it?

1 · `.split(sep)` — chop into a list

2 · `"sep".join(list)` — weld a list into one string

3 · `.strip()` — trim the rubbish

4 · `.find(needle)` — where is it?