Learning Goals
3 min- Pick one or many columns with
df["col"]anddf[["a", "b"]]. - Pick rows by label with
df.loc[row]; ranges withdf.loc[a:b](inclusive!). - Pick rows by integer position with
df.iloc[i]anddf.iloc[a:b](exclusive). - Combine row + column picks in one expression.
Warm-Up · Series vs DataFrame
5 minimport pandas as pd df = pd.read_csv("students.csv") print(type(df["name"])) # → <class 'pandas.core.series.Series'> print(type(df[["name"]])) # → <class 'pandas.core.frame.DataFrame'>
Single brackets give back a 1-D Series. Double brackets (a list of columns) give back a DataFrame — even if the list has one element. Two different types; pick the one your next step expects.
loc = labels, iloc = integers. Memorise the prefix and you stop confusing the two. loc is also the one you assign to — the source of pandas's "chained-assignment" warning.
New Concept · Columns, .loc, .iloc
14 minColumns
df["name"] # one column → Series df[["name", "score"]] # two columns → DataFrame
.loc — by label
df.loc[0] # row with label 0 (Series) df.loc[0:2] # rows 0, 1, 2 — INCLUSIVE df.loc[0, "score"] # one cell df.loc[:, "score"] # every row, score column df.loc[0:2, ["name", "score"]] # block # loc accepts boolean masks too (next lesson) df.loc[df["score"] > 80]
.iloc — by integer position
df.iloc[0] # first row df.iloc[0:2] # rows 0 and 1 — EXCLUSIVE (like normal Python slicing) df.iloc[-1] # last row df.iloc[0, 2] # row 0, column 2 df.iloc[:, 1:3] # every row, columns 1-2
Set the index for clean .loc work
By default the index is 0, 1, 2... .set_index swaps in a meaningful column:
students = df.set_index("name") print(students.loc["Aisyah"]) print(students.loc["Aisyah", "score"])
Edit safely — always via .loc
# ✗ Bad — pandas may warn about chained assignment df[df["name"] == "Aisyah"]["score"] = 100 # ✓ Good df.loc[df["name"] == "Aisyah", "score"] = 100
If you only remember one rule from this lesson: when assigning, use .loc.
Worked Example · Slice a Bigger Frame
12 minimport pandas as pd df = pd.read_csv("clean.csv", parse_dates=["date"]) # 1. The first three rows, just two columns print(df.loc[:2, ["product", "quantity"]]) # 2. Every row of one column as a Series total_qty = df["quantity"].sum() print(f"total quantity: {total_qty}") # 3. By position — middle row, every column print(df.iloc[len(df) // 2]) # 4. Set a meaningful index, then look someone up by name by_customer = df.set_index("customer") print(by_customer.loc["Ahmad"]) # 5. Multi-row + column block via .loc with a label-based slice print(by_customer.loc["Ahmad":"Mei", ["product", "price"]])
Read aloud
1. "first three rows, product and quantity columns" 2. "every quantity, then sum" 3. "the middle row's full record" 4. "the row labelled Ahmad" 5. "rows from Ahmad through Mei, product and price"
Read the diff
Five lines, five distinct slices. Notice how loc is inclusive at both ends — 0:2 gives rows 0, 1, AND 2. That trips up everyone once.
Try It Yourself
13 minReturn the first column and the last 3 rows in one expression.
Hint
df.iloc[-3:, 0]
Set the index to the customer column. Print Ahmad's and Mei's product and price.
Hint
bc = df.set_index("customer") print(bc.loc[["Ahmad", "Mei"], ["product", "price"]])
Add a new column tax equal to 6% of price. Then bump every row where product == "Nasi" to a tax rate of 8% — using .loc properly.
Hint
df["tax"] = df["price"] * 0.06 df.loc[df["product"] == "Nasi", "tax"] = df.loc[df["product"] == "Nasi", "price"] * 0.08 print(df[["product", "price", "tax"]])
Mini-Challenge · Re-Order the Frame
8 minBuild a one-line expression that returns the DataFrame with columns in a custom order: customer, product, quantity, price, total, date, order_id. (Compute total first if it doesn't exist.)
Show one possible solution
df["total"] = df["quantity"] * df["price"] df = df[["customer", "product", "quantity", "price", "total", "date", "order_id"]]
Selecting with a column-name list returns a new DataFrame in that order. The simplest re-ordering trick in pandas.
Recap
3 minThree notations, three roles. df["col"] picks columns. df.loc picks by label (inclusive). df.iloc picks by integer position (exclusive). Always assign via .loc. Tomorrow we filter rows with boolean masks.
Homework
4 minTake your real-CSV from yesterday. Produce five slices:
- Two columns only.
- Every row, but in reverse column order.
- The first 10 rows of a specific column.
- A single cell by label.
- A single cell by integer position.
Write the line and the one-sentence English meaning side by side. Submit a markdown file.
df[["customer", "product"]] # 2 cols df.iloc[:, ::-1] # columns reversed df.loc[:9, "price"] # first 10 rows of price df.set_index("customer").loc["Mei", "product"] # cell by label df.iat[2, 3] # cell by int position (faster than iloc[r,c])