PY-L5-42 · Tool Use — Letting the LLM Call a Function

Learning Goals

3 min

Define a tool with a name, description and JSON input schema.
Detect a tool_use response and run the matching function.
Return the result and let the model finish its answer.
Understand the agent loop: think → call tool → observe → respond.

Warm-Up · The Model Asks, You Do

5 min

User:  "What's 4823 × 1947?"
Model: (can't reliably multiply) → asks to call calculate(4823, 1947)
You:   run the function → 9,390,381
Model: "4823 × 1947 = 9,390,381"

Today's big idea

Tools give the LLM hands. It still can't run code — it just requests a tool call with arguments. YOUR code executes it (so you stay in control of what's allowed) and returns the result. The model then weaves it into its reply. That request-execute-return loop is the heart of every AI agent.

New Concept · Defining & Running Tools

14 min

1. Describe the tool

tools = [{
    "name": "calculate",
    "description": "Evaluate a basic arithmetic expression and return the number.",
    "input_schema": {
        "type": "object",
        "properties": {
            "expression": {"type": "string",
                           "description": "e.g. '4823 * 1947'"}
        },
        "required": ["expression"],
    },
}]

The description is critical — it's how the model decides when to use the tool. Write it like docs for a colleague.

2. Send the request with tools

import anthropic
client = anthropic.Anthropic()

messages = [{"role": "user", "content": "What is 4823 * 1947?"}]
resp = client.messages.create(
    model="claude-haiku-4-5", max_tokens=500,
    tools=tools, messages=messages,
)
print(resp.stop_reason)   # 'tool_use' if it wants a tool

3. Run the tool and return the result

def run_tool(name, args):
    if name == "calculate":
        # in real code, parse safely — don't eval untrusted input!
        return str(eval(args["expression"]))
    return "unknown tool"

if resp.stop_reason == "tool_use":
    tool = next(b for b in resp.content if b.type == "tool_use")
    result = run_tool(tool.name, tool.input)

    messages.append({"role": "assistant", "content": resp.content})
    messages.append({"role": "user", "content": [{
        "type": "tool_result",
        "tool_use_id": tool.id,
        "content": result,
    }]})

    final = client.messages.create(model="claude-haiku-4-5", max_tokens=500,
                                   tools=tools, messages=messages)
    print(final.content[0].text)

The loop

1. send user message + tool definitions
2. model replies with text OR a tool_use request
3. if tool_use: run it, append tool_result, call again
4. repeat until the model returns a final text answer

Safety: never let the model trigger dangerous actions blindly. You decide which tools exist and validate their inputs — that eval above is fine for a class demo but unsafe for untrusted input.

Worked Example · A Calculator Agent

12 min

# tool_agent.py — LLM that can do reliable maths via a tool
import anthropic
client = anthropic.Anthropic()

tools = [{
    "name": "calculate",
    "description": "Compute a math expression. Use for any arithmetic.",
    "input_schema": {
        "type": "object",
        "properties": {"expression": {"type": "string"}},
        "required": ["expression"],
    },
}]

def run_tool(name, args):
    if name == "calculate":
        try:
            return str(eval(args["expression"], {"__builtins__": {}}))
        except Exception as e:
            return f"error: {e}"
    return "unknown tool"

def ask(question, model="claude-haiku-4-5"):
    messages = [{"role": "user", "content": question}]
    while True:
        resp = client.messages.create(model=model, max_tokens=500,
                                      tools=tools, messages=messages)
        if resp.stop_reason != "tool_use":
            return resp.content[0].text
        # handle every tool the model requested this turn
        messages.append({"role": "assistant", "content": resp.content})
        results = []
        for block in resp.content:
            if block.type == "tool_use":
                out = run_tool(block.name, block.input)
                print(f"  [tool] {block.name}({block.input}) = {out}")
                results.append({"type": "tool_result",
                                "tool_use_id": block.id, "content": out})
        messages.append({"role": "user", "content": results})

print(ask("What is 4823 * 1947, and is that more than 9 million?"))

Sample output

  [tool] calculate({'expression': '4823 * 1947'}) = 9390381
4823 × 1947 = 9,390,381. Yes, that's more than 9 million (it's about 9.39 million).

Read the diff

The while loop is the agent loop — it keeps running tools until the model is ready to answer. The model correctly chose to call the calculator, then reasoned about the result. We sandboxed eval ({"__builtins__": {}}) so it can't run arbitrary code. This exact pattern — define tools, loop, validate — scales up to web search, database queries, and real agents.

Try It Yourself

13 min

01 🟢 Add a tool

Add a get_time tool that returns the current time. Ask "what time is it?" and watch the model use it.

02 🟡 Weather tool

Wrap the Level-4 open-meteo weather function as a tool. Ask "should I bring an umbrella in KL?" and let the model fetch + reason.

03 🔴 Two tools, one question

Give it calculate + get_time and ask something needing both ("how many minutes until midnight, times 3?"). Confirm the loop handles multiple tool calls.

Mini-Challenge · A Mini Database Agent

8 min

Give the model a lookup_student(name) tool that reads from a SQLite DB (Level 4!). Ask "what's Aisyah's score?" in natural language and let the agent fetch and answer. You've just built a natural-language database interface.

Show the tool definition

tools = [{
    "name": "lookup_student",
    "description": "Look up a student's score by name from the database.",
    "input_schema": {
        "type": "object",
        "properties": {"name": {"type": "string"}},
        "required": ["name"],
    },
}]

def run_tool(name, args):
    if name == "lookup_student":
        import sqlite3
        con = sqlite3.connect("school.db")
        row = con.execute("SELECT score FROM students WHERE name=?",
                          (args["name"],)).fetchone()
        con.close()
        return str(row[0]) if row else "not found"

Non-negotiables: parameterised SQL (no injection), the model never touches the DB directly — only your validated function does.

Recap

3 min

Tools give the LLM hands. You define name + description + input schema; the model requests a call; your code runs it and returns a tool_result; loop until the model answers. You stay in control of what tools exist and validate inputs. This request-execute-return loop is the foundation of every AI agent. Next: make replies feel alive with streaming.

Vocabulary Card

tool use: The model requesting that a defined function be run with arguments.
input_schema: JSON schema describing a tool's arguments.
tool_result: The message you send back containing the function's output.
agent loop: Repeating think → call tool → observe → respond until done.

Homework

4 min

Build a 2-tool agent (e.g., calculator + a lookup from a CSV/DB you have). Ask 3 natural-language questions that require the tools. Print each tool call. One paragraph: where did the model choose tools well, and where did it stumble?

tools = [{ "name": "calculate", "description": "Evaluate a basic arithmetic expression and return the number.", "input_schema": { "type": "object", "properties": { "expression": {"type": "string", "description": "e.g. '4823 * 1947'"} }, "required": ["expression"], }, }]

import anthropic client = anthropic.Anthropic() messages = [{"role": "user", "content": "What is 4823 * 1947?"}] resp = client.messages.create( model="claude-haiku-4-5", max_tokens=500, tools=tools, messages=messages, ) print(resp.stop_reason) # 'tool_use' if it wants a tool

def run_tool(name, args): if name == "calculate": # in real code, parse safely — don't eval untrusted input! return str(eval(args["expression"])) return "unknown tool" if resp.stop_reason == "tool_use": tool = next(b for b in resp.content if b.type == "tool_use") result = run_tool(tool.name, tool.input) messages.append({"role": "assistant", "content": resp.content}) messages.append({"role": "user", "content": [{ "type": "tool_result", "tool_use_id": tool.id, "content": result, }]}) final = client.messages.create(model="claude-haiku-4-5", max_tokens=500, tools=tools, messages=messages) print(final.content[0].text)

# tool_agent.py — LLM that can do reliable maths via a tool import anthropic client = anthropic.Anthropic() tools = [{ "name": "calculate", "description": "Compute a math expression. Use for any arithmetic.", "input_schema": { "type": "object", "properties": {"expression": {"type": "string"}}, "required": ["expression"], }, }] def run_tool(name, args): if name == "calculate": try: return str(eval(args["expression"], {"__builtins__": {}})) except Exception as e: return f"error: {e}" return "unknown tool" def ask(question, model="claude-haiku-4-5"): messages = [{"role": "user", "content": question}] while True: resp = client.messages.create(model=model, max_tokens=500, tools=tools, messages=messages) if resp.stop_reason != "tool_use": return resp.content[0].text # handle every tool the model requested this turn messages.append({"role": "assistant", "content": resp.content}) results = [] for block in resp.content: if block.type == "tool_use": out = run_tool(block.name, block.input) print(f" [tool] {block.name}({block.input}) = {out}") results.append({"type": "tool_result", "tool_use_id": block.id, "content": out}) messages.append({"role": "user", "content": results}) print(ask("What is 4823 * 1947, and is that more than 9 million?"))

tools = [{ "name": "lookup_student", "description": "Look up a student's score by name from the database.", "input_schema": { "type": "object", "properties": {"name": {"type": "string"}}, "required": ["name"], }, }] def run_tool(name, args): if name == "lookup_student": import sqlite3 con = sqlite3.connect("school.db") row = con.execute("SELECT score FROM students WHERE name=?", (args["name"],)).fetchone() con.close() return str(row[0]) if row else "not found"