Learning Goals
3 minBy the end of this lesson you can:
- Create a TCP socket and connect to a host:port.
- Send and receive bytes, encoding/decoding text correctly.
- Handle partial reads — TCP is a stream, not messages.
- Use timeouts and context managers so connections never hang or leak.
Warm-Up · Under requests Is a Socket
5 minWhen you called requests.get("https://...") in Level 7, underneath it: a socket connected to the server's IP on port 443, TLS-encrypted the channel, sent the literal HTTP bytes, and read the reply byte by byte. requests hid all of that. Today we lift the lid.
A socket is an endpoint for sending/receiving bytes over the network. The client flow is always the same four steps: create → connect → send/recv → close. Everything is bytes, not strings, so you encode before sending and decode after receiving. And TCP is a stream: one recv may return part of a message, so you must loop. Master this and scanners, chat servers, and protocol tools all become approachable.
New Concept · The Client Lifecycle
14 minCreate → connect → send → recv → close
import socket # 1. create: AF_INET = IPv4, SOCK_STREAM = TCP s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.settimeout(5) # never hang forever # 2. connect to host:port s.connect(("127.0.0.1", 9000)) # 3. send bytes (encode text first!) s.sendall(b"hello\n") # sendall ensures it all goes # 4. receive bytes (decode after) data = s.recv(1024) # read up to 1024 bytes print("got:", data.decode()) # 5. close s.close()
Use a context manager so it always closes, even on error:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: s.settimeout(5) s.connect(("127.0.0.1", 9000)) s.sendall(b"hello\n") print(s.recv(1024).decode()) # socket auto-closed here
Bytes, not strings
text = "héllo" raw = text.encode("utf-8") # str → bytes (b'h\xc3\xa9llo') back = raw.decode("utf-8") # bytes → str s.sendall(text.encode()) # always encode before sending
The network only moves bytes. Forgetting to .encode() raises TypeError; using the wrong codec mangles non-ASCII. Default to UTF-8 everywhere.
TCP is a stream — loop your reads
TCP delivers a continuous byte stream. A single recv(1024) might return half a message, or two messages glued together. You need a rule for "message complete" — read until the connection closes, until you hit a delimiter (like \\n), or until you've read a length you were told up front.
def recv_all(sock) -> bytes: """Read until the peer closes the connection.""" chunks = [] while True: chunk = sock.recv(4096) if not chunk: # b"" means the peer closed break chunks.append(chunk) return b"".join(chunks) def recv_line(sock) -> str: """Read until a newline delimiter.""" buf = b"" while not buf.endswith(b"\n"): chunk = sock.recv(1) if not chunk: break buf += chunk return buf.decode().rstrip("\n")
Timeouts & errors
try: with socket.create_connection(("127.0.0.1", 9000), timeout=5) as s: s.sendall(b"ping\n") print(s.recv(1024).decode()) except socket.timeout: print("connection timed out") except ConnectionRefusedError: print("nothing is listening there")
socket.create_connection is a tidy shortcut that creates + connects in one call. Always set a timeout and handle ConnectionRefusedError (no server) and socket.timeout (no answer) — security tools especially must never hang.
Worked Example · A Banner-Grabbing Client (localhost)
12 minGoal: connect to a service and read the "banner" it sends on connect (many services announce their software/version). We'll also speak raw HTTP to a local server — proving requests is just bytes over a socket. Localhost only.
import socket def grab_banner(host: str, port: int, timeout: float = 3.0) -> str: """Read whatever a service sends on connect. LOCALHOST/own hosts only.""" try: with socket.create_connection((host, port), timeout=timeout) as s: s.settimeout(timeout) return s.recv(1024).decode(errors="replace").strip() except (socket.timeout, ConnectionRefusedError, OSError) as e: return f"(no banner: {e})" def http_get_raw(host: str, port: int, path: str = "/") -> str: """Speak HTTP by hand — what requests.get does under the hood.""" request = (f"GET {path} HTTP/1.1\r\n" f"Host: {host}\r\n" f"User-Agent: socket-demo/1.0\r\n" f"Connection: close\r\n\r\n") # blank line ends headers with socket.create_connection((host, port), timeout=5) as s: s.sendall(request.encode()) # read until the server closes (Connection: close) chunks = [] while True: chunk = s.recv(4096) if not chunk: break chunks.append(chunk) return b"".join(chunks).decode(errors="replace") # against your own running dev server: raw = http_get_raw("127.0.0.1", 3000, "/") print(raw.split("\r\n\r\n")[0]) # just the status line + headers
HTTP/1.1 200 OK Content-Type: text/html; charset=utf-8 ... # You just performed an HTTP request with nothing but a socket and bytes.
Read the code
The http_get_raw function demystifies the whole web stack: an HTTP request is just text bytes (method, path, headers, a blank line) sent over a TCP socket, and the response is bytes you read until the server closes. requests wraps exactly this, plus TLS, redirects, and parsing. grab_banner shows the security angle — services often reveal their version on connect, which is recon (and a reason to suppress banners defensively). We stay on 127.0.0.1 per the ethics rules.
Try It Yourself
13 minWith your dev server running, use http_get_raw to fetch / from 127.0.0.1. Print only the response headers (everything before the first blank line). Confirm you see the status line.
Write a client that connects to a host:port, sends a line, and reads a single line back using recv_line. Add a timeout and handle a refused connection gracefully.
Hint
def ask(host, port, message): with socket.create_connection((host, port), timeout=5) as s: s.sendall((message + "\n").encode()) return recv_line(s)
Fetch the same local URL two ways — with your raw socket client and with requests.get — and compare the headers. Explain in comments what requests added that you had to do by hand (encoding, the blank-line terminator, reading until close).
Hint
import requests r = requests.get("http://127.0.0.1:3000/") print("requests headers:", dict(r.headers)) # Differences: requests built the request line + Host header for me, # encoded/decoded bytes, knew when the body ended (Content-Length), # and would have handled TLS/redirects. My socket did none of that.
Mini-Challenge · A Length-Prefixed Protocol Client
8 minReal protocols frame messages so the receiver knows where each ends. Build a client that sends and receives length-prefixed messages: 4 bytes giving the length, then that many bytes of payload. This solves the "TCP is a stream" problem cleanly (and you'll use it in the chat server).
Show a sample solution
import socket, struct def send_msg(sock, text: str) -> None: data = text.encode("utf-8") sock.sendall(struct.pack("!I", len(data)) + data) # 4-byte length + payload def recv_exact(sock, n: int) -> bytes: buf = b"" while len(buf) < n: chunk = sock.recv(n - len(buf)) if not chunk: raise ConnectionError("peer closed early") buf += chunk return buf def recv_msg(sock) -> str: raw_len = recv_exact(sock, 4) (length,) = struct.unpack("!I", raw_len) return recv_exact(sock, length).decode("utf-8") # usage (against a matching server you run on localhost): # with socket.create_connection(("127.0.0.1", 9000)) as s: # send_msg(s, "hello"); print(recv_msg(s))
Non-negotiables: 4-byte length prefix, recv_exact loop (handles partial reads), clean framing both ways.
Recap
3 minA TCP client follows one lifecycle: create → connect → send → recv → close (use a context manager or create_connection so it always closes). Everything is bytes — encode before sending, decode after — and TCP is a stream, so one recv may be partial: loop until close, until a delimiter, or until a known length (length-prefixing is the clean solution). Always set a timeout and handle ConnectionRefusedError/socket.timeout. Speaking raw HTTP over a socket shows that requests is just this plus TLS and parsing. Next: the server side.
Vocabulary Card
- socket
- An endpoint for sending/receiving bytes over the network.
- SOCK_STREAM
- A TCP socket (reliable, ordered byte stream).
- sendall / recv
- Send all given bytes / read up to N bytes (possibly fewer).
- framing
- A rule (delimiter or length prefix) marking where a message ends.
Homework
4 minBuild a small reusable sockclient.py module with create_connection-based helpers: send_line/recv_line and the length-prefixed send_msg/recv_msg, all with timeouts and clean error handling. Use it to fetch a raw HTTP response from your local dev server and from a length-prefixed echo (you'll build the matching server next lesson). Note what breaks if you forget to loop on recv.
Sample · sockclient.py (core)
import socket, struct def connect(host, port, timeout=5): return socket.create_connection((host, port), timeout=timeout) def send_line(sock, text): sock.sendall((text + "\n").encode()) def recv_line(sock): buf = b"" while not buf.endswith(b"\n"): c = sock.recv(1) if not c: break buf += c return buf.decode().rstrip("\n") def send_msg(sock, text): data = text.encode() sock.sendall(struct.pack("!I", len(data)) + data) def _recv_exact(sock, n): buf = b"" while len(buf) < n: c = sock.recv(n - len(buf)) if not c: raise ConnectionError("closed early") buf += c return buf def recv_msg(sock): (n,) = struct.unpack("!I", _recv_exact(sock, 4)) return _recv_exact(sock, n).decode()
Forgetting to loop on recv: a single recv(1024) of a 5KB HTTP response returns only the first ~1KB — you'd parse a truncated, broken response. TCP is a stream; you must read until done.
Non-negotiables: timeout + error handling, both line and length-prefixed helpers, and the "must loop on recv" lesson stated.