A fully functional Layer 4 TCP load balancer built from scratch in Go — no external dependencies, pure standard library. Includes a multi-type client simulator, isolated server instances, a transparent TCP proxy with three load balancing algorithms, a background health checker, and a complete Docker Compose setup for simulating horizontal scaling.
A Layer 4 proxy operates at the transport layer. It accepts TCP connections from clients and forwards the raw byte stream to a backend server — it never reads or parses the application protocol. From the client's perspective it is talking directly to a server. From each server's perspective all connections come from the LB's IP.
┌─────────────────────────────────────────────────────────────┐
│ lb-net (Docker bridge) │
│ │
│ ┌──────────┐ TCP ┌──────────────┐ │
│ │ client │ ──────────▶ │ lb:9000 │ │
│ │ simulator│ │ │ │
│ └──────────┘ │ • listener │ │
│ │ • picker │ ──▶ server-1:8080
│ │ • proxy │ ──▶ server-2:8081
│ │ • health │ ──▶ server-3:8082
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
| Component | Package | Binary | Role |
|---|---|---|---|
| Client Simulator | pkgs/clients |
cmd/client |
Spawns 5 client types at configurable rate |
| Load Balancer | pkgs/lb |
cmd/lb |
Accepts connections, picks backend, proxies bytes |
| Server | pkgs/server |
cmd/server |
Handles connections, simulates delay, enforces limits |
| Algorithm | Flag | Behaviour |
|---|---|---|
| Round Robin | rr |
Cycles through backends in order — atomic cursor, no locks |
| Least Connections | lc |
Picks backend with fewest active connections |
| Random | rand |
Uniform random selection |
Switch algorithm at runtime via LB_ALGO env var. No restart needed if changed via CLI in local mode.
| Type | Behaviour | Models |
|---|---|---|
realClient |
REQ/ACK loop, random iterations | Typical HTTP keep-alive user |
fireAndForget |
Single PING, immediate disconnect | Health probe, UDP-like sender |
streamClient |
Continuous stream until cancelled | Log shipper, metrics agent |
idleClient |
Holds connection open, sends nothing | Zombie / leaked connection |
slowClient |
Sends one byte at a time with delays | Slowloris, bad-network client |
Clients are selected via weighted random dispatch — weights defined in pkgs/clients/launcher.go.
L4-proxy-LB/
├── cmd/
│ ├── client/
│ │ ├── main.go # client simulator entrypoint
│ │ └── Dockerfile
│ ├── lb/
│ │ ├── main.go # load balancer entrypoint
│ │ └── Dockerfile
│ └── server/
│ ├── main.go # server instance entrypoint
│ └── Dockerfile
├── pkgs/
│ ├── clients/
│ │ ├── types.go # contextKey, clientFunc type
│ │ ├── clients.go # 5 client implementations
│ │ └── launcher.go # weighted dispatch, constant spawner
│ ├── lb/
│ │ ├── types.go # Backend, Registry, Config, Algorithm
│ │ ├── picker.go # round-robin, least-conn, random
│ │ ├── health.go # background TCP health checker
│ │ ├── proxy.go # bidirectional io.Copy proxy
│ │ └── lb.go # listener loop, handleClient
│ └── server/
│ ├── types.go # Config (delay, maxConns, timeout), Stats
│ └── server.go # listener, handleConn, protocol dispatch
Zero external dependencies — the entire system uses only Go's standard library. net, sync/atomic, io, context — that's it. No frameworks, no packages.
Atomics over mutexes on the hot path — every counter (activeConns, totalBytes, healthy, algo) is an atomic.Int64 or atomic.Int32. The picker reads them on every connection without ever taking a lock.
Context as the shutdown bus — one cancel() call in main propagates to every goroutine in the tree. No channels, no manual signalling, no waitgroups exposed to main.
L4 not L7 — the proxy never reads protocol bytes. io.Copy forwards the raw stream. This means it works with any TCP protocol — REQ/ACK, streaming, slow clients — without modification.
Per-instance isolation — each server process is completely isolated. No shared memory, no shared state. This accurately models real horizontal scaling.
- Go 1.23 or later
- Docker + Docker Compose (for containerised mode)
git clone https://github.com/Asthraris/L4-proxy-LB.git
cd L4-proxy-LBgo build -o server ./cmd/server
go build -o lb ./cmd/lb
go build -o client ./cmd/clientTerminal 1 — server-1
SERVER_PORT=8080 SERVER_ID=server-1 SERVER_DELAY_MS=0 \
SERVER_MAX_CONNS=100 SERVER_TIMEOUT_SECS=30 ./serverTerminal 2 — server-2
SERVER_PORT=8081 SERVER_ID=server-2 SERVER_DELAY_MS=150 \
SERVER_MAX_CONNS=100 SERVER_TIMEOUT_SECS=30 ./serverTerminal 3 — server-3
SERVER_PORT=8082 SERVER_ID=server-3 SERVER_DELAY_MS=500 \
SERVER_MAX_CONNS=50 SERVER_TIMEOUT_SECS=30 ./serverTerminal 4 — load balancer
LB_PORT=9000 LB_ID=lb-1 LB_HEALTH_INTERVAL=3 \
LB_BACKENDS=localhost:8080,localhost:8081,localhost:8082 \
LB_ALGO=rr ./lbTerminal 5 — client simulator
CLIENT_TARGET=localhost:9000 CLIENT_RATE=2 ./clientdocker compose up --buildFirst run downloads base images and compiles all binaries inside Docker. Subsequent runs use the cache — startup is near instant.
docker compose up -ddocker compose logs lb -f
docker compose logs server-n -f
docker compose logs client -fdocker compose downdocker compose down
docker compose up --buildOnly rebuild a single service:
docker compose up --build server-2Server
| Variable | Default | Description |
|---|---|---|
SERVER_PORT |
required | TCP port to listen on |
SERVER_ID |
required | Label shown in logs |
SERVER_DELAY_MS |
0 |
Artificial response delay in milliseconds |
SERVER_MAX_CONNS |
0 |
Max concurrent connections (0 = unlimited) |
SERVER_TIMEOUT_SECS |
0 |
Idle connection timeout in seconds (0 = none) |
Load Balancer
| Variable | Default | Description |
|---|---|---|
LB_PORT |
required | TCP port to listen on |
LB_ID |
required | Label shown in logs |
LB_HEALTH_INTERVAL |
3 |
Health check interval in seconds |
LB_BACKENDS |
required | Comma-separated backend addresses |
LB_ALGO |
rr |
Algorithm: rr, lc, or rand |
Client
| Variable | Default | Description |
|---|---|---|
CLIENT_TARGET |
required | LB address to connect to |
CLIENT_RATE |
2 |
Clients spawned per second |
Start everything and watch the LB stats:
docker compose logs lb -fAfter 30 seconds the total column across all three backends should be within 2-3 of each other. Equal distribution confirms round robin is working.
Stop server-1 mid-traffic:
docker compose stop server-1Watch the LB log — within 3 seconds (one health interval):
[health] backend server-1:8080 DOWN
Traffic redistributes to server-2 and server-3 automatically. Restart server-1:
docker compose start server-1Within 3 seconds it comes back UP and receives connections again.
Increase server-3's delay by changing compose to SERVER_DELAY_MS=2000, then restart it:
docker compose up --build server-3Then switch the LB algorithm by changing LB_ALGO=lc and restarting:
docker compose up --build lbWatch active counts — server-3's active stays high while it's busy with slow requests. The LB stops sending to it and floods server-1 and server-2 instead. That's least-connections working correctly.
Set SERVER_MAX_CONNS=10 on server-2 and increase CLIENT_RATE=20:
docker compose up --buildServer-2 logs will show:
[server-2] at capacity (10/10) — dropping ...
The LB continues routing to server-1 and server-3 unaffected.
Observability
- Prometheus
/metricsendpoint on a separate port — exposesactive_conns,total_conns,bytes_forwarded,health_statusper backend as scrapeable metrics - Grafana dashboard consuming those metrics — live graphs of connection distribution across backends
pprofendpoint for CPU and memory profiling under load — attach withgo tool pprofto find goroutine bottlenecks
Load Balancer
- Weighted backends —
add localhost:8080 weight=3routes 3× more traffic to stronger instances - Connection draining on
rm— stop new connections, wait for active ones to finish before removing - Sticky sessions — hash client IP to always route the same client to the same backend (useful for stateful protocols)
- Retry on dial failure with exponential backoff — safe only for idempotent protocols
- Max connections per backend soft cap with queue — hold client connections in a FIFO channel until a slot opens, drop after configurable timeout
Server
- Chaos mode —
SERVER_CHAOS_PCT=10randomly drops 10% of connections immediately after accept — simulates a flaky backend - Hang mode — accepts connection but delays first response by N seconds — simulates an overloaded server that is alive but not responding
- HTTP control endpoint on a separate port —
curl -X POST :9001/delay -d "500"changes delay without restarting