Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 44 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,39 +16,74 @@ Thank you for your interest in contributing to celeris!
- Go 1.26+
- [Mage](https://magefile.org) build tool: `go install github.com/magefile/mage@latest`
- Linux (for io_uring/epoll engine tests) or macOS (std engine only)
- [golangci-lint](https://golangci-lint.run/) v2.9+
- [h2spec](https://github.com/summerwind/h2spec) (installed automatically by `mage tools`)

### Build & Test

```bash
mage lint # Run linters
mage test # Run tests with race detection
mage build # Verify compilation
mage check # Run all checks
mage lint # Run linters (golangci-lint)
mage test # Run tests with race detection
mage spec # Run h2spec + h1spec compliance tests
mage fuzz # Run fuzz tests (30s default, set FUZZ_TIME to override)
mage check # Run all checks: lint + test + spec + build
```

### Linux Testing from macOS

Use the Multipass VM targets:
The io_uring and epoll engines only compile and run on Linux. Use the Multipass VM targets to test from macOS:

```bash
mage testLinux # Run tests with race detection in a Linux VM
mage specLinux # Run h2spec + h1spec in a Linux VM
mage checkLinux # Full verification suite in a Linux VM
mage benchLinux # Run benchmarks in a Linux VM
```

The VM is created automatically on first use (Ubuntu Noble, 4 CPUs, 4 GB RAM). Use `mage vmStop` / `mage vmDelete` to manage it.

### Benchmarking

```bash
mage testlinux # Run tests on a Linux VM
mage speclinux # Run h2spec on a Linux VM
mage localBenchmark # A/B benchmark: main vs current branch (Multipass VM)
mage localProfile # Capture pprof profiles for main vs current branch
mage bench # Run Go benchmarks (local, any OS)
```

### Available Mage Targets

```bash
mage -l # List all available targets
```

## Pull Requests

- Keep PRs focused on a single change
- Include tests for new functionality
- Run `mage check` before submitting
- Run `mage check` before submitting (or `mage checkLinux` if touching engine code)
- Follow existing code style (enforced by golangci-lint)
- Write clear commit messages
- Write clear commit messages following the `type: description` format (e.g., `feat:`, `fix:`, `perf:`, `docs:`, `test:`, `chore:`)
- Security-sensitive changes should note the CWE number in the commit message

## Code Style

- Follow standard Go conventions
- No unnecessary comments — code should be self-documenting
- Use `mage lint` to verify
- Only add comments where the logic is not self-evident
- Use `mage lint` to verify (runs golangci-lint across darwin + linux)
- Hot-path code: avoid allocations, avoid `defer` for Lock/Unlock, prefer inline fast-paths with fallback to function calls

## Testing

- Unit tests go in `*_test.go` files alongside the code
- Integration tests (multi-engine, adaptive, etc.) go in `test/integration/`
- Conformance tests (HTTP/1.1 RFC 9112, HTTP/2 h2spec) go in `test/spec/`
- Use the `celeristest` package for Context/ResponseRecorder test helpers
- Run with `-race` flag (all CI runs use race detection)

## Reporting Issues

Use GitHub Issues with the provided templates for bug reports and feature requests.

For security vulnerabilities, see [SECURITY.md](SECURITY.md).
143 changes: 46 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,9 @@ Ultra-low latency Go HTTP engine with a protocol-aware dual-architecture (io_uri

## Highlights

- **3.3M+ HTTP/2 requests/sec** on a single 8-vCPU machine (arm64 Graviton3)
- **590K+ HTTP/1.1 requests/sec** — 81% syscall-bound, zero allocations on the hot path
- **io_uring and epoll at parity** — both engines hit the same throughput
- **H2 is 5.7x faster than H1** thanks to stream multiplexing and inline handler execution
- **io_uring and epoll at parity** — both engines deliver equivalent throughput
- **Zero hot-path allocations** for both H1 and H2
- **Designed for throughput** — see [benchmarks](https://goceleris.dev/benchmarks) for current numbers

## Features

Expand All @@ -26,7 +24,15 @@ Ultra-low latency Go HTTP engine with a protocol-aware dual-architecture (io_uri
- **Error-returning handlers** — `HandlerFunc` returns `error`; structured `HTTPError` for status codes
- **Serialization** — JSON and XML response methods (`JSON`, `XML`); Protocol Buffers available via [`github.com/goceleris/middlewares`](https://github.com/goceleris/middlewares); `Bind` auto-detects request format from Content-Type
- **net/http compatibility** — wrap existing `http.Handler` via `celeris.Adapt()`
- **Built-in metrics collector** — atomic counters, always-on `Server.Collector().Snapshot()`
- **Streaming responses** — `Detach()` + `StreamWriter` for SSE and chunked responses on native engines
- **Connection hijacking** — `Hijack()` for WebSocket and custom protocol upgrades (H1 only)
- **Configurable body limits** — `MaxRequestBodySize` enforced across H1, H2, and the bridge
- **100-continue control** — `OnExpectContinue` callback for upload validation before body transfer
- **Accept control** — `PauseAccept()`/`ResumeAccept()` for graceful load shedding
- **Content negotiation** — `Negotiate`, `Respond`, `AcceptsEncodings`, `AcceptsLanguages`
- **Response buffering** — `BufferResponse`/`FlushResponse` for transform middleware (compress, ETag)
- **Zero-downtime restart** — `InheritListener` + `StartWithListener` for socket inheritance
- **Built-in metrics** — atomic counters, CPU utilization sampling, always-on `Server.Collector().Snapshot()`

## Quick Start

Expand Down Expand Up @@ -85,48 +91,7 @@ v2.GET("/items", listItemsV2)

## Middleware

Middleware is provided by the [`goceleris/middlewares`](https://github.com/goceleris/middlewares) module — one subpackage per middleware, individually importable.

```go
import (
"github.com/goceleris/middlewares/logger"
"github.com/goceleris/middlewares/recovery"
"github.com/goceleris/middlewares/cors"
"github.com/goceleris/middlewares/ratelimit"
)

s := celeris.New(celeris.Config{Addr: ":8080"})
s.Use(recovery.New())
s.Use(logger.New(slog.Default()))

api := s.Group("/api")
api.Use(ratelimit.New(1000))
api.Use(cors.New(cors.Config{
AllowOrigins: []string{"https://example.com"},
}))
```

See the [middlewares repo](https://github.com/goceleris/middlewares) for the full list: Logger, Recovery, CORS, RateLimit, RequestID, Timeout, BodyLimit, BasicAuth, JWT, CSRF, Session, Metrics, Debug, Compress, and more.

### Writing Custom Middleware

Middleware is just a `HandlerFunc` that calls `c.Next()`:

```go
func Timing() celeris.HandlerFunc {
return func(c *celeris.Context) error {
start := time.Now()
err := c.Next()
dur := time.Since(start)
slog.Info("request", "path", c.Path(), "duration", dur, "error", err)
return err
}
}

s.Use(Timing())
```

The `error` returned by `c.Next()` is the first non-nil error from any downstream handler. Middleware can inspect, wrap, or swallow the error before returning.
Middleware is provided by the [`goceleris/middlewares`](https://github.com/goceleris/middlewares) module. See that repo for usage, examples, and the full list of available middleware.

## Error Handling

Expand Down Expand Up @@ -161,16 +126,16 @@ func ErrorLogger() celeris.HandlerFunc {

```go
s := celeris.New(celeris.Config{
Addr: ":8080",
Protocol: celeris.Auto, // HTTP1, H2C, or Auto
Engine: celeris.Adaptive, // IOUring, Epoll, Adaptive, or Std
Workers: 8,
Objective: celeris.Latency, // Latency, Throughput, or Balanced
ReadTimeout: 30 * time.Second,
WriteTimeout: 30 * time.Second,
IdleTimeout: 120 * time.Second,
ShutdownTimeout: 10 * time.Second, // max wait for in-flight requests (default 30s)
Logger: slog.Default(),
Addr: ":8080",
Protocol: celeris.Auto, // HTTP1, H2C, or Auto
Engine: celeris.Adaptive, // IOUring, Epoll, Adaptive, or Std
Workers: 8,
ReadTimeout: 30 * time.Second,
WriteTimeout: 30 * time.Second,
IdleTimeout: 120 * time.Second,
ShutdownTimeout: 10 * time.Second, // max wait for in-flight requests (default 30s)
MaxRequestBodySize: 50 << 20, // 50 MB (default 100 MB, -1 for unlimited)
Logger: slog.Default(),
})
```

Expand All @@ -188,7 +153,7 @@ s.GET("/func", celeris.AdaptFunc(func(w http.ResponseWriter, r *http.Request) {
}))
```

The bridge buffers the adapted handler's response in memory, capped at **100 MB**. Responses exceeding this limit return an error.
The bridge buffers the adapted handler's response in memory, capped at `MaxRequestBodySize` (default 100 MB). Responses exceeding this limit return an error.

## Engine Selection

Expand All @@ -201,14 +166,6 @@ The bridge buffers the adapted handler's response in memory, capped at **100 MB*

Use Adaptive (the default on Linux) unless you have a specific reason to pin an engine. On non-Linux platforms, only Std is available.

## Performance Profiles

| Profile | Optimizes For | Key Tuning |
|---------|---------------|------------|
| `celeris.Latency` | P99 tail latency | TCP_NODELAY, small batches, SO_BUSY_POLL |
| `celeris.Throughput` | Max RPS | Large CQ batches, write batching |
| `celeris.Balanced` | Mixed workloads | Default settings |

## Graceful Shutdown

Use `StartWithContext` for production deployments. When the context is canceled, the server drains in-flight requests up to `ShutdownTimeout` (default 30s).
Expand All @@ -234,7 +191,15 @@ The core provides a lightweight metrics collector accessible via `Server.Collect

```go
snap := server.Collector().Snapshot()
fmt.Println(snap.RequestsTotal, snap.ErrorsTotal, snap.ActiveConns)
fmt.Println(snap.RequestsTotal, snap.ErrorsTotal, snap.ActiveConns, snap.CPUUtilization)
```

For CPU utilization tracking, register a monitor:

```go
mon, _ := observe.NewCPUMonitor()
server.Collector().SetCPUMonitor(mon)
defer mon.Close()
```

For Prometheus exposition and debug endpoints, use the [`middlewares/metrics`](https://github.com/goceleris/middlewares) and [`middlewares/debug`](https://github.com/goceleris/middlewares) packages.
Expand All @@ -252,50 +217,33 @@ For Prometheus exposition and debug endpoints, use the [`middlewares/metrics`](h
| Multishot recv | yes (6.0+) | no | no |
| Zero-alloc HEADERS | yes | yes | no |
| Inline H2 handlers | yes | yes | no |
| Detach / StreamWriter | yes | yes | yes |
| Connection hijack | yes | yes | yes |

## Benchmarks

Cloud benchmarks on arm64 c7g.2xlarge (8 vCPU Graviton3), separate server and client machines:

| Protocol | Engine | Throughput |
|----------|--------|-----------|
| HTTP/2 | epoll | **3.33M rps** |
| HTTP/2 | io_uring | **3.30M rps** |
| HTTP/1.1 | epoll | **590K rps** |
| HTTP/1.1 | io_uring | **590K rps** |
For current benchmark results and methodology, see [goceleris.dev/benchmarks](https://goceleris.dev/benchmarks).

- io_uring and epoll within **1%** of each other on both protocols
- H2 is **5.7x faster** than H1 (stream multiplexing + inline handlers)
- Zero allocations on the hot path for both H1 and H2
- All 3 engines within **0.3%** of each other in adaptive mode

Methodology: 14 server configurations (io_uring/epoll/std x latency/throughput/balanced x H1/H2) tested with `wrk` (H1, 16384 connections) and `h2load` (H2, 128 connections x 128 streams) in 9-pass interleaved runs. Full results and reproduction scripts are in the [benchmarks repo](https://github.com/goceleris/benchmarks).
Reproduction scripts are in the [benchmarks repo](https://github.com/goceleris/benchmarks).

## API Overview

| Type | Package | Description |
|------|---------|-------------|
| `Server` | `celeris` | Top-level entry point; owns config, router, engine |
| `Config` | `celeris` | Server configuration (addr, engine, protocol, timeouts) |
| `Config` | `celeris` | Server configuration (addr, engine, protocol, timeouts, body limits) |
| `Context` | `celeris` | Per-request context with params, headers, body, response methods |
| `HandlerFunc` | `celeris` | `func(*Context) error` — handler/middleware signature |
| `HTTPError` | `celeris` | Structured error carrying HTTP status code and message |
| `RouteGroup` | `celeris` | Group of routes sharing a prefix and middleware |
| `Route` | `celeris` | Opaque handle to a registered route |
| `RouteInfo` | `celeris` | Describes a registered route (method, path, handler count) |
| `Cookie` | `celeris` | HTTP cookie for `SetCookie`/`DeleteCookie` |
| `StreamWriter` | `celeris` | Incremental response writer for streaming/SSE |
| `EngineInfo` | `celeris` | Read-only info about the running engine (type + metrics) |
| `Collector` | `observe` | Lock-free request metrics aggregator |
| `Snapshot` | `observe` | Point-in-time copy of all collected metrics |

## Architecture

```mermaid
block-beta
columns 3
A["celeris (public API)"]:3
B["adaptive"]:1 C["observe"]:2
E["engine/iouring"]:1 F["engine/epoll"]:1 G["engine/std"]:1
H["protocol/h1"]:1 I["protocol/h2"]:1 J["protocol/detect"]:1
K["probe"]:1 L["resource"]:1 M["internal"]:1
```
| `CPUMonitor` | `observe` | CPU utilization sampler (Linux `/proc/stat`, other `runtime/metrics`) |

## Requirements

Expand All @@ -308,12 +256,13 @@ block-beta

```
adaptive/ Adaptive meta-engine (Linux)
celeristest/ Test helpers (NewContext, ResponseRecorder)
engine/ Engine interface + implementations (iouring, epoll, std)
internal/ Shared internals (conn, cpumon, platform, sockopts)
observe/ Lightweight metrics collector (atomic counters, Snapshot)
internal/ Shared internals (conn, cpumon, ctxkit, negotiate, platform, sockopts, timer)
observe/ Metrics collector, CPUMonitor, Snapshot
probe/ System capability detection
protocol/ Protocol parsers (h1, h2, detect)
resource/ Configuration, presets, objectives
resource/ Configuration, presets, defaults
test/ Conformance, spec compliance, integration, benchmarks
```

Expand Down
12 changes: 10 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@

| Version | Supported |
|----------------|--------------------|
| >= 1.1.0 | Yes |
| < 1.1.0 | No |
| >= 1.2.0 | Yes |
| < 1.2.0 | No |

Only the latest minor release receives security patches. Upgrade to the latest version to ensure you have all fixes.

Versions prior to 1.2.0 contain known vulnerabilities in the HTTP response writer, cookie handling, header sanitization, and the `Detach()` streaming mechanism that were identified and fixed during the v1.2.0 security review process. We strongly recommend upgrading immediately.

## Reporting a Vulnerability

If you discover a security vulnerability in celeris, please report it responsibly.
Expand All @@ -32,3 +34,9 @@ This policy covers the core `github.com/goceleris/celeris` module, including:
- I/O engines (io_uring, epoll, std)
- Request routing and context handling
- The net/http bridge adapter
- Response header sanitization (CRLF, null bytes, cookie attributes)
- Connection lifecycle management (Detach, StreamWriter, pool safety)
- Body size enforcement (MaxRequestBodySize across H1, H2, bridge)
- Callback safety (OnExpectContinue, OnConnect, OnDisconnect)

The middleware ecosystem (`github.com/goceleris/middlewares`) has its own security policy.
2 changes: 1 addition & 1 deletion server.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ import (
)

// Version is the semantic version of the celeris module.
const Version = "1.0.0"
const Version = "1.2.0"

// ErrAlreadyStarted is returned when Start or StartWithContext is called on a
// server that is already running.
Expand Down