feat: add reproducible API benchmark suite by DYSfu · Pull Request #668 · SecureBananaLabs/bug-bounty

DYSfu · 2026-05-24T03:08:50Z

/claim #30

Summary

Adds a dependency-free benchmark runner under benchmarks/ for /health plus all mounted /api/* routes.
Captures p50/p95/p99 latency, p95 TTFB, sustained RPS, peak RPS, error rate, and status-code distribution per endpoint.
Writes JSON and Markdown results under benchmarks/results/, with a smoke regression gate in GitHub Actions.
Includes an OBS-recorded demo artifact at demos/api-benchmark-demo.mp4.

Endpoint	p50 ms	p95 ms	p99 ms	TTFB p95 ms	Sustained RPS	Peak RPS
GET /health	0.79	1.81	1.81	1.7	1002.54	1443.17
POST /api/auth/register	1.24	2.34	2.34	2.15	708.13	1424.5
POST /api/auth/login	0.63	1.38	1.38	1.33	1325.66	2014.61
GET /api/auth/oauth/github/callback	0.35	0.88	0.88	0.84	2075.55	2980.99
POST /api/auth/refresh	0.31	0.4	0.4	0.37	3037.44	3573.56
GET /api/users	0.28	0.28	0.28	0.26	3762.7	4109.59
POST /api/users	0.42	0.56	0.56	0.52	2310.14	2851.37
GET /api/jobs	0.3	0.41	0.41	0.37	3124.67	3576.22
POST /api/jobs	0.54	0.62	0.62	0.59	1830.86	1975.8
GET /api/proposals	0.32	0.45	0.45	0.42	3003.45	3908.16
POST /api/proposals	0.31	0.37	0.37	0.34	3032.14	3282.27
POST /api/payments	0.33	1.25	1.25	1.19	1952.97	3294.45
GET /api/reviews	0.3	0.41	0.41	0.38	3258.92	3953.88
POST /api/reviews	0.31	0.39	0.39	0.36	3087.93	3386.48
GET /api/messages	0.25	0.28	0.28	0.25	3915.94	4148.67
POST /api/messages	0.34	0.47	0.47	0.43	2689.08	3275.55
GET /api/notifications	0.25	0.36	0.36	0.34	3542.02	4137.93
POST /api/notifications	0.31	0.38	0.38	0.35	3076.29	3379.81
POST /api/uploads	0.89	2.89	2.89	2.86	827.96	1604.17
GET /api/search?q=contract	0.44	0.96	0.96	0.92	1851.88	3108.41
GET /api/admin/metrics	0.36	0.8	0.8	0.77	2196.31	3406.19

Hardware

CPU model & core count: Apple M4 Pro, 12 cores
RAM (total & available during benchmark): 24 GB total, about 10 GB available before the run
Storage type (SSD / NVMe / HDD): local SSD
Network interface (Ethernet / WiFi / loopback): loopback for the benchmark target
Machine type (local workstation / cloud VM / CI runner — include instance type if cloud): local workstation
OS & version: macOS 26.4.1

Runtime

Node.js version (or relevant runtime): Node.js v22.22.0, npm 10.9.4
Any resource limits applied (Docker memory cap, cgroup limits, etc.): none applied
Other significant processes running during benchmark (yes / no — if yes, describe): yes, normal desktop/background processes

If submitted by or with an AI agent

Agent or tool name (e.g. Claude Code, Devin, Copilot Workspace, AutoGPT): OpenAI Codex desktop app
Underlying model and version (e.g. claude-sonnet-4-5, gpt-4o — if known): GPT-5-based Codex model
Inference provider (e.g. Anthropic, OpenAI, Azure, self-hosted): OpenAI
Orchestration framework if any (e.g. LangChain, AutoGen, custom): none beyond Codex shell/GitHub tooling
Execution mode (fully autonomous / human-supervised / human-initiated per step): human-initiated, agent-executed
Did the agent have shell/tool access during execution (yes / no): yes
Did the agent have internet access during execution (yes / no): yes
Were benchmark commands run by the agent directly or handed off to the human to run: run directly by the agent
Any known agent constraints or sandboxing that may have affected execution: local loopback target; benchmark results are for this local workstation rather than staging or production

feat: add reproducible API benchmark suite

e2f24db

github-actions Bot added a commit that referenced this pull request May 24, 2026

chore: update leaderboard for PR #668

6fa7bc1