These examples are intentionally split across two layers:
examples/*.yaml:void-controlexecution specsexamples/runtime-templates/*.yaml: runtime workflow templates launched byvoid-boxexamples/runtime-assets/: helper scripts and data mounted into runtime templates
Boundary:
void-controlowns orchestration concerns such asmode,policy,evaluation,variation,swarm, andsupervision.void-boxonly receives a runtime template path plus a patched per-candidate workflow spec. It does not need to understand swarm/supervision strategies.
swarm-transform-optimization-3way.yaml- canonical Transform-02 swarm test
- launches 3 sibling candidates in parallel
- most reliable live validation path today
swarm-transform-optimization.yaml- wide Transform-02 swarm stress example
- launches 8 sibling candidates in parallel
- scores
latency_p99_ms,error_rate, andcpu_pct
supervision-transform-review.yaml- canonical Transform-02 supervision example
- launches 3 worker runs under a flat supervisor
- finalizes once every worker emits
metrics.approved = 1.0
runtime-templates/transform_optimizer_agent.yaml- plain runtime template used by the swarm example
runtime-templates/transform_supervision_worker.yaml- runtime template used by the supervision example
Both Transform-02 swarm examples use measured metrics from a local fixture replay, not prompt-invented values.
Recommended test path:
- use
swarm-transform-optimization-3way.yamlfor routine live validation - use
swarm-transform-optimization.yamlonly when you explicitly want the wider 8-candidate stress case
Flow:
examples/swarm-transform-optimization-3way.yaml
->
void-control creates sibling candidates
->
void-box launches service-mode runs
->
each run mounts examples/runtime-assets read-only
->
python3 /workspace/runtime-assets/transform_benchmark.py
->
benchmark processes the same transform_02 fixture corpus
->
benchmark writes /workspace/output.json
->
void-control collects latency_p99_ms, error_rate, cpu_pct
->
weighted scoring picks the best candidate
Baseline:
- candidate 1 in both Transform-02 swarm specs is the baseline
- it uses
TRANSFORM_STRATEGY=baseline - every other candidate runs the same fixture corpus with different strategy env overrides
Metric source of truth:
examples/runtime-assets/transform_benchmark.pycomputes the metricslatency_p99_mscomes from measured per-record timingserror_ratecomes from actual validation/transform failures on the fixturescpu_pctcomes from measured process CPU time versus wall-clock time- the agent reads the measured result and may summarize it, but it must not invent or overwrite the metrics
These examples assume a sibling checkout layout:
<workspace>/void-box # the runtime repo
<workspace>/void-control # this repo
Substitute <workspace> for your local path (e.g. ~/dev/repos,
~/github, etc.). The runtime template mount path
../../void-control/examples/runtime-assets resolves to
<workspace>/void-control/examples/runtime-assets when the daemon is
started from <workspace>/void-box.
Build the production void-box rootfs in the sibling repo:
cd <workspace>/void-box
TMPDIR=$PWD/target/tmp scripts/build_claude_rootfs.shStart the void-box daemon (Linux):
cd <workspace>/void-box
export ANTHROPIC_API_KEY=sk-ant-... # or use provider: claude-personal, see below
export VOID_BOX_KERNEL=/boot/vmlinuz-$(uname -r)
export VOID_BOX_INITRAMFS=$PWD/target/void-box-rootfs.cpio.gz
cargo run --bin voidbox -- serve --listen 127.0.0.1:43100Start the void-box daemon (macOS, Apple Silicon):
cd <workspace>/void-box
export VOID_BOX_KERNEL=$PWD/target/vmlinuz
export VOID_BOX_INITRAMFS=$PWD/target/void-box-claude.cpio.gz
# ANTHROPIC_API_KEY is not needed if the runtime template sets
# `llm.provider: claude-personal` and you have a Claude subscription
# (credentials come from the macOS Keychain or ~/.claude/.credentials.json).
cargo run --bin voidbox -- serve --listen 127.0.0.1:43100Start the void-control bridge:
cd <workspace>/void-control
cargo run --features serde --bin voidctl -- serveThe checked-in runtime templates hardcode llm.provider: claude so CI keeps
using an API key. To opt into claude-personal without editing tracked
templates, set VOID_CONTROL_LLM_PROVIDER when starting the bridge (or any
other process that launches runs through void-control):
cd <workspace>/void-control
VOID_CONTROL_LLM_PROVIDER=claude-personal \
cargo run --features serde --bin voidctl -- serveThis patches llm.provider on every candidate's runtime template at launch
time. Per-candidate variation.explicit[].overrides still win, so you can
keep a mixed setup if needed.
Swarm:
cd <workspace>/void-control
curl -sS -X POST http://127.0.0.1:43210/v1/executions \
-H 'Content-Type: text/yaml' \
--data-binary @examples/swarm-transform-optimization-3way.yamlSupervision:
cd <workspace>/void-control
curl -sS -X POST http://127.0.0.1:43210/v1/executions \
-H 'Content-Type: text/yaml' \
--data-binary @examples/supervision-transform-review.yamlStress swarm:
cd <workspace>/void-control
curl -sS -X POST http://127.0.0.1:43210/v1/executions \
-H 'Content-Type: text/yaml' \
--data-binary @examples/swarm-transform-optimization.yaml- Start the UI:
cd <workspace>/void-control/web/void-control-ux
npm run dev -- --host 127.0.0.1 --port 3000- Open the Launch modal.
- Upload one of the top-level execution specs from
examples/. - Keep the referenced runtime template path unchanged unless you moved the repo.
- The transform runtime template uses
agent.output_file: /workspace/output.jsonsovoid-controlcan collect structured metrics directly. - The supervision worker template uses
examples/runtime-assets/transform_supervision_worker.pyto appendmetrics.approvedafter the measured benchmark run. That is the current v1 supervision approval contract. examples/runtime-templates/transform_optimizer_agent.yamlusessandbox.image: "python:3.12-slim"because the production Claude rootfs does not includepython3.- The transform benchmark and fixtures are mounted into the guest read-only from the host repo. Keep the canonical sibling repo layout unless you also update the mount path in the template.
- Candidate variation is applied by
void-controlby patching fields such asagent.promptandsandbox.env.*before launch. - The strategy labels in env vars are only inputs for the prompt/runtime
template.
void-boxdoes not interpret them semantically. - The 3-agent swarm is the preferred test target because it currently completes more reliably than the 8-agent stress variant while exercising the same control-plane and service-mode primitives.