Psionic is a Rust-native ML and inference stack.
It owns the machine-facing execution substrate behind local inference, serving, training, distributed execution, artifact truth, and clustered compute. The project is broader than one app or one benchmark lane. It is the crate family that OpenAgents uses for inference, training, cluster bring-up, and execution evidence.
- System architecture: docs/ARCHITECTURE.md
- Detailed workspace map: docs/WORKSPACE_MAP.md
- Inference and serving: docs/INFERENCE_ENGINE.md
- Training system: docs/TRAIN_SYSTEM.md
- Full roadmap: docs/ROADMAP.md
- Inference and local serving
- local GPT-OSS server and benchmark harness
- generic OpenAI-compatible server surfaces
- hardware validation and backend truth
- start with docs/GPT_OSS_LOCAL_SERVING.md
- Parameter Golf and distributed training
- single-H100, distributed
8xH100, submission, evidence, and score-path work - start with docs/ROADMAP_PARAMETERGOLF.md
- supporting docs: docs/PARAMETER_GOLF_SINGLE_H100_TRAINER.md, docs/PARAMETER_GOLF_DISTRIBUTED_8XH100.md, docs/PARAMETER_GOLF_RUNPOD_8XH100_RUNBOOK.md
- single-H100, distributed
- Cluster, swarm, and cross-provider compute
- local mixed-hardware swarm, Google dual-node swarm, cross-provider training contracts
- start with docs/ROADMAP_CLUSTER.md
- supporting docs: docs/FIRST_SWARM_TRUSTED_LAN_RUNBOOK.md, docs/PSION_GOOGLE_TWO_NODE_SWARM_RUNBOOK.md, docs/TRAIN_ARTIFACT_STORAGE_REFERENCE.md
- Psion learned-model program
- corpus, tokenizer, pretrain, trusted-cluster, and decentralized contribution work
- start with docs/PSION_PROGRAM_MAP.md
- supporting docs: docs/PSION_PRETRAIN_STAGE.md, docs/PSION_TRUSTED_CLUSTER_RUN.md, docs/PSION_DECENTRALIZED_CONTRIBUTION.md
- Tassadar executor lane
- executor-class research and runtime work for exact computation
- start with docs/ROADMAP_TASSADAR.md
Psionic ships a dedicated local GPT-OSS server in
crates/psionic-serve/src/bin/psionic-gpt-oss-server.rs. It exposes:
GET /healthGET /v1/modelsPOST /v1/chat/completions
Build it:
cargo build -p psionic-serve --bin psionic-gpt-oss-server --releaseRun it on a Linux NVIDIA host:
./target/release/psionic-gpt-oss-server \
-m /path/to/gpt-oss-20b-mxfp4.gguf \
--backend cuda \
--host 127.0.0.1 \
--port 8080 \
-c 4096 \
-ngl 999Run it on Apple Silicon:
./target/release/psionic-gpt-oss-server \
-m /path/to/gpt-oss-20b-mxfp4.gguf \
--backend metal \
--metal-mode native \
--host 127.0.0.1 \
--port 8080 \
-c 1024 \
-ngl 4Call it:
curl -s http://127.0.0.1:8080/v1/models | jq
curl -s http://127.0.0.1:8080/v1/chat/completions \
-H 'content-type: application/json' \
-d '{
"model": "gpt-oss-20b-mxfp4.gguf",
"messages": [
{"role": "system", "content": "You are ChatGPT."},
{"role": "user", "content": "Why does HTTPS matter?"}
]
}' | jqBenchmark it against local llama.cpp:
scripts/benchmark-gpt-oss-vs-llama.sh \
--psionic-backend cuda \
--model /path/to/gpt-oss-20b-mxfp4.gguf \
--llama-bin /path/to/llama-server \
--json-out /tmp/psionic-gpt-oss-benchMore detail lives in docs/GPT_OSS_LOCAL_SERVING.md.
The current benchmark harness is scripts/benchmark-gpt-oss-vs-llama.sh. It
uses the explicit GPT-OSS system/developer/user request contract, checks visible
output equality, and records prompt-cache-hit throughput.
The closed benchmark proof referenced publicly here is:
- OpenAgents issue comment: openagents#3248 comment 4028968842
- exact reported result on that host:
- Psionic
prompt_cache_hit:172.84 tok/s llama.cpp prompt_cache_hit:160.98 tok/sprompt_cache_hit_visible_output_match=true- visible output:
HTTPS protects users by encrypting traffic, preventing tampering, and confirming they are connected to the right website.
- Psionic
That proof is grounded in the shipped server binary, the shipped benchmark script, and the explicit hardware-validation posture in docs/HARDWARE_VALIDATION_MATRIX.md.
The main crate families are:
- framework core:
psionic-core,psionic-ir,psionic-compiler,psionic-runtime - backends:
psionic-backend-cpu,psionic-backend-cuda,psionic-backend-metal - serving and provider surfaces:
psionic-serve,psionic-provider,psionic-router - cluster and distributed execution:
psionic-cluster,psionic-collectives,psionic-distributed,psionic-net - training and eval:
psionic-train,psionic-data,psionic-eval,psionic-adapters
Use docs/WORKSPACE_MAP.md for the full doc index, crate map, and subsystem entrypoints.