Skip to content

OpenAgentsInc/psionic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,253 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Psionic

Psionic is a Rust-native ML and inference stack.

It owns the machine-facing execution substrate behind local inference, serving, training, distributed execution, artifact truth, and clustered compute. The project is broader than one app or one benchmark lane. It is the crate family that OpenAgents uses for inference, training, cluster bring-up, and execution evidence.

Start Here

Main Tracks

Local GPT-OSS Inference

Psionic ships a dedicated local GPT-OSS server in crates/psionic-serve/src/bin/psionic-gpt-oss-server.rs. It exposes:

  • GET /health
  • GET /v1/models
  • POST /v1/chat/completions

Build it:

cargo build -p psionic-serve --bin psionic-gpt-oss-server --release

Run it on a Linux NVIDIA host:

./target/release/psionic-gpt-oss-server \
  -m /path/to/gpt-oss-20b-mxfp4.gguf \
  --backend cuda \
  --host 127.0.0.1 \
  --port 8080 \
  -c 4096 \
  -ngl 999

Run it on Apple Silicon:

./target/release/psionic-gpt-oss-server \
  -m /path/to/gpt-oss-20b-mxfp4.gguf \
  --backend metal \
  --metal-mode native \
  --host 127.0.0.1 \
  --port 8080 \
  -c 1024 \
  -ngl 4

Call it:

curl -s http://127.0.0.1:8080/v1/models | jq

curl -s http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{
    "model": "gpt-oss-20b-mxfp4.gguf",
    "messages": [
      {"role": "system", "content": "You are ChatGPT."},
      {"role": "user", "content": "Why does HTTPS matter?"}
    ]
  }' | jq

Benchmark it against local llama.cpp:

scripts/benchmark-gpt-oss-vs-llama.sh \
  --psionic-backend cuda \
  --model /path/to/gpt-oss-20b-mxfp4.gguf \
  --llama-bin /path/to/llama-server \
  --json-out /tmp/psionic-gpt-oss-bench

More detail lives in docs/GPT_OSS_LOCAL_SERVING.md.

GPT-OSS Benchmark Proof

The current benchmark harness is scripts/benchmark-gpt-oss-vs-llama.sh. It uses the explicit GPT-OSS system/developer/user request contract, checks visible output equality, and records prompt-cache-hit throughput.

The closed benchmark proof referenced publicly here is:

  • OpenAgents issue comment: openagents#3248 comment 4028968842
  • exact reported result on that host:
    • Psionic prompt_cache_hit: 172.84 tok/s
    • llama.cpp prompt_cache_hit: 160.98 tok/s
    • prompt_cache_hit_visible_output_match=true
    • visible output: HTTPS protects users by encrypting traffic, preventing tampering, and confirming they are connected to the right website.

That proof is grounded in the shipped server binary, the shipped benchmark script, and the explicit hardware-validation posture in docs/HARDWARE_VALIDATION_MATRIX.md.

Project Shape

The main crate families are:

  • framework core: psionic-core, psionic-ir, psionic-compiler, psionic-runtime
  • backends: psionic-backend-cpu, psionic-backend-cuda, psionic-backend-metal
  • serving and provider surfaces: psionic-serve, psionic-provider, psionic-router
  • cluster and distributed execution: psionic-cluster, psionic-collectives, psionic-distributed, psionic-net
  • training and eval: psionic-train, psionic-data, psionic-eval, psionic-adapters

Use docs/WORKSPACE_MAP.md for the full doc index, crate map, and subsystem entrypoints.

About

Rust ML stack

Resources

License

Stars

Watchers

Forks

Contributors

Languages