Keep Codex Cheap

A cache-first Codex skill for reducing repeated prompt cost without changing how Codex thinks or works.

The idea is simple:

Keep the prefix stable. Keep the session sticky. Let official prompt caching do the discounting.

This repository is inspired by:

keep-codex-fast: safe local Codex state maintenance
XAI Router's Codex cache guidance: session-aware routing, stable prefixes, and low-rewrite Responses forwarding

It does not promise magic savings. It helps you inspect whether your Codex setup is likely to preserve prompt-cache hits across long coding sessions.

How It Becomes Cheap

Codex is expensive in long sessions because every turn can resend a large repeated prefix:

system and developer instructions
tool definitions and JSON schemas
repo rules such as AGENTS.md
stable project context
previous session and conversation identifiers

Prompt caching can make that repeated prefix cheaper, but only when the provider sees the same beginning of the request again. The cache is strict: similar text is not enough; the prefix has to stay stable enough to match.

keep-codex-cheap helps with the parts a user can control:

Stable provider: do not bounce the same task between providers or upstream keys.
Stable transport: prefer one Codex path for the task, especially Responses API.
Stable session: WebSocket mode and session-aware routing make it easier for later turns to land near existing cache.
Stable model settings: model and reasoning effort changes can create different request buckets.
Stable static context: keep repeated rules and tool context at the front; put changing task details later when possible.

The savings come from the provider charging or processing cached input more cheaply than uncached input. This repository does not hide context from Codex, truncate important instructions, or rewrite the model's task. It makes the official cache path easier to hit.

The rough mental model is:

same long prefix + same session route + Responses-compatible transport
  -> higher prompt-cache hit probability
  -> less repeated prefill work
  -> lower repeated-input cost and latency

What It Checks

The bundled Rust CLI is read-only by default. It inspects a Codex config.toml and reports:

whether the configured provider points at https://api.xairouter.com
whether wire_api = "responses" is set
whether WebSocket mode is enabled when you expect long sessions
whether env_key is configured and present in the current shell
whether model and reasoning settings are stable enough for repeat sessions
whether the config looks likely to drift between providers or transport modes

It also prints copy-ready HTTP and WebSocket configuration templates.

Quick Start

Ask Codex:

Use $keep-codex-cheap to inspect my Codex config and tell me whether it is prompt-cache friendly.

Or run the CLI directly:

cargo run --quiet

Print the recommended WebSocket template:

cargo run --quiet -- --print-ws-config

Print the simpler HTTP template:

cargo run --quiet -- --print-http-config

Inspect a custom config path:

cargo run --quiet -- --config /path/to/config.toml

Recommended WebSocket Config

Use this when you want stronger long-session continuity:

model_provider = "xai"
model = "gpt-5.4"
model_reasoning_effort = "xhigh"
plan_mode_reasoning_effort = "xhigh"
model_reasoning_summary = "none"
model_verbosity = "medium"
approval_policy = "never"
sandbox_mode = "danger-full-access"
suppress_unstable_features_warning = true

[model_providers.xai]
name = "OpenAI"
base_url = "https://api.xairouter.com"
wire_api = "responses"
requires_openai_auth = false
env_key = "XAI_API_KEY"
supports_websockets = true

[features]
responses_websockets_v2 = true

Recommended HTTP Config

Use this when you prefer a simpler, broadly compatible setup:

model_provider = "xai"
model = "gpt-5.4"
model_reasoning_effort = "xhigh"
plan_mode_reasoning_effort = "xhigh"
model_reasoning_summary = "none"
model_verbosity = "medium"
approval_policy = "never"
sandbox_mode = "danger-full-access"

[model_providers.xai]
name = "OpenAI"
base_url = "https://api.xairouter.com"
wire_api = "responses"
requires_openai_auth = false
env_key = "XAI_API_KEY"

Cheapness Checklist

Keep static instructions, tool schemas, and repo rules stable.
Avoid switching providers, models, or transport modes mid-task.
Prefer Responses API for Codex-style workflows.
Use WebSocket mode for long interactive sessions when available.
Keep session and conversation continuity intact.
Put dynamic task details after stable context when you control prompt layout.
Do not chase artificial cache metrics by rewriting request semantics.

What It Does Not Do

It does not make every token cheap.
It does not cache model outputs or replay old answers.
It does not share cache across organizations.
It does not mutate ~/.codex/config.toml unless a future command explicitly implements that and you ask for it.
It does not print API keys.

Install

Ask Codex:

Install the keep-codex-cheap skill from https://github.com/3873225350/keep-codex-cheap

Or clone/copy this folder into your Codex skills directory as keep-codex-cheap.

Build

cargo build --release

The binary will be available at:

target/release/keep-codex-cheap

Run validation:

cargo test

Privacy And Safety

Report mode does not write files. It prints only configuration health and hides environment variable values. It never prints API keys.

If you share reports publicly, review local paths and provider names first.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
references		references
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keep Codex Cheap

How It Becomes Cheap

What It Checks

Quick Start

Recommended WebSocket Config

Recommended HTTP Config

Cheapness Checklist

What It Does Not Do

Install

Build

Privacy And Safety

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Keep Codex Cheap

How It Becomes Cheap

What It Checks

Quick Start

Recommended WebSocket Config

Recommended HTTP Config

Cheapness Checklist

What It Does Not Do

Install

Build

Privacy And Safety

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages