feat: add Wafer AI as a first-party OpenAI-compatible provider by ianye23301 · Pull Request #28637 · BerriAI/litellm

ianye23301 · 2026-05-22T18:29:10Z

Summary

Adds Wafer AI as an OpenAI-compatible provider following the documented JSON-only path.

Wafer is an OpenAI-compatible inference gateway that serves frontier open models via https://api.wafer.ai/v1 (bearer auth, standard OpenAI chat-completions shape, SSE streaming, tool/function calling). After this PR:

from litellm import completion
import os

os.environ["WAFER_API_KEY"] = "..."
resp = completion(
    model="wafer/GLM-5.1",
    messages=[{"role": "user", "content": "hi"}],
    stream=True,
)

What changed

Four files, ~260 lines:

litellm/llms/openai_like/providers.json — register wafer with base_url=https://api.wafer.ai/v1, api_key_env=WAFER_API_KEY, api_base_env=WAFER_API_BASE, and max_completion_tokens → max_tokens param mapping. No custom transformation needed.
provider_endpoints_support.json — required by the code-quality doc-coverage check for every JSON-registered provider.
model_prices_and_context_window.json + litellm/model_prices_and_context_window_backup.json — 7 chat models with per-token pricing, context windows, and capability flags:

Model	Context	$/M in	$/M out	Tools	Vision	Reasoning
`wafer/GLM-5.1`	128K	1.50	4.50	✅	—	—
`wafer/Qwen3.5-397B-A17B`	128K	0.60	3.60	✅	✅	—
`wafer/Qwen3.6-35B-A3B`	32K	0.19	1.25	✅	✅	—
`wafer/deepseek-v4-flash`	128K	0.18	0.35	✅	—	✅
`wafer/deepseek-v4-pro`	128K	2.18	4.35	✅	—	✅
`wafer/qwen3.6-max-preview`	256K	1.43	8.58	✅	—	✅
`wafer/Kimi-K2.6`	262K	1.10	4.80	✅	✅	—

Cache-read prices included where applicable.

Docs — companion PR at BerriAI/litellm-docs#205 per the AGENTS.md guidance.

Test plan

JSONProviderRegistry.get("wafer") returns the expected config (base_url, api_key_env, param_mappings)
litellm.get_llm_provider("wafer/GLM-5.1", api_key="...") resolves to (model="GLM-5.1", provider="wafer", base="https://api.wafer.ai/v1")
completion(model="wafer/GLM-5.1", ...) hits api.wafer.ai/v1/chat/completions with the bearer header; errors surface as WaferException
python3 tests/code_coverage_tests/check_provider_folders_documented.py → 19 openai_like providers, 150 endpoint-support entries, all documented
All 7 wafer/<model> entries are in litellm.model_cost with input_cost_per_token > 0 and output_cost_per_token > 0

🤖 Generated with Claude Code

codecov · 2026-05-22T18:32:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ianye23301 · 2026-05-22T18:32:50Z

CI summary after the rerun: 39 pass / 2 fail / 1 cancel.

The only red checks are test-server-root-path (/api/v1) and test-server-root-path (/llmproxy). Both fail at Docker build time, not in any application code:

RUN prisma generate --schema=./schema.prisma
  → npm install prisma@5.4.2
  → node: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory
  → exit code 127

This is an upstream issue with the Docker base image used by the Test Proxy SERVER_ROOT_PATH Routing workflow — libatomic.so.1 isn't present, so node can't run. The same failure is reproducing on every other recent PR I checked (fix/langfuse-closed-client-LIT-3221, fix/fireworks-schema-sanitize-nested-fields, fix/responses-adapter-cache-read-tokens, …), so it's a fleet-wide infra problem, not caused by this PR.

For reference, all Wafer-relevant checks are green: lint, code-quality, unit-test, validate-model-prices-json, llm-provider-tests, openai-anthropic-vertex-bedrock-tests, llm-handler-tests, core-utils, documentation, Analyze (python), Verify PR source branch, etc.

Happy to retrigger once the base image is fixed.

greptile-apps · 2026-05-22T18:33:20Z

Greptile Summary

This PR adds Wafer AI as a first-party OpenAI-compatible provider following the same pattern as the Featherless AI integration. The implementation is thorough and well-tested with 14 mock-only unit tests covering the full registration surface.

New WaferConfig(OpenAIGPTConfig) in litellm/llms/wafer/chat/transformation.py handles auth headers, base URL resolution, and max_completion_tokens → max_tokens aliasing; all provider-registry wiring in constants.py, __init__.py, utils.py, get_llm_provider_logic.py, and types/utils.py follows the established pattern.
Seven models added to model_prices_and_context_window.json and its backup with pricing, context windows, and capability flags.
A hardcoded wafer_models set in constants.py (bare model names without the wafer/ prefix) appears to be dead code — it is never imported or read outside that file; the authoritative set is populated at runtime from the JSON price map in __init__.py.

Confidence Score: 4/5

The change is additive and isolated; existing providers are unaffected.

The integration follows the Featherless AI template closely and all changed wiring paths are well-tested. The only notable gaps are a missing None guard when remapping max_completion_tokens (a null value would be forwarded to the upstream API) and a redundant hardcoded model list in constants.py that will silently drift from the JSON price map. Neither affects existing functionality.

litellm/constants.py (dead wafer_models set) and litellm/llms/wafer/chat/transformation.py (None handling in map_openai_params)

Important Files Changed

Filename	Overview
litellm/llms/wafer/chat/transformation.py	New WaferConfig extending OpenAIGPTConfig; minor None-handling inconsistency in map_openai_params for max_completion_tokens
litellm/constants.py	Adds wafer to provider lists and a hardcoded wafer_models set that duplicates the JSON price map; follows featherless_ai pattern but is dead code
litellm/litellm_core_utils/get_llm_provider_logic.py	Adds endpoint-detection and provider-info branch for wafer; correctly respects explicitly passed api_key
litellm/utils.py	Adds env-var validation check and ProviderConfigManager mapping for wafer; follows established pattern
litellm/types/utils.py	Adds LlmProviders.WAFER enum entry; clean addition
litellm/init.py	Wires wafer_models set, add_known_models branch, model_list union, and models_by_provider entry
model_prices_and_context_window.json	Adds 7 wafer model entries with pricing, context windows, and capability flags
provider_endpoints_support.json	Adds wafer entry with responses:true; consistent with featherless_ai pattern
tests/test_litellm/llms/wafer/chat/test_wafer_chat_transformation.py	14 mocked unit tests covering header injection, missing key, env-var precedence, param mapping, and registry wiring

_{Reviews (1): Last reviewed commit: "feat: add Wafer AI as a first-party Open..." | Re-trigger Greptile}

greptile-apps · 2026-05-22T18:33:24Z

+wafer_models: set = set(
+    [
+        "GLM-5.1",
+        "Qwen3.5-397B-A17B",
+        "Qwen3.6-35B-A3B",
+        "deepseek-v4-flash",
+        "deepseek-v4-pro",
+        "qwen3.6-max-preview",
+        "Kimi-K2.6",
+    ]
+)
+


Unreferenced hardcoded model list

wafer_models defined here (bare model names, no wafer/ prefix) is never imported or read anywhere in the codebase — grep finds no references outside this file. The authoritative model list already lives in model_prices_and_context_window.json, and litellm/__init__.py populates its own wafer_models set from that JSON at startup. This constants.py copy will silently drift whenever models are added or removed from the price map. The same issue exists in featherless_ai_models, but it's worth not propagating the pattern.

greptile-apps · 2026-05-22T18:33:25Z

+        for param, value in non_default_params.items():
+            if param == "max_completion_tokens":
+                optional_params["max_tokens"] = value
+            elif param in supported_openai_params:
+                if value is not None:
+                    optional_params[param] = value


Missing None guard on the alias mapping branch

Every other param in the loop is protected by a None check, but the max_completion_tokens alias writes unconditionally to optional_params. Passing a None value for this param therefore forwards a null field to the upstream API, which can produce a validation error. Wrapping the assignment in a None check matches the convention used for all other params in this method.

ianye23301 · 2026-05-22T18:55:51Z

Update after pushing 6a731f3:

CI tally on the new SHA: 39 pass / 2 fail.

The 2 remaining red checks are both unrelated to Wafer:

test-server-root-path (/llmproxy) — Docker build fails at RUN prisma generate --schema=./schema.prisma with node: libatomic.so.1: cannot open shared object file. Same failure on every recent PR in the repo (fix/langfuse-closed-client-LIT-3221, fix/fireworks-schema-sanitize-nested-fields, fix/responses-adapter-cache-read-tokens, …). Base image needs libatomic installed.
misc / Run tests — single test failure in tests/test_litellm/interactions/test_openapi_compliance.py::TestResponseCompliance::test_status_enum_values:
```
assert ['in_progress', 'requires_action', 'completed', 'failed', 'cancelled', 'incomplete', 'budget_exceeded']
    == ['in_progress', 'requires_action', 'completed', 'failed', 'cancelled', 'incomplete']
```
The OpenAPI schema gained budget_exceeded but this test wasn't updated. No Wafer touch.

My-side checks fixed by 6a731f3:

✅ All Other Providers / Run tests — the wafer model-cost test now reads the bundled model_prices_and_context_window_backup.json instead of litellm.model_cost (which is fetched from main and lacks this PR's entries until merge).
✅ codecov/patch — wafer module at 100% line coverage.
✅ Endpoint-detection branch in get_llm_provider_logic.py now reachable (the previous "https://api.wafer.ai/v1" literal never matched the api.wafer.ai/v1 entry stored in openai_compatible_endpoints; tests now exercise both branches).

Wafer-relevant checks all green: lint, code-quality, unit-test, validate-model-prices-json, llm-provider-tests, openai-anthropic-vertex-bedrock-tests, llm-handler-tests, core-utils, Analyze (python), Verify PR source branch, codecov/patch, documentation, etc.

Two P2 review comments from Greptile on PR BerriAI#28637: 1. **Dead `wafer_models` set in `constants.py`** — the hardcoded set was never imported or referenced anywhere; the authoritative wafer-model set is populated at startup in ``litellm/__init__.py`` from ``model_prices_and_context_window.json``. Removing the dead copy (which would silently drift from the JSON price map). (Note: the Featherless PR I templated from has the same dead code; not propagating that mistake here.) 2. **`max_completion_tokens` alias bypassed the `None` guard** in ``map_openai_params`` — every other supported param checked ``value is not None`` before forwarding, but the alias branch wrote unconditionally. A caller passing ``max_completion_tokens=None`` would therefore forward ``max_tokens: null`` to Wafer's upstream API, which rejects null max_tokens with a 400 validation error. Fixed by hoisting the ``value is None`` check to the top of the loop so it covers both the alias and passthrough branches. Added a dedicated test ``test_map_openai_params_max_completion_tokens_none_is_dropped`` that locks down the behavior so a future refactor can't quietly regress it. Verification: - ``pytest tests/test_litellm/llms/wafer/ --cov=litellm.llms.wafer`` → 19/19 pass, 100% line coverage on the wafer module - ``ruff check`` + ``black --check`` clean

CLAassistant · 2026-05-22T19:11:32Z

All committers have signed the CLA.

ianye23301 · 2026-05-22T19:11:42Z

Addressed both Greptile P2 comments in 6a8e88d:

Dead wafer_models set in constants.py — removed. The authoritative set is built at startup in litellm/__init__.py from model_prices_and_context_window.json; the constants copy was unreachable and would have silently drifted. (The Featherless template I followed has the same dead block — not propagating it here.)
max_completion_tokens alias bypassed the None guard — fixed by hoisting if value is None: continue to the top of map_openai_params's loop so the alias and passthrough branches both honor it. Added a test test_map_openai_params_max_completion_tokens_none_is_dropped to lock the behavior down.

Local verification: 19/19 pass on tests/test_litellm/llms/wafer/, 100% line coverage on the wafer module, ruff check + black --check clean.

Re the still-red test-server-root-path / misc checks — confirmed both are upstream and unrelated to this PR:

test-server-root-path (/api/v1) and (/llmproxy): the libatomic.so.1: cannot open shared object file failure during prisma generate was repo-wide as of ~18:37 UTC, but I see runs against litellm_oss_staging started succeeding around 18:39 UTC (e.g. litellm_redis-circuit-breaker-fix, litellm_mc_purview_guardrails, litellm_shin_staging_05_22_2026). The new push should retrigger and pick that up.
misc / Run tests: failing on tests/test_litellm/interactions/test_openapi_compliance.py::TestResponseCompliance::test_status_enum_values — the OpenAPI schema gained a budget_exceeded enum but the test wasn't updated. Nothing Wafer-related touches that file.

I'll post the new CI tally once the rerun settles.

ianye23301 · 2026-05-22T19:25:58Z

Final CI tally on SHA 6a8e88d: 39 pass / 2 fail / 1 cancel.

All Wafer-relevant checks pass: unit-test, lint, code-quality, validate-model-prices-json, documentation, Verify PR source branch, llm-provider-tests, openai-anthropic-vertex-bedrock-tests, llm-handler-tests, core-utils, Analyze (python), codecov/patch, All Other Providers / Run tests, and everything else.

The 2 still-red checks are both genuinely unrelated to this PR — I dug into both:

1. `test-server-root-path (/api/v1)` and `(/llmproxy)` — Dockerfile base-image issue

The build fails at RUN prisma generate --schema=./schema.prisma:

node: error while loading shared libraries: libatomic.so.1:
       cannot open shared object file: No such file or directory
subprocess.CalledProcessError: Command '['/root/.cache/prisma-python/nodeenv/bin/npm', 'install', 'prisma@5.4.2']'
       returned non-zero exit status 127.

docker/Dockerfile.non_root uses cgr.dev/chainguard/wolfi-base and installs nodejs via apk add but doesn't pull in libatomic. Prisma 5.4.2's bundled nodeenv binary links against it.

I confirmed by surveying ~30 recent runs of the Test Proxy SERVER_ROOT_PATH Routing workflow across the repo — failures and successes alternate based on whether the run pulls a warm prisma generate layer from the GHA build cache (type=gha). My PR's cache shard does not have a warm layer, so the build re-runs and hits the libatomic error. This is not something this PR can or should fix (would be out of scope for adding a provider). A one-liner adding libatomic to the apk install block in docker/Dockerfile.non_root would resolve it.

2. `misc / Run tests` — unrelated schema-drift test

tests/test_litellm/interactions/test_openapi_compliance.py::TestResponseCompliance::test_status_enum_values
  AssertionError:
    ['in_progress', 'requires_action', 'completed', 'failed', 'cancelled', 'incomplete', 'budget_exceeded']
    ==
    ['in_progress', 'requires_action', 'completed', 'failed', 'cancelled', 'incomplete']

The OpenAPI schema gained a budget_exceeded status value; this test wasn't updated. Touches no Wafer code paths. Failing fleet-wide.

Happy to retrigger CI once either is patched on litellm_oss_staging. Otherwise this PR is ready for review whenever maintainers are.

Per the maintainer feedback on BerriAI#28637 and the docs at https://docs.litellm.ai/docs/contributing/adding_openai_compatible_providers, this replaces the previous 11-file first-party wiring with the documented JSON-only approach. Four files changed: - ``litellm/llms/openai_like/providers.json`` — register ``wafer`` with ``base_url=https://api.wafer.ai/v1``, ``api_key_env=WAFER_API_KEY``, ``api_base_env=WAFER_API_BASE``, and ``max_completion_tokens`` → ``max_tokens`` param mapping. Wafer is an OpenAI-compatible inference gateway; no custom transformation required. - ``provider_endpoints_support.json`` — required by the code-quality doc-coverage check for every entry in ``openai_like/providers.json``. - ``model_prices_and_context_window.json`` (+ bundled backup) — pricing and capability metadata for the 7 current Wafer chat models: GLM-5.1, Qwen3.5-397B-A17B, Qwen3.6-35B-A3B, deepseek-v4-flash, deepseek-v4-pro, qwen3.6-max-preview, Kimi-K2.6. Per-token input / output / cache-read costs and capability flags (tools, vision, reasoning where applicable). Verified locally: - ``JSONProviderRegistry.get("wafer")`` returns the expected config. - ``litellm.get_llm_provider("wafer/GLM-5.1", api_key="...")`` resolves to ``provider=wafer, base=https://api.wafer.ai/v1``. - ``completion(model="wafer/GLM-5.1", ...)`` hits ``api.wafer.ai/v1/chat/completions`` with the bearer header and surfaces upstream errors as ``WaferException``. - ``tests/code_coverage_tests/check_provider_folders_documented.py`` passes (19 openai_like providers, 150 entries in provider_endpoints_support.json). - All 7 ``wafer/<model>`` entries are in ``litellm.model_cost`` with ``input_cost_per_token > 0`` and ``output_cost_per_token > 0``.

ianye23301 · 2026-05-26T19:09:26Z

Thanks for the pointer — fully reworked.

Force-pushed 67b9d91 which replaces the 11-file first-party wiring with the documented JSON-only path: a single entry in litellm/llms/openai_like/providers.json, plus the JSON keepers (provider_endpoints_support.json for the doc-coverage check and 7 pricing entries in model_prices_and_context_window.json + backup).

Net change: 4 files, 260 insertions (was 11 files, 680). No litellm/llms/wafer/ module, no enum/constants/lazy-imports/utils/init touches, no custom transformation, no Python tests — param_mappings.max_completion_tokens → max_tokens is declared inline in the JSON.

Locally verified:

JSONProviderRegistry.get("wafer") returns the right config
get_llm_provider("wafer/GLM-5.1") → provider=wafer, base=https://api.wafer.ai/v1
completion(model="wafer/GLM-5.1", …) actually hits api.wafer.ai/v1 (real call, returns a WaferException against the production endpoint)
tests/code_coverage_tests/check_provider_folders_documented.py passes (19 openai_like / 150 endpoint entries)
All 7 pricing entries load and have non-zero per-token costs

PR description updated to match. Ready for re-review.

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

ianye23301 force-pushed the ian/add-wafer-provider-oss branch from b52dc1b to 67b9d91 Compare May 26, 2026 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Wafer AI as a first-party OpenAI-compatible provider#28637

feat: add Wafer AI as a first-party OpenAI-compatible provider#28637
ianye23301 wants to merge 1 commit into
BerriAI:litellm_oss_stagingfrom
ianye23301:ian/add-wafer-provider-oss

ianye23301 commented May 22, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

greptile-apps Bot commented May 22, 2026

Important Files Changed

Uh oh!

greptile-apps Bot May 22, 2026

Uh oh!

greptile-apps Bot May 22, 2026

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

CLAassistant commented May 22, 2026 •

edited

Loading

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

ianye23301 commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ianye23301 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Test plan

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

greptile-apps Bot commented May 22, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

CLAassistant commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianye23301 commented May 22, 2026

Uh oh!

ianye23301 commented May 22, 2026

1. test-server-root-path (/api/v1) and (/llmproxy) — Dockerfile base-image issue

2. misc / Run tests — unrelated schema-drift test

Uh oh!

ianye23301 commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ianye23301 commented May 22, 2026 •

edited

Loading

codecov Bot commented May 22, 2026 •

edited

Loading

CLAassistant commented May 22, 2026 •

edited

Loading

1. `test-server-root-path (/api/v1)` and `(/llmproxy)` — Dockerfile base-image issue

2. `misc / Run tests` — unrelated schema-drift test