feat: add Wafer AI as a first-party OpenAI-compatible provider#28637
feat: add Wafer AI as a first-party OpenAI-compatible provider#28637ianye23301 wants to merge 1 commit into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
CI summary after the rerun: 39 pass / 2 fail / 1 cancel. The only red checks are This is an upstream issue with the Docker base image used by the For reference, all Wafer-relevant checks are green: Happy to retrigger once the base image is fixed. |
Greptile SummaryThis PR adds Wafer AI as a first-party OpenAI-compatible provider following the same pattern as the Featherless AI integration. The implementation is thorough and well-tested with 14 mock-only unit tests covering the full registration surface.
Confidence Score: 4/5The change is additive and isolated; existing providers are unaffected. The integration follows the Featherless AI template closely and all changed wiring paths are well-tested. The only notable gaps are a missing None guard when remapping max_completion_tokens (a null value would be forwarded to the upstream API) and a redundant hardcoded model list in constants.py that will silently drift from the JSON price map. Neither affects existing functionality. litellm/constants.py (dead wafer_models set) and litellm/llms/wafer/chat/transformation.py (None handling in map_openai_params)
|
| Filename | Overview |
|---|---|
| litellm/llms/wafer/chat/transformation.py | New WaferConfig extending OpenAIGPTConfig; minor None-handling inconsistency in map_openai_params for max_completion_tokens |
| litellm/constants.py | Adds wafer to provider lists and a hardcoded wafer_models set that duplicates the JSON price map; follows featherless_ai pattern but is dead code |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Adds endpoint-detection and provider-info branch for wafer; correctly respects explicitly passed api_key |
| litellm/utils.py | Adds env-var validation check and ProviderConfigManager mapping for wafer; follows established pattern |
| litellm/types/utils.py | Adds LlmProviders.WAFER enum entry; clean addition |
| litellm/init.py | Wires wafer_models set, add_known_models branch, model_list union, and models_by_provider entry |
| model_prices_and_context_window.json | Adds 7 wafer model entries with pricing, context windows, and capability flags |
| provider_endpoints_support.json | Adds wafer entry with responses:true; consistent with featherless_ai pattern |
| tests/test_litellm/llms/wafer/chat/test_wafer_chat_transformation.py | 14 mocked unit tests covering header injection, missing key, env-var precedence, param mapping, and registry wiring |
Reviews (1): Last reviewed commit: "feat: add Wafer AI as a first-party Open..." | Re-trigger Greptile
| wafer_models: set = set( | ||
| [ | ||
| "GLM-5.1", | ||
| "Qwen3.5-397B-A17B", | ||
| "Qwen3.6-35B-A3B", | ||
| "deepseek-v4-flash", | ||
| "deepseek-v4-pro", | ||
| "qwen3.6-max-preview", | ||
| "Kimi-K2.6", | ||
| ] | ||
| ) | ||
|
|
There was a problem hiding this comment.
Unreferenced hardcoded model list
wafer_models defined here (bare model names, no wafer/ prefix) is never imported or read anywhere in the codebase — grep finds no references outside this file. The authoritative model list already lives in model_prices_and_context_window.json, and litellm/__init__.py populates its own wafer_models set from that JSON at startup. This constants.py copy will silently drift whenever models are added or removed from the price map. The same issue exists in featherless_ai_models, but it's worth not propagating the pattern.
| for param, value in non_default_params.items(): | ||
| if param == "max_completion_tokens": | ||
| optional_params["max_tokens"] = value | ||
| elif param in supported_openai_params: | ||
| if value is not None: | ||
| optional_params[param] = value |
There was a problem hiding this comment.
Missing None guard on the alias mapping branch
Every other param in the loop is protected by a None check, but the max_completion_tokens alias writes unconditionally to optional_params. Passing a None value for this param therefore forwards a null field to the upstream API, which can produce a validation error. Wrapping the assignment in a None check matches the convention used for all other params in this method.
|
Update after pushing 6a731f3: CI tally on the new SHA: 39 pass / 2 fail. The 2 remaining red checks are both unrelated to Wafer:
My-side checks fixed by 6a731f3:
Wafer-relevant checks all green: |
Two P2 review comments from Greptile on PR BerriAI#28637: 1. **Dead `wafer_models` set in `constants.py`** — the hardcoded set was never imported or referenced anywhere; the authoritative wafer-model set is populated at startup in ``litellm/__init__.py`` from ``model_prices_and_context_window.json``. Removing the dead copy (which would silently drift from the JSON price map). (Note: the Featherless PR I templated from has the same dead code; not propagating that mistake here.) 2. **`max_completion_tokens` alias bypassed the `None` guard** in ``map_openai_params`` — every other supported param checked ``value is not None`` before forwarding, but the alias branch wrote unconditionally. A caller passing ``max_completion_tokens=None`` would therefore forward ``max_tokens: null`` to Wafer's upstream API, which rejects null max_tokens with a 400 validation error. Fixed by hoisting the ``value is None`` check to the top of the loop so it covers both the alias and passthrough branches. Added a dedicated test ``test_map_openai_params_max_completion_tokens_none_is_dropped`` that locks down the behavior so a future refactor can't quietly regress it. Verification: - ``pytest tests/test_litellm/llms/wafer/ --cov=litellm.llms.wafer`` → 19/19 pass, 100% line coverage on the wafer module - ``ruff check`` + ``black --check`` clean
|
Addressed both Greptile P2 comments in 6a8e88d:
Local verification: 19/19 pass on Re the still-red
I'll post the new CI tally once the rerun settles. |
|
Final CI tally on SHA All Wafer-relevant checks pass: The 2 still-red checks are both genuinely unrelated to this PR — I dug into both: 1.
|
Per the maintainer feedback on BerriAI#28637 and the docs at https://docs.litellm.ai/docs/contributing/adding_openai_compatible_providers, this replaces the previous 11-file first-party wiring with the documented JSON-only approach. Four files changed: - ``litellm/llms/openai_like/providers.json`` — register ``wafer`` with ``base_url=https://api.wafer.ai/v1``, ``api_key_env=WAFER_API_KEY``, ``api_base_env=WAFER_API_BASE``, and ``max_completion_tokens`` → ``max_tokens`` param mapping. Wafer is an OpenAI-compatible inference gateway; no custom transformation required. - ``provider_endpoints_support.json`` — required by the code-quality doc-coverage check for every entry in ``openai_like/providers.json``. - ``model_prices_and_context_window.json`` (+ bundled backup) — pricing and capability metadata for the 7 current Wafer chat models: GLM-5.1, Qwen3.5-397B-A17B, Qwen3.6-35B-A3B, deepseek-v4-flash, deepseek-v4-pro, qwen3.6-max-preview, Kimi-K2.6. Per-token input / output / cache-read costs and capability flags (tools, vision, reasoning where applicable). Verified locally: - ``JSONProviderRegistry.get("wafer")`` returns the expected config. - ``litellm.get_llm_provider("wafer/GLM-5.1", api_key="...")`` resolves to ``provider=wafer, base=https://api.wafer.ai/v1``. - ``completion(model="wafer/GLM-5.1", ...)`` hits ``api.wafer.ai/v1/chat/completions`` with the bearer header and surfaces upstream errors as ``WaferException``. - ``tests/code_coverage_tests/check_provider_folders_documented.py`` passes (19 openai_like providers, 150 entries in provider_endpoints_support.json). - All 7 ``wafer/<model>`` entries are in ``litellm.model_cost`` with ``input_cost_per_token > 0`` and ``output_cost_per_token > 0``.
b52dc1b to
67b9d91
Compare
|
Thanks for the pointer — fully reworked. Force-pushed 67b9d91 which replaces the 11-file first-party wiring with the documented JSON-only path: a single entry in Net change: 4 files, 260 insertions (was 11 files, 680). No Locally verified:
PR description updated to match. Ready for re-review. |
Summary
Adds Wafer AI as an OpenAI-compatible provider following the documented JSON-only path.
Wafer is an OpenAI-compatible inference gateway that serves frontier open models via
https://api.wafer.ai/v1(bearer auth, standard OpenAI chat-completions shape, SSE streaming, tool/function calling). After this PR:What changed
Four files, ~260 lines:
litellm/llms/openai_like/providers.json— registerwaferwithbase_url=https://api.wafer.ai/v1,api_key_env=WAFER_API_KEY,api_base_env=WAFER_API_BASE, andmax_completion_tokens → max_tokensparam mapping. No custom transformation needed.provider_endpoints_support.json— required by thecode-qualitydoc-coverage check for every JSON-registered provider.model_prices_and_context_window.json+litellm/model_prices_and_context_window_backup.json— 7 chat models with per-token pricing, context windows, and capability flags:wafer/GLM-5.1wafer/Qwen3.5-397B-A17Bwafer/Qwen3.6-35B-A3Bwafer/deepseek-v4-flashwafer/deepseek-v4-prowafer/qwen3.6-max-previewwafer/Kimi-K2.6Cache-read prices included where applicable.
Docs — companion PR at BerriAI/litellm-docs#205 per the AGENTS.md guidance.
Test plan
JSONProviderRegistry.get("wafer")returns the expected config (base_url, api_key_env, param_mappings)litellm.get_llm_provider("wafer/GLM-5.1", api_key="...")resolves to(model="GLM-5.1", provider="wafer", base="https://api.wafer.ai/v1")completion(model="wafer/GLM-5.1", ...)hitsapi.wafer.ai/v1/chat/completionswith the bearer header; errors surface asWaferExceptionpython3 tests/code_coverage_tests/check_provider_folders_documented.py→ 19 openai_like providers, 150 endpoint-support entries, all documentedwafer/<model>entries are inlitellm.model_costwithinput_cost_per_token > 0andoutput_cost_per_token > 0🤖 Generated with Claude Code