From b9cabf23a2c0276760e6bc920fef5d94b87e8a39 Mon Sep 17 00:00:00 2001 From: Mo King Date: Tue, 5 May 2026 09:48:46 -0700 Subject: [PATCH 1/2] Clarify batch availability in together-batch-inference skill Mirror the docs change in https://github.com/togethercomputer/mintlify-docs/pull/782: not all serverless models support batch, and a small set (currently DeepSeek-R1-0528-tput and DeepSeek-V3.1) will fail if submitted. Linear: MLE-5279 Co-Authored-By: Claude Opus 4.7 (1M context) --- skills/together-batch-inference/references/api-reference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/together-batch-inference/references/api-reference.md b/skills/together-batch-inference/references/api-reference.md index c13c5ca..1b01652 100644 --- a/skills/together-batch-inference/references/api-reference.md +++ b/skills/together-batch-inference/references/api-reference.md @@ -245,7 +245,7 @@ curl -X GET "https://api.together.xyz/v1/batches" \ - `zai-org/GLM-4.5-Air-FP8` - `openai/whisper-large-v3` -All serverless models support batch processing — models not listed have no discount. +Most serverless models support batch processing through the chat completions endpoint; models not listed above have no discount. A small number of serverless models are not available for batch and will fail if submitted, currently `deepseek-ai/DeepSeek-R1-0528-tput` and `deepseek-ai/DeepSeek-V3.1`. ## Rate Limits From 00c35725a407f18595171c0a028548b4f34c7157 Mon Sep 17 00:00:00 2001 From: Mo King Date: Tue, 5 May 2026 10:00:32 -0700 Subject: [PATCH 2/2] Expand batch-unavailable model list Mirrors the corrected list in https://github.com/togethercomputer/mintlify-docs/pull/782 after cross-referencing against the live Together API (the previous list was undercounted because it was checked against a stale model snippet). Linear: MLE-5279 Co-Authored-By: Claude Opus 4.7 (1M context) --- .../references/api-reference.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/skills/together-batch-inference/references/api-reference.md b/skills/together-batch-inference/references/api-reference.md index bf1bc09..5b240c1 100644 --- a/skills/together-batch-inference/references/api-reference.md +++ b/skills/together-batch-inference/references/api-reference.md @@ -240,7 +240,17 @@ curl -X GET "https://api.together.xyz/v1/batches" \ - `meta-llama/Llama-3.3-70B-Instruct-Turbo` -Most serverless models support batch processing through the chat completions endpoint; models not listed above have no discount. A small number of serverless models are not available for batch and will fail if submitted, currently `deepseek-ai/DeepSeek-R1-0528-tput` and `deepseek-ai/DeepSeek-V3.1`. +Most serverless models support batch processing through the chat completions endpoint; models not listed above have no discount. The following serverless models are not currently available for batch and will fail if submitted: + +- `deepseek-ai/DeepSeek-R1` +- `deepseek-ai/DeepSeek-V3.1` +- `deepseek-ai/DeepSeek-V4-Pro` +- `MiniMaxAI/MiniMax-M2.7` +- `moonshotai/Kimi-K2.5` +- `moonshotai/Kimi-K2.6` +- `Qwen/Qwen3.5-397B-A17B` +- `zai-org/GLM-5` +- `zai-org/GLM-5.1` ## Rate Limits