Skip to content

docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1)#225

Merged
yassin-berriai merged 2 commits into
yuneng/release-notes-v1-87-0-rc-1from
claude/gracious-edison-1mBAx
May 25, 2026
Merged

docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1)#225
yassin-berriai merged 2 commits into
yuneng/release-notes-v1-87-0-rc-1from
claude/gracious-edison-1mBAx

Conversation

@yassin-berriai
Copy link
Copy Markdown
Contributor

@yassin-berriai yassin-berriai commented May 25, 2026

Resolves LIT-3334

Stacked on top of yuneng/release-notes-v1-87-0-rc-1 — contributes my depth-item write-up to the v1.87.0rc1 release notes (per @yuneng-jiang's request in #214).

Summary

Expands the Anthropic streaming hot-path perf entry (#28289) — my depth item for this release — from a one-liner into a full write-up:

  • The four optimization groups (skip no-op per-chunk work, de-duplicate per-request work, cheaper end-of-stream reconstruction, cheaper hot-path logging).
  • Source-of-truth metrics from internal Week-4 real-deployment testing (4-pod m7i.xlarge, no HPA, 256 text_delta chunks/request, validated on both Anthropic direct and Bedrock Invoke): TTFT overhead ~90% lower (p50 2220% → 165%, p95 3057% → 316%, p99 3111% → 328%) and TPM +12% / +6% / +4% (p50 / p95 / p99).
  • Notes the wire-output parity testing.

Also leads the matching Key Highlights line with the headline result.

Metrics now come from the internal Week-4 performance doc (real deployment), replacing the local mock-SSE benchmark figures from the PR description.

Not included

  • #28794 (management-endpoint SERVER span) — merged 2026-05-25, after this RC was cut, so it is not part of v1.87.0rc1. It belongs in the next release's notes.
  • The other PRs I merged this week (#28273, #28362, #28364, #28395) were already documented by Yuneng and need no change.

Test plan

  • Docusaurus builds and the note renders at /release_notes

@vercel
Copy link
Copy Markdown

vercel Bot commented May 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment May 25, 2026 9:11pm

Request Review

@yassin-berriai yassin-berriai force-pushed the claude/gracious-edison-1mBAx branch from 5430012 to a6e4dba Compare May 25, 2026 20:45
@yassin-berriai yassin-berriai changed the title docs(release): add v1.87.0-rc.1 release notes docs(release): add PR #28794 to v1.87.0rc1 release notes May 25, 2026
@yassin-berriai yassin-berriai changed the base branch from main to yuneng/release-notes-v1-87-0-rc-1 May 25, 2026 20:45
Flesh out the #28289 entry in the v1.87.0rc1 notes with the specific
optimizations and benchmark numbers (depth-item write-up), and lead the
Key Highlights line with the headline speedup.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu
@yassin-berriai yassin-berriai force-pushed the claude/gracious-edison-1mBAx branch from a6e4dba to dbb30a6 Compare May 25, 2026 20:52
@yassin-berriai yassin-berriai changed the title docs(release): add PR #28794 to v1.87.0rc1 release notes docs(release): expand Anthropic streaming perf depth-item write-up (v1.87.0rc1) May 25, 2026
Replace the local mock-benchmark figures with the source-of-truth
metrics from internal Week-4 testing (4-pod m7i.xlarge, Anthropic +
Bedrock Invoke): TTFT overhead ~90% lower, TPM +12/6/4%.

https://claude.ai/code/session_01HDDqHBK46d5bLsFih3WkEu
@yassin-berriai yassin-berriai marked this pull request as ready for review May 25, 2026 21:11
@yassin-berriai yassin-berriai merged commit c7a358b into yuneng/release-notes-v1-87-0-rc-1 May 25, 2026
1 of 2 checks passed
@yassin-berriai yassin-berriai deleted the claude/gracious-edison-1mBAx branch May 25, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants