Disable prefix caching for qwen3.5 & glm5 AMD benchmarks#970
Disable prefix caching for qwen3.5 & glm5 AMD benchmarks#970functionstackx wants to merge 2 commits intomainfrom
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
There was a problem hiding this comment.
LGTM — simple, mechanical addition of --disable-radix-cache to 7 benchmark scripts, consistent with the established pattern in this repo (e.g., #966, #926).
Extended reasoning...
Overview\nAdds --disable-radix-cache to the SGLang server launch command in 7 AMD benchmark shell scripts (qwen3.5 bf16/fp8 on MI300X/MI325X/MI355X, and glm5 fp8 on MI355X). Each change is a one-line addition following an existing multi-line argument block.\n\n### Security risks\nNone. These are benchmark configuration scripts with no auth, crypto, or permission-sensitive logic.\n\n### Level of scrutiny\nLow. This is a straightforward config flag addition that mirrors identically structured recent merges (#966 disabling prefix caching for minimax, #926 for kimi). The change is purely additive — no logic branches, no new code paths.\n\n### Other factors\nNo bugs found by the automated system. The PR is self-contained to benchmark scripts with no production code impact. The pattern is well-established in this repo.
|
@claude add the perf change logs . Yaml |
|
Claude finished @functionstackx's task in 45s —— View job Added perf-changelog.yaml entry
Added a |
…glm5 AMD benchmarks Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Add
--disable-radix-cacheto SGLang server launch command in all 7 benchmark scripts listed in #968, rebased on latest main.Closes #968
Generated with Claude Code