-
Notifications
You must be signed in to change notification settings - Fork 112
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Disable prefix caching for qwen3.5 & glm5 AMD benchmarks
sweep-enabled
#970
opened Mar 28, 2026 by
functionstackx
Loading…
[AMD/ROCM] ATOM support for new models: Kimi-K2.5 FP4, GLM-5 FP8, and MiniMax-M2.5
#963
opened Mar 27, 2026 by
seungrokj
Loading…
[DRAFT] [AMD] Update Minimax M2.5 MI325 image and adjust search space
AMD
#953
opened Mar 27, 2026 by
benenzhu
Loading…
[WIP] B200 Minimax FP8 vllm upgrade
NVIDIA
sweep-enabled
#947
opened Mar 26, 2026 by
kedarpotdar-nv
Loading…
fix: multi-turn benchmark hangs after all clients finish
#908
opened Mar 13, 2026 by
lishicheng1996-nv
Loading…
3 of 4 tasks
Add Kimi-K2.5 INT4 vLLM v0.16.0 benchmark for MI300X
AMD
sweep-enabled
#860
opened Mar 3, 2026 by
functionstackx
Loading…
Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)
AMD
sweep-enabled
#827
opened Mar 1, 2026 by
functionstackx
Loading…
[NV] Qwen3.5 B200 SGLang FP4 configs
NVIDIA
sweep-enabled
#820
opened Feb 27, 2026 by
kedarpotdar-nv
Loading…
Performance Improvements for MI300X with GEMM and FP8 Enhancements
#811
opened Feb 26, 2026 by
chunfangamd
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.