Add GLM-4.7-FP8 tuned/untuned BF16 GEMM configs (gfx950) by omirosh · Pull Request #3285 · ROCm/aiter

omirosh · 2026-05-20T13:35:30Z

Summary

Adds tuned BF16 GEMM configurations for GLM-4.7-FP8 detected from vLLM server log untuned warnings on gfx950.
New files: aiter/configs/model_configs/glm47_bf16_tuned_gemm.csv and aiter/configs/model_configs/glm47_bf16_untuned_gemm.csv.

Test plan

Verify the new configs are picked up by the GEMM tuner / dispatcher on gfx950.
Re-run the vLLM server scenario that produced the untuned warnings and confirm they no longer appear.

…M server log untuned warnings on gfx950.

github-actions · 2026-05-20T13:35:56Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
`ci:atom`	ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
`ci:atom_full`	ATOM accuracy suite for PR and main models from ATOM `models_accuracy.json`
`ci:vllm`	vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
`ci:all`	All standard extended tests (excludes `ci:atom_full`)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3285 --add-label <label>

Copilot

Pull request overview

Adds model-specific BF16 GEMM shape lists and their tuned kernel selections for the GLM-4.7-FP8 deployment scenario on gfx950, so the runtime GEMM dispatcher can pick optimized kernels and avoid “untuned” warnings.

Changes:

Add a GLM-4.7 BF16 untuned GEMM shape list (M,N,K,...) for reproducing/tuning and avoiding missing-shape warnings.
Add a GLM-4.7 BF16 tuned GEMM config (gfx,cu_num,...,libtype,solidx,...) targeting gfx950 with cu_num=256.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

File	Description
aiter/configs/model_configs/glm47_bf16_untuned_gemm.csv	Adds the set of BF16 GEMM shapes observed in the target workload so tuning/coverage can be driven from a stable CSV.
aiter/configs/model_configs/glm47_bf16_tuned_gemm.csv	Provides tuned kernel selections for the above shapes on `gfx950`, enabling the dispatcher to select optimized implementations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds tuned BF16 GEMM configurations for GLM-4.7-FP8 detected from vLL…

0401086

…M server log untuned warnings on gfx950.

omirosh requested review from a team and Copilot May 20, 2026 13:35

Copilot started reviewing on behalf of omirosh May 20, 2026 13:35 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GLM-4.7-FP8 tuned/untuned BF16 GEMM configs (gfx950)#3285

Add GLM-4.7-FP8 tuned/untuned BF16 GEMM configs (gfx950)#3285
omirosh wants to merge 1 commit into
ROCm:mainfrom
amdsiloai:glm47-opt-bf16-gemm

omirosh commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

omirosh commented May 20, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 20, 2026

🏷️ CI Guide

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants