adapt w2 quant layer format for Minimax2.5 by shadowxz109 · Pull Request #111 · Ascend/sglang

shadowxz109 · 2026-03-18T08:15:06Z

Motivation

adapt w2 quant weight format for Minimax2.5

Modifications

add '.0.w2.weight' format in get_moe_scheme
prefix change mlp to block_sparse_moe

Accuracy Tests

Accuracy: 0.955
Invalid: 0.000
Latency: 38.473 s
Output throughput: 520.775 token/s

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

ascend-robot · 2026-03-18T08:15:17Z

CLA Signature Guide

@shadowxz109 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit	Reason
53ac4615 adapt w2 quant weight format foe...	the email used in the commit is not linked to a signed CLA! please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.

shadowxz109 · 2026-03-18T08:23:04Z

/check-cla

ascend-robot · 2026-03-18T08:23:31Z

CLA Signature Pass

shadowxz109, thanks for your pull request. All authors of the commits have signed the CLA. 👍

iforgetmyname · 2026-03-19T01:28:20Z

+        prefix_in_quant_config_down = prefix + ".0.down_proj.weight"
+        prefix_in_quant_config_w2 = prefix + ".0.w2.weight"


Suggested change

prefix_in_quant_config_down = prefix + ".0.down_proj.weight"

prefix_in_quant_config_w2 = prefix + ".0.w2.weight"

prefix_in_quant_config_search_list = [".0.down_proj.weight", ".0.w2.weight"]

quant_scheme_entry = None

for prefix_in_quant_config in prefix_in_quant_config_search_list:

if (quant_scheme_entry := self.quant_description.get(prefix_in_quant_config, None):

break

adapt w2 quant weight format foe Minimax2.5

53ac461

ascend-robot added the ascend-cla/no label Mar 18, 2026

shadowxz109 changed the title ~~adapt w2 quant weight format foe Minimax2.5~~ adapt w2 quant weight format for Minimax2.5 Mar 18, 2026

ascend-robot added ascend-cla/yes and removed ascend-cla/no labels Mar 18, 2026

shadowxz109 changed the title ~~adapt w2 quant weight format for Minimax2.5~~ adapt w2 quant weight layer format for Minimax2.5 Mar 18, 2026

shadowxz109 changed the title ~~adapt w2 quant weight layer format for Minimax2.5~~ adapt w2 quant layer format for Minimax2.5 Mar 18, 2026

iforgetmyname reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adapt w2 quant layer format for Minimax2.5#111

adapt w2 quant layer format for Minimax2.5#111
shadowxz109 wants to merge 1 commit intoAscend:release/PoC_20260310from
shadowxz109:minimax_poc

shadowxz109 commented Mar 18, 2026 •

edited

Loading

Uh oh!

ascend-robot commented Mar 18, 2026

Uh oh!

shadowxz109 commented Mar 18, 2026 •

edited

Loading

Uh oh!

ascend-robot commented Mar 18, 2026

Uh oh!

iforgetmyname Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		prefix_in_quant_config_down = prefix + ".0.down_proj.weight"
		prefix_in_quant_config_w2 = prefix + ".0.w2.weight"

-        prefix_in_quant_config_down = prefix + ".0.down_proj.weight"
-        prefix_in_quant_config_w2 = prefix + ".0.w2.weight"
+        prefix_in_quant_config_search_list = [".0.down_proj.weight", ".0.w2.weight"]
+        quant_scheme_entry = None
+        for prefix_in_quant_config in prefix_in_quant_config_search_list:
+            if (quant_scheme_entry := self.quant_description.get(prefix_in_quant_config, None):
+                break

Conversation

shadowxz109 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

ascend-robot commented Mar 18, 2026

CLA Signature Guide

Uh oh!

shadowxz109 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ascend-robot commented Mar 18, 2026

CLA Signature Pass

Uh oh!

iforgetmyname Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shadowxz109 commented Mar 18, 2026 •

edited

Loading

shadowxz109 commented Mar 18, 2026 •

edited

Loading