Skip to content

adapt w2 quant layer format for Minimax2.5#111

Open
shadowxz109 wants to merge 1 commit intoAscend:release/PoC_20260310from
shadowxz109:minimax_poc
Open

adapt w2 quant layer format for Minimax2.5#111
shadowxz109 wants to merge 1 commit intoAscend:release/PoC_20260310from
shadowxz109:minimax_poc

Conversation

@shadowxz109
Copy link
Copy Markdown

@shadowxz109 shadowxz109 commented Mar 18, 2026

Motivation

adapt w2 quant weight format for Minimax2.5

Modifications

add '.0.w2.weight' format in get_moe_scheme
prefix change mlp to block_sparse_moe

Accuracy Tests

Accuracy: 0.955
Invalid: 0.000
Latency: 38.473 s
Output throughput: 520.775 token/s

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@ascend-robot
Copy link
Copy Markdown

CLA Signature Guide

@shadowxz109 , thanks for your pull request.

The following commit(s) are not associated with a signed Contributor License Agreement (CLA).

Commit Reason
53ac4615 adapt w2 quant weight format foe... the email used in the commit is not linked to a signed CLA!
please verify that it matches the email you used when signing the CLA.

To sign CLA, click here.

To check if your email is configured correctly, refer to the FAQs.

Once you've signed the CLA or updating your email, please comment /check-cla to revalidate CLA status.

@shadowxz109 shadowxz109 changed the title adapt w2 quant weight format foe Minimax2.5 adapt w2 quant weight format for Minimax2.5 Mar 18, 2026
@shadowxz109
Copy link
Copy Markdown
Author

shadowxz109 commented Mar 18, 2026

/check-cla

@ascend-robot
Copy link
Copy Markdown

CLA Signature Pass

shadowxz109, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@shadowxz109 shadowxz109 changed the title adapt w2 quant weight format for Minimax2.5 adapt w2 quant weight layer format for Minimax2.5 Mar 18, 2026
@shadowxz109 shadowxz109 changed the title adapt w2 quant weight layer format for Minimax2.5 adapt w2 quant layer format for Minimax2.5 Mar 18, 2026
Comment on lines +217 to +218
prefix_in_quant_config_down = prefix + ".0.down_proj.weight"
prefix_in_quant_config_w2 = prefix + ".0.w2.weight"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prefix_in_quant_config_down = prefix + ".0.down_proj.weight"
prefix_in_quant_config_w2 = prefix + ".0.w2.weight"
prefix_in_quant_config_search_list = [".0.down_proj.weight", ".0.w2.weight"]
quant_scheme_entry = None
for prefix_in_quant_config in prefix_in_quant_config_search_list:
if (quant_scheme_entry := self.quant_description.get(prefix_in_quant_config, None):
break

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants