[WIP] Add HyperCLOVAX model#44956
Draft
bigshanedogg wants to merge 2 commits intohuggingface:mainfrom
Draft
Conversation
2 tasks
HanFa
added a commit
to HanFa/vllm
that referenced
this pull request
Mar 29, 2026
Vendor the HyperCLOVAX Vision config into vLLM to fix transformers v5 compatibility. The upstream remote code config does not handle empty initialization (text_config=None), which breaks v5's @strict config validation added in huggingface/transformers#41250. Fixes: vllm-project#38387 TODO: Remove vendored config once HyperCLOVAX is upstreamed to transformers. Tracking PR: huggingface/transformers#44956 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Mar 29, 2026
b31ff44 to
ef1e73f
Compare
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, hyperclovax |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds native Transformers support for HyperCLOVA X SEED Think 14B,
a 14.74B-parameter Korean reasoning LLM developed by NAVER Cloud.
Architecture
LLaMA-style decoder-only transformer with two modifications:
use_post_norm): an extraRMSNormis applied after eachsub-layer output (both attention and MLP), in addition to the standard pre-norm.
attention_multiplier— replaces1/sqrt(head_dim)in attentionresidual_multiplier— scales each sub-layer output before adding to the residual streamembedding_multiplier— scales the token embedding outputlogits_scaling— scales final logits before softmax / samplingImplementation approach
Following the maintainer's guidance in #44957, this PR uses the modular system (
modular_hyperclovax.py) to minimise LOC and make the diff easy to review-iterate. (Roughly 59% of lines are generated rather than manually maintained.)The maintainer suggested inheriting the decoder layer with post-norms from GLM4. After evaluation, Granite was chosen as the decoder layer base instead, for the following reasons:
use_post_normis optional (Falseby default). GLM4's decoder layer has post-norms always on — inheriting from it would require logic to conditionally disablepost_self_attn_layernorm/post_mlp_layernorm, adding complexity rather than reducing it.residual_multiplier(always-active MuP). Whenuse_post_norm=False,HyperCLOVAXDecoderLayeris identical toGraniteDecoderLayer— zero extra code.residual_multiplierand conditionally disabling its built-in norms — two changes in opposite directions for no net gain in code reuse.All other modules (RMSNorm, MLP, Attention, etc.) are inherited from Granite unchanged. The modular file is a few hundred LOC as suggested.
(WIP) Benchmark validation
External support
Code Agent Policy
A code agent was used for mechanical tasks such as aligning docstrings and comments. The core implementation was written by the submitter directly, who has reviewed every changed line and personally run the tests including benchmark validation.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.