Skip to content

Add OLMo-7B-Instruct NeuronX port#100

Open
dhwanw wants to merge 1 commit intomainfrom
contrib/olmo-7b-instruct
Open

Add OLMo-7B-Instruct NeuronX port#100
dhwanw wants to merge 1 commit intomainfrom
contrib/olmo-7b-instruct

Conversation

@dhwanw
Copy link

@dhwanw dhwanw commented Mar 21, 2026

Description

Adds NeuronX port of OLMo-7B-Instruct (model_type=hf_olmo) to the contrib models collection.

Model Information

Field Value
Model allenai/OLMo-7B-Instruct
Architecture OLMoForCausalLM (decoder-only, non-affine LayerNorm, SwiGLU)
Parameters 6.9B
TP Degree 2
Precision BF16

Checklist

  • Model compiles successfully on Neuron
  • Token matching validated (93.75% greedy, 99.38% teacher-forced)
  • README with architecture details, usage, validation results
  • Integration tests included

Folder Structure

contrib/models/olmo-7b-instruct/
├── README.md
├── src/
│   ├── __init__.py
│   └── modeling_olmo.py
└── test/
    └── integration/
        └── test_model.py

Testing

  • Token Match (greedy): 93.75% (10 prompts, 32 tokens each)
  • Token Match (teacher-forced): 99.38%
  • 9/10 prompts at 100% greedy match

Note: Uses custom OlmoInferenceConfig to map hf_olmo config fields (d_model, n_heads, n_layers) to standard names. Handles fused QKV (att_proj [3H, H] → Q/K/V) and fused MLP (ff_proj [2I, H] → gate/up) weight splitting. Non-affine LayerNorm (no learnable weight/bias) implemented with explicit tensor ops for Neuron traceability.

Compatibility

  • Neuron SDK: 2.22+
  • Instance: trn1.32xlarge

🤖 Generated with Claude Code

Port of allenai/OLMo-7B-Instruct (hf_olmo format) to NeuronX Distributed
Inference. Key features:
- Fused QKV/MLP weight splitting from hf_olmo checkpoint format
- Non-affine LayerNorm (no learnable params) with explicit ops for tracing
- SwiGLU activation, MHA with 32 heads, RoPE
- Validated at 99.38% teacher-forced / 93.75% greedy token match (TP=2, bf16)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant