A Unified Framework for Non-Autoregressive Language Models
XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. It is built on PyTorch and PyTorch Lightning, with Hydra for configuration management.
Documentation (dev): https://dhruveshp.com/xlm-core/dev/
| Feature | Description |
|---|---|
| Modular design | Plug-and-play components: swap models, losses, predictors, and collators independently. |
| Lightning-powered | Distributed training, mixed precision, and logging via PyTorch Lightning. |
| Hydra configs | Hierarchical configuration with runtime overrides. |
| Multiple architectures | Several model families ship in xlm-models. |
| Research-oriented | Type annotations (including jaxtyping), debug modes, and hooks for metrics and evaluators. |
| Hub integration | Push checkpoints to the Hugging Face Hub. |
pip install xlm-coreFor the bundled model implementations in this repository:
pip install xlm-modelsPython 3.11+ is required (setup.py).
Optional extras and contributor setup
Optional dependency groups (see setup.py):
pip install "xlm-core[safe]" # SAFE-style molecule preprocessing / evaluators
pip install "xlm-core[molgen]" # heavier GenMol / Biomemo-related stack
pip install "xlm-core[llm_eval]" # math-verify / LLM-style benchmarks
pip install "xlm-core[all]" # union of the above (used in CI)From a git checkout, install in editable mode and pull dev/test/docs/lint stacks as needed:
pip install -e .
pip install -r requirements/dev_requirements.txt
pip install -r requirements/test_requirements.txt
pip install -r requirements/docs_requirements.txt
pip install -r requirements/lint_requirements.txtFull detail: Dependencies.
XLM is driven by Hydra. The usual entrypoint is:
xlm job_type=<JOB> job_name=<NAME> experiment=<CONFIG>| Argument | Description |
|---|---|
job_type |
What to run (training, eval, data prep, etc.); see below. |
job_name |
A label for the run. |
experiment |
Hydra experiment config (e.g. lm1b_ilm). |
job_type reference
| Group | job_type values |
|---|---|
| Main | prepare_data, train, eval, generate |
| Checkpoints / Hub | extract_checkpoint (guide), push_to_hub (guide) |
| Hydra helpers | name (print resolved config tree + job name), print_predictor_params (dump predictor config as JSON) |
| External models | Additional values registered by external packages (external models, custom commands) |
External commands are dispatched by job_type after Hydra loads the config.
Example: ILM on LM1B (full workflow)
xlm job_type=prepare_data job_name=lm1b_prepare experiment=lm1b_ilm# Quick debug: overfit a single batch
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm debug=overfit
# Full training
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilmxlm job_type=eval job_name=lm1b_ilm experiment=lm1b_ilm \
+eval.ckpt_path=<CHECKPOINT_PATH>xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
+generation.ckpt_path=<CHECKPOINT_PATH>To print generations to the console:
xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
+generation.ckpt_path=<CHECKPOINT_PATH> \
debug=[overfit,print_predictions]xlm job_type=push_to_hub job_name=lm1b_ilm_hub experiment=lm1b_ilm \
+hub_checkpoint_path=<CHECKPOINT_PATH> \
+hub.repo_id=<YOUR_REPO_ID>Step-by-step copy of the hosted guide: Quick Start.
The companion package xlm-models registers six top-level families (see xlm-models/xlm_models.json). Documented means a conceptual guide on the site; State is a rough stability hint (the PyPI package as a whole is beta). Cross-family comparison: Models overview.
| Tag | Name | Documented | State | Notes |
|---|---|---|---|---|
arlm |
Autoregressive LM (baseline) | Full | Beta | — |
ilm |
Insertion language model | Full | Beta | — |
mdlm |
Masked diffusion LM | Full | Beta | — |
mlm |
Masked language model (BERT-style) | Full | Beta | — |
flexmdm |
Flexible masked diffusion | Partial | Alpha | Variable-length / fragment-style masking; arXiv:2509.01025; source |
dream |
Dream-style decoder LM in XLM (DreamXLMModel, DreamPredictor, …) |
Partial | Alpha | Source; backbone helpers in xlm.backbones.dream |
The API reference includes xlm and the four main xlm-models families (see API Reference).
setup.py also exposes:
xlm-scaffold— model scaffolding helperxlm-push-to-hub— dedicated Hub upload entrypoint (in addition tojob_type=push_to_hub)
New architectures generally implement four pieces that plug into the harness:
| Piece | Role |
|---|---|
| Model | Network and forward pass |
| Loss | Training objective |
| Predictor | Inference / generation |
| Collator | Batch construction |
Guides:
- Adding a task or dataset
- Data pipeline
- Metrics
- Evaluate
- External models
- Custom commands (
job_typeextensions) - FAQ
xlm-core/
├── src/xlm/ # Core framework (harness, datamodule, tasks, Hydra configs)
├── xlm-models/ # Model families (arlm, ilm, mlm, mdlm, flexmdm, dream, …)
├── docs/ # MkDocs source (published as https://dhruveshp.com/xlm-core/dev/)
├── tests/
├── requirements/
└── wiki/ # Legacy internal notes
We welcome contributions. See CONTRIBUTING.md and the Good First Issue list.
This project is licensed under the MIT License.
XLM is developed and maintained by IESL students at UMass Amherst.
Primary developers
External Model Contributors:
| Contributor | Model | Paper |
|---|---|---|
| Dhruvesh Patel | DILM | A Continuous Time Markov Chain Framework for Insertion Language Models |
We welcome external model contributions! Please see CONTRIBUTING.md for more details.
If you found this repository useful, please consider citing:
@article{patel2025xlm,
title={XLM: A Python package for non-autoregressive language models},
author={Patel, Dhruvesh and Maram, Durga Prasad and Chintha, Sai Sreenivas and Rozonoyer, Benjamin and McCallum, Andrew},
journal={arXiv preprint arXiv:2512.17065},
year={2025}
}Built with care for the NLP research community
