GitHub - dhruvdcoder/xlm-core: XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. Built on PyTorch and PyTorch Lightning, with Hydra for configuration management, XLM makes it effortless to experiment with cutting-edge NAR architectures.

A Unified Framework for Non-Autoregressive Language Models

XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. It is built on PyTorch and PyTorch Lightning, with Hydra for configuration management.

Documentation (dev): https://dhruveshp.com/xlm-core/dev/

Key features

Feature	Description
Modular design	Plug-and-play components: swap models, losses, predictors, and collators independently.
Lightning-powered	Distributed training, mixed precision, and logging via PyTorch Lightning.
Hydra configs	Hierarchical configuration with runtime overrides.
Multiple architectures	Several model families ship in `xlm-models`.
Research-oriented	Type annotations (including `jaxtyping`), debug modes, and hooks for metrics and evaluators.
Hub integration	Push checkpoints to the Hugging Face Hub.

Installation

pip install xlm-core

For the bundled model implementations in this repository:

pip install xlm-models

Python 3.11+ is required (setup.py).

Optional extras and contributor setup

Optional dependency groups (see setup.py):

pip install "xlm-core[safe]"      # SAFE-style molecule preprocessing / evaluators
pip install "xlm-core[molgen]"    # heavier GenMol / Biomemo-related stack
pip install "xlm-core[llm_eval]"  # math-verify / LLM-style benchmarks
pip install "xlm-core[all]"       # union of the above (used in CI)

From a git checkout, install in editable mode and pull dev/test/docs/lint stacks as needed:

pip install -e .
pip install -r requirements/dev_requirements.txt
pip install -r requirements/test_requirements.txt
pip install -r requirements/docs_requirements.txt
pip install -r requirements/lint_requirements.txt

Full detail: Dependencies.

CLI overview

XLM is driven by Hydra. The usual entrypoint is:

xlm job_type=<JOB> job_name=<NAME> experiment=<CONFIG>

Argument	Description
`job_type`	What to run (training, eval, data prep, etc.); see below.
`job_name`	A label for the run.
`experiment`	Hydra experiment config (e.g. `lm1b_ilm`).

job_type reference

Group	`job_type` values
Main	`prepare_data`, `train`, `eval`, `generate`
Checkpoints / Hub	`extract_checkpoint` (guide), `push_to_hub` (guide)
Hydra helpers	`name` (print resolved config tree + job name), `print_predictor_params` (dump predictor config as JSON)
External models	Additional values registered by external packages (external models, custom commands)

External commands are dispatched by job_type after Hydra loads the config.

Example: ILM on LM1B (full workflow)

1. Prepare data

xlm job_type=prepare_data job_name=lm1b_prepare experiment=lm1b_ilm

2. Train

# Quick debug: overfit a single batch
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm debug=overfit

# Full training
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm

3. Evaluate

xlm job_type=eval job_name=lm1b_ilm experiment=lm1b_ilm \
    +eval.ckpt_path=<CHECKPOINT_PATH>

4. Generate

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH>

To print generations to the console:

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH> \
    debug=[overfit,print_predictions]

5. Push to the Hugging Face Hub

xlm job_type=push_to_hub job_name=lm1b_ilm_hub experiment=lm1b_ilm \
    +hub_checkpoint_path=<CHECKPOINT_PATH> \
    +hub.repo_id=<YOUR_REPO_ID>

Step-by-step copy of the hosted guide: Quick Start.

Model families (`xlm-models`)

The companion package xlm-models registers six top-level families (see xlm-models/xlm_models.json). Documented means a conceptual guide on the site; State is a rough stability hint (the PyPI package as a whole is beta). Cross-family comparison: Models overview.

Tag	Name	Documented	State	Notes
`arlm`	Autoregressive LM (baseline)	Full	Beta	—
`ilm`	Insertion language model	Full	Beta	—
`mdlm`	Masked diffusion LM	Full	Beta	—
`mlm`	Masked language model (BERT-style)	Full	Beta	—
`flexmdm`	Flexible masked diffusion	Partial	Alpha	Variable-length / fragment-style masking; arXiv:2509.01025; source
`dream`	Dream-style decoder LM in XLM (`DreamXLMModel`, `DreamPredictor`, …)	Partial	Alpha	Source; backbone helpers in `xlm.backbones.dream`

The API reference includes xlm and the four main xlm-models families (see API Reference).

Other CLIs

setup.py also exposes:

xlm-scaffold — model scaffolding helper
xlm-push-to-hub — dedicated Hub upload entrypoint (in addition to job_type=push_to_hub)

Extending XLM

New architectures generally implement four pieces that plug into the harness:

Piece	Role
Model	Network and forward pass
Loss	Training objective
Predictor	Inference / generation
Collator	Batch construction

Guides:

Developers

Project layout

xlm-core/
├── src/xlm/              # Core framework (harness, datamodule, tasks, Hydra configs)
├── xlm-models/           # Model families (arlm, ilm, mlm, mdlm, flexmdm, dream, …)
├── docs/                 # MkDocs source (published as https://dhruveshp.com/xlm-core/dev/)
├── tests/
├── requirements/
└── wiki/                 # Legacy internal notes

Contributing

We welcome contributions. See CONTRIBUTING.md and the Good First Issue list.

License

This project is licensed under the MIT License.

Acknowledgements

XLM is developed and maintained by IESL students at UMass Amherst.

Primary developers

External Model Contributors:

Contributor	Model	Paper
Dhruvesh Patel	DILM	A Continuous Time Markov Chain Framework for Insertion Language Models

We welcome external model contributions! Please see CONTRIBUTING.md for more details.

Cite

If you found this repository useful, please consider citing:

@article{patel2025xlm,
  title={XLM: A Python package for non-autoregressive language models},
  author={Patel, Dhruvesh and Maram, Durga Prasad and Chintha, Sai Sreenivas and Rozonoyer, Benjamin and McCallum, Andrew},
  journal={arXiv preprint arXiv:2512.17065},
  year={2025}
}

_{Built with care for the NLP research community}

Name		Name	Last commit message	Last commit date
Latest commit History 260 Commits
.cursor/rules		.cursor/rules
.github		.github
assets		assets
docs		docs
evals/protein_eval		evals/protein_eval
requirements		requirements
src/xlm		src/xlm
tests		tests
vendor		vendor
wiki		wiki
xlm-models		xlm-models
.darglint		.darglint
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
MANIFEST.in		MANIFEST.in
PR_MESSAGE_SAFE.md		PR_MESSAGE_SAFE.md
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Key features

Installation

CLI overview

1. Prepare data

2. Train

3. Evaluate

4. Generate

5. Push to the Hugging Face Hub

Model families (`xlm-models`)

Other CLIs

Extending XLM

Developers

Project layout

Contributing

License

Acknowledgements

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Key features

Installation

CLI overview

1. Prepare data

2. Train

3. Evaluate

4. Generate

5. Push to the Hugging Face Hub

Model families (xlm-models)

Other CLIs

Extending XLM

Developers

Project layout

Contributing

License

Acknowledgements

Cite

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Model families (`xlm-models`)

Packages