Skip to content

dhruvdcoder/xlm-core

Repository files navigation

XLM Logo

A Unified Framework for Non-Autoregressive Language Models

PyPI version Code coverage Documentation Python 3.11+ License


XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. It is built on PyTorch and PyTorch Lightning, with Hydra for configuration management.

Documentation (dev): https://dhruveshp.com/xlm-core/dev/

Key features

Feature Description
Modular design Plug-and-play components: swap models, losses, predictors, and collators independently.
Lightning-powered Distributed training, mixed precision, and logging via PyTorch Lightning.
Hydra configs Hierarchical configuration with runtime overrides.
Multiple architectures Several model families ship in xlm-models.
Research-oriented Type annotations (including jaxtyping), debug modes, and hooks for metrics and evaluators.
Hub integration Push checkpoints to the Hugging Face Hub.

Installation

pip install xlm-core

For the bundled model implementations in this repository:

pip install xlm-models

Python 3.11+ is required (setup.py).

Optional extras and contributor setup

Optional dependency groups (see setup.py):

pip install "xlm-core[safe]"      # SAFE-style molecule preprocessing / evaluators
pip install "xlm-core[molgen]"    # heavier GenMol / Biomemo-related stack
pip install "xlm-core[llm_eval]"  # math-verify / LLM-style benchmarks
pip install "xlm-core[all]"       # union of the above (used in CI)

From a git checkout, install in editable mode and pull dev/test/docs/lint stacks as needed:

pip install -e .
pip install -r requirements/dev_requirements.txt
pip install -r requirements/test_requirements.txt
pip install -r requirements/docs_requirements.txt
pip install -r requirements/lint_requirements.txt

Full detail: Dependencies.

CLI overview

XLM is driven by Hydra. The usual entrypoint is:

xlm job_type=<JOB> job_name=<NAME> experiment=<CONFIG>
Argument Description
job_type What to run (training, eval, data prep, etc.); see below.
job_name A label for the run.
experiment Hydra experiment config (e.g. lm1b_ilm).
job_type reference
Group job_type values
Main prepare_data, train, eval, generate
Checkpoints / Hub extract_checkpoint (guide), push_to_hub (guide)
Hydra helpers name (print resolved config tree + job name), print_predictor_params (dump predictor config as JSON)
External models Additional values registered by external packages (external models, custom commands)

External commands are dispatched by job_type after Hydra loads the config.

Example: ILM on LM1B (full workflow)

1. Prepare data

xlm job_type=prepare_data job_name=lm1b_prepare experiment=lm1b_ilm

2. Train

# Quick debug: overfit a single batch
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm debug=overfit

# Full training
xlm job_type=train job_name=lm1b_ilm experiment=lm1b_ilm

3. Evaluate

xlm job_type=eval job_name=lm1b_ilm experiment=lm1b_ilm \
    +eval.ckpt_path=<CHECKPOINT_PATH>

4. Generate

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH>

To print generations to the console:

xlm job_type=generate job_name=lm1b_ilm experiment=lm1b_ilm \
    +generation.ckpt_path=<CHECKPOINT_PATH> \
    debug=[overfit,print_predictions]

5. Push to the Hugging Face Hub

xlm job_type=push_to_hub job_name=lm1b_ilm_hub experiment=lm1b_ilm \
    +hub_checkpoint_path=<CHECKPOINT_PATH> \
    +hub.repo_id=<YOUR_REPO_ID>

Step-by-step copy of the hosted guide: Quick Start.

Model families (xlm-models)

The companion package xlm-models registers six top-level families (see xlm-models/xlm_models.json). Documented means a conceptual guide on the site; State is a rough stability hint (the PyPI package as a whole is beta). Cross-family comparison: Models overview.

Tag Name Documented State Notes
arlm Autoregressive LM (baseline) Full Beta
ilm Insertion language model Full Beta
mdlm Masked diffusion LM Full Beta
mlm Masked language model (BERT-style) Full Beta
flexmdm Flexible masked diffusion Partial Alpha Variable-length / fragment-style masking; arXiv:2509.01025; source
dream Dream-style decoder LM in XLM (DreamXLMModel, DreamPredictor, …) Partial Alpha Source; backbone helpers in xlm.backbones.dream

The API reference includes xlm and the four main xlm-models families (see API Reference).

Other CLIs

setup.py also exposes:

  • xlm-scaffold — model scaffolding helper
  • xlm-push-to-hub — dedicated Hub upload entrypoint (in addition to job_type=push_to_hub)

Extending XLM

New architectures generally implement four pieces that plug into the harness:

Piece Role
Model Network and forward pass
Loss Training objective
Predictor Inference / generation
Collator Batch construction

Guides:

Developers

Project layout

xlm-core/
├── src/xlm/              # Core framework (harness, datamodule, tasks, Hydra configs)
├── xlm-models/           # Model families (arlm, ilm, mlm, mdlm, flexmdm, dream, …)
├── docs/                 # MkDocs source (published as https://dhruveshp.com/xlm-core/dev/)
├── tests/
├── requirements/
└── wiki/                 # Legacy internal notes

Contributing

We welcome contributions. See CONTRIBUTING.md and the Good First Issue list.

License

This project is licensed under the MIT License.

Acknowledgements

XLM is developed and maintained by IESL students at UMass Amherst.

Primary developers

  1. Dhruvesh Patel
  2. Durga Prasad Maram
  3. Sai Sreenivas Chintha
  4. Benjamin Rozonoyer

External Model Contributors:

Contributor Model Paper
Dhruvesh Patel DILM A Continuous Time Markov Chain Framework for Insertion Language Models

We welcome external model contributions! Please see CONTRIBUTING.md for more details.

Cite

If you found this repository useful, please consider citing:

@article{patel2025xlm,
  title={XLM: A Python package for non-autoregressive language models},
  author={Patel, Dhruvesh and Maram, Durga Prasad and Chintha, Sai Sreenivas and Rozonoyer, Benjamin and McCallum, Andrew},
  journal={arXiv preprint arXiv:2512.17065},
  year={2025}
}

Built with care for the NLP research community

About

XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. Built on PyTorch and PyTorch Lightning, with Hydra for configuration management, XLM makes it effortless to experiment with cutting-edge NAR architectures.

Topics

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages