prefect-submitit

A Prefect 3 TaskRunner that submits tasks to SLURM clusters via submitit.

Features

sbatch submission -- submit individual tasks as SLURM jobs and task.map() calls as job arrays, with automatic chunking when arrays exceed cluster limits
srun submission -- run tasks as srun steps within an existing allocation (salloc), avoiding per-task scheduling overhead
Batched execution -- group multiple items per SLURM job with units_per_worker to reduce scheduling overhead (works with both sbatch and srun modes)
Local mode -- swap to local execution for development without changing your flow code
Prefect UI integration -- task run names include SLURM job IDs for easy cross-referencing with squeue/sacct

Requirements

Python >= 3.12
Prefect >= 3.6, < 4.0
SLURM cluster (for sbatch/srun modes) or Docker (for local development)

Installation

pip install prefect-submitit

With pixi:

pixi add prefect-submitit

With conda:

conda install -c conda-forge prefect-submitit

Quick start

from prefect import flow, task
from prefect_submitit import SlurmTaskRunner


@task
def add(x: int, y: int) -> int:
    return x + y


@flow(task_runner=SlurmTaskRunner(partition="cpu", time_limit="00:10:00"))
def my_flow():
    # Single task
    future = add.submit(1, 2)
    print(future.result())  # 3

    # Map over inputs (submitted as a SLURM job array)
    futures = add.map([1, 2, 3], [4, 5, 6])
    print([f.result() for f in futures])  # [5, 7, 9]


if __name__ == "__main__":
    my_flow()

Execution modes

The runner supports three execution modes, selected via the execution_mode parameter or the SLURM_TASKRUNNER_BACKEND environment variable (slurm, srun, or local):

Mode	Dispatch	Requires	Best for
`slurm`	sbatch	SLURM access	Batch workloads
`srun`	srun	Active allocation (`SLURM_JOB_ID`)	Interactive / low-latency
`local`	None	Nothing	Development and testing

slurm (default)

Each .submit() becomes a SLURM job via sbatch. Each .map() becomes a job array with automatic chunking when the array exceeds cluster limits.

SlurmTaskRunner(execution_mode="slurm", partition="gpu", gpus_per_node=1)

srun

Runs tasks as srun steps inside an existing SLURM allocation. Requires a prior salloc or sbatch session with SLURM_JOB_ID set. Avoids per-task scheduling overhead, making it suited for many small tasks sharing a single allocation.

salloc -N2 --mem=32G --time=02:00:00 -- python my_flow.py

SlurmTaskRunner(execution_mode="srun", mem_gb=4, cpus_per_task=2)

local

Runs tasks locally via submitit's LocalExecutor. SLURM parameters are ignored. For development and testing without a cluster.

SlurmTaskRunner(execution_mode="local")

Or via environment variable:

export SLURM_TASKRUNNER_BACKEND=local

Configuration

Parameter	Default	Description
`partition`	`"cpu"`	SLURM partition
`time_limit`	`"01:00:00"`	Wall time (HH:MM:SS)
`mem_gb`	`4`	Memory per job in GB
`gpus_per_node`	`0`	GPUs per job
`cpus_per_task`	`1`	CPUs per task
`units_per_worker`	`1`	Items per SLURM job (>1 enables batched execution)
`slurm_array_parallelism`	`1000`	Max concurrent array tasks
`execution_mode`	`None`	`"slurm"`, `"srun"`, or `"local"`. Falls back to `SLURM_TASKRUNNER_BACKEND` env var, then `"slurm"`
`poll_interval`	mode-dependent	Seconds between status checks (slurm=5.0, srun=0.5, local=1.0)
`max_poll_time`	`None`	Max seconds to poll before timing out. Default: time_limit × 2
`log_folder`	`"slurm_logs"`	Directory for submitit logs
`fail_on_error`	`True`	Raise on SLURM job failure
`max_array_size`	`None`	Override auto-detected cluster MaxArraySize
`srun_launch_concurrency`	`128`	Max concurrent srun steps (srun mode only)

Additional keyword arguments are passed through to submitit (e.g. slurm_gres="gpu:a100:1").

Examples

The examples/ directory contains Jupyter notebooks demonstrating each feature on a real SLURM cluster:

Notebook	Covers
`01_single_task_submission`	Submitting individual tasks as SLURM jobs
`02_job_arrays_with_map`	`task.map()` with automatic job array chunking
`03_batched_execution`	Grouping items per job with `units_per_worker`
`04_error_handling_and_cancellation`	Failure propagation and job cancellation
`05_local_mode_and_development`	Local execution mode for dev/testing
`slurm_submit_and_run.py`	Minimal script for the Docker SLURM environment

To run the notebooks: install dependencies, register the Jupyter kernel, and start the Prefect server (see Development below), then open any notebook and select the Prefect-Submitit kernel.

Prefect server

The repo includes a prefect-server CLI to run a local Prefect server backed by PostgreSQL (handles SLURM concurrency better than SQLite). The server uses UID-based port allocation to avoid conflicts on shared nodes.

pixi run prefect-start   # Start in background (PostgreSQL + Prefect)
pixi run prefect-stop    # Stop the server

The CLI automatically:

Initializes PostgreSQL on first run (stored in ~/.prefect-submitit/postgres/)
Picks UID-based ports: Prefect on even ports (4200--5798), PostgreSQL on odd ports (5433--7031)
Uses the node's FQDN so SLURM workers can reach it (falls back to IP if FQDN is unresolvable)
Writes a discovery file to ~/.prefect-submitit/server.json
Tunes connection pool sizes for high-concurrency SLURM workloads

Direct CLI

prefect-server start [--bg] [--sqlite] [--restart] [--port N] [--pg-port N]
prefect-server stop [-f]
prefect-server status
prefect-server init-db [--reset]

--sqlite uses SQLite instead of PostgreSQL
--restart stops any existing server before starting
start is idempotent -- skips if the server is already healthy

Server discovery

Workers resolve the Prefect API URL in this order: PREFECT_SUBMITIT_SERVER env var → PREFECT_API_URL env var → discovery file (auto-written by prefect-server start).

Development

Requires pixi:

pixi install
pixi run -e dev fmt       # Format and lint
pixi run -e dev test      # Run unit tests

Docker SLURM environment

A containerized single-node SLURM cluster is included for development and integration testing without access to a real cluster:

pixi run slurm-build     # Build the Docker image
pixi run slurm-up        # Start the SLURM container
pixi run slurm-shell     # Shell into the running container
pixi run slurm-down      # Stop and remove the container

See docker/README.md for details.

Integration tests

Integration tests submit real SLURM jobs and are gated behind --run-slurm. Tests are split by submission mode:

# sbatch tests (standard SLURM submission)
pixi run -e dev test-sbatch          # On a real cluster
pixi run -e dev test-sbatch-docker   # In the Docker environment

# srun tests (within an allocation)
pixi run -e dev test-srun            # On a real cluster (wraps in salloc)
pixi run -e dev test-srun-docker     # In the Docker environment (wraps in salloc)

Tests cover single submission, job arrays, batched execution, cancellation, failure handling, polling, and environment propagation.

IDE setup (VS Code)

Python interpreter: Set the pixi environment as your VS Code Python interpreter:

pixi run which python
# Example: /home/user/prefect-submitit/.pixi/envs/default/bin/python

In VS Code: Ctrl+Shift+P → "Python: Select Interpreter" → paste the path.

Jupyter kernel: Register the pixi environment as a Jupyter kernel so notebooks use the correct packages:

pixi run install-kernel

In VS Code: open a .ipynb file → click "Select Kernel" → choose Prefect-Submitit.

License

BSD 3-Clause. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.github		.github
docker		docker
examples		examples
recipe		recipe
src/prefect_submitit		src/prefect_submitit
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prefect-submitit

Features

Requirements

Installation

Quick start

Execution modes

slurm (default)

srun

local

Configuration

Examples

Prefect server

Development

Docker SLURM environment

Integration tests

IDE setup (VS Code)

License

About

Uh oh!

Releases 6

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

prefect-submitit

Features

Requirements

Installation

Quick start

Execution modes

slurm (default)

srun

local

Configuration

Examples

Prefect server

Development

Docker SLURM environment

Integration tests

IDE setup (VS Code)

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Contributors

Uh oh!

Languages