From 127dc89973576d0d7a51e06aa5140d342d2ee057 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Fri, 22 May 2026 23:47:31 +0000 Subject: [PATCH 01/66] docs: rewrite into 8 task-oriented pages with auto-generated reference MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces 22 small/uneven docs files with 8 longer pages modeled on the verifiers docs: each page opens with a TOC, content is grouped by task (configure, train, scale, …) instead of by feature, and a single auto-generated reference page covers every config field. New pages: overview, configuration, training, scaling, algorithms, advanced, faqs, reference. reference.md is generated by scripts/generate_docs_reference.py from the Pydantic config models; regenerate with `uv run python scripts/generate_docs_reference.py`. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 21 +- configs/debug/training_modes/README.md | 2 +- docs/advanced.md | 345 +++ docs/algorithms.md | 256 +++ docs/async.md | 39 - docs/benchmarking.md | 79 - docs/bring-your-own-algorithms.md | 142 -- docs/checkpointing.md | 57 - docs/configs.md | 33 - docs/configuration.md | 189 ++ docs/deployment.md | 299 --- docs/disaggregated-inference.md | 91 - docs/entrypoints.md | 67 - docs/environments.md | 32 - docs/faqs.md | 265 +++ docs/index.md | 16 - docs/kubernetes.md | 308 --- docs/logging.md | 86 - docs/memory_usage.md | 132 -- docs/metrics.md | 55 - docs/mint.json | 22 +- docs/multi_run_manager.md | 244 --- docs/multimodal.md | 60 - docs/overview.md | 74 + docs/platform-monitoring.md | 48 - docs/reference.md | 2703 ++++++++++++++++++++++++ docs/scaling.md | 581 +++++ docs/slurm.md | 298 --- docs/testing-moe-at-small-scale.md | 113 - docs/training.md | 363 ++++ docs/training_modes.md | 39 - docs/trajectories.md | 96 - docs/troubleshooting.md | 21 - scripts/generate_docs_reference.py | 364 ++++ 34 files changed, 5158 insertions(+), 2382 deletions(-) create mode 100644 docs/advanced.md create mode 100644 docs/algorithms.md delete mode 100644 docs/async.md delete mode 100644 docs/benchmarking.md delete mode 100644 docs/bring-your-own-algorithms.md delete mode 100644 docs/checkpointing.md delete mode 100644 docs/configs.md create mode 100644 docs/configuration.md delete mode 100644 docs/deployment.md delete mode 100644 docs/disaggregated-inference.md delete mode 100644 docs/entrypoints.md delete mode 100644 docs/environments.md create mode 100644 docs/faqs.md delete mode 100644 docs/index.md delete mode 100644 docs/kubernetes.md delete mode 100644 docs/logging.md delete mode 100644 docs/memory_usage.md delete mode 100644 docs/metrics.md delete mode 100644 docs/multi_run_manager.md delete mode 100644 docs/multimodal.md create mode 100644 docs/overview.md delete mode 100644 docs/platform-monitoring.md create mode 100644 docs/reference.md create mode 100644 docs/scaling.md delete mode 100644 docs/slurm.md delete mode 100644 docs/testing-moe-at-small-scale.md create mode 100644 docs/training.md delete mode 100644 docs/training_modes.md delete mode 100644 docs/trajectories.md delete mode 100644 docs/troubleshooting.md create mode 100644 scripts/generate_docs_reference.py diff --git a/README.md b/README.md index 5b7654868d..6a092f7dc4 100644 --- a/README.md +++ b/README.md @@ -52,7 +52,7 @@ With `[model] impl = "auto"` (the default), the trainer selects that custom stac | GLM-5 (`glm_moe_dsa`) | `zai-org/GLM-5`, `zai-org/GLM-5-FP8` | yes | ✅ | ✅ | | Qwen3 MoE (`qwen3_moe`) | `Qwen/Qwen3-30B-A3B`, … | yes | ✅ | ✅ | | Qwen3.5 MoE (`qwen3_5_moe`) | `Qwen/Qwen3.5-35B-A3B`, … | yes | ✅ | ✅ | -| Qwen3 / Qwen3.5 VLMs | [multimodal.md](docs/multimodal.md) (`qwen3_vl`, `qwen3_5`, `qwen3_5_moe`) | MoE only on MoE VLMs | MoE only | ✅ | +| Qwen3 / Qwen3.5 VLMs | see [advanced.md](docs/advanced.md#vision-language-models) (`qwen3_vl`, `qwen3_5`, `qwen3_5_moe`) | MoE only on MoE VLMs | MoE only | ✅ | | Poolside Laguna (`laguna`) | `poolside/Laguna-XS.2` | yes | ✅ | ✅ | | MiniMax M2 (`minimax_m2`) | `MiniMax/MiniMax-M2` | yes | ✅ | ✅ | | Nemotron H (`nemotron_h`) | `nvidia/Nemotron-3-Nano-30B-A3B`, `nvidia/Nemotron-3-Super-120B-A12B`, … | yes | ✅ | ❌ | @@ -217,17 +217,14 @@ These guides are designed to be run from a Slurm cluster but can also be adapted Check out the [docs](docs) directory for in-depth guides on how to use PRIME-RL. -- [**Entrypoints**](docs/entrypoints.md) - Overview of the main components (orchestrator, trainer, inference) and how to run SFT, RL, and evals -- [**Configs**](docs/configs.md) - Configuration system using TOML files, CLI arguments, and environment variables -- [**Environments**](docs/environments.md) - Installing and using verifiers environments from the Environments Hub -- [**Async Training**](docs/async.md) - Understanding asynchronous off-policy training and step semantics -- [**Logging**](docs/logging.md) - Logging with loguru, torchrun, and Weights & Biases -- [**Checkpointing**](docs/checkpointing.md) - Saving and resuming training from checkpoints -- [**Benchmarking**](docs/benchmarking.md) - Performance benchmarking and throughput measurement -- [**Deployment**](docs/deployment.md) - Training deployment on single-GPU, multi-GPU, and multi-node clusters -- [**Memory Usage**](docs/memory_usage.md) - Techniques for reducing memory usage (activation checkpointing, offloading, EP, CP, LoRA, etc.) -- [**Troubleshooting**](docs/troubleshooting.md) - Common issues and their solutions -- [**Multimodal**](docs/multimodal.md) - Training VLMs like Qwen3-VL +- [**Overview**](docs/overview.md) - Architecture, install, and a copy-pasteable end-to-end RL run +- [**Configuration**](docs/configuration.md) - TOML composition, CLI overrides, env vars, validation +- [**Training**](docs/training.md) - RL, SFT, evals, checkpointing, observability, rules of thumb +- [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, Kubernetes, disaggregated inference, benchmarking +- [**Algorithms**](docs/algorithms.md) - Async/off-policy training, the AIPO loss, advantage and filter plugins, trajectory merging +- [**Advanced**](docs/advanced.md) - MoE, VLMs, LoRA, multi-run manager, environments, small-scale MoE testing +- [**FAQs**](docs/faqs.md) - Frequently-asked questions +- [**Reference**](docs/reference.md) - Auto-generated field-by-field reference for every entrypoint config ## Contributing diff --git a/configs/debug/training_modes/README.md b/configs/debug/training_modes/README.md index 67c5450947..96ccebb009 100644 --- a/configs/debug/training_modes/README.md +++ b/configs/debug/training_modes/README.md @@ -44,4 +44,4 @@ uv run rl @ configs/debug/training_modes/sft_lora.toml uv run rl @ configs/debug/training_modes/sft_external.toml ``` -See [docs/training_modes.md](../../docs/training_modes.md) for what each mode does. +See [docs/training.md](../../docs/training.md#training-modes-rl--opd--sft-via-orchestrator) for what each mode does. diff --git a/docs/advanced.md b/docs/advanced.md new file mode 100644 index 0000000000..a6ef4e83db --- /dev/null +++ b/docs/advanced.md @@ -0,0 +1,345 @@ +# Advanced + +This page covers the specialized features layered on top of the core training stack: MoE training and our custom model implementations, vision-language models, LoRA and the multi-run manager, environments installation and authoring, and the small-scale MoE testing workflow used during architecture work. + +## Table of Contents + +- [MoE models](#moe-models) + - [Custom vs HF implementations](#custom-vs-hf-implementations) + - [Expert parallelism backends](#expert-parallelism-backends) +- [Vision-language models](#vision-language-models) + - [Supported families](#supported-families) + - [Enabling VLM mode](#enabling-vlm-mode) + - [Limitations](#limitations) + - [Multi-turn VLM training](#multi-turn-vlm-training) +- [LoRA](#lora) +- [Multi-run manager](#multi-run-manager) + - [Run discovery](#run-discovery) + - [Eviction](#eviction) + - [Hooks](#hooks) +- [Environments](#environments) + - [Installing from the Hub](#installing-from-the-hub) + - [Authoring locally](#authoring-locally) + - [Multi-env training](#multi-env-training) +- [Testing MoE at small scale](#testing-moe-at-small-scale) + +## MoE models + +### Custom vs HF implementations + +`prime-rl` ships custom optimized model implementations for several MoE families. With `model.impl = "auto"` (default) the trainer picks the custom path when the HF config type is registered, falling back to plain HF otherwise. To force one: + +```toml +[trainer.model] +impl = "custom" # or "hf" to force the HF path +``` + +| Family | HF config types | EP | CP | +|---|---|---|---| +| GLM-5 (`glm_moe_dsa`) | `zai-org/GLM-5`, `zai-org/GLM-5-FP8` | ✅ | ✅ | +| Qwen3 MoE | `Qwen/Qwen3-30B-A3B`, … | ✅ | ✅ | +| Qwen3.5 MoE | `Qwen/Qwen3.5-35B-A3B`, … | ✅ | ✅ | +| Qwen3 / Qwen3.5 VLMs | see [Multimodal](#vision-language-models) | MoE only | ✅ | +| Poolside Laguna | `poolside/Laguna-XS.2` | ✅ | ✅ | +| MiniMax M2 | `MiniMax/MiniMax-M2` | ✅ | ✅ | +| Nemotron H | `nvidia/Nemotron-3-Nano-30B-A3B`, … | ✅ | ❌ | +| Trinity (AFMoE) | `arcee-ai/Trinity-Mini`, … | ✅ | ✅ | +| GLM-4 / GLM-4.5 / INTELLECT-3 | `THUDM/GLM-4-9B-0414`, `zai-org/GLM-4.5`, `PrimeIntellect/INTELLECT-3`, … | ✅ | ✅ | +| GPT-OSS (HF MoE) | `openai/gpt-oss-20b`, `openai/gpt-oss-120b` | ❌ | ✅ | + +The custom path enables EP, selective activation checkpointing, FP8 training (`model.fp8 = true`, requires SM90+), and faster MoE kernels (`moe_use_grouped_mm = true`, default). Forcing `impl = "hf"` is mostly useful when debugging — it's slower and disables most MoE-specific knobs. + +### Expert parallelism backends + +`model.ep_comm_backend` picks the all-to-all kernel used for EP dispatch/combine: + +- **`torch`** (default): TorchTitan's all-to-all collective. Works everywhere, no extra install. +- **`deepep`**: Custom kernels from DeepEP. Faster but requires DeepEP build (`bash scripts/install_deep_gemm.sh`, `bash scripts/install_ep_kernels.sh`) and tuning of `deepep_num_sms` (default 20) and `deepep_token_chunk_size` for your hardware. + +DeepEP intranode dispatch derives the RDMA channel count as `deepep_num_sms / 2`. Lower SM count leaves more for compute; higher speeds up dispatch. Useful starting points: 16–24 SMs on H100, 20–40 on B200. + +When you enable DeepEP, gradient clipping is auto-disabled (`optim.max_norm` set to `None`) because the kernels don't currently support it. This is a tradeoff — watch `grad_norm` in the trainer logs to make sure nothing diverges. + +## Vision-language models + +### Supported families + +The built-in VLM registry covers: + +| Family | `model_type` | Vision attr | LM attr | +|---|---|---|---| +| Qwen3-VL | `qwen3_vl` | `model.visual` | `model.language_model` | +| Qwen3.5 | `qwen3_5` | `model.visual` | `model.language_model` | +| Qwen3.5-MoE | `qwen3_5_moe` | `model.visual` | `model.language_model` | + +For a model not in the table, look up the attribute paths on the loaded HF model with `model.named_children()`. + +### Enabling VLM mode + +Add `[model.vlm]` and bfloat16 dtypes: + +```toml +[model] +name = "Qwen/Qwen3-VL-4B-Instruct" +optimization_dtype = "bfloat16" +reduce_dtype = "bfloat16" + +[model.vlm] +vision_encoder_attr = "model.visual" +language_model_attr = "model.language_model" +# freeze_vision_encoder = true # default; set false to fine-tune the encoder +``` + +A bad attribute path errors immediately — no silent fallbacks. The weight-broadcast key prefix is derived as `{language_model_attr}.layers.` automatically. + +To add a new model family permanently, append an entry to `VLM_REGISTRY` in `src/prime_rl/utils/vlm.py`. + +### Limitations + +- **Vision encoder frozen by default.** Set `freeze_vision_encoder = false` to fine-tune it; in that case it's FSDP-sharded per block. Has no effect under LoRA (LoRA freezes everything non-adapter regardless). +- **No multimodal-safe truncation.** Token sequences are truncated to `seq_len`, but `pixel_values` and `image_grid_thw` pass through unchanged. If a sample's tokens overflow, image tokens may get dropped while image tensors still describe the full image set. Set `seq_len` to cover your longest sample. +- **bfloat16 mandatory.** The trainer config validator refuses any other `optimization_dtype` / `reduce_dtype` for VLMs — vLLM serves VLMs in bfloat16 and a mismatch breaks the importance ratio. +- **Higher KL mismatch with multi-image inputs.** Expect noisier `mismatch_kl` than text-only; this is from minor numerical differences between the trainer's and vLLM's image processing. +- **Images aren't logged to monitors.** Sample logging captures the prompt text but not the actual images. + +### Multi-turn VLM training + +VLM rollouts go through the renderer-backed TITO client (`orchestrator.use_renderer = true`, required for VLMs). Per trajectory step: + +1. **Render** — the renderer tokenizes messages and emits per-image multimodal tensors (`pixel_values`, `image_grid_thw` for Qwen3-VL) as `multi_modal_data`. +2. **Pack** — `interleave_rollout` concatenates per-image tensors across a sample's merged step range into a single `mm_kwargs` dict on the `TrainingSample`. Per-token `mm_token_type_ids` (0=text, 1=image, 2=video) come from `renderer.mm_token_type_id_map`. +3. **Forward** — the trainer `**`-unpacks `mm_kwargs` into the model's `forward`. Any VLM whose HF processor and forward signature agree on kwarg names works without modifying the transport. + +Each multimodal sample becomes its own micro-batch (no packing) because image tensor sizes vary. + +`VLLM_WORKER_MULTIPROC_METHOD=spawn` is required for VLM inference — set automatically by `uv run rl`, but if you launch `uv run inference` separately for a VLM, export it yourself. + +## LoRA + +LoRA is enabled by adding `[model.lora]`: + +```toml +[model.lora] +rank = 16 +alpha = 32 +dropout = 0.0 +``` + +`target_modules` defaults to a reasonable cross-family set (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`, `experts`, plus a few latent-projection names for Nemotron). Unknown names are silently ignored, so the defaults work across architectures. Add architecture-specific names to extend coverage (e.g. `in_proj` / `out_proj` for Mamba). + +LoRA is supported across SFT and RL. For RL, `weight_broadcast.type = "nccl"` is **not** supported with LoRA — use the default filesystem transport. To save the raw adapter alongside the merged HF weights: + +```toml +[ckpt.weights] +save_adapter_separately = true +``` + +LoRA pairs naturally with the multi-run manager — each run gets its own adapter, and many runs share the same backbone in trainer memory. + +## Multi-run manager + +`MultiRunManager` is a trainer-side singleton that lets one trainer process serve multiple concurrent orchestrator deployments, each with its own LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking. Enable by setting `trainer.max_concurrent_runs > 1`. + +Per-run layout under `/`: + +``` +run_abc123/ +├── control/ +│ ├── orch.toml # orchestrator config for this run +│ ├── config_validation_error.txt # populated if validation failed +│ └── evicted.txt # populated if the run was evicted +├── checkpoints/ +│ └── step_/ # orchestrator checkpoints +├── rollouts/ +│ └── step_/ # rollouts +└── broadcast/ + └── step_/ # weight snapshots for inference +``` + +### Run discovery + +Runs are added by dropping a `run_*` directory into `` with a valid `control/orch.toml`. The trainer scans periodically: + +```python +multi_run_manager.discover_runs() # master rank only +multi_run_manager.synchronize_state() # all ranks +``` + +- `discover_runs()` (master): scans, filters evicted runs, detects new/deleted, validates configs, fires `discovered_hook` / `forgotten_hook`. +- `synchronize_state()` (all ranks): master broadcasts run state over the distributed store; all ranks run `deletion_hook` then `creation_hook` so DTensor allocations and other collective ops happen in lock-step. + +Once `max_concurrent_runs` is reached, new `run_*` directories are ignored until existing runs are evicted or deleted. + +### Eviction + +The master can evict a run with `evict_run(idx, reason)`: + +```python +multi_run_manager.evict_run(idx=0, reason="exceeded memory limits") +``` + +The eviction writes `/control/evicted.txt`. Effect: + +- **Trainer side**: next `discover_runs()` treats the run as deleted, hooks fire, the index returns to the unused pool. +- **Orchestrator side**: checks for `evicted.txt` at the top of each iteration. If found, it raises a `RuntimeError` with the reason. The orchestrator also self-evicts after `MAX_EMPTY_BATCH_ATTEMPTS` (3) consecutive empty-batch failures, so a run with degenerate rewards doesn't sit consuming a slot forever. + +### Hooks + +Five hook types fire at well-defined points: + +| Hook | Where | When | +|---|---|---| +| `discovered_hook` | master | new run detected and config validated | +| `forgotten_hook` | master | run deleted from the output dir | +| `config_validation_hook` | master | validate the orchestrator config when a new run is discovered | +| `creation_hook` | all ranks | after `synchronize_state` for a newly created run (use for optimizer/scheduler init, LoRA param reset) | +| `deletion_hook` | all ranks | after `synchronize_state` for a deleted run (use for releasing per-run resources) | + +Deletion hooks always run before creation hooks. The creation/deletion hooks run on **all** ranks, so they're the right place for DTensor allocation and other collective work; `torch.dist.barrier()` is safe inside. + +For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). The primary use case today is the LoRA-per-run training topology — many lightweight RL runs (e.g. one per environment) sharing a single trainer process group. + +## Environments + +`prime-rl` trains in any [`verifiers`](https://github.com/PrimeIntellect-ai/verifiers) environment. The orchestrator hosts each declared environment as either a local subprocess (`vf.EnvServer` sidecar — default) or a standalone process you launched elsewhere. + +### Installing from the Hub + +Explore what's available: + +```bash +prime env info / +``` + +Install: + +```bash +prime env install / +# or pin a version +prime env install /@1.2.3 +``` + +Verify the import works: + +```bash +uv run python -c "import " +``` + +Then reference it in your config by ID: + +```toml +[[orchestrator.train.env]] +id = "primeintellect/math-env" +name = "gsm8k" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } +``` + +### Authoring locally + +For local dev or pre-Hub work, install an environment in editable mode: + +```bash +uv pip install -e path/to/my-env +``` + +The env exposes a `load_environment(**kwargs)` returning a `vf.Environment` (or v1 `vf.Env`). The `args` field in the orchestrator config is forwarded verbatim as `**kwargs`. See the [verifiers docs](https://docs.primeintellect.ai/verifiers) for environment authoring. + +To run an env in an isolated process (e.g. inside a container, with its own conda environment), launch the env server separately and pass its address: + +```toml +[[orchestrator.train.env]] +id = "my-env" +address = "tcp://10.0.0.5:5000" +``` + +When `address` is set, the orchestrator connects to that ZMQ server rather than spawning a subprocess. + +### Multi-env training + +You can train on a mixture of environments by listing several `[[orchestrator.train.env]]` tables. Set `ratio` on each to weight sampling; omit `ratio` on all of them to sample uniformly across problems (not envs). + +```toml +[[orchestrator.train.env]] +id = "math-env" +name = "gsm8k" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } +ratio = 3 + +[[orchestrator.train.env]] +id = "reverse-text" +ratio = 1 +``` + +This batches roughly 75% from `gsm8k` and 25% from `reverse-text`. The same applies to `[[orchestrator.eval.env]]` for evaluation mixtures. + +## Testing MoE at small scale + +When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. + +### Step 1: build and verify a mini model + +```bash +uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe +``` + +This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) with random weights, copies the tokenizer from the original GLM-4 model, and verifies the HF↔PrimeRL roundtrip is lossless. To re-verify after a modeling-code change without re-creating the model: + +```bash +uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only +``` + +### Step 2: SFT warmup + +Use the shipped debug MoE SFT config with reverse-text data: + +```bash +uv run sft @ configs/debug/moe/sft/train.toml \ + --model.name ./mini-glm-moe \ + --data.name PrimeIntellect/Reverse-Text-SFT \ + --data.type null \ + --max_steps 200 \ + --optim.lr 1e-4 \ + --ckpt.weights +``` + +Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution so KL divergence becomes meaningful in RL. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). + +### Step 3: RL on reverse-text + +```bash +uv run rl @ configs/ci/integration/rl/start.toml \ + --model.name samsja/mini-glm-moe \ + --trainer.model.impl custom \ + --inference.gpu-memory-utilization 0.7 \ + --inference.model.max-model-len 2048 +``` + +What to look for: + +- **No crashes.** Validates the full inference + orchestrator + trainer pipeline end-to-end. +- **Finite, non-zero KL.** Confirms the reference distribution is meaningful. +- **Loss reasonable.** Not NaN, not stuck. + +Don't expect reward to climb meaningfully in 20 steps on a random model. + +### Adding a new architecture + +To add (e.g.) Kimi 2.5: + +1. Add the modeling code under `src/prime_rl/trainer/models//`. +2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: + +```python +ARCH_PRESETS = { + "glm4_moe": { + "config_class": Glm4MoeConfig, + "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), + "hf_model_class": HFGlm4MoeForCausalLM, + "prime_model_class": PrimeRLGlm4MoeForCausalLM, + "tokenizer_source": "THUDM/GLM-4-9B-0414", + }, + # add your arch here +} +``` + +3. Run the three steps above with `--arch `. diff --git a/docs/algorithms.md b/docs/algorithms.md new file mode 100644 index 0000000000..4011871229 --- /dev/null +++ b/docs/algorithms.md @@ -0,0 +1,256 @@ +# Algorithms + +This page covers the math and the configurable algorithmic components: how off-policy training works, the default loss and advantage functions, how to plug in your own, the filters applied between rollout and training, and how multi-turn rollouts get merged into training samples. + +## Table of Contents + +- [Async / off-policy training](#async--off-policy-training) + - [Step semantics](#step-semantics) + - [The AIPO loss](#the-aipo-loss) + - [Tuning `max_async_level`](#tuning-max_async_level) +- [Loss](#loss) + - [Default loss](#default-loss) + - [Custom loss](#custom-loss) +- [Advantage](#advantage) + - [Default advantage](#default-advantage) + - [Custom advantage](#custom-advantage) + - [Length penalties](#length-penalties) +- [Filters](#filters) +- [Multi-turn trajectories](#multi-turn-trajectories) + - [Extension property](#extension-property) + - [Best-effort interleaving](#best-effort-interleaving) + - [Discontinuous trajectories](#discontinuous-trajectories) + +## Async / off-policy training + +`prime-rl` is asynchronous by default. Inference is allowed to generate rollouts using a stale policy that is up to `k` steps behind the trainer, where `k = max_async_level`. Setting `k = 1` with matched trainer and inference step times produces fully-overlapped pipeline parallelism — neither side ever idles. The default is `k = 2`, chosen to absorb the latency of a cross-WAN weight broadcast for decentralized runs. + +![Two-Step Off-Policy Training](assets/two-step-off-policy.png) + +### Step semantics + +At step $n = 1, 2, 3, \dots$: + +- **Trainer** produces policy $\pi_n$ with weights $\theta_n$ from rollouts $(x_n, y_n)$. +- **Inference** produces rollouts $(x_n, y_n)$ from policy $\pi_{\max(0,\,n - k)}$. + +So at step $n$ the gap between the policy being trained and the policy that generated the data is at most $k$ steps. Step indices are 0-indexed so the bound holds at startup. + +### The AIPO loss + +The default loss is a token-level variant of [AIPO](https://arxiv.org/abs/2505.24034), without the entropy and KL terms used in the original paper. For each prompt $x_j$ we sample a group of $G$ rollouts $\{y_i\}_{i=1}^G$, score them with the rubric to get $s_i$, then optimize: + +$$ +\mathcal{J}_{\text{AIPO}}(\theta) += \frac{1}{\sum_{j=1}^N \sum_{i=1}^G |y_i^{(j)}|} +\sum_{j=1}^N \sum_{i=1}^G \sum_{t=1}^{|y_i^{(j)}|} +\min\!\left(\frac{\pi(y_{i,t}^{(j)}\mid x_j, y_{i, LossOutputs: + ratio = torch.exp(inputs.trainer_logprobs - inputs.inference_logprobs) + clipped = torch.clamp(ratio, 1 - clip_eps, 1 + clip_eps) + surr1 = ratio * inputs.advantages + surr2 = clipped * inputs.advantages + loss = -torch.min(surr1, surr2)[inputs.loss_mask].sum() + return LossOutputs( + loss=loss, + metrics={ + "clip_frac": (ratio != clipped)[inputs.loss_mask].float().mean(), + }, + ) +``` + +Wire it up: + +```toml +[trainer.loss] +type = "custom" +import_path = "my_module.ppo_clip_loss" +kwargs = { clip_eps = 0.2 } +``` + +The dataclasses: + +```python +@dataclass +class LossInputs: + trainer_logprobs: Float[Tensor, "seq"] # current policy + inference_logprobs: Float[Tensor, "seq"] # rollout-time policy + teacher_logprobs: Float[Tensor, "seq"] | None # only set in OPD mode + advantages: Float[Tensor, "seq"] + loss_mask: Bool[Tensor, "seq"] + +@dataclass +class LossOutputs: + loss: Float[Tensor, ""] + metrics: dict[str, Tensor] +``` + +Anything you put in `metrics` is averaged across sequences and logged with the other trainer metrics. + +## Advantage + +### Default advantage + +The default advantage is per-group reward minus per-group baseline (DR-GRPO without std normalization). For each prompt's group of `rollouts_per_example` rollouts, every token in rollout $i$ receives advantage $s_i - \bar{s}$ where $\bar{s}$ is the group mean. + +This is intentionally simple — it does the right thing for most envs. Switch to a custom function when you need group-aware shaping (e.g. length penalties tied to turn count, sub-agent rollouts, or relative-rank shaping). + +### Custom advantage + +Advantages are computed **per group**. You write a function that takes one group of rollouts and returns one advantage scalar per rollout. The orchestrator handles groups of varying size automatically — partial-group training kicks in when some rollouts in a group errored. + +```python +# my_module.py +import statistics +from prime_rl.orchestrator.advantage import AdvantageInputs, AdvantageOutputs + +def normalized_advantage(inputs: AdvantageInputs, eps: float = 1e-8) -> AdvantageOutputs: + rewards = [r["reward"] for r in inputs.rollouts] + mean = statistics.fmean(rewards) + std = statistics.pstdev(rewards) if len(rewards) > 1 else 0.0 + return AdvantageOutputs(advantages=[(r - mean) / (std + eps) for r in rewards]) +``` + +```toml +[orchestrator.advantage] +type = "custom" +import_path = "my_module.normalized_advantage" +kwargs = { eps = 1e-8 } +``` + +`AdvantageInputs.rollouts` is a list of `verifiers.RolloutOutput`, so you have access to the full rollout (turns, tool calls, custom metadata) — not just the reward. Use this for anything reward-shaping-like that needs trajectory context. + +### Length penalties + +Two built-in length penalties can be layered on top of any advantage: + +- `[orchestrator.length_penalty] type = "tokens"` — penalizes long completions in tokens, with configurable target and slope. +- `[orchestrator.length_penalty] type = "turns"` — penalizes long multi-turn rollouts by turn count. + +See [Reference § orchestrator length penalties](reference.md#orchestrator) for the fields. + +## Filters + +Filters drop rollouts between scoring and training. Built-ins (composable): + +| Filter | Effect | +|---|---| +| `gibberish` | Drops rollouts whose mean log-prob fall below a threshold — usually a sign of degenerate output. | +| `repetition` | Drops rollouts with high n-gram repetition. | +| `zero_advantage` | Drops rollouts whose advantage is zero, so the trainer doesn't waste tokens on them. | + +Enable via `[[orchestrator.filter]]`: + +```toml +[[orchestrator.filter]] +type = "zero_advantage" + +[[orchestrator.filter]] +type = "repetition" +threshold = 0.4 +``` + +Filtered rollouts still appear in W&B distributions, just not in the trainer batch — useful for spotting whether filtering is doing its job. + +## Multi-turn trajectories + +Multi-turn rollouts (tool use, browser environments, long conversations) used to be stitched into a single fake "single-turn" sample, which silently corrupted the importance ratio when chat templates didn't roundtrip. Since [verifiers v0.1.8](https://github.com/PrimeIntellect-ai/verifiers/releases/tag/v0.1.8), `prime-rl` records each LLM request/response as an independent **trajectory step** and merges them at training time using best-effort interleaving. + +### Extension property + +A sequence of trajectory steps has the **extension property** when each successive step's prompt contains all previous prompts and completions as an exact prefix. The trainer relies on this property — when it holds: + +- Multiple steps merge into one training sample. +- Compute scales as $O(T)$ in the trajectory length. + +When it breaks (chat template strips past thinking, environment compacts context, an agent hands off to a sub-agent, etc.), the trainer starts a new training sample from that step: + +- Graceful fallback to multiple samples — no corrupted data. +- Worst case (every step breaks extension) is $O(T^2)$. + +### Best-effort interleaving + +Concretely: + +``` +5-step trajectory where extension breaks at step 4: + +steps 1–3: extension holds → merged into Sample 1 +step 4: extension breaks (e.g. thinking stripped from history) +steps 4–5: extension holds → merged into Sample 2 + +result: 2 training samples instead of 5 +``` + +The orchestrator enforces an **exact prefix invariant**: the prompt at turn $t$ must be the exact concatenation of prior messages exactly as the LLM originally generated them. If turn 2's prompt is `U1, A1', U2` while `A1' ≠ A1`, the orchestrator can't safely merge — either choice produces logprob drift between trainer and inference. Starting a fresh sample is the only correct behavior, so that's what happens. + +A common source of breakage is models like Qwen3 whose chat templates strip past `` blocks across user turns: + +```python +from transformers import AutoTokenizer +tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B") +messages = [ + {"role": "user", "content": "U1"}, + {"role": "assistant", "content": "R1A1"}, + {"role": "user", "content": "U2"}, +] +tok.apply_chat_template(messages[:1], tokenize=False) +# <|im_start|>user +# U1<|im_end|> + +tok.apply_chat_template(messages, tokenize=False) +# <|im_start|>user\nU1<|im_end|>\n<|im_start|>assistant\nA1<|im_end|>\n<|im_start|>user\nU2<|im_end|> +# (the R1 from turn 2 is gone) +``` + +Workarounds: use a chat template that preserves thinking (we ship patched versions for many models, e.g. `PrimeIntellect/Qwen3-0.6B`), or enable `orchestrator.renderer.preserve_all_thinking = true` so the renderer re-emits past thinking blocks itself. + +### Discontinuous trajectories + +Some envs are discontinuous by design — e.g. a main agent delegating to a sub-agent and getting back only a summarized result, not the sub-agent's whole conversation. Best-effort interleaving handles this naturally: each agent's contiguous turns merge, the handoff starts a new sample. The trainer never sees fabricated extension where there is none. + +For background on the design, see the verifiers [trajectories design note](https://github.com/PrimeIntellect-ai/verifiers/blob/main/notes/TRAJECTORIES.md). The `--trajectory-strategy branching` option is deprecated — best-effort interleaving covers all cases, falling back to separate samples (equivalent to old branching) when extension breaks. diff --git a/docs/async.md b/docs/async.md deleted file mode 100644 index 3343c26dc6..0000000000 --- a/docs/async.md +++ /dev/null @@ -1,39 +0,0 @@ -# Asynchronous Training - -PRIME-RL implements asynchronous off-policy training, instead of the traditional synchronous on-policy training. This means that we allow inference to generate rollouts from a stale policy up to $k$ (in the code we call this `max_async_level`) steps ahead of the trainer. With `k=1` and trainer and inference step timings being equal, this allows to run without any idle time on either the trainer or inference. By default, we set `k=2` to allow overlap with a weight broadcast over the Internet, which is needed for decentralized training. - -![Two-Step Off-Policy Training](assets/two-step-off-policy.png) - -## Loss Objective - -We adopt a loss objective capable of handling the natural distribution shift caused by the off-policy nature of the training. By default, we use a token-level loss variant of the [AIPO](https://arxiv.org/abs/2505.24034) training objective introduced in Llama-RL, -but omit the entropy and KL loss terms. - -At each step, we sample $N$ prompts from our dataset. For -each prompt $x$, we sample a group of rollouts $\{y_i\}^G_{i=1}$ -and use a verifier to assign scores $s_i$ to each $y_i$. -Then, the optimization objective is given by - -$$ -\mathcal{J}_{\text{AIPO}}(\theta) -= \frac{1}{\sum_{j=1}^N \sum_{i=1}^G |y_i^{(j)}|} -\sum_{j=1}^N -\sum_{i=1}^G -\sum_{t=1}^{|y_i^{(j)}|} -\min\left( -\frac{\pi(y^{(j)}_{i,t}\mid x_j, y^{(j)}_{i, LossOutputs: - ... -``` - -#### LossInputs - -```python -@dataclass -class LossInputs: - trainer_logprobs: Float[Tensor, "seq"] # Log probs from current policy - inference_logprobs: Float[Tensor, "seq"] # Log probs from reference policy - teacher_logprobs: Float[Tensor, "seq"] | None # Optional teacher log probs - advantages: Float[Tensor, "seq"] # Per-token advantages - loss_mask: Bool[Tensor, "seq"] # Mask for valid tokens -``` - -#### LossOutputs - -```python -@dataclass -class LossOutputs: - loss: Float[Tensor, ""] # Scalar loss for this sequence - metrics: dict[str, Tensor] # Metrics to log -``` - -### Example: PPO Clipped Loss - -```python -import torch -from prime_rl.trainer.rl.loss import LossInputs, LossOutputs - -def ppo_clip_loss(inputs: LossInputs, clip_eps: float = 0.2) -> LossOutputs: - ratio = torch.exp(inputs.trainer_logprobs - inputs.inference_logprobs) - clipped_ratio = torch.clamp(ratio, 1 - clip_eps, 1 + clip_eps) - - surr1 = ratio * inputs.advantages - surr2 = clipped_ratio * inputs.advantages - - loss = -torch.min(surr1, surr2)[inputs.loss_mask].sum() - - return LossOutputs( - loss=loss, - metrics={"clip_frac": (ratio != clipped_ratio)[inputs.loss_mask].float().mean()}, - ) -``` - -### Configuration - -```toml -[loss] -type = "custom" -import_path = "my_module.ppo_clip_loss" -kwargs = { clip_eps = 0.2 } -``` - ---- - -## 2. Custom Advantage Functions - -Advantages are computed **per-group** (one example × N rollouts). You provide a function that computes advantages for a single group; the framework calls it once per group and stitches the results back together. Groups may have fewer than `rollouts_per_example` rollouts when some rollouts in the group errored (partial-group training). - -### Interface - -```python -from prime_rl.orchestrator.advantage import AdvantageInputs, AdvantageOutputs - -def my_custom_advantage(inputs: AdvantageInputs, **kwargs) -> AdvantageOutputs: - ... -``` - -#### AdvantageInputs - -```python -@dataclass -class AdvantageInputs: - # All rollouts for a single example (one group). - rollouts: list[vf.RolloutOutput] -``` - -Each `vf.RolloutOutput` carries the full rollout (`reward`, `trajectory`, etc.), so custom advantages can read any metadata they need (e.g. completion-token counts, turn counts, tool calls). - -#### AdvantageOutputs - -```python -@dataclass -class AdvantageOutputs: - advantages: list[float] # one entry per rollout in the input group -``` - -### Example: Normalized Advantage - -```python -import statistics -from prime_rl.orchestrator.advantage import AdvantageInputs, AdvantageOutputs - -def normalized_advantage(inputs: AdvantageInputs, eps: float = 1e-8) -> AdvantageOutputs: - """Normalize advantages to zero mean and unit variance within the group.""" - rewards = [r["reward"] for r in inputs.rollouts] - mean = statistics.fmean(rewards) - std = statistics.pstdev(rewards) if len(rewards) > 1 else 0.0 - return AdvantageOutputs(advantages=[(r - mean) / (std + eps) for r in rewards]) -``` - -### Configuration - -```toml -[advantage] -type = "custom" -import_path = "my_module.normalized_advantage" -kwargs = { eps = 1e-8 } -``` - ---- - -## Default Implementations - -If no custom function is specified: - -- **Loss**: Uses `default_loss_fn` (masked importance sampling with KL against the inference policy, and optional masking strategies) -- **Advantage**: Uses `default_advantage_fn` (reward minus per-example baseline, a.k.a. DR-GRPO without std normalization) - -See `LossConfig` and `AdvantageConfig` for available parameters. - -## Tips - -- Your functions receive structured inputs via dataclasses with jaxtyping annotations -- Return metrics as scalars or 1D tensors - they'll be aggregated automatically -- Use the `loss_mask` / tensor shapes to handle variable-length sequences -- Test your custom functions with the provided test patterns before training diff --git a/docs/checkpointing.md b/docs/checkpointing.md deleted file mode 100644 index ce929a2f57..0000000000 --- a/docs/checkpointing.md +++ /dev/null @@ -1,57 +0,0 @@ -# Checkpointing - -Checkpointing is non-standard due to trainer/orchestrator separation and natural asynchrony. - -- SFT+RL Trainer: Checkpoints FSDP model shard (using DCP), optimizer and scheduler state, and progress (training step, total samples, total tokens) -- Orchestrator: Checkpoints orchestrator progress (training step, total tokens, total samples, total problems) -- Inference: Inference is stateless. Upon restart, the orchestrator will reload the correct weights into the inference engine. No checkpointing is required. - -The default checkpoint directory is `checkpoints` and each checkpoint step will live in a step subdirectory, i.e. `checkpoints/step_{step}`. - -Checkpointing is configured with the config key `--ckpt`. One can specify the interval (`--ckpt.interval`), whether to save checkpoints asynchronously (`--ckpt.save-async`), how many recent step checkpoints to keep on disk (`--ckpt.keep-last`), and keep checkpoints at every N steps permanently (`--ckpt.keep-interval`). By default, we do not checkpoint to save disk space. - -## SFT - -Let's split the reverse text training SFT example, which does 40 steps by default, into two runs of 20 steps each. - -First, run the first 20 steps and append `--ckpt` flag will enable the default checkpoint configuration which will only write the final checkpoint to disk, but no intermediate checkpoints. - -```bash -uv run sft ... --max-steps 20 --ckpt -``` - -Then, to resume the training from step 20, run the following command - -```bash -uv run sft ... --max-steps 40 --ckpt.resume-step 20 -``` - -## RL - -Similarly, let's split the reverse text training RL example, which does 20 steps by default, into two runs of 10 steps each. - -First, start the inference server. It can stay running across restarts as the orchestrator will automatically send the right checkpoint to the inference server when resuming. - -```bash -uv run inference ... -``` - -Then, run the first 20 steps and write the final checkpoint to disk - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --max-steps 10 \ - --ckpt -``` - -And finally, resume the training to do the remaining 10 steps - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --max-steps 20 \ - --ckpt.resume-step 10 -``` diff --git a/docs/configs.md b/docs/configs.md deleted file mode 100644 index ea384a7c5c..0000000000 --- a/docs/configs.md +++ /dev/null @@ -1,33 +0,0 @@ -# Configs - -We use `pydantic-settings` with some custom functionality for configuring runs. We support the following sources, in this order of precedence: - -1. **Command-line arguments**: Pass (nested) arguments as `--key.subkey value` to the script. For example, to set the model name, set `--model.name ` - -2. **Config files**: You can pass TOML config files using the `@` prefix. For example, to set a config, run `uv run inference @ path/to/config.toml`. (*You have to leave a space between the `@` and the config file*) - -3. **Environment variables**: You can set environment variables to override the config values. All environment variables must be prefixed with `PRIME_` and use the `__` delimiter to nest the keys. For example, to set the model name you can run `export PRIME_MODEL__NAME=Qwen/Qwen3-0.6B`. - -4. **Defaults**: For almost all config arguments, we have a default value which will be used if no other source is provided. - -In general we recommend setting configurations via config files to define reproducible experiments and use command-line arguments to override the config values to run variants of the same experiment. Environment variables are usually only used in production settings to communicate with the [Prime Protocol](https://github.com/PrimeIntellect-ai/protocol) worker. In most cases, you should not need to use environment variables. - -The precedence order will be important if multiple sources try to configure the same argument. For example, in the following command, all sources will define a model name - -```toml -# qwen8b.toml -[model] -name = "Qwen/Qwen3-8B" -``` - -```toml -# qwen14b.toml -[model] -name = "Qwen/Qwen-14B" -``` - -```bash -PRIME_MODEL__NAME=Qwen/Qwen3-4B uv run ... @ qwen8b.toml @ qwen14b.toml --model.name Qwen/Qwen3-32B -``` - -In this example, the CLI argument `--model.name Qwen/Qwen3-32B` will take precedence and the script will use `Qwen/Qwen3-32B` as the model name. If the CLI argument wasn't set, then the second config file would take precedence and the script would use `Qwen/Qwen-14B` as the model name. If the second config file wasn't set, then the first config file would take precedence and the script would use `Qwen/Qwen3-8B` as the model name. Finally, if the first config file wasn't set, then the environment variable would take precedence and the script would use `Qwen/Qwen3-4B` as the model name. If the environment variable wasn't set, then the default value would be used and the script would use `Qwen/Qwen3-0.6B` as the model name. diff --git a/docs/configuration.md b/docs/configuration.md new file mode 100644 index 0000000000..e7825fb404 --- /dev/null +++ b/docs/configuration.md @@ -0,0 +1,189 @@ +# Configuration + +Every `prime-rl` entrypoint — `rl`, `sft`, `trainer`, `orchestrator`, `inference`, `eval` — is configured by the same system: TOML files for reproducible base configs, CLI flags for one-off overrides, and a small set of environment variables for production deployments. Under the hood it is [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config) wrapping our Pydantic config models. + +## Table of Contents + +- [Sources and precedence](#sources-and-precedence) +- [TOML files and composition](#toml-files-and-composition) +- [CLI overrides](#cli-overrides) +- [Environment variables](#environment-variables) +- [Inspecting and validating](#inspecting-and-validating) +- [Special syntax](#special-syntax) + - [Booleans, `None`, and lists](#booleans-none-and-lists) + - [Optional sub-configs](#optional-sub-configs) + - [Discriminated unions](#discriminated-unions) + - [Environments (`[[orchestrator.train.env]]`)](#environments-orchestratortrainenv) +- [Worked example](#worked-example) +- [Conventions](#conventions) + +## Sources and precedence + +For a single field, sources are applied in this order — later sources win: + +1. **Defaults** — declared on the Pydantic model. +2. **Environment variables** — prefixed with `PRIME_`, double underscore (`__`) for nesting (`PRIME_MODEL__NAME=...`). +3. **TOML files** — passed with `@`, left to right (later files override earlier ones). +4. **CLI flags** — dotted, kebab-case (`--model.name`). + +Recommendation: pin reproducible experiments in TOML, override one-off knobs (W&B name, output dir, max steps) on the CLI, and reserve env vars for things that vary by deployment (API keys, infra endpoints). + +## TOML files and composition + +`@` introduces a TOML file. Multiple `@` arguments compose left-to-right, deep-merged — unset fields in an overlay keep the base value: + +```bash +uv run rl @ configs/gsm8k/rl.toml # one file +uv run rl @ base.toml @ overlay.toml # left to right +uv run rl --trainer @ trainer.toml --orchestrator @ orch.toml # per-section +uv run rl @ base.toml --trainer @ trainer.toml # mixed +``` + +**Mind the space**: `@ path/to/x.toml`, not `@path/to/x.toml`. + +The composed `rl` entrypoint splits its config across three processes — `[trainer]`, `[orchestrator]`, and `[inference]` tables become the sub-configs for each. Shared knobs (`model.name`, `output_dir`, `wandb.*`, …) live at the top level and are fanned out automatically. Stand-alone entrypoints (`uv run trainer`, `uv run orchestrator`, …) skip this lifting — their TOMLs have no `[trainer]` table because the whole file _is_ the trainer. + +## CLI overrides + +CLI flags mirror the TOML tree using dots, with kebab-case for field names (the leading `--` is a kebab-case marker; TOML stays snake_case): + +```bash +--max-steps 50 # top-level +--model.name Qwen/Qwen3-4B # nested +--trainer.optim.lr 1e-5 # double-nested +--inference.parallel.tp 4 +``` + +Field names in TOML use snake_case (`max_model_len`); the same field on the CLI is kebab-case (`--max-model-len`). + +Three convenience flags every entrypoint accepts: + +- `--help` — prints the full schema (all fields, defaults, types, descriptions). +- `--dry-run` — resolves the full config, writes it to `/configs/`, and exits without launching anything. Use to debug composition. +- `--output-dir ` — top-level override for the run's working directory (logs, checkpoints, weight snapshots). + +## Environment variables + +Env vars are read on top of defaults but below TOML and CLI. The convention is `PRIME_` with `__` as the dot separator: + +```bash +export PRIME_MODEL__NAME=Qwen/Qwen3-0.6B +export PRIME_TRAINER__OPTIM__LR=1e-5 +``` + +In practice only a few env vars are used routinely: + +- `PRIME_LOG_LEVEL` / `PRIME_VF_LOG_LEVEL` — log levels for the prime-rl and verifiers loggers (the `[log]` defaults read these). +- `WANDB_API_KEY` / `HF_TOKEN` — third-party credentials. +- `PRIME_API_KEY` — for [Prime Intellect platform monitoring](training.md#platform-monitoring). + +## Inspecting and validating + +```bash +uv run rl --help # full schema +uv run rl @ rl.toml --dry-run --output-dir /tmp/check # write resolved configs +``` + +`--dry-run` is the single most useful debugging tool: it runs every Pydantic validator (catching incompatibilities like CP requiring flash-attention, or NCCL weight broadcast requiring `max_async_level=1`) and dumps the fully merged config to disk. If a run misbehaves in mysterious ways, dry-run it first and inspect `/configs/`. + +When a validator fails, the error names the conflicting fields — fix one and re-run dry-run until clean. + +## Special syntax + +### Booleans, `None`, and lists + +**Booleans** — CLI uses paired flags: `--ckpt` enables, `--no-ckpt` disables. TOML must be explicit: + +```toml +ckpt = true +``` + +**None** — TOML has no `null`. Use the string `"None"`, which the loader coerces: + +```toml +[inference.model] +max_model_len = "None" +``` + +On the CLI: `--inference.model.max-model-len None`. + +**Lists** — TOML uses arrays of tables (see the env example below). Overlays **replace** lists wholesale, so an overlay that only wants to add an env still has to include the full list. On the CLI, index by position: + +```bash +--orchestrator.train.env.0.id math-env --orchestrator.train.env.1.id reverse-text +``` + +### Optional sub-configs + +Many sub-configs are typed `SomeConfig | None`. Two patterns enable them: + +- **Bare flag with defaults**: `--model.compile` or, in TOML, an empty section `[model.compile]`. The sub-config materializes with all-default values. +- **Enable and set fields together**: `--model.compile.fullgraph` (CLI) or any populated `[model.compile]` table (TOML). + +This is how `[ckpt]`, `[model.lora]`, `[model.compile]`, `[trainer.wandb]`, etc. are turned on. + +### Discriminated unions + +Loss, advantage, optimizer, scheduler, weight broadcast transport, and several others are discriminated unions. Set the `type` field to pick a variant: + +```toml +[trainer.optim] +type = "muon" +lr = 1e-5 +mu = 0.95 +``` + +Omit `type` to keep the default variant. See [Reference](reference.md) for every variant's fields. + +### Environments (`[[orchestrator.train.env]]`) + +Training environments are an array of tables — set one per env, optionally with sampling weights: + +```toml +[[orchestrator.train.env]] +id = "math-env" +name = "gsm8k" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } + +[[orchestrator.train.env]] +id = "reverse-text" +ratio = 0.25 # 25% of batches; remaining 75% goes to math-env + +[[orchestrator.eval.env]] +id = "math-env" +name = "gsm8k-eval" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } +``` + +`args` is forwarded verbatim to the environment's `load_environment(**args)`. See each environment's README on the [Hub](https://app.primeintellect.ai/dashboard/environments) for accepted args. + +## Worked example + +Start from a shipped base config, override two fields on the CLI, and dry-run: + +```bash +uv run rl @ configs/gsm8k/rl.toml \ + --wandb.name my-experiment \ + --trainer.optim.lr 5e-6 \ + --dry-run \ + --output-dir /tmp/gsm8k-dry +``` + +Then inspect the resolved config: + +```bash +ls /tmp/gsm8k-dry/configs/ +# rl.toml trainer.toml orchestrator.toml inference.toml +``` + +Each per-process TOML reflects the final, validated configuration that the actual run would consume — exactly what each process sees when started standalone (`uv run trainer @ /tmp/gsm8k-dry/configs/trainer.toml`, etc.). This is the easiest way to bisect a misbehaving config: dry-run a known-good base, dry-run your overlay, diff the two. + +## Conventions + +- **Reproducible base, mutable overlays.** Commit base TOMLs alongside example dirs (`configs//rl.toml`). Override on the CLI for one-shot experiments; promote overrides to a new TOML when they stabilize. +- **One W&B name per run.** Pass `--wandb.name ` on every launch. The orchestrator and trainer share the W&B run, so the same name surfaces all metrics together. +- **Always pin `output_dir`.** Per-run output directories prevent rollout files from one run leaking into another's training step. Use `--output-dir outputs/` or pin in TOML. +- **Prefer `--ckpt` for any run you might resume.** Without `ckpt`, only HF weight snapshots are written — you can serve them but cannot resume optimizer/scheduler state. See [Training § Checkpointing](training.md#checkpointing). +- **Dry-run before scaling.** A multi-node SLURM job that crashes on a config validator wastes a queue slot. Always `--dry-run` first. + +For the full set of fields, defaults, types, and constraints accepted by each entrypoint, jump to [Reference](reference.md). diff --git a/docs/deployment.md b/docs/deployment.md deleted file mode 100644 index 38d911b122..0000000000 --- a/docs/deployment.md +++ /dev/null @@ -1,299 +0,0 @@ -# Deployment - -You can deploy PRIME-RL on a single GPU and larger multi-node clusters. - -## SFT - -### Single-GPU - -For training on a single GPU, no communication orchestration is required and you can choose whether to start your trainer using our trainer entrypoint or using `torchrun`. - -To start with our `sft` entrypoint - -```bash -uv run sft ... -``` - -To do the same thing, but using `torchrun` - -```bash -uv run torchrun src/prime_rl/trainer/sft/train.py ... -``` - -### Multi-GPU - -For training on multiple GPUs, use `torchrun` with the `--nproc-per-node` flag. - -```bash -uv run torchrun \ - --local-rank-filter 0 \ - --nproc-per-node 8 \ - src/prime_rl/trainer/sft/train.py ... -``` - -*The `--local-rank-filter` flag is used to only log the logs from the master rank, as detailed in [logging](logging.md).* - -### Multi-Node - -For training on multiple nodes, use `torchrun` with the `--nnodes`, `--node-rank`, and `--rdzv-endpoint` flags. - -First, decide which node will be your head node and find a reachable private IP address for it. If your nodes are not colocated, you will likely need to setup VPN (e.g. [Tailscale](https://tailscale.com)) for the nodes to reach each other. - -(*Skip this step if the default network interface is sufficient.*) Make sure to set the network interface for GLOO and NCCL to one that allows all nodes to reach each other. - -```bash -# On both nodes -export GLOO_SOCKET_IFNAME=... -export NCCL_SOCKET_IFNAME=... -``` - -Then, configure the rendezvous endpoint to allow the nodes to find each other. Here, `MASTER_ADDR` is the private IP address of the head node and `MASTER_PORT` is a free port on the head node, typically port 29500 for `torchrun`. - -```bash -# On both nodes -export MASTER_ADDR=... -export MASTER_PORT=... -``` - -Then, on the head node, run - -```bash -# On node 0 -uv run torchrun \ - --nnodes 2 \ - --node-rank 0 \ - --rdzv-endpoint=$MASTER_ADDR:$MASTER_PORT \ - --local-rank-filter 0 \ - --nproc-per-node 8 \ - src/prime_rl/trainer/sft/train.py ... -``` - -And on the second node, run - -```bash -# On node 1 -uv run torchrun \ - --nnodes 2 \ - --node-rank 1 \ - --rdzv-endpoint=$MASTER_ADDR:$MASTER_PORT \ - --local-rank-filter 0 \ - --nproc-per-node 8 \ - src/prime_rl/trainer/sft/train.py ... -``` - -### SLURM - -See the dedicated [SLURM guide](slurm.md). - -## Inference - -For SLURM-based inference deployment, see the [SLURM guide](slurm.md#inference-examples). Each node runs an independent vLLM replica — no manual coordination needed. - -For manual multi-node deployment without SLURM, we rely on vLLM's multi-node data parallel deployment primitives ([docs](https://docs.vllm.ai/en/v0.10.0/serving/data_parallel_deployment.html)). - -First, decide which node will be your head node and find a reachable private IP address for it. If your nodes are not colocated, you will likely need to setup VPN (e.g. [Tailscale](https://tailscale.com)) for the nodes to reach each other. - -(*Skip this step if the default network interface is sufficient.*) Make sure to set the network interface for GLOO and NCCL to one that allows all nodes to reach each other. - -```bash -# On both nodes -export GLOO_SOCKET_IFNAME=... -export NCCL_SOCKET_IFNAME=... -``` - -Then, configure the data parallel address as the private IP address of the head node. - -```bash -# On both nodes -export DATA_PARALLEL_ADDRESS=... -export DATA_PARALLEL_RPC_PORT=... -``` - -To run TP=4 and DP=4 with DP ranks 0 and 1 on the head node and DP ranks 2 and 3 on the second node, run - -```bash -# On node 0 -uv run inference \ - --data-parallel-size 4 \ - --tensor-parallel-size 4 \ - --data-parallel-size-local 2 \ - --data-parallel-address $DATA_PARALLEL_ADDRESS \ - --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT -``` - -```bash -# On node 1 -uv run inference \ - --data-parallel-size 4 \ - --tensor-parallel-size 4 \ - --data-parallel-size-local 2 \ - --data-parallel-address $DATA_PARALLEL_ADDRESS \ - --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT \ - --data-parallel-start-rank 2 \ - --headless -``` - -## RL - -### Single-GPU Training - -If you only have access to a single GPU, you may still be able to run small RL experiments. To do so, configure your inference server to use only a fraction of the available memory to leave some space for the trainer. - -For example, to run an RL training on a single GPU while using 50% of the available memory for the inference server, run - -```bash -bash scripts/tmux.sh -``` - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ - --trainer-gpu-ids 0 \ - --inference-gpu-ids 0 \ - --inference.gpu-memory-utilization 0.5 -``` - -*Make sure to tune the `--gpu-memory-utilization` value such that you have enough GPU memory for the RL trainer.* - -You can also set this up by starting each submodule manually. - -```bash -# Run this in the `Inference` pane -uv run inference @ path/to/infer.toml --gpu-memory-utilization 0.5 -``` - -```bash -# Run this in the `Orchestrator` pane -uv run orchestrator @ path/to/orch.toml -``` - -```bash -# Run this in the `Trainer` pane -uv run trainer @ path/to/train.toml -``` - -### Multi-GPU Training - -For single-node training, we recommend using the `rl` entrypoint to conveniently start all components, i.e. the inference server, the orchestrator, and the trainer. - -By default, the inference server starts on GPU ID 0 and the trainer on GPU ID 1. - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ -``` - -You can configure the GPU IDs to use for the inference server and the trainer. For example, to run the inference server on GPUs IDs 0-5 with data parallelism and the trainer on GPUs IDs 6-7 - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ - --inference-gpu-ids 0,1,2,3,4,5 \ - --trainer-gpu-ids 6,7 \ - --inference.parallel.dp 6 -``` - -### Parallel Experiments - -For quick ablations, it can be more efficient to parallelize experiments within a node (e.g. split your GPUs to run two experiments in parallel). For example, if you have access to 4 GPUs and your experiment fits on 2 GPUs, you can parallelize two experiments as follows: - -Start the first experiment in a tmux session `exp1` with outputs directory `outputs1`. Specify it both in the tmux script, as well as in the start command (*will use the first 2 GPUs*) - -```bash -bash scripts/tmux.sh -s exp1 -o outputs1 -``` - -```bash -# Run this in the `Trainer` pane -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ - --output-dir outputs1 -``` - -For the second experiment, start a second tmux session named `exp2` with outputs directory `outputs2`. In addition, specify a new server port for the inference engine and orchestrator (*will use the last 2 GPUs*) - -```bash -bash scripts/tmux.sh -s exp-2 -o outputs2 -``` - -```bash -# Run this in the `Trainer` pane -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ - --inference-gpu-ids 2 \ - --trainer-gpu-ids 3 \ - --inference.server.port 8001 \ - --orchestrator.client.base-url http://localhost:8001/v1 \ - --output-dir outputs2 -``` - -### Multi-Node Training - -> We currently require a shared file system for multi-node RL training. - -To facilitate multi-node RL training, ensure that all nodes have access to a shared file system and that the node that will run the inference server is reachable from the orchestrator via a private or public IP address. Then, set the following environment variables on all nodes: - -```bash -# On all nodes -export OUTPUT_DIR=... # Path to directory in shared file system -export INFERENCE_SERVER_IP=... # Reachable IP address of the inference node -export INFERENCE_SERVER_API_KEY=... # API key for the inference server -``` - -Then, start the inference server on one node. - -```bash -# On one node -uv run inference ... \ - --api-key $INFERENCE_SERVER_API_KEY --parallel ... -``` - -Then, start a single orchestrator - -```bash -# On either node -uv run orchestrator ... \ - --client.base-url http://$INFERENCE_SERVER_IP:8000/v1 \ - --client.api-key-var INFERENCE_SERVER_API_KEY \ - --output-dir $OUTPUT_DIR -``` - -Finally, start the trainer on one as described in the [Trainer](#trainer) section. - -```bash -# On other node -uv run torchrun \ - --nproc-per-node 8 \ - --local-rank-filter 0 \ - src/prime_rl/trainer/rl/train.py ... \ - --output-dir $OUTPUT_DIR -``` - -Of course, you can further scale up the number of nodes used by the trainer and inference server, as described in the sections above. However, make sure that there is only a single orchestrator instance. - -### SLURM - -See the dedicated [SLURM guide](slurm.md). - -## Kubernetes - -For deployments on Kubernetes clusters, PRIME-RL provides a Helm chart that manages the entire training infrastructure including orchestrator, trainer, and inference components with automatic pod scheduling, GPU allocation, and shared storage. - -See the dedicated [Kubernetes guide](kubernetes.md) for complete documentation including: - -- Prerequisites and setup -- Quick start examples -- Component architecture -- Scaling and distributed training -- Configuration options -- Troubleshooting diff --git a/docs/disaggregated-inference.md b/docs/disaggregated-inference.md deleted file mode 100644 index 65f5dacf84..0000000000 --- a/docs/disaggregated-inference.md +++ /dev/null @@ -1,91 +0,0 @@ -# Disaggregated Prefill/Decode Inference - -Run MoE models with separate prefill and decode node groups for higher throughput. - -## Quick Start - -See [`configs/glm5_disagg_inference/inference.toml`](../configs/glm5_disagg_inference/inference.toml) for an example config. - -```bash -uv run inference @ configs/glm5_disagg_inference/inference.toml --output-dir /data/$USER/outputs -``` - -## Prefill/Decode Ratio - -| Workload | Recommended ratio (P:D) | Why | -|---|---|---| -| Agentic (SWE, Lean) | **3:1** | Long growing contexts → prefill-heavy | -| Non-agentic (math, chat) | **1:2** | Short prompts, long generations → decode-heavy | - -Monitor live queue depths: -```bash -curl -s http://:8100/metrics | grep num_requests_waiting -curl -s http://:8200/metrics | grep num_requests_waiting -``` - -If prefill has queued requests and decode has zero, add more prefill nodes (and vice versa). - -For historical averages (cumulative over the entire run), query the histogram metrics: -```bash -# Average queue time per request (seconds) -curl -s http://:/metrics | awk ' - /request_queue_time_seconds_sum\{/ { sum += $2 } - /request_queue_time_seconds_count\{/ { count += $2 } - END { if (count > 0) printf "avg queue: %.2fs (%d requests)\n", sum/count, count } -' - -# Average prefill/decode compute time -curl -s http://:/metrics | awk ' - /request_prefill_time_seconds_sum\{/ { ps += $2 } - /request_prefill_time_seconds_count\{/ { pc += $2 } - /request_decode_time_seconds_sum\{/ { ds += $2 } - /request_decode_time_seconds_count\{/ { dc += $2 } - END { - if (pc > 0) printf "avg prefill: %.2fs\n", ps/pc - if (dc > 0) printf "avg decode: %.2fs\n", ds/dc - } -' -``` - -Other useful metrics on the `/metrics` endpoint: -- `vllm:e2e_request_latency_seconds` — end-to-end latency -- `vllm:kv_cache_usage_perc` — KV cache memory pressure -- `vllm:nixl_xfer_time_seconds` — NIXL KV transfer duration -- `vllm:nixl_bytes_transferred` — bytes per KV transfer - -## UCX 1.19 - -NVSHMEM requires UCX >= 1.19 for multi-GPU CUDA support. Most clusters ship UCX 1.17 (via HPC-X), which causes `cuStreamCreate: invalid device context` errors during DeepEP internode dispatch. - -**Check your version:** -```bash -/opt/hpcx/ucx/bin/ucx_info -v | head -1 -# If < 1.19, you need to build from source -``` - -**Build UCX 1.19 (run once on a GPU node):** -```bash -salloc -N 1 --gres=gpu:1 bash -c 'bash scripts/install_nixl_from_source.sh' -``` - -This installs UCX 1.19 to `prime-rl/third_party/ucx/`. The sbatch template automatically adds it to `LD_LIBRARY_PATH`, overriding the system version. - -## Troubleshooting - -### `DeepEP error: timeout (dispatch CPU)` -NVSHMEM internode communication failing. Check: -1. UCX version >= 1.19? (`third_party/ucx/bin/ucx_info -v`) -2. NVSHMEM libs reachable at `/tmp/deepep_build/nvshmem/lib/`? If not: - ```bash - ssh 'mkdir -p /tmp/deepep_build/nvshmem && \ - ln -sfn /lib/python3.12/site-packages/nvidia/nvshmem/lib \ - /tmp/deepep_build/nvshmem/lib' - ``` -3. IBGDA driver enabled? `ssh 'cat /proc/driver/nvidia/params | grep EnableStreamMemOPs'` should show `1`. - -### Router healthy but requests hang -NIXL side channel not running on prefill. Check: -```bash -ssh 'ss -tlnp sport ge :5600 sport le :5608 | grep -c LISTEN' -# Should show 8 (one per DP rank). If 0, check logs for UCX/NVSHMEM errors. -``` diff --git a/docs/entrypoints.md b/docs/entrypoints.md deleted file mode 100644 index 6a97a60b5b..0000000000 --- a/docs/entrypoints.md +++ /dev/null @@ -1,67 +0,0 @@ -# Entrypoints - -## RL - -The main usecase of PRIME-RL is RL training. Three main abstractions facilitate RL training: the **orchestrator**, the **trainer**, and the **inference** service. - -![Architecture](assets/architecture.png) - -### Orchestrator - -The orchestrator is a lightweight CPU process that handles the core data and scheduling logic, serving as an intermediary between the trainer and inference service with bidirectional relays. In one direction, it collects rollouts from the inference server, assembles them into packed batches, and dispatches them to the trainer; in the other direction, it relays updated model weights from the trainer to the inference service. The orchestrator utilizes `verifiers` environments to abstract multi-turn rollout generation and scoring. Each training and evaluation environment is exposed as a `vf.EnvServer` as a sidecar to the orchestrator process (default) or as a standalone process (e.g. used in hosted training to run environments in containers). - -### Trainer - -The trainer is responsible for producing an updated policy model given rollouts and advantages. We use FSDP2 as the backend with compatibility for any HuggingFace (HF) model. For some models we also provide custom implementations, mostly for performance reasons. FSDP shards model parameters, gradients, and optimizer states, allowing training large models with data parallelism and minimal GPU memory footprint. We support a variety of popular training objectives, such as GRPO, GSPO, OPO, RLOO and [CISPO](https://arxiv.org/abs/2506.13585). The trainer is inspired by [`torchtitan`](https://github.com/pytorch/torchtitan) and relies on native PyTorch features to implement advanced parallelism techniques, such as tensor, context or expert parallelism. - -### Inference - -The inference service in its simplest form is a standard OpenAI-compatible server with a vLLM backend. The API specification is extended with a custom `update_weights` endpoint to reload model weights from a HF-compatible checkpoint on disk. Otherwise, we rely on vLLM's optimized kernels, parallelism strategies, and scheduling for fast rollout generation. Given the disaggregated nature of the service architecture, it can be directly extended to include multiple engines with a shared request pool, allowing operation across multiple clusters and straightforward integration of alternative inference engines (e.g. SGLang, Tokasaurus). We also heavily rely on native data parallelism in vLLM (also available in SGLang) for orchestrating the fleet of nodes dedicated to inference. - -### RL - -For doing RL training all components need to be started. One can do this manually: - -```bash -uv run inference ... -``` - -```bash -uv run orchestrator ... -``` - -```bash -uv run trainer ... -``` - -Or, alternatively on a single node, use the `rl` entrypoint to start all components. - -```bash -uv run rl \ - --trainer @ path/to/train.toml \ - --orchestrator @ path/to/orch.toml \ - --inference @ path/to/infer.toml \ - ... -``` - -For more details on multi-node deployment options, see the [deployment](deployment.md) documentation and see the [examples](examples) for concrete training configurations. To see all available configuration options, run `uv run rl --help`. - -## SFT - -We provide a fairly straight-forward SFT trainer which is capable of fine-tuning any conversational model on multi-turn conversation with tool calling. It shares a lot of components with the RL trainer, such as the modeling code, parallelism techniques, checkpoint format, logger, etc. which ensures a seamless post-training workflow. - -To start an SFT training, you need to prepare a conversational dataset in either [prompt-completion format](https://huggingface.co/docs/trl/en/dataset_formats#prompt-completion) or raw `messages` format. If `messages` is provided, the trainer interprets the full conversation as a single sample with an empty prompt and applies role-based loss masking across the whole chat. If both `messages` and `prompt` / `completion` are present, `messages` takes precedence. Single-turn fine-tuning should be compatible with the chat templates of most models. However, to properly handle loss masking, we require that the tokenizer's chat template satisfies a prefix property: the tokenization of any conversation prefix must be a prefix of the tokenization of the full conversation. For instance, tokenizing message 1 should yield a token sequence that forms a prefix of tokenizing messages 1 and 2, which in turn should be a prefix of tokenizing messages 1, 2, 3, and so forth. An example of a chat template that *does not* satisfy this property is Qwen3's chat template, as it strips away past think sections. - -On a single GPU, start the training with the `sft` entrypoint - -```bash -uv run sft ... -``` - -If you have access to multiple GPUs, use [`torchrun`](https://docs.pytorch.org/docs/stable/elastic/run.html) with `--nproc-per-node` to start the training. - -```bash -uv run torchrun --nproc-per-node 8 src/prime_rl/trainer/sft/train.py ... -``` - -For more details on multi-node deployment options, see the [deployment](deployment.md) documentation and see the [examples](examples) for concrete training configurations. To see all available configuration options, run `uv run sft --help`. diff --git a/docs/environments.md b/docs/environments.md deleted file mode 100644 index 69fe15e625..0000000000 --- a/docs/environments.md +++ /dev/null @@ -1,32 +0,0 @@ -# Environments - -PRIME-RL can train and evaluate in any [`verifiers`](https://github.com/willccbb/verifiers) environments. To train in a new environment, simply install it from the [Environment Hub](https://app.primeintellect.ai/dashboard/environments) or install a local environment. - -## Installation - -You can explore the installation options using - -```bash -prime env info / -``` - -To install an environment temporarily - -```bash -prime env install / -# Or: uv pip install --extra-index-url https://hub.primeintellect.ai//simple/ -``` - -To install a local environment - -```bash -uv pip install -e path/to/env -``` - -To verify your installation - -```bash -uv run python -c "import " -``` - -For more details on environments, see our Environments Hub documentation [here](https://docs.primeintellect.ai/tutorials-environments/environments). \ No newline at end of file diff --git a/docs/faqs.md b/docs/faqs.md new file mode 100644 index 0000000000..35cd4ded5d --- /dev/null +++ b/docs/faqs.md @@ -0,0 +1,265 @@ +# FAQs + +Frequently-asked questions, grouped by topic. For full background see the linked pages. + +## Table of Contents + +- [Getting started](#getting-started) +- [Configs](#configs) +- [RL training](#rl-training) +- [SFT training](#sft-training) +- [Checkpoints and resume](#checkpoints-and-resume) +- [Scaling](#scaling) +- [Memory and OOM](#memory-and-oom) +- [Observability](#observability) +- [Models and environments](#models-and-environments) + +## Getting started + +### What's the fastest way to verify my install works? + +The SFT debug config runs end-to-end on CPU or any single GPU with fake data: + +```bash +uv run sft @ configs/debug/sft/train.toml +``` + +For the full RL stack on 2 GPUs, the GSM8K example is the smallest realistic run: + +```bash +prime env install primeintellect/math-env +bash scripts/tmux.sh +uv run rl @ configs/gsm8k/rl.toml --wandb.name smoke-test --ckpt +``` + +See [Overview § Quick run](overview.md#quick-run). + +### What hardware do I need? + +Any NVIDIA GPU with compute capability ≥ 8.0 (RTX 3090/4090/5090, A100, H100, H200, B200). MoE training with FP8 needs SM ≥ 9.0 (Hopper or newer). RL needs at least 2 GPUs in practice (1 inference + 1 trainer), but you can co-locate both on a single GPU for the smallest debug runs. + +### Why `uv run` and not `python`? + +`uv` manages our virtualenv lock and pins dependencies precisely. Running `python` directly will pick up a different interpreter or miss extras (e.g. DeepGEMM). Always use `uv run `. + +## Configs + +### My TOML file isn't being loaded — what's wrong? + +Most common cause: missing space between `@` and the path. Use `@ rl.toml`, not `@rl.toml`. Otherwise the loader treats `@rl.toml` as a CLI flag. + +If the file is loading but Pydantic complains, run `--dry-run --output-dir /tmp/x` to see exactly which field doesn't match the schema. + +### How do I disable a feature that's enabled by default? + +For an optional sub-config typed `SomeConfig | None`, pass the `--no-` form on the CLI: + +```bash +uv run rl @ rl.toml --no-trainer.gc # disable garbage collection config +``` + +In TOML, comment out or remove the section. + +### How do I override an env var in TOML? + +You can't directly — env vars are a separate source. To force a fixed value, set it in TOML; the precedence order (CLI > TOML > env > defaults) means the TOML wins. + +### How do I add a new environment to my training mix? + +Add another `[[orchestrator.train.env]]` table. Lists are replaced wholesale on overlay, so include the full list every time: + +```toml +[[orchestrator.train.env]] +id = "math-env" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } + +[[orchestrator.train.env]] +id = "reverse-text" +``` + +See [Configuration § Environments](configuration.md#environments-orchestratortrainenv). + +## RL training + +### What does `max_async_level` actually do? + +It caps how many steps inference can run ahead of training. `1` is pipelined (overlapped) but fully responsive; `2` (default) absorbs slower weight broadcasts. Higher values give more throughput at the cost of off-policy drift. Watch `mismatch_kl/all/mean` — if it grows, lower the value. See [Algorithms § Tuning `max_async_level`](algorithms.md#tuning-max_async_level). + +### Why are there two W&B runs per RL job? + +The trainer and orchestrator log as separate runs so their step indices and timings stay independent. The names are `-trainer` and `-orchestrator`. Group them in W&B if you want a unified view. + +### My reward isn't improving. What should I check first? + +In order: + +1. `reward/all/mean` and `reward/all/std`. If std is ~0, the env is too easy or rewards are degenerate — increase difficulty or check the rubric. +2. `is_truncated/all/mean`. If high, your model is hitting `max_completion_tokens` — either raise it or the model isn't learning to stop. +3. Eval scores vs train rewards. If train reward rises but eval is flat, you may be hitting a chat-template prefix violation; see [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). +4. `mismatch_kl/all/mean`. If growing, drop `max_async_level` or LR. +5. `optim/grad_norm`. Sustained spikes mean you're about to diverge — drop LR. + +### How do I evaluate without training? + +Use `vf-eval`: + +```bash +uv run vf-eval math-env \ + -a '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ + -m PrimeIntellect/Qwen3-0.6B \ + -b http://localhost:8000/v1 -n 50 -t 2048 +``` + +This talks to any OpenAI-compatible endpoint, so it works against `uv run inference`, hosted endpoints, or a stale checkpoint mid-run. + +### What's the difference between `training_mode = "sft"` and the standalone `uv run sft`? + +`uv run sft` is the traditional path: load a HF dataset, train the model. No orchestrator, no teacher. + +`orchestrator.training_mode = "sft"` uses the RL pipeline to hard-distill from a teacher: the teacher (any OpenAI-compatible endpoint) generates the completions, and the student trains on them as they're produced. Use this when you want on-the-fly teacher supervision against a moving student. See [Training § Training modes](training.md#training-modes-rl--opd--sft-via-orchestrator). + +## SFT training + +### Why does Qwen3 fail multi-turn SFT silently? + +Qwen3's default chat template strips past `` blocks when re-tokenizing, which violates the prefix property the SFT trainer depends on. Use a model with a patched chat template (e.g. `PrimeIntellect/Qwen3-0.6B`) or set `orchestrator.renderer.preserve_all_thinking = true`. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). + +### Can I train on `prompt`/`completion` and `messages` mixed in one dataset? + +Yes — if both columns are present in a row, `messages` takes precedence. The trainer will use `messages` for that row and ignore `prompt`/`completion`. + +### How do I do tool-calling SFT? + +Tool-calling SFT works out of the box if your dataset uses the `messages` format with tool messages embedded. The renderer handles tool turns the same as assistant turns. Make sure your model's chat template supports tool tokens. + +## Checkpoints and resume + +### How often should I checkpoint? + +For long runs: every 50–100 steps with `--ckpt.keep-last 3` for rolling rollback and `--ckpt.keep-interval 500` for permanent milestones. For short experiments: end-of-training only (`--ckpt` with no interval). See [Training § Checkpointing](training.md#checkpointing). + +### How do I resume from the latest checkpoint? + +```bash +uv run rl @ rl.toml --max-steps 100 --ckpt.resume-step -1 # -1 means latest +``` + +`--max-steps` is the absolute target, not the remainder. + +### Can I serve a mid-training weight checkpoint? + +Yes. HF-compatible weights are written to `/weights/step_/` on every checkpoint. Use them directly: + +```bash +uv run inference --model.name outputs/weights/step_500 +``` + +or upload to HF: + +```bash +uv run hf upload /-RL-500 outputs/weights/step_500 +``` + +### Can the inference server stay running across trainer restarts? + +Yes. The orchestrator pushes the resumed checkpoint into inference automatically. No need to restart the inference server. + +## Scaling + +### Single GPU is too tight. What's the minimum useful layout? + +2 GPUs — one for inference, one for the trainer. The default placement in `rl` does exactly this. See [Scaling § Single-node multi-GPU](scaling.md#single-node-multi-gpu). + +### Multi-node without SLURM or K8s? + +Yes, see [Scaling § Multi-node (manual)](scaling.md#multi-node-manual). You need a shared filesystem and a reachable inference IP. Set the three `OUTPUT_DIR` / `INFERENCE_SERVER_IP` / `INFERENCE_SERVER_API_KEY` env vars on every node and launch each process by hand. + +### How big a difference does NCCL weight broadcast make? + +NCCL broadcast is much faster than filesystem for local-cluster setups, at the cost of being synchronous: it requires `max_async_level = 1` and doesn't support LoRA today. Use it for multi-node single-cluster RL where you want maximum throughput; stick with filesystem for cross-WAN, LoRA, or async-heavy setups. + +### Should I use TP, DP, EP, or CP? + +- **TP (inference)**: scale within a node, up to `num_attention_heads / 2`. Past that, returns diminish. +- **DP (inference and trainer)**: scale throughput linearly across replicas. Default scaling lever. +- **EP (trainer, MoE only)**: shards expert weights; the right knob for MoE memory and throughput together. +- **CP (trainer)**: shards a sequence across GPUs along the token axis. Needed for sequences past ~32K tokens. Stick to CP ≤ 8. + +See [Scaling § Parallelism knobs](scaling.md#parallelism-knobs). + +## Memory and OOM + +### Trainer CUDA OOM — what should I try first? + +In order: + +1. `--model.ac` (full activation checkpointing). +2. Lower `seq_len` or `data.micro_batch_size`. +3. `--model.optim-cpu-offload` (offloads only optimizer state). +4. `--model.cp 2` (context parallelism; requires flash-attention and the custom impl). +5. `--model.fsdp-cpu-offload` as a last resort (significant throughput hit). + +The kitchen-sink config in [Scaling § Memory-tight recipe](scaling.md#memory-tight-recipe) combines all of the above. + +### Inference CUDA OOM? + +Tighten `inference.gpu_memory_utilization` (try 0.85), lower `inference.model.max_model_len`, or split inference across more GPUs with `inference.parallel.dp`. + +### Why is `optim_cpu_offload` not slowing me down much? + +In RL you typically take many gradient-accumulation micro-steps per optimizer step, so the H2D/D2H transfer is amortized. In pretraining the cost is more visible. + +## Observability + +### Where's the log file for a specific environment worker? + +`/logs/envs/{train,eval}//env_worker_.log`. Most silent training kills come from OOM in env workers — start there. + +### How do I get more verbose env logging? + +```bash +uv run rl @ rl.toml --orchestrator.log.vf-level debug +``` + +Or set `PRIME_VF_LOG_LEVEL=debug` in the environment. + +### vLLM is logging too much. Can I quiet it? + +Set `inference.log.level = "warning"` (or pass `--inference.log.level warning`). Note that `inference.log` only controls the prime-rl logger; vLLM's own logging is controlled by `VLLM_LOGGING_LEVEL` env var. + +### What's the fastest way to see KV cache pressure? + +```bash +curl -s http://localhost:8000/metrics | grep gpu_cache_usage_perc +``` + +Approaching 1.0 means KV cache is saturated and request latency will spike. Reduce `max_model_len` or split across more inference GPUs. + +## Models and environments + +### Which models have a custom optimized implementation? + +GLM-5, Qwen3 MoE, Qwen3.5 MoE, Qwen3 / Qwen3.5 VLMs, Poolside Laguna, MiniMax M2, Nemotron H, Trinity (AFMoE), GLM-4 / GLM-4.5 / INTELLECT-3, GPT-OSS (HF-MoE only). See the table in [Advanced § MoE models](advanced.md#moe-models). + +Other HF causal LMs work via the HF path (`impl = "hf"` or `"auto"`) but without EP, FP8, or the custom kernels. + +### Can I train a VLM? + +Yes — Qwen3-VL, Qwen3.5, Qwen3.5-MoE out of the box. Add `[model.vlm]` and use bfloat16 dtypes. See [Advanced § Vision-language models](advanced.md#vision-language-models). + +### How do I install an environment from the Environments Hub? + +```bash +prime env install primeintellect/math-env +uv run python -c "import math_env" # verify +``` + +Then reference by ID in your config. See [Advanced § Environments](advanced.md#environments). + +### Can I install an environment from outside the Hub? + +Yes — install with `uv pip install -e path/to/my-env` and reference it by its `id` (the env's package name). The orchestrator will discover it. + +### My environment hangs occasionally. What's happening? + +Most likely it's running user code that blocks on a network call or an external service (e.g. a math verifier, a sandbox). Check the env worker logs and the event-loop lag metrics on the env server. The orchestrator's `max_retries` and `errored_rollouts` metric should tell you how often rollouts fail vs hang. diff --git a/docs/index.md b/docs/index.md deleted file mode 100644 index aa76871f8c..0000000000 --- a/docs/index.md +++ /dev/null @@ -1,16 +0,0 @@ -# Docs - -This directory maintains the documentation for PRIME-RL. It is organized into the following sections: - -- [**Entrypoints**](entrypoints.md) - Overview of the main components (orchestrator, trainer, inference) and how to run SFT, RL, and evals -- [**Configs**](configs.md) - Configuration system using TOML files, CLI arguments, and environment variables -- [**Environments**](environments.md) - Installing and using verifiers environments from the Environments Hub -- [**Async Training**](async.md) - Understanding asynchronous off-policy training and step semantics -- [**Logging**](logging.md) - Logging with loguru, torchrun, and Weights & Biases -- [**Platform Monitoring**](platform-monitoring.md) - Register runs on the Prime Intellect platform and stream training metrics -- [**MultiRunManager**](multi_run_manager.md) - Multi-run training with the MultiRunManager object for concurrent LoRA adapters -- [**Checkpointing**](checkpointing.md) - Saving and resuming training from checkpoints -- [**Benchmarking**](benchmarking.md) - Performance benchmarking and throughput measurement -- [**Deployment**](deployment.md) - Training deployment on single-GPU, multi-GPU, and multi-node clusters -- [**Kubernetes**](kubernetes.md) - Deploying PRIME-RL on Kubernetes with Helm -- [**Troubleshooting**](troubleshooting.md) - Common issues and their solutions \ No newline at end of file diff --git a/docs/kubernetes.md b/docs/kubernetes.md deleted file mode 100644 index f718f1df01..0000000000 --- a/docs/kubernetes.md +++ /dev/null @@ -1,308 +0,0 @@ -# Kubernetes - -This guide covers deploying PRIME-RL training infrastructure on Kubernetes clusters using the provided Helm chart. - -## Prerequisites - -- Kubernetes cluster with GPU nodes -- [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html) installed -- [Helm 3.x](https://helm.sh/docs/intro/install/) installed -- Storage class that supports `ReadWriteMany` (e.g., NFS, CephFS, or cloud provider storage) - -### Verify Prerequisites - -```bash -# Check Helm installation -helm version - -# Check GPU operator -kubectl get pods -n gpu-operator - -# Check available storage classes -kubectl get storageclass -``` - -## Quick Start - -### 1. Deploy - -```bash -# Deploy with a release name -helm install my-exp ./k8s/prime-rl -f ./k8s/prime-rl/examples/reverse-text.yaml - -# Or with defaults (no example-specific config) -helm install my-exp ./k8s/prime-rl --set trainer.replicas=3 --set inference.replicas=2 -``` - -### 2. Verify deployment - -```bash -# Check pod status -kubectl get pods -l app.kubernetes.io/instance=my-exp - -# Should show 3 pods: -# my-exp-orchestrator-0 -# my-exp-inference-0 -# my-exp-trainer-0 -``` - -### 3. Run training - -```bash -# Exec into trainer -kubectl exec -it my-exp-trainer-0 -- bash - -# Inside the pod, run training -cd /data -uv run trainer @ /app/examples/reverse_text/configs/train.toml -``` - -### 4. Monitor progress - -```bash -# Get logs -kubectl logs my-exp-trainer-0 - -# Follow logs in real-time -kubectl logs -f my-exp-trainer-0 -``` - -## Available Examples - -The chart includes pre-configured values for each example: - -### reverse-text (Small - 1 GPU) - -```bash -helm install my-exp ./k8s/prime-rl -f ./k8s/prime-rl/examples/reverse-text.yaml -``` - -- Model: Qwen3-0.6B -- GPUs: 1 per component -- Runs on consumer GPUs (RTX 3090/4090) -- **Note:** You can use any release name - the chart automatically configures service URLs - -## Configuration - -### Storage Configuration - -By default, the chart creates a 1TB PVC with NFS storage. To customize: - -```yaml -# custom-values.yaml -storage: - storageClassName: my-storage-class - size: 500Gi -``` - -Deploy with custom storage: - -```bash -helm install my-release ./k8s/prime-rl -f custom-values.yaml -``` - -### GPU Configuration - -Adjust GPU count per component: - -```yaml -# custom-gpu.yaml -inference: - gpu: - count: 4 # Use 4 GPUs for inference - -trainer: - gpu: - count: 2 # Use 2 GPUs for training -``` - -### Resource Limits - -Customize memory and CPU: - -```yaml -# custom-resources.yaml -trainer: - resources: - requests: - memory: "64Gi" - cpu: "16" - limits: - memory: "128Gi" - cpu: "32" -``` - -### Secrets (Optional) - -For W&B and HuggingFace authentication: - -```bash -# Create secret -kubectl create secret generic prime-rl-secrets \ - --from-literal=wandb-api-key=YOUR_WANDB_KEY \ - --from-literal=hf-token=YOUR_HF_TOKEN - -# Enable in values -helm install my-release ./k8s/prime-rl \ - --set config.secrets.enabled=true \ - --set config.secrets.name=prime-rl-secrets -``` - -## Common Operations - -### Deploy a new experiment - -```bash -# With example config -helm install my-exp ./k8s/prime-rl -f ./k8s/prime-rl/examples/reverse-text.yaml - -# With custom settings -helm install my-exp ./k8s/prime-rl --set trainer.replicas=10 --set inference.replicas=5 -``` - -### Exec into pods - -```bash -# Exec into trainer-0 -kubectl exec -it my-exp-trainer-0 -- bash - -# Exec into specific trainer pod -kubectl exec -it my-exp-trainer-3 -- bash - -# Exec into inference -kubectl exec -it my-exp-inference-0 -- bash -``` - -### View logs - -```bash -# Get logs from trainer-0 -kubectl logs my-exp-trainer-0 - -# Follow logs in real-time -kubectl logs -f my-exp-trainer-2 - -# Get logs from all trainers -kubectl logs -l app.kubernetes.io/instance=my-exp,role=trainer -``` - -### List all pods - -```bash -# List pods for specific experiment -kubectl get pods -l app.kubernetes.io/instance=my-exp - -# List all prime-rl pods -kubectl get pods -l app=prime-rl -``` - -## Architecture - -### Components - -The chart deploys three main components (all using StatefulSets): - -1. **Orchestrator** (StatefulSet) - Coordinates training workflow - - Always 1 replica: `prime-rl-orchestrator-0` - - No GPU required - - Communicates with trainer and inference - -2. **Inference** (StatefulSet) - Runs vLLM inference server - - Scalable replicas with stable pod names: `prime-rl-inference-0`, `prime-rl-inference-1`, ... - - Each pod gets predictable DNS: `prime-rl-inference-0.prime-rl-inference-headless.default.svc.cluster.local` - - Requires GPU(s) - - Serves model predictions - -3. **Trainer** (StatefulSet) - Runs SFT or RL training - - Scalable replicas with stable pod names: `prime-rl-trainer-0`, `prime-rl-trainer-1`, ... - - Each pod gets predictable DNS: `prime-rl-trainer-0.prime-rl-trainer-headless.default.svc.cluster.local` - - Requires GPU(s) - - Updates model weights on shared storage - -**Why StatefulSets for all components?** - -- **Consistent naming**: All pods have predictable names (`orchestrator-0`, `trainer-0`, `trainer-1`, ...) -- **Stable networking**: Each pod gets its own DNS hostname via headless service -- **Required for distributed training**: PyTorch/vLLM need to discover peers by stable hostname -- **Clean naming**: No random pod suffixes, easier to identify and debug - -### Shared Storage - -All components mount the same PVC at `/data` for: - -- Model checkpoint sharing -- Training data -- Experiment outputs - -This is **required** for coordinating weight updates between trainer and inference. - -## Environment Variables - -Each pod has these K8s environment variables set: - -- `$POD_NAME` - Full pod name (e.g., `my-exp-trainer-3`) -- `$POD_IP` - Pod IP address -- `$STATEFUL_REPLICAS` - Total number of replicas for that component -- `$HEADLESS_SERVICE` - DNS name for peer discovery (e.g., `my-exp-trainer-headless.default.svc.cluster.local`) -- `$INFERENCE_URL` - Full URL to the first inference pod (available in orchestrator and trainer pods) - -For distributed training, extract the rank from the pod name: - -```bash -# Extract ordinal from pod name -RANK=$(echo $POD_NAME | grep -o '[0-9]*$') # e.g., "my-exp-trainer-3" -> "3" - -# Use in torchrun -torchrun \ - --nnodes=$STATEFUL_REPLICAS \ - --node-rank=$RANK \ - --nproc-per-node=8 \ - --rdzv-endpoint=my-exp-trainer-0.$HEADLESS_SERVICE:29501 \ - src/prime_rl/trainer/sft/train.py @ configs/train.toml -``` - -## Troubleshooting - -### Can't access shared storage - -Verify PVC is bound: - -```bash -kubectl get pvc prime-rl-shared-data -# STATUS should be "Bound" -``` - -Check mount inside pod: - -```bash -kubectl exec -it prime-rl-trainer-xxx -- df -h /data -``` - -### Pod stuck in Pending - -Check if GPU resources are available: - -```bash -kubectl describe pod my-exp-trainer-0 -``` - -Look for events like `Insufficient nvidia.com/gpu`. - -### Inference server not responding - -Check if the inference pod is ready: - -```bash -kubectl get pods -l role=inference -kubectl logs my-exp-inference-0 -``` - -## Uninstalling - -```bash -# Remove the Helm release -helm uninstall my-exp - -# Delete PVC (data will be lost!) -kubectl delete pvc prime-rl-shared-data -``` diff --git a/docs/logging.md b/docs/logging.md deleted file mode 100644 index cbd7e881f2..0000000000 --- a/docs/logging.md +++ /dev/null @@ -1,86 +0,0 @@ -# Logging - -prime-rl uses [loguru](https://loguru.readthedocs.io/en/stable/) for logging with a global logger pattern. All logs are captured at the deployment level (stdout/stderr redirection for local, `tee` for SLURM) under `{output_dir}/logs/`. For RL training, we recommend streaming logs into tmux panes (as set up by `tmux.sh`). - -## Logger Architecture - -### `setup_logger` and `get_logger` - -We use a **singleton pattern** with a module-level global logger instance (`_LOGGER`). - -```python -from prime_rl.utils.logger import setup_logger, get_logger - -# At entrypoint - call ONCE -logger = setup_logger("info") - -# Anywhere else in codebase -logger = get_logger() -logger.info("Hello world") -``` - -**How it works:** - -1. **`get_logger()`** - Returns the global logger instance. Always works — if `setup_logger` hasn't been called yet, it initializes a default logger automatically. Safe to call from any module at any time. - -2. **`setup_logger(log_level)`** - Configures (or reconfigures) the global logger: - - Creates an isolated loguru `Logger` instance (not the default `loguru.logger`) to prevent third-party code from hijacking our logs - - Adds a stdout handler with colorized output (or JSON output if `json_logging=True`) - - Can be called multiple times — cleans up the previous logger before creating a new one - -3. **`reset_logger()`** - Resets the global logger to `None`: - - Used in subprocesses that inherit parent state (e.g., env workers) - - Used in tests between test cases - -## Log File Structure - -Logs are captured at the deployment level — the entrypoint redirects subprocess stdout/stderr to files (local) or `tee` captures them (SLURM). The structure is consistent across deployment types: `logs/trainer.log` and `logs/inference.log` always exist, regardless of whether the run is local or multi-node SLURM. - -### Local (single node) - -``` -{output_dir}/logs/ -├── trainer.log # trainer stdout (rank 0 only) -├── orchestrator.log # orchestrator stdout -├── inference.log # vLLM inference server stdout -├── trainer/ -│ └── torchrun/ # per-rank stdout/stderr (all ranks) -└── envs/ - ├── train/{env_name}/ - │ ├── env_server.log - │ └── env_worker_{id}.log - └── eval/{env_name}/ - └── ... -``` - -### SLURM multi-node - -``` -{output_dir}/logs/ -├── trainer.log -> trainer/node_0.log (symlink) -├── inference.log -> inference/node_0.log (symlink) -├── orchestrator.log # orchestrator stdout -├── trainer/ -│ ├── node_0.log # per-node trainer output (rank 0 only) -│ ├── node_1.log -│ └── torchrun/ # per-rank stdout/stderr (all ranks) -├── inference/ -│ ├── node_0.log # per-node inference output -│ ├── node_1.log -│ └── router_0.log # vllm-router per replica -└── envs/ - └── ... -``` - -Environment logs live under `logs/envs/train/{env_name}/` and `logs/envs/eval/{env_name}/`. Env log verbosity is controlled by `orchestrator.log.vf_level`. - -Only rank 0 output is shown in `trainer.log`. Per-rank logs from all ranks are available under `logs/trainer/torchrun/{rdzv_id}/attempt_0/{rank}/{stdout,stderr}.log`, written by torchrun's `--log-dir`. - -## tmux helper (`scripts/tmux.sh`) - -`scripts/tmux.sh` sets up a tmux session for RL runs with **four panes**: - -- **Trainer**: follows `{output_dir}/logs/trainer.log` -- **Orchestrator**: follows `{output_dir}/logs/orchestrator.log` -- **Envs**: follows `{output_dir}/logs/envs/*/*/*.log` -- **Inference**: follows `{output_dir}/logs/inference.log` diff --git a/docs/memory_usage.md b/docs/memory_usage.md deleted file mode 100644 index b36c117254..0000000000 --- a/docs/memory_usage.md +++ /dev/null @@ -1,132 +0,0 @@ -# Reducing memory usage - -While most of our parallelism techniques in prime-rl are designed to scale training up (FSDP, EP, CP, ...), we also provide many tools to scale training down that allow training large MoE models on a limited amount of GPUs. - -These techniques target the trainer part of prime-rl. - - -## TLDR: config to use for maximum memory usage reduction with correct throughput - -```toml -[trainer.model] -impl = "custom" -attn = "flash_attention_2" -fused_lm_head_token_chunk_size = 1024 -ep = 8 -cp = 2 -optim_cpu_offload = true - -[trainer.model.compile] - -[trainer.model.ac] -freq = 1 - -[trainer.model.ac_offloading] -max_inflight_activations = 1 -``` - -## Activation checkpointing - -Activation checkpointing discards intermediate activations during the forward pass and recomputes them during the backward pass, trading compute for memory. - -To enable it, use: - -```toml -[trainer.model.ac] -freq = 1 -``` - -`freq` controls how often layers are checkpointed: every `freq` layers. Lower values yield lower memory usage (e.g. `freq = 1` checkpoints every layer). - -## Activation offloading - -Activation offloading offloads the activations to CPU to reduce the memory usage of the trainer. It can be used in combination with activation checkpointing. - -To enable it, use: - -```toml -[trainer.model.ac] -freq = 1 - -[trainer.model.ac_offloading] -max_inflight_activations = 5 -``` - -## Chunk loss - -Chunk loss splits the loss computation into smaller chunks to reduce the memory usage of the trainer. - -To enable it, use: - -```toml -[trainer.model] -fused_lm_head_token_chunk_size = auto -``` - - -## Expert parallelism - -While expert parallelism splits the weights of the experts across all GPUs like FSDP, using EP still reduces memory usage by reducing the communication size and therefore the FSDP buffer. - -EP is only available for models with MoE layers using the custom model implementation. - -``` -[trainer.model] -impl = "custom" -ep = 8 -``` - -## Context parallelism - -Context parallelism splits the context into smaller chunks to reduce the memory usage of the activations. We don't advise using CP across multiple nodes (i.e., increasing CP beyond 8). - -CP is only available for certain models and only with the custom model implementation. - -``` -[trainer.model] -impl = "custom" -cp = 2 -``` - -We recommend CP 2 or CP 4 for most 128K sequence length training runs. Can be pushed to 8. - - -## torch compile - -Enabling torch.compile can reduce the memory usage for certain model architectures, especially MoE with the custom model implementation. - -``` -[trainer.model.compile] -``` - -## CPU Optimizer offloading - -Offloading the optimizer states to CPU can reduce the memory usage of the trainer significantly, especially at low GPU counts where the optimizer states take a lot of memory as they won't be sharded enough. - -In RL, in contrast with pretraining, we end up with many gradient accumulation steps, so the cost of offloading the optimizer states is not as high as in pretraining, and indeed barely noticeable. - -``` -[trainer.optim] -optim_cpu_offload = true -``` - -## :warning: FSDP CPU offloading - -FSDP CPU offloading offloads the parameters, gradients, and optimizer states to CPU to reduce the memory usage of the trainer. - -This will make training significantly slower and is not recommended most of the time. - -``` -[trainer.model] -fsdp_cpu_offload = true -``` - -## :warning: Lora training - -LoRA training significantly reduces the memory usage of the trainer at the cost of smaller gradient updates. - -``` -[trainer.model.lora] -rank = 8 -``` - diff --git a/docs/metrics.md b/docs/metrics.md deleted file mode 100644 index bf6a785a1a..0000000000 --- a/docs/metrics.md +++ /dev/null @@ -1,55 +0,0 @@ -# Metrics - -## W&B - -For most runs we recommend logging metrics to [W&B](https://wandb.ai). Before enabling W&B, make sure that you have an account and are logged in. - -```bash -uv run wandb login -# Or set `export WANDB_API_KEY=...` -``` - -### SFT - -Logging to W&B is disabled by default. Enable the default configuration with `--wandb` - -```bash -uv run sft ... --wandb -``` - -This will log to the `prime-rl` project with a random run name. You can specify which project and name to log to - -```bash -uv run sft ... --wandb.project my-project --wandb.name my-run -``` - -The same settings also work for multi-node training with `torchrun`. Note, that we only log global metrics from the master rank (e.g. the all-reduced loss) - -```bash -uv run torchrun --nproc-per-node 8 ... --wandb -``` - -### RL - -For RL training, both the trainer and orchestrator log to W&B as separate runs. Again, logging to W&B is disabled by default. Enable the default configuration with `--wandb` - -```bash -uv run rl ... --wandb -``` - -This will log to the `prime-rl` project with a random run name. The trainer run is suffixed with `-trainer` and the orchestrator run is suffixed with `-orchestrator`. You can specify which project and name to log to using the same flags as for SFT. - -```bash -uv run rl ... --wandb.project my-project --wandb.name my-run -``` - -For the RL trainer, we support logging samples (e.g. prompt, completion, reward, advantage for selected rollouts) and distributions (e.g. reward, advantage, entropy distributions) as W&B tables using the `wandb.log-extras` subconfig. If W&B is setup, this is enabled by default and will log for the RL trainer and orchestrator every 10 steps. - -You can configure this on the trainer and orchestrator separately. For example, to only log samples on the orchestrator every 50 steps, but not distribution on either - -```bash -uv run rl ... \ - --no-trainer.wandb.log-extras.distributions \ - --orchestrator.wandb.log-extras.interval 50 -``` - diff --git a/docs/mint.json b/docs/mint.json index 25437b7a31..3fed8a01df 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -4,20 +4,14 @@ { "group": "PRIME-RL", "pages": [ - "index", - "entrypoints", - "configs", - "training_modes", - "environments", - "async", - "logging", - "multi_run_manager", - "checkpointing", - "benchmarking", - "deployment", - "kubernetes", - "testing-moe-at-small-scale", - "troubleshooting" + "overview", + "configuration", + "training", + "scaling", + "algorithms", + "advanced", + "faqs", + "reference" ] } ] diff --git a/docs/multi_run_manager.md b/docs/multi_run_manager.md deleted file mode 100644 index bef6a6f566..0000000000 --- a/docs/multi_run_manager.md +++ /dev/null @@ -1,244 +0,0 @@ -# MultiRunManager - -The `MultiRunManager` object is a global singleton that manages the parameters and components for multiple concurrent training runs within a single trainer process. -This allows multiple orchestrator deployments to share the same trainer. - -When `max_concurrent_runs > 1`, the trainer can train multiple runs in parallel. Each run: - -- Has its own LoRA adapter parameters -- Has its own optimizer and scheduler -- Saves its own checkpoints -- Tracks its own training progress (step, tokens, samples) -- Loads its own orchestrator configuration - -The `MultiRunManager` object provides: - -- **Bidirectional mapping** between run IDs (e.g., `run_abc123`) and run indices (0, 1, 2, ...) -- **Progress tracking** per run (step count, total tokens, total samples) -- **Configuration management** for orchestrator configs -- **Distributed synchronization** across ranks via the PyTorch distributed store -- **LoRA module registration** for multi-adapter parameter management -- **Creation hooks** for initializing per-run resources (optimizers, schedulers) -- **Run eviction** for removing runs that are misbehaving - -## **Initialization and run discovery** - -The `MultiRunManager` singleton is set up at the start of training: - -```python -from prime_rl.trainer.runs import setup_multi_run_manager, get_multi_run_manager - -# Initialize with output directory and max concurrent runs -setup_multi_run_manager(output_dir=Path("outputs/my-experiment"), max_runs=4) - -# Get the singleton instance anywhere in the codebase -multi_run_manager = get_multi_run_manager() -``` - -Each run's directory follows this structure: - -``` -{output_dir}/ -├── run_abc123/ -│ ├── control/ -│ │ ├── orch.toml # Orchestrator configuration -│ │ ├── config_validation_error.txt # Config validation errors (if any) -│ │ └── evicted.txt # Eviction reason (if evicted) -│ ├── checkpoints/ -│ │ └── step_100/ # Orchestrator checkpoints -│ ├── rollouts/ -│ │ └── step_100/ # Rollouts -│ └── broadcast/ -│ └── step_100/ # Broadcasted weights for inference -├── run_def456/ -│ └── ... -└── ... - -``` - -Runs are discovered by scanning the output directory for the pattern `run_*`. Each run must contain a valid orchestrator config at `{run_dir}/control/orch.toml` before they are added to the active runs otherwise they are ignored. When the maximum number of runs is reached, new `run_*` directories will not be picked up until old ones are deleted. - -```python -# Master rank scans for new/deleted runs -multi_run_manager.discover_runs() - -# All ranks synchronize state (must be called after discover_runs) -multi_run_manager.synchronize_state() -``` - -The `discover_runs()` method (master only): - -1. Scans the output directory for `run_*` directories -2. Filters out evicted runs (those with `control/evicted.txt`) -3. Detects new runs and deleted runs -4. Calls `forgotten_hook` for deleted runs (master only) -5. Loads and validates the orchestrator config for each new run -6. Updates internal mappings and data structures -7. Calls `discovered_hook` for new runs (master only) - -The `synchronize_state()` method (all ranks): - -1. Master broadcasts run state to all ranks via the distributed store -2. Non-master ranks catch up by calling internal `_delete_run_data` / `_create_run_data` -3. All ranks execute `deletion_hook` for deleted runs -4. All ranks execute `creation_hook` for new runs (e.g., optimizer setup, LoRA parameter reset) - -## Run Eviction - -The master proc on the trainer can evict a run using the `evict_run(idx: int, reason: str)` method. -This is useful when the trainer detects an issue with a run that requires it to be stopped (e.g., invalid data, resource constraints, or policy violations). - -```python -# Evict a run by its index (master only) -multi_run_manager.evict_run(idx=0, reason="Run exceeded memory limits") -``` - -The `evict_run()` method (master only): - -1. Writes the eviction reason to `{run_dir}/control/evicted.txt` -2. Logs a warning with the eviction details -3. The run is **not** immediately removed from the manager's data structures - -The eviction takes effect through two mechanisms: - -**On the trainer side:** -- The next `discover_runs()` call will filter out the evicted run (it checks for `evicted.txt`) -- The run will then be treated as deleted, triggering forgotten/deletion hooks -- The run index is returned to the unused pool - -**On the orchestrator side:** -- The orchestrator checks for `evicted.txt` at the start of each iteration in its main loop -- If found, it raises a `RuntimeError` with the eviction reason, causing the orchestrator to exit -- This surfaces the eviction reason to the user -- The orchestrator also self-evicts by writing `evicted.txt` if a training batch has no learning signal (all rollouts filtered out) on `MAX_EMPTY_BATCH_ATTEMPTS` (3) consecutive attempts - -## LoRA Module Registration - -LoRA modules register themselves with `MultiRunManager` for parameter management: - -```python -# In apply_lora_to_model() -lora_module = MultiLoRALinear( - base_layer=base_module, - rank=config.rank, - n_adapters=get_multi_run_manager().max_runs, - ... -) -lora_module.register_with_runs(get_multi_run_manager(), module_name) - -``` - -The `MultiRunManager` object then exposes: - -```python -# Get parameters for a specific run (used by optimizer creation) -multi_run_manager.get_named_parameters_for_run(idx) - -# Get state dict for a specific run (used by weight broadcast) -multi_run_manager.get_state_dict_for_run(idx) - -# Reset parameters for a new run -multi_run_manager.reset_run_parameters(idx) - -``` - -## Hooks - -The `MultiRunManager` object supports several types of hooks for different lifecycle events. -Deletion hooks are always called before creation hooks. - -```mermaid -flowchart TD - subgraph master["Rank 0 (Master)"] - discover["discover_runs()"] - forgotten["forgotten_hooks"] - validation["config_validation_hooks"] - discovered["discovered_hooks"] - - discover --> forgotten - forgotten --> validation - validation --> discovered - discovered --> discover - end - - subgraph rank1["Rank 1"] - wait1["waiting..."] - end - - subgraph rankN["Rank N"] - waitN["waiting..."] - end - - discovered --> barrier - wait1 --> barrier - waitN --> barrier - - barrier[["synchronize_state()"]] - - barrier --> deletion["deletion_hooks"] - deletion --> creation["creation_hooks"] - - style barrier fill:#fff9c4 -``` - -### Hook Registration - -```python -# These hooks are only called on the master as only master uses `discover_runs()` -# These hooks are thus only relevant to master only components (packer) -multi_run_manager.register_discovered_hook(callback) -multi_run_manager.register_forgotten_hook(callback) - -# These hooks are executed by all ranks in the order they were added during `synchronize_state()` -# This ensures DTensor creations and other distributed operations happen together -# Calling torch.dist.barrier() in a hook here should work -multi_run_manager.register_creation_hook(callback) -multi_run_manager.register_deletion_hook(callback) - -# These hooks validate the orchestrator config when runs are discovered: -multi_run_manager.register_config_validation_hook(callback) -``` - -The callback signatures: - -```python -def discovered_callback(idx: int, run_id: str, config: OrchestratorConfig) -> None: - """Called when a new run is discovered (master only). - - Args: - idx: The run's index (0 to max_runs-1) - run_id: The run's ID (e.g., "run_abc123") - config: The orchestrator config for the run - """ - # Example: Set the scaling factor for the run - multi_run_manager.scaling_factors[idx] = config.model.lora.alpha / config.model.lora.rank - -def forgotten_callback(idx: int, run_id: str) -> None: - """Called when a run is forgotten/removed (master only). - - Args: - idx: The run's index (0 to max_runs-1) - run_id: The run's ID (e.g., "run_abc123") - """ - pass - -def callback(idx: int, run_id: str) -> None: - """Called when a run is created/deleted. - - Args: - idx: The run's index (0 to max_runs-1) - run_id: The run's ID (e.g., "run_abc123") - """ - pass - -def config_validation_callback(config: OrchestratorConfig) -> tuple[bool, str]: - """Validate an orchestrator config. - - Args: - config: The orchestrator config to validate - - Returns: - (is_valid, error_message): If invalid, error_message is written to config dir - """ - return True, "" -``` diff --git a/docs/multimodal.md b/docs/multimodal.md deleted file mode 100644 index 092869d922..0000000000 --- a/docs/multimodal.md +++ /dev/null @@ -1,60 +0,0 @@ -# Multimodal (VLM) Support - -Prime-RL supports training vision-language models (VLMs) like Qwen3-VL. - -## VLM Configuration - -### Supported Models - -The built-in registry supports these model families out of the box: - -| Model Family | model_type | Vision Encoder | Language Model | -|-------------|------------|---------------|----------------| -| Qwen3-VL | `qwen3_vl` | `model.visual` | `model.language_model` | -| Qwen3.5 | `qwen3_5` | `model.visual` | `model.language_model` | -| Qwen3.5-MoE | `qwen3_5_moe` | `model.visual` | `model.language_model` | - -Enable VLM mode by adding a `[model.vlm]` section. Both fields are required — they tell prime-rl where the vision encoder and language model live on the model object: - -```toml -[model] -name = "Qwen/Qwen3-VL-4B-Instruct" - -[model.vlm] -vision_encoder_attr = "model.visual" -language_model_attr = "model.language_model" -``` - -For the registered models in the table above, use the attrs shown there. For custom VLMs, check your model's structure with `model.named_children()`. - -Both fields are dotted attribute paths resolved on the loaded model. A bad path raises a `ValueError` immediately — there are no silent fallbacks. - -The weight key prefix for NCCL broadcasting is derived automatically as `{language_model_attr}.layers.`. - -To add permanent support for a new model family, add an entry to `VLM_REGISTRY` in `src/prime_rl/utils/vlm.py`. - -## Current Limitations - -- **Vision encoder is frozen by default**: The vision encoder is frozen during training by default. Set `freeze_vision_encoder = false` in `[model.vlm]` to make it trainable. When unfrozen, the vision encoder is FSDP-sharded per-block for proper gradient flow. Note: this has no effect when using LoRA. - -- **No multimodal-safe truncation**: Token sequences are truncated to `seq_len`, but `pixel_values` and `image_grid_thw` are passed through unchanged. If a multimodal sample exceeds `seq_len`, image tokens can be dropped while image tensors still describe the full set of images. Ensure `seq_len` covers your longest VLM samples. - -- **Optimization dtype must be bfloat16**: Set `optimization_dtype = "bfloat16"` and `reduce_dtype = "bfloat16"` in your trainer config. - -- **Higher KL mismatch with multi-image inputs**: VLM training exhibits higher KL mismatch compared to text-only, especially with multiple images. - -- **Images are not logged**: The images the VLM sees during training are not logged to monitors. - -## How Multi-Turn VLM RL Training Works - -VLM rollouts go through the renderer-backed TITO client (`orchestrator.use_renderer = true`, the default and required for VLMs). The renderer owns the HuggingFace processor per-slot and emits multimodal tensors alongside tokens. - -1. **Render**: For each trajectory step, the renderer tokenizes messages and emits per-image multimodal tensors (e.g. `pixel_values`, `image_grid_thw` for Qwen3-VL) as `multi_modal_data`. -2. **Pack**: `interleave_rollout` concatenates the per-image tensors emitted across a sample's merged step range into a single `mm_kwargs` dict on the `TrainingSample`. Per-token `mm_token_type_ids` (0=text, 1=image, 2=video) come from `renderer.mm_token_type_id_map`. -3. **Forward**: The trainer `**`-unpacks `mm_kwargs` into the model's `forward` signature, so any VLM whose HF processor and forward agree on kwarg names works without touching the transport. - -Each multimodal sample becomes its own micro-batch during training (no packing) since image tensor sizes vary. - -## vLLM Configuration - -`VLLM_WORKER_MULTIPROC_METHOD=spawn` is required for VLM inference. This is set automatically when using `uv run rl @ ...`, but if you start the vLLM server yourself, make sure this environment variable is set. diff --git a/docs/overview.md b/docs/overview.md new file mode 100644 index 0000000000..8ae0fd8531 --- /dev/null +++ b/docs/overview.md @@ -0,0 +1,74 @@ +# Overview + +`prime-rl` is a framework for large-scale, asynchronous reinforcement learning of large language models. It is designed to be easy to use and hackable, yet capable of scaling to 1000+ GPU clusters. Models are trained with PyTorch FSDP2 (with optional expert and context parallelism), rollouts are generated with vLLM, and the two halves talk to each other through a thin orchestrator process that owns dataset sampling, advantage computation, and weight broadcasting. + +Use `prime-rl` when you want to: + +- Train an open-weights LLM with RL on one of the [Environments Hub](https://app.primeintellect.ai/dashboard/environments?ex_sort=most_stars) tasks, or your own [verifiers](https://github.com/PrimeIntellect-ai/verifiers) environment. +- Post-train with SFT, then continue with RL, using the same model loader, checkpoint format, and chat template plumbing for both phases. +- Scale across multiple nodes — SLURM, Kubernetes, or hand-launched — without rewriting your config. +- Run agentic multi-turn rollouts (tool use, browser environments, long horizons) without re-tokenizing across turns. + +## Architecture + +A `prime-rl` RL run is three cooperating processes: + +![Architecture](assets/architecture.png) + +- **Inference** — A vLLM server (or fleet) that holds the current policy weights and serves OpenAI-compatible completions. Updated in place via a custom `update_weights` endpoint after each trainer step. +- **Orchestrator** — A lightweight CPU process that samples prompts, drives `verifiers` environments to generate rollouts against the inference server, packs them into training batches, ships them to the trainer, and relays new weights back to inference. +- **Trainer** — A torchrun-launched FSDP2 process group that consumes packed rollouts, computes the loss, steps the optimizer, and writes the new policy to the weight broadcast transport. + +The three processes communicate through configurable transports — by default the trainer↔orchestrator rollout link uses the local filesystem, and weight broadcast uses the filesystem (or NCCL for synchronous setups). Swap to ZMQ for multi-host setups without shared storage. See [Scaling](scaling.md) for the deployment options. + +Training is **asynchronous by default**: inference is allowed to run ahead of training by up to `max_async_level` steps, which hides the weight-broadcast latency behind ongoing rollouts. The loss is an off-policy-aware variant of [AIPO](https://arxiv.org/abs/2505.24034); see [Algorithms](algorithms.md) for the details. + +## Installation + +```bash +curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/prime-rl/main/scripts/install.sh | bash +``` + +The script clones the repo, initializes the `verifiers` / `renderers` / `research-environments` submodules, installs `uv`, and runs `uv sync --all-extras`. For manual setup, MoE-only installs (DeepGEMM / DeepEP / NIXL), or troubleshooting, see the [README](https://github.com/PrimeIntellect-ai/prime-rl#setup). + +You need at least one NVIDIA GPU (RTX 3090/4090/5090, A100, H100, H200, or B200). Single-GPU runs are supported for debugging; production RL is typically 1× inference node + 1+ trainer nodes. + +## Quick run + +Train `Qwen3-0.6B` on GSM8K with one trainer GPU and one inference GPU. This config ships in the repo: + +```bash +# 1. Install the verifiers environment from the Environments Hub. +prime env install primeintellect/math-env + +# 2. Set up a four-pane tmux session that tails each process's logs. +bash scripts/tmux.sh + +# 3. From the `Trainer` pane, launch all three processes co-located on this node. +uv run rl @ configs/gsm8k/rl.toml \ + --wandb.project your-project \ + --wandb.name gsm8k-smoke \ + --ckpt +``` + +The `rl` entrypoint reads `configs/gsm8k/rl.toml`, splits it into per-process sub-configs, picks GPU 0 for inference and GPU 1 for the trainer, launches all three processes, and tees their stdout into `outputs/logs/{trainer,orchestrator,inference}.log`. Watch the tmux panes — within a minute the trainer should log `step 1` and a reward sample. + +After 100 steps the run completes. Final HF-compatible weights land at `outputs/weights/step_100`. + +For a CPU-only smoke check (no real training, no GPU), use the SFT fake-data config: + +```bash +uv run sft @ configs/debug/sft/train.toml +``` + +For multi-GPU, multi-node, SLURM, and Kubernetes layouts, see [Scaling](scaling.md). + +## Where to go next + +- **[Configuration](configuration.md)** — How TOML files, `@` composition, CLI overrides, and env vars combine; the precedence rules; worked examples. +- **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. +- **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. +- **[Algorithms](algorithms.md)** — Async / off-policy semantics; the AIPO loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. +- **[Advanced](advanced.md)** — MoE training (EP backends, custom impls); VLMs; LoRA and the multi-run manager; small-scale MoE testing; environments deep-dive. +- **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. +- **[FAQs](faqs.md)** — Quick answers to recurring questions. diff --git a/docs/platform-monitoring.md b/docs/platform-monitoring.md deleted file mode 100644 index 31bcfe312b..0000000000 --- a/docs/platform-monitoring.md +++ /dev/null @@ -1,48 +0,0 @@ -# Platform Monitoring - -Use `orchestrator.prime_monitor` to register a run on the Prime Intellect platform and stream training metrics, samples, and distributions. - -> **Internal-only for now:** external run registration is currently only enabled for internal / allowlisted teams. - -## Prerequisites - -You need a Prime API key with `rft:write` scope. - -Use the CLI: - -```bash -prime login -``` - -Or set an environment variable directly: - -```bash -export PRIME_API_KEY=pit_... -``` - -## Minimal config - -```toml -[orchestrator.prime_monitor] -run_name = "my-experiment" -``` - -You can also override from the CLI: - -```bash -uv run rl @ config.toml --orchestrator.prime_monitor.run_name "my-experiment" -``` - -## Troubleshooting - -### `API key not found` - -Set the env var from `api_key_var` or run: - -```bash -prime login -``` - -### `External training runs are not enabled for this team` - -Your team is not allowlisted yet. This feature is currently internal-only. diff --git a/docs/reference.md b/docs/reference.md new file mode 100644 index 0000000000..8d2d02b51e --- /dev/null +++ b/docs/reference.md @@ -0,0 +1,2703 @@ +# Reference + +This page documents every field accepted by every prime-rl entrypoint. It is +auto-generated from the Pydantic config models; do not edit by hand. + +To regenerate, run from the project root: + +```bash +uv run python scripts/generate_docs_reference.py +``` + +Each entrypoint section walks its config tree top-down. Nested sub-configs +appear under headings named after their dotted path (e.g. `trainer.model.ac`). +Discriminated unions (loss, advantage, scheduler, optimizer, …) document each +variant in turn — set the `type` field to pick one. + +For conceptual context behind these knobs, see +[Configuration](configuration.md), [Training](training.md), +[Scaling](scaling.md), [Algorithms](algorithms.md), and [Advanced](advanced.md). + +## Table of Contents + +- [`rl` — Full RL training](#rl) + - [`trainer`](#rl-trainer) + - [`orchestrator`](#rl-orchestrator) + - [`inference`](#rl-inference) + - [`teacher_inference`](#rl-teacher-inference) + - [`log`](#rl-log) + - [`ckpt`](#rl-ckpt) + - [`wandb`](#rl-wandb) + - [`model`](#rl-model) + - [`tokenizer`](#rl-tokenizer) + - [`weight_broadcast`](#rl-weight-broadcast) + - [`slurm`](#rl-slurm) + - [`experimental`](#rl-experimental) + - [`deployment`](#rl-deployment) +- [`sft` — Supervised fine-tuning](#sft) + - [`model`](#sft-model) + - [`tokenizer`](#sft-tokenizer) + - [`renderer`](#sft-renderer) + - [`val`](#sft-val) + - [`ckpt`](#sft-ckpt) + - [`log`](#sft-log) + - [`wandb`](#sft-wandb) + - [`bench`](#sft-bench) + - [`gc`](#sft-gc) + - [`heartbeat`](#sft-heartbeat) + - [`slurm`](#sft-slurm) + - [`experimental`](#sft-experimental) + - [`data`](#sft-data) + - [`optim`](#sft-optim) + - [`scheduler`](#sft-scheduler) + - [`deployment`](#sft-deployment) +- [`trainer` — Standalone trainer](#trainer) + - [`model`](#trainer-model) + - [`tokenizer`](#trainer-tokenizer) + - [`data`](#trainer-data) + - [`ckpt`](#trainer-ckpt) + - [`log`](#trainer-log) + - [`wandb`](#trainer-wandb) + - [`bench`](#trainer-bench) + - [`gc`](#trainer-gc) + - [`heartbeat`](#trainer-heartbeat) + - [`metrics_server`](#trainer-metrics-server) + - [`experimental`](#trainer-experimental) + - [`loss`](#trainer-loss) + - [`optim`](#trainer-optim) + - [`scheduler`](#trainer-scheduler) + - [`weight_broadcast`](#trainer-weight-broadcast) + - [`rollout_transport`](#trainer-rollout-transport) +- [`orchestrator` — Standalone orchestrator](#orchestrator) + - [`student`](#orchestrator-student) + - [`teacher`](#orchestrator-teacher) + - [`train`](#orchestrator-train) + - [`tokenizer`](#orchestrator-tokenizer) + - [`renderer`](#orchestrator-renderer) + - [`optim`](#orchestrator-optim) + - [`eval`](#orchestrator-eval) + - [`buffer`](#orchestrator-buffer) + - [`log`](#orchestrator-log) + - [`wandb`](#orchestrator-wandb) + - [`prime_monitor`](#orchestrator-prime-monitor) + - [`ckpt`](#orchestrator-ckpt) + - [`heartbeat`](#orchestrator-heartbeat) + - [`experimental`](#orchestrator-experimental) + - [`weight_broadcast`](#orchestrator-weight-broadcast) + - [`rollout_transport`](#orchestrator-rollout-transport) +- [`inference` — Standalone vLLM server](#inference) + - [`server`](#inference-server) + - [`model`](#inference-model) + - [`parallel`](#inference-parallel) + - [`weight_broadcast`](#inference-weight-broadcast) + - [`kv_cache_offload`](#inference-kv-cache-offload) + - [`slurm`](#inference-slurm) + - [`experimental`](#inference-experimental) + - [`deployment`](#inference-deployment) + +--- + + +## `rl` — Full RL training + +The `rl` entrypoint composes a trainer, orchestrator, and (optionally) inference server into a single co-located deployment. Sub-configs under `[trainer]`, `[orchestrator]`, and `[inference]` mirror the standalone entrypoints below, with shared knobs (model name, output dir, W&B run name, …) lifted to the top level so they only need to be set once. + +_Defined in_ `prime_rl.configs.rl.RLConfig`. + +| Field | Type | Default | Description | +|---|---|---|---| +| `output_dir` | Path | `'outputs'` | Output directory. Should be unique per experiment. | +| `clean_output_dir` | bool | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | +| `max_steps` | int \| None | `None` | Shared maximum training steps. If None, falls back to the sub-config ``max_steps``. | +| `seq_len` | int \| None | `None` | Shared sequence length. Propagates to ``trainer.model.seq_len`` and ``orchestrator.seq_len`` only when those values were not explicitly set; explicit per-component values always win. | +| `max_async_level` | int \| None | `None` | Shared async level. If None, falls back to the sub-config ``max_async_level``. | +| `bench` | bool | `False` | Benchmark mode. Sets trainer and orchestrator to benchmark mode and, when set, suffixes the W&B project with ``-bench``. | +| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | + + +### `trainer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `trainer.matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `trainer.max_steps` | int \| None | `None` | Maximum number of training steps. If None, runs indefinitely. | +| `trainer.max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | +| `trainer.enable_router_replay` | bool | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | +| `trainer.memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | +| `trainer.trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | +| `trainer.dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | +| `trainer.max_concurrent_runs` | int | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | + + +#### `trainer.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `trainer.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `trainer.model.seq_len` | int | `2048` | Sequence length the model is trained on. | +| `trainer.model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `trainer.model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `trainer.model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `trainer.model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | +| `trainer.model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | +| `trainer.model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `trainer.model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `trainer.model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `trainer.model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `trainer.model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | +| `trainer.model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `trainer.model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `trainer.model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | +| `trainer.model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | +| `trainer.model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `trainer.model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `trainer.model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | +| `trainer.model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | + + +##### `trainer.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `trainer.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `trainer.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +##### `trainer.model.compile` + +Compile the model with ``torch.compile``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | + + +##### `trainer.model.ac` + +Activation checkpointing configuration. If None, activation checkpointing is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `trainer.model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `trainer.model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | + + +##### `trainer.model.ac_offloading` + +Activation offloading configuration. If None, activation offloading is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | +| `trainer.model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | + + +##### `trainer.model.lora` + +LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `trainer.model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | +| `trainer.model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `trainer.model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `trainer.model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | + + +##### `trainer.model.debug` + +Debugging knobs for the model and distributed training. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | +| `trainer.model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | +| `trainer.model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | + + +#### `trainer.tokenizer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `trainer.tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `trainer.tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +#### `trainer.data` + + +##### `trainer.data.fake` + +Use a fake data loader sampling random micro-batches (for debugging). + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.data.fake.batch_size` | int | `2` | _≥1._ Batch size of the fake data loader. | +| `trainer.data.fake.generate_samples` | bool | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | + + +#### `trainer.ckpt` + +Full training-state checkpoint configuration (model + optimizer + scheduler). If None, no resume-capable checkpoints are written. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `trainer.ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `trainer.ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `trainer.ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `trainer.ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `trainer.ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `trainer.ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `trainer.ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | +| `trainer.ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | +| `trainer.ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | +| `trainer.ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | + + +##### `trainer.ckpt.weights` + +Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoints are written. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | +| `trainer.ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | +| `trainer.ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | + + +#### `trainer.log` + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `trainer.log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `trainer.log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `trainer.log.log_data` | bool | `False` | Log the first data sample at startup. | +| `trainer.log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | + + +#### `trainer.wandb` + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.wandb.project` | str | `'prime-rl'` | W&B project to log to. | +| `trainer.wandb.entity` | str \| None | `None` | W&B entity to log to. | +| `trainer.wandb.name` | str \| None | `None` | W&B run name. | +| `trainer.wandb.group` | str \| None | `None` | W&B group. | +| `trainer.wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `trainer.wandb.offline` | bool | `False` | Run W&B in offline mode. | + + +#### `trainer.bench` + +Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake data is used. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | + + +#### `trainer.gc` + +Garbage collection config. Disables automatic GC and runs deterministic collections every N steps to avoid stragglers. Set to null to use Python's default GC behavior. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | + + +#### `trainer.heartbeat` + +BetterStack heartbeat configuration for monitoring training progress. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.heartbeat.url` | str | *required* | URL to send the heartbeat to. | + + +#### `trainer.metrics_server` + +Prometheus metrics server configuration. If set, exposes a ``/metrics`` endpoint for scraping. + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.metrics_server.port` | int | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | +| `trainer.metrics_server.host` | str | `'0.0.0.0'` | Host to bind the server to. | + + +#### `trainer.experimental` + + +##### `trainer.experimental.token_export` + +Opt-in per-token JSONL export for rollout debugging. When enabled, writes token ids and aligned trainer metrics after each forward pass. + + +#### `trainer.loss` + +Loss config for rl-mode batches. opd and sft batches dispatch to their own loss fns unconditionally and do not read this. + +Discriminated union — set `trainer.loss.type` to one of `default`, `custom` and provide the matching sub-fields. + + +##### `trainer.loss.type = "default"` (DefaultLossConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.loss.type` | 'default' | `'default'` | | +| `trainer.loss.dppo_mask_low` | float | `0.2` | _≥0._ Lower DPPO masking threshold. | +| `trainer.loss.dppo_mask_high` | float | `0.2` | _≥0._ Upper DPPO masking threshold. | +| `trainer.loss.adv_tau` | float | `1.0` | _≥0._ Temperature for the advantage term. | +| `trainer.loss.kl_tau` | float | `0.001` | _≥0._ Temperature for the KL term. | + + +##### `trainer.loss.type = "custom"` (CustomLossConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.loss.type` | 'custom' | `'custom'` | | +| `trainer.loss.import_path` | str | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | +| `trainer.loss.kwargs` | dict[str, Any] | `{}` | Kwargs forwarded to the loss function. | + + +#### `trainer.optim` + +Discriminated union — set `trainer.optim.type` to one of `sgd`, `adamw`, `muon`, `sign_sgd` and provide the matching sub-fields. + + +##### `trainer.optim.type = "sgd"` (SGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | 'sgd' | `'sgd'` | | +| `trainer.optim.nesterov` | bool | `True` | Use Nesterov momentum. | +| `trainer.optim.momentum` | float | `0.9` | SGD momentum factor. | + + +##### `trainer.optim.type = "adamw"` (AdamWConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | 'adamw' | `'adamw'` | | +| `trainer.optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `trainer.optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | + + +##### `trainer.optim.type = "muon"` (MuonConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | 'muon' | `'muon'` | | +| `trainer.optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `trainer.optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `trainer.optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | + + +##### `trainer.optim.type = "sign_sgd"` (SignSGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | 'sign_sgd' | `'sign_sgd'` | | + + +#### `trainer.scheduler` + +Discriminated union — set `trainer.scheduler.type` to one of `constant`, `linear`, `cosine` and provide the matching sub-fields. + + +##### `trainer.scheduler.type = "constant"` (ConstantSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.scheduler.type` | 'constant' | `'constant'` | | + + +##### `trainer.scheduler.type = "linear"` (LinearSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.scheduler.type` | 'linear' | `'linear'` | | +| `trainer.scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `trainer.scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `trainer.scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +##### `trainer.scheduler.type = "cosine"` (CosineSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.scheduler.type` | 'cosine' | `'cosine'` | | +| `trainer.scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `trainer.scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +#### `trainer.weight_broadcast` + +Transport used to broadcast updated weights from trainer to inference. + +Discriminated union — set `trainer.weight_broadcast.type` to one of `filesystem`, `nccl` and provide the matching sub-fields. + + +##### `trainer.weight_broadcast.type = "filesystem"` (FileSystemWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.weight_broadcast.type` | 'filesystem' | `'filesystem'` | | +| `trainer.weight_broadcast.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | +| `trainer.weight_broadcast.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | + + +##### `trainer.weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.weight_broadcast.type` | 'nccl' | `'nccl'` | | +| `trainer.weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `trainer.weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | +| `trainer.weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | +| `trainer.weight_broadcast.inference_world_size` | int | `1` | Number of GPUs used for inference. | +| `trainer.weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | + + +#### `trainer.rollout_transport` + +Transport used to ship rollouts from orchestrator to trainer. + +Discriminated union — set `trainer.rollout_transport.type` to one of `filesystem`, `zmq` and provide the matching sub-fields. + + +##### `trainer.rollout_transport.type = "filesystem"` (FileSystemTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.rollout_transport.type` | 'filesystem' | `'filesystem'` | | + + +##### `trainer.rollout_transport.type = "zmq"` (ZMQTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `trainer.rollout_transport.type` | 'zmq' | `'zmq'` | | +| `trainer.rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | +| `trainer.rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | +| `trainer.rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | + + +### `orchestrator` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.training_mode` | 'rl' \| 'opd' \| 'sft' | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | +| `orchestrator.advantage` | DefaultAdvantageConfig \| CustomAdvantageConfig \| None | `DefaultAdvantageConfig()` | | +| `orchestrator.filters` | list[GibberishFilterConfig \| RepetitionFilterConfig \| ZeroAdvantageFilterConfig] | `[GibberishFilterConfig(type='gibberish', enforce=False, token_id_threshold=100000, logprob_offset=2.0), RepetitionFilterConfig(type='repetition', enforce=False, window=3000, prob_threshold=0.99), ZeroAdvantageFilterConfig(type='zero_advantage', enforce=True)]` | Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). | +| `orchestrator.collect_inference_metrics` | bool | `True` | Collect inference-server metrics (requires wandb). | +| `orchestrator.output_dir` | Path | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `orchestrator.tasks_per_minute` | int \| None | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | +| `orchestrator.batch_size` | int \| None | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | +| `orchestrator.token_batch_size` | int \| None | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | +| `orchestrator.oversampling_factor` | float \| None | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | +| `orchestrator.max_inflight_rollouts` | int \| None | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | +| `orchestrator.rollouts_per_example` | int | `1` | _≥1._ Output sequences returned per example during training. | +| `orchestrator.seq_len` | int | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | +| `orchestrator.num_train_workers` | int | `1` | _≥1._ Training workers to use. | +| `orchestrator.max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | +| `orchestrator.max_off_policy_steps` | int | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | +| `orchestrator.max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | +| `orchestrator.strict_async_level` | bool | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | +| `orchestrator.bench` | bool | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | +| `orchestrator.seed` | int \| None | `42` | Random seed for the orchestrator. | +| `orchestrator.use_renderer` | bool | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | +| `orchestrator.env_install_prerelease` | bool | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | + + +#### `orchestrator.student` + +Student rollout participant (model + client) — the model being trained. + + +##### `orchestrator.student.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.student.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `orchestrator.student.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | + + +###### `orchestrator.student.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.student.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `orchestrator.student.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `orchestrator.student.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +###### `orchestrator.student.model.lora` + +Per-run LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.student.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `orchestrator.student.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `orchestrator.student.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | + + +##### `orchestrator.student.client` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.student.client.timeout` | int | `1200` | Request timeout in seconds. | +| `orchestrator.student.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `orchestrator.student.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `orchestrator.student.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `orchestrator.student.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `orchestrator.student.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | +| `orchestrator.student.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `orchestrator.student.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "example_id"}`` enables sticky routing at the inference router. | +| `orchestrator.student.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `orchestrator.student.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `orchestrator.student.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `orchestrator.student.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | + + +###### `orchestrator.student.client.elastic` + +Elastic inference pool config for DNS-based service discovery. When set, ``base_url`` is ignored and inference servers are discovered dynamically via DNS. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.student.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | +| `orchestrator.student.client.elastic.port` | int | `8000` | Port that inference servers listen on. | +| `orchestrator.student.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | + + +#### `orchestrator.teacher` + +Teacher rollout participant (model + client). Role depends on ``training_mode``: ``opd`` — teacher computes logprobs; ``sft`` — teacher generates rollouts. + + +##### `orchestrator.teacher.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.teacher.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `orchestrator.teacher.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | + + +###### `orchestrator.teacher.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.teacher.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `orchestrator.teacher.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `orchestrator.teacher.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +###### `orchestrator.teacher.model.lora` + +Per-run LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.teacher.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `orchestrator.teacher.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `orchestrator.teacher.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | + + +##### `orchestrator.teacher.client` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.teacher.client.timeout` | int | `1200` | Request timeout in seconds. | +| `orchestrator.teacher.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `orchestrator.teacher.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `orchestrator.teacher.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `orchestrator.teacher.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `orchestrator.teacher.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | +| `orchestrator.teacher.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `orchestrator.teacher.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "example_id"}`` enables sticky routing at the inference router. | +| `orchestrator.teacher.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `orchestrator.teacher.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `orchestrator.teacher.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `orchestrator.teacher.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | + + +###### `orchestrator.teacher.client.elastic` + +Elastic inference pool config for DNS-based service discovery. When set, ``base_url`` is ignored and inference servers are discovered dynamically via DNS. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.teacher.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | +| `orchestrator.teacher.client.elastic.port` | int | `8000` | Port that inference servers listen on. | +| `orchestrator.teacher.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | + + +#### `orchestrator.train` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.train.env` | list[TrainEnvConfig] | `[TrainEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=TrainSamplingConfig(temperature=1.0, repetition_penalty=1.0, max_completion_tokens=None, min_tokens=0, seed=None, extra_body={}))]` | Training environments. | +| `orchestrator.train.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `orchestrator.train.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | + + +##### `orchestrator.train.sampling` + +Shared training sampling configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.train.sampling.temperature` | float | `1.0` | _≥0._ Sampling temperature. | +| `orchestrator.train.sampling.repetition_penalty` | float | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `orchestrator.train.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `orchestrator.train.sampling.min_tokens` | int | `0` | _≥0._ Minimum output tokens per sequence. | +| `orchestrator.train.sampling.seed` | int \| None | `None` | Random seed for sampling. If None, no seeding is used. | +| `orchestrator.train.sampling.extra_body` | dict[str, Any] | `{}` | Extra body forwarded with each request to the inference server. | + + +#### `orchestrator.tokenizer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `orchestrator.tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `orchestrator.tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +#### `orchestrator.renderer` + +Client-side renderer configuration. Only consumed when ``use_renderer=true``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `orchestrator.renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `orchestrator.renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `orchestrator.renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `orchestrator.renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `orchestrator.renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | + + +#### `orchestrator.optim` + +Per-run optimizer configuration for multi-run training. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.optim.lr` | float | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | + + +#### `orchestrator.eval` + +Evaluation configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.eval.env` | list[EvalEnvConfig] | `[EvalEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=EvalSamplingConfig(temperature=None, repetition_penalty=None, top_p=None, top_k=None, min_p=None, max_completion_tokens=None, min_tokens=None, reasoning_effort=None, seed=None, extra_body={}), num_examples=-1, rollouts_per_example=1, interval=100)]` | Evaluation environments. | +| `orchestrator.eval.num_examples` | int | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | +| `orchestrator.eval.rollouts_per_example` | int | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | +| `orchestrator.eval.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `orchestrator.eval.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `orchestrator.eval.interval` | int | `100` | _≥1._ Step interval at which to evaluate the model. | +| `orchestrator.eval.eval_base_model` | bool | `True` | Evaluate the base model we are training on. | +| `orchestrator.eval.skip_eval_on_resume` | bool | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | +| `orchestrator.eval.cancel_inflight_rollouts_on_eval` | bool | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | + + +##### `orchestrator.eval.sampling` + +Shared eval sampling configuration; can differ from training sampling. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.eval.sampling.temperature` | float \| None | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `orchestrator.eval.sampling.repetition_penalty` | float \| None | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `orchestrator.eval.sampling.top_p` | float \| None | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.sampling.top_k` | int \| None | `None` | Top-k sampling. None defers to the inference server default. | +| `orchestrator.eval.sampling.min_p` | float \| None | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `orchestrator.eval.sampling.min_tokens` | int \| None | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `orchestrator.eval.sampling.reasoning_effort` | 'minimal' \| 'low' \| 'medium' \| 'high' \| None | `None` | Reasoning effort constraint for reasoning models. | +| `orchestrator.eval.sampling.seed` | int \| None | `None` | Random seed for sampling. None means no seeding. | +| `orchestrator.eval.sampling.extra_body` | dict[str, Any] | `{}` | Extra body parameters forwarded to the inference server. | + + +#### `orchestrator.buffer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.buffer.seed` | int \| None | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | +| `orchestrator.buffer.easy_threshold` | float \| None | `None` | Average-reward threshold above which a problem is classified ``easy``. | +| `orchestrator.buffer.hard_threshold` | float \| None | `None` | Average-reward threshold below which a problem is classified ``hard``. | +| `orchestrator.buffer.easy_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `orchestrator.buffer.hard_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `orchestrator.buffer.online_difficulty_filtering` | bool | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | +| `orchestrator.buffer.hash_keys` | list[str] | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | + + +#### `orchestrator.log` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `orchestrator.log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `orchestrator.log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `orchestrator.log.log_data` | bool | `False` | Log the first data sample at startup. | + + +#### `orchestrator.wandb` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.wandb.project` | str | `'prime-rl'` | W&B project to log to. | +| `orchestrator.wandb.entity` | str \| None | `None` | W&B entity to log to. | +| `orchestrator.wandb.name` | str \| None | `None` | W&B run name. | +| `orchestrator.wandb.group` | str \| None | `None` | W&B group. | +| `orchestrator.wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `orchestrator.wandb.offline` | bool | `False` | Run W&B in offline mode. | + + +##### `orchestrator.wandb.log_extras` + +Extras logging configuration. If None, no extras are logged. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.wandb.log_extras.samples` | bool | `True` | Log prompt/response samples. | +| `orchestrator.wandb.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | +| `orchestrator.wandb.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | +| `orchestrator.wandb.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | + + +#### `orchestrator.prime_monitor` + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.prime_monitor.base_url` | str | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | +| `orchestrator.prime_monitor.api_key_var` | str | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | +| `orchestrator.prime_monitor.run_name` | str \| None | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | +| `orchestrator.prime_monitor.team_id` | str \| None | `None` | Team ID to associate the run with. | +| `orchestrator.prime_monitor.frontend_url` | str \| None | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | + + +##### `orchestrator.prime_monitor.log_extras` + +Extras logging configuration. If None, no extras are logged. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.prime_monitor.log_extras.samples` | bool | `True` | Log prompt/response samples. | +| `orchestrator.prime_monitor.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | +| `orchestrator.prime_monitor.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | +| `orchestrator.prime_monitor.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | + + +#### `orchestrator.ckpt` + +Checkpoint configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.ckpt.interval` | int \| None | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | +| `orchestrator.ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | +| `orchestrator.ckpt.wait_for_weights_timeout` | int \| None | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | +| `orchestrator.ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `orchestrator.ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `orchestrator.ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | +| `orchestrator.ckpt.skip_buffer` | bool | `False` | Skip loading the buffer from checkpoint. | + + +#### `orchestrator.heartbeat` + +BetterStack heartbeat configuration for monitoring training progress. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.heartbeat.url` | str | *required* | URL to send the heartbeat to. | + + +#### `orchestrator.experimental` + + +#### `orchestrator.weight_broadcast` + +Transport used to receive updated weights from the trainer. + +Discriminated union — set `orchestrator.weight_broadcast.type` to one of `filesystem`, `nccl` and provide the matching sub-fields. + + +##### `orchestrator.weight_broadcast.type = "filesystem"` (FileSystemWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.weight_broadcast.type` | 'filesystem' | `'filesystem'` | | + + +##### `orchestrator.weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.weight_broadcast.type` | 'nccl' | `'nccl'` | | +| `orchestrator.weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `orchestrator.weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | +| `orchestrator.weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | +| `orchestrator.weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | +| `orchestrator.weight_broadcast.inference_world_size` | int | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | + + +#### `orchestrator.rollout_transport` + +Transport used to ship rollouts from orchestrator to trainer. + +Discriminated union — set `orchestrator.rollout_transport.type` to one of `filesystem`, `zmq` and provide the matching sub-fields. + + +##### `orchestrator.rollout_transport.type = "filesystem"` (FileSystemTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.rollout_transport.type` | 'filesystem' | `'filesystem'` | | + + +##### `orchestrator.rollout_transport.type = "zmq"` (ZMQTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.rollout_transport.type` | 'zmq' | `'zmq'` | | +| `orchestrator.rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | +| `orchestrator.rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | +| `orchestrator.rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | + + +### `inference` + +Inference server configuration. If None, the rl entrypoint will not start an inference server (useful for elastic inference pools or manually started servers). + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.enable_lora` | bool | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | +| `inference.max_loras` | int | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | +| `inference.max_cpu_loras` | int | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | +| `inference.max_lora_rank` | int \| None | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | +| `inference.lora_target_modules` | list[str] \| None | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | +| `inference.enable_prefix_caching` | bool \| None | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | +| `inference.gpu_memory_utilization` | float | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | +| `inference.api_server_count` | int | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | +| `inference.data_parallel_size_local` | int \| None | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | +| `inference.data_parallel_rpc_port` | int | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | +| `inference.seed` | int | `0` | Seed the inference components. Forwarded as ``--seed``. | +| `inference.enable_expert_parallel` | bool | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | +| `inference.all2all_backend` | 'allgather_reducescatter' \| 'deepep_high_throughput' \| 'deepep_low_latency' \| 'flashinfer_nvlink_one_sided' \| 'flashinfer_nvlink_two_sided' | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | +| `inference.enable_eplb` | bool | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | +| `inference.enable_dbo` | bool | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | +| `inference.use_deep_gemm` | bool | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | +| `inference.enable_return_routed_experts` | bool | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | +| `inference.enable_fp32_lm_head` | bool | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | +| `inference.vllm_extra` | dict[str, Any] | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | +| `inference.output_dir` | Path | `'outputs'` | Directory for SLURM logs and generated scripts. | +| `inference.dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | + + +#### `inference.server` + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.server.host` | str \| None | `None` | Host to bind to. | +| `inference.server.port` | int | `8000` | Port to bind to. | +| `inference.server.liveness_timeout_seconds` | float | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | + + +#### `inference.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `inference.model.trust_remote_code` | bool | `False` | Trust remote code. Forwarded to vLLM engine init. | +| `inference.model.dtype` | 'auto' \| 'float16' \| 'bfloat16' \| 'float32' | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | +| `inference.model.max_model_len` | int \| None | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | +| `inference.model.enforce_eager` | bool | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | +| `inference.model.chat_template` | str \| None | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | +| `inference.model.tool_call_parser` | str \| None | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `inference.model.reasoning_parser` | str \| None | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `inference.model.rope_scaling` | dict[str, Any] \| str \| None | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | + + +##### `inference.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `inference.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `inference.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +#### `inference.parallel` + +Multi-node and multi-GPU parallelism (TP, DP, PP). + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.parallel.tp` | int | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | +| `inference.parallel.dp` | int | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | + + +#### `inference.weight_broadcast` + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | + + +#### `inference.kv_cache_offload` + +CPU KV cache offload for inference workers. Standard inference uses vLLM's ``OffloadingConnector``. Disaggregated P/D deployments combine it with NIXL through ``MultiConnector`` in the SLURM launcher. + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.kv_cache_offload.cpu_bytes` | int | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | + + +#### `inference.slurm` + +SLURM configuration. When set, the run is submitted as a SLURM job instead of running locally. + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.slurm.job_name` | str | `'prime-rl'` | SLURM job name. | +| `inference.slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `inference.slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `inference.slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `inference.slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `inference.slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `inference.slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | +| `inference.slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `inference.slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | + + +#### `inference.experimental` + + +#### `inference.deployment` + +Discriminated union — set `inference.deployment.type` to one of `single_node`, `multi_node`, `disaggregated` and provide the matching sub-fields. + + +##### `inference.deployment.type = "single_node"` (SingleNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `inference.deployment.type` | 'single_node' | `'single_node'` | | + + +##### `inference.deployment.type = "multi_node"` (MultiNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `inference.deployment.type` | 'multi_node' | `'multi_node'` | | +| `inference.deployment.num_nodes` | int | `2` | _≥1._ Inference nodes. | +| `inference.deployment.router_port` | int | `8000` | Port for the vllm-router. | +| `inference.deployment.backend_port` | int | `8100` | Port for vLLM backend instances. | +| `inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | + + +##### `inference.deployment.type = "disaggregated"` (DisaggregatedInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `inference.deployment.type` | 'disaggregated' | `'disaggregated'` | | +| `inference.deployment.num_prefill_nodes` | int | `1` | _≥1._ Total prefill nodes. | +| `inference.deployment.num_decode_nodes` | int | `1` | _≥1._ Total decode nodes. | +| `inference.deployment.num_prefill_replicas` | int | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | +| `inference.deployment.num_decode_replicas` | int | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | +| `inference.deployment.router_port` | int | `8000` | Port for the vllm-router on each replica. | +| `inference.deployment.prefill_port` | int | `8100` | Port for prefill vLLM instances. | +| `inference.deployment.decode_port` | int | `8200` | Port for decode vLLM instances. | +| `inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `inference.deployment.prefill_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on prefill nodes. | +| `inference.deployment.decode_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on decode nodes. | + + +### `teacher_inference` + +Teacher inference server configuration. If None, falls back to the same config as ``inference`` (or a default). Only used when teacher GPUs/nodes are set. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.enable_lora` | bool | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | +| `teacher_inference.max_loras` | int | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | +| `teacher_inference.max_cpu_loras` | int | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | +| `teacher_inference.max_lora_rank` | int \| None | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | +| `teacher_inference.lora_target_modules` | list[str] \| None | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | +| `teacher_inference.enable_prefix_caching` | bool \| None | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | +| `teacher_inference.gpu_memory_utilization` | float | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | +| `teacher_inference.api_server_count` | int | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | +| `teacher_inference.data_parallel_size_local` | int \| None | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | +| `teacher_inference.data_parallel_rpc_port` | int | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | +| `teacher_inference.seed` | int | `0` | Seed the inference components. Forwarded as ``--seed``. | +| `teacher_inference.enable_expert_parallel` | bool | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | +| `teacher_inference.all2all_backend` | 'allgather_reducescatter' \| 'deepep_high_throughput' \| 'deepep_low_latency' \| 'flashinfer_nvlink_one_sided' \| 'flashinfer_nvlink_two_sided' | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | +| `teacher_inference.enable_eplb` | bool | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | +| `teacher_inference.enable_dbo` | bool | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | +| `teacher_inference.use_deep_gemm` | bool | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | +| `teacher_inference.enable_return_routed_experts` | bool | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | +| `teacher_inference.enable_fp32_lm_head` | bool | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | +| `teacher_inference.vllm_extra` | dict[str, Any] | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | +| `teacher_inference.output_dir` | Path | `'outputs'` | Directory for SLURM logs and generated scripts. | +| `teacher_inference.dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | + + +#### `teacher_inference.server` + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.server.host` | str \| None | `None` | Host to bind to. | +| `teacher_inference.server.port` | int | `8000` | Port to bind to. | +| `teacher_inference.server.liveness_timeout_seconds` | float | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | + + +#### `teacher_inference.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `teacher_inference.model.trust_remote_code` | bool | `False` | Trust remote code. Forwarded to vLLM engine init. | +| `teacher_inference.model.dtype` | 'auto' \| 'float16' \| 'bfloat16' \| 'float32' | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | +| `teacher_inference.model.max_model_len` | int \| None | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | +| `teacher_inference.model.enforce_eager` | bool | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | +| `teacher_inference.model.chat_template` | str \| None | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | +| `teacher_inference.model.tool_call_parser` | str \| None | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `teacher_inference.model.reasoning_parser` | str \| None | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `teacher_inference.model.rope_scaling` | dict[str, Any] \| str \| None | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | + + +##### `teacher_inference.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `teacher_inference.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `teacher_inference.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +#### `teacher_inference.parallel` + +Multi-node and multi-GPU parallelism (TP, DP, PP). + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.parallel.tp` | int | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | +| `teacher_inference.parallel.dp` | int | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | + + +#### `teacher_inference.weight_broadcast` + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | + + +#### `teacher_inference.kv_cache_offload` + +CPU KV cache offload for inference workers. Standard inference uses vLLM's ``OffloadingConnector``. Disaggregated P/D deployments combine it with NIXL through ``MultiConnector`` in the SLURM launcher. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.kv_cache_offload.cpu_bytes` | int | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | + + +#### `teacher_inference.slurm` + +SLURM configuration. When set, the run is submitted as a SLURM job instead of running locally. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.slurm.job_name` | str | `'prime-rl'` | SLURM job name. | +| `teacher_inference.slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `teacher_inference.slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `teacher_inference.slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `teacher_inference.slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `teacher_inference.slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `teacher_inference.slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | +| `teacher_inference.slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `teacher_inference.slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | + + +#### `teacher_inference.experimental` + + +#### `teacher_inference.deployment` + +Discriminated union — set `teacher_inference.deployment.type` to one of `single_node`, `multi_node`, `disaggregated` and provide the matching sub-fields. + + +##### `teacher_inference.deployment.type = "single_node"` (SingleNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `teacher_inference.deployment.type` | 'single_node' | `'single_node'` | | + + +##### `teacher_inference.deployment.type = "multi_node"` (MultiNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `teacher_inference.deployment.type` | 'multi_node' | `'multi_node'` | | +| `teacher_inference.deployment.num_nodes` | int | `2` | _≥1._ Inference nodes. | +| `teacher_inference.deployment.router_port` | int | `8000` | Port for the vllm-router. | +| `teacher_inference.deployment.backend_port` | int | `8100` | Port for vLLM backend instances. | +| `teacher_inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | + + +##### `teacher_inference.deployment.type = "disaggregated"` (DisaggregatedInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher_inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `teacher_inference.deployment.type` | 'disaggregated' | `'disaggregated'` | | +| `teacher_inference.deployment.num_prefill_nodes` | int | `1` | _≥1._ Total prefill nodes. | +| `teacher_inference.deployment.num_decode_nodes` | int | `1` | _≥1._ Total decode nodes. | +| `teacher_inference.deployment.num_prefill_replicas` | int | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | +| `teacher_inference.deployment.num_decode_replicas` | int | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | +| `teacher_inference.deployment.router_port` | int | `8000` | Port for the vllm-router on each replica. | +| `teacher_inference.deployment.prefill_port` | int | `8100` | Port for prefill vLLM instances. | +| `teacher_inference.deployment.decode_port` | int | `8200` | Port for decode vLLM instances. | +| `teacher_inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `teacher_inference.deployment.prefill_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on prefill nodes. | +| `teacher_inference.deployment.decode_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on decode nodes. | + + +### `log` + +Shared log config. Propagated to trainer and orchestrator. + +| Field | Type | Default | Description | +|---|---|---|---| +| `log.level` | str \| None | `None` | Log level for trainer and orchestrator. When unset, each sub-config's own log level applies (defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``). | +| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | + + +### `ckpt` + +Shared checkpoint config. If None, falls back to the sub-config checkpoint settings. + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. When set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir``. | +| `ckpt.interval` | int \| None | `None` | Interval at which to save checkpoints. | +| `ckpt.resume_step` | int \| None | `None` | Step to resume from. If None, does not resume from a checkpoint. | +| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | + + +### `wandb` + +Shared W&B config. If None, falls back to the sub-config W&B settings. + +| Field | Type | Default | Description | +|---|---|---|---| +| `wandb.project` | str \| None | `'prime-rl'` | W&B project. | +| `wandb.entity` | str \| None | `None` | W&B entity. | +| `wandb.name` | str \| None | `None` | W&B run name. | +| `wandb.group` | str \| None | `None` | W&B group. | +| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `wandb.offline` | bool \| None | `False` | Run W&B in offline mode. | +| `wandb.shared` | bool | `True` | Log trainer and orchestrator metrics to a single shared W&B run. Requires wandb SDK ≥ 0.19.9. Incompatible with offline mode. | + + +### `model` + +Shared model config. If None, falls back to the sub-config model settings. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | + + +#### `model.vlm` + +VLM configuration. Set this to enable vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +### `tokenizer` + +Shared tokenizer config. Propagated to trainer, orchestrator, and inference. If None, each component uses its own tokenizer config (defaulting to model name). + +| Field | Type | Default | Description | +|---|---|---|---| +| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +### `weight_broadcast` + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | +| `weight_broadcast.port` | int | `29501` | Port for NCCL weight broadcast. | +| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for NCCL weight broadcast. | +| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | + + +### `slurm` + +SLURM configuration. If None, runs locally. + +| Field | Type | Default | Description | +|---|---|---|---| +| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | + + +### `experimental` + + +### `deployment` + +Discriminated union — set `deployment.type` to one of `single_node`, `multi_node` and provide the matching sub-fields. + + +#### `deployment.type = "single_node"` (SingleNodeDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'single_node' | `'single_node'` | | +| `deployment.num_train_gpus` | int | `1` | GPUs allocated to the trainer. | +| `deployment.num_infer_gpus` | int | `1` | GPUs allocated to inference. | +| `deployment.num_teacher_gpus` | int \| None | `None` | GPUs allocated to teacher inference (None disables the teacher server). | + + +#### `deployment.type = "multi_node"` (MultiNodeDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'multi_node' | `'multi_node'` | | +| `deployment.num_train_nodes` | int | *required* | Training nodes. | +| `deployment.num_infer_nodes` | int | *required* | _≥0._ Inference nodes per replica. Set to 0 to skip inference and orchestrator (requires fake data). | +| `deployment.num_infer_replicas` | int | `1` | _≥1._ Independent inference replicas. Total inference nodes = ``num_infer_nodes * num_infer_replicas``. | +| `deployment.num_teacher_nodes` | int \| None | `None` | Teacher inference nodes. | +| `deployment.nodes_per_fsdp_group` | int \| None | `None` | Training nodes per FSDP island. Auto-sets ``trainer.dp_replicate = num_train_nodes / nodes_per_fsdp_group``. | + + +## `sft` — Supervised fine-tuning + +The `sft` entrypoint runs supervised fine-tuning on a tokenized dataset. + +_Defined in_ `prime_rl.configs.sft.SFTConfig`. + +| Field | Type | Default | Description | +|---|---|---|---| +| `use_renderer` | bool | `False` | Tokenize SFT samples through the ``renderers`` library (single ``render()`` + ``message_indices`` mask) instead of the default ``build_incremental_token_mask`` path. Required for chat templates that render position-dependently (e.g. Qwen3, Qwen3.5). | +| `output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `clean_output_dir` | bool | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | +| `matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | +| `memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | +| `trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | +| `dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | +| `loss_impl` | 'liger' \| 'torch' \| 'liger_fused' \| 'quack_fused' | `'torch'` | Cross-entropy loss implementation. ``liger_fused`` fuses the lm_head projection with the CE loss to avoid materializing full logits. ``quack_fused`` uses quack-kernels for chunked linear + CE with CuTe DSL CUDA kernels. | +| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | + + +### `model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `model.seq_len` | int | `2048` | Sequence length the model is trained on. | +| `model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | +| `model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | +| `model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | +| `model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | +| `model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | +| `model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | +| `model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | + + +#### `model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +#### `model.compile` + +Compile the model with ``torch.compile``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | + + +#### `model.ac` + +Activation checkpointing configuration. If None, activation checkpointing is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | + + +#### `model.ac_offloading` + +Activation offloading configuration. If None, activation offloading is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | +| `model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | + + +#### `model.lora` + +LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | +| `model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | + + +#### `model.debug` + +Debugging knobs for the model and distributed training. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | +| `model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | +| `model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | + + +### `tokenizer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +### `renderer` + +Client-side renderer configuration. Only consumed when ``use_renderer=true``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | + + +### `val` + +Validation configuration. If None, no validation runs. + +| Field | Type | Default | Description | +|---|---|---|---| +| `val.interval` | int | `50` | _≥1._ Run validation every N training steps. | +| `val.eval_on_start` | bool | `False` | Run validation before the first training step. | + + +#### `val.data` + +| Field | Type | Default | Description | +|---|---|---|---| +| `val.data.batch_size` | int | `128` | _≥1._ Global batch size. | +| `val.data.seq_len` | int | `128` | _≥1._ Sequence length. | +| `val.data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `val.data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `val.data.type` | 'sft' | `'sft'` | | +| `val.data.name` | str | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | +| `val.data.subsets` | list[str] \| None | `None` | Subsets to load from the HF dataset. | +| `val.data.splits` | list[str] \| None | `None` | Splits to load from the HF dataset. | +| `val.data.probabilities` | list[float] \| None | `None` | Sampling probabilities for each subset/split. | +| `val.data.stopping_strategy` | 'first_exhausted' \| 'all_exhausted' | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | +| `val.data.shuffle` | bool | `True` | Shuffle the dataset at the start of each epoch. | +| `val.data.seed` | int | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | + + +##### `val.data.loss_mask` + +Which message types contribute to the loss. + +| Field | Type | Default | Description | +|---|---|---|---| +| `val.data.loss_mask.system` | bool | `False` | System messages contribute to the loss. | +| `val.data.loss_mask.user` | bool | `False` | User messages contribute to the loss. | +| `val.data.loss_mask.assistant` | bool | `True` | Assistant messages contribute to the loss. | +| `val.data.loss_mask.tool` | bool | `False` | Tool messages contribute to the loss. | + + +### `ckpt` + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | +| `ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | +| `ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | + + +#### `ckpt.weights` + +Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoints are written. + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | +| `ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | +| `ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | + + +### `log` + +| Field | Type | Default | Description | +|---|---|---|---| +| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | bool | `False` | Log the first data sample at startup. | +| `log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | + + +### `wandb` + +| Field | Type | Default | Description | +|---|---|---|---| +| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | str \| None | `None` | W&B entity to log to. | +| `wandb.name` | str \| None | `None` | W&B run name. | +| `wandb.group` | str \| None | `None` | W&B group. | +| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `wandb.offline` | bool | `False` | Run W&B in offline mode. | + + +### `bench` + +Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake data is used. + +| Field | Type | Default | Description | +|---|---|---|---| +| `bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | + + +### `gc` + +Garbage collection config. Disables automatic GC and runs deterministic collections every N steps to avoid stragglers. Set to null to use Python's default GC behavior. + +| Field | Type | Default | Description | +|---|---|---|---| +| `gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | + + +### `heartbeat` + +BetterStack heartbeat configuration for monitoring training progress. + +| Field | Type | Default | Description | +|---|---|---|---| +| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | + + +### `slurm` + +SLURM configuration. When set, the run is submitted as a SLURM job instead of running locally. + +| Field | Type | Default | Description | +|---|---|---|---| +| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | + + +### `experimental` + + +### `data` + +Discriminated union — set `data.type` to one of `fake`, `sft` and provide the matching sub-fields. + + +#### `data.type = "fake"` (FakeDataConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `data.batch_size` | int | `128` | _≥1._ Global batch size. | +| `data.seq_len` | int | `128` | _≥1._ Sequence length. | +| `data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `data.type` | 'fake' | `'fake'` | | +| `data.length` | 'fixed' \| 'variable' | `'fixed'` | Use fixed-length samples or variable-length samples. | +| `data.input_ids` | 'increasing' \| 'random' | `'increasing'` | Token id generator: ``increasing`` for deterministic sequences, ``random`` for random ids. | + + +#### `data.type = "sft"` (SFTDataConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `data.batch_size` | int | `128` | _≥1._ Global batch size. | +| `data.seq_len` | int | `128` | _≥1._ Sequence length. | +| `data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `data.type` | 'sft' | `'sft'` | | +| `data.name` | str | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | +| `data.subsets` | list[str] \| None | `None` | Subsets to load from the HF dataset. | +| `data.splits` | list[str] \| None | `None` | Splits to load from the HF dataset. | +| `data.probabilities` | list[float] \| None | `None` | Sampling probabilities for each subset/split. | +| `data.stopping_strategy` | 'first_exhausted' \| 'all_exhausted' | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | +| `data.shuffle` | bool | `True` | Shuffle the dataset at the start of each epoch. | +| `data.seed` | int | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | + + +##### `data.loss_mask` + +Which message types contribute to the loss. + +| Field | Type | Default | Description | +|---|---|---|---| +| `data.loss_mask.system` | bool | `False` | System messages contribute to the loss. | +| `data.loss_mask.user` | bool | `False` | User messages contribute to the loss. | +| `data.loss_mask.assistant` | bool | `True` | Assistant messages contribute to the loss. | +| `data.loss_mask.tool` | bool | `False` | Tool messages contribute to the loss. | + + +### `optim` + +Discriminated union — set `optim.type` to one of `sgd`, `adamw`, `muon`, `sign_sgd` and provide the matching sub-fields. + + +#### `optim.type = "sgd"` (SGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'sgd' | `'sgd'` | | +| `optim.nesterov` | bool | `True` | Use Nesterov momentum. | +| `optim.momentum` | float | `0.9` | SGD momentum factor. | + + +#### `optim.type = "adamw"` (AdamWConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'adamw' | `'adamw'` | | +| `optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | + + +#### `optim.type = "muon"` (MuonConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'muon' | `'muon'` | | +| `optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | + + +#### `optim.type = "sign_sgd"` (SignSGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'sign_sgd' | `'sign_sgd'` | | + + +### `scheduler` + +Discriminated union — set `scheduler.type` to one of `constant`, `linear`, `cosine` and provide the matching sub-fields. + + +#### `scheduler.type = "constant"` (ConstantSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'constant' | `'constant'` | | + + +#### `scheduler.type = "linear"` (LinearSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'linear' | `'linear'` | | +| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +#### `scheduler.type = "cosine"` (CosineSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'cosine' | `'cosine'` | | +| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +### `deployment` + +Discriminated union — set `deployment.type` to one of `single_node`, `multi_node` and provide the matching sub-fields. + + +#### `deployment.type = "single_node"` (SingleNodeDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'single_node' | `'single_node'` | | +| `deployment.num_gpus` | int | `1` | GPUs to use. | + + +#### `deployment.type = "multi_node"` (MultiNodeDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'multi_node' | `'multi_node'` | | +| `deployment.num_nodes` | int | `2` | Training nodes. | +| `deployment.nodes_per_fsdp_group` | int \| None | `None` | Nodes per FSDP island. Auto-sets ``model.dp_replicate = num_nodes / nodes_per_fsdp_group``. | + + +## `trainer` — Standalone trainer + +The `trainer` entrypoint runs only the trainer process. It expects rollouts to be shipped in via the configured transport (filesystem or ZMQ) by an external orchestrator. + +_Defined in_ `prime_rl.configs.trainer.TrainerConfig`. + +| Field | Type | Default | Description | +|---|---|---|---| +| `output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `max_steps` | int \| None | `None` | Maximum number of training steps. If None, runs indefinitely. | +| `max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | +| `enable_router_replay` | bool | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | +| `memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | +| `trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | +| `dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | +| `max_concurrent_runs` | int | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | + + +### `model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `model.seq_len` | int | `2048` | Sequence length the model is trained on. | +| `model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | +| `model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | +| `model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | +| `model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | +| `model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | +| `model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | +| `model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | + + +#### `model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +#### `model.compile` + +Compile the model with ``torch.compile``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | + + +#### `model.ac` + +Activation checkpointing configuration. If None, activation checkpointing is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | + + +#### `model.ac_offloading` + +Activation offloading configuration. If None, activation offloading is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | +| `model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | + + +#### `model.lora` + +LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | +| `model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | + + +#### `model.debug` + +Debugging knobs for the model and distributed training. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | +| `model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | +| `model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | + + +### `tokenizer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +### `data` + + +#### `data.fake` + +Use a fake data loader sampling random micro-batches (for debugging). + +| Field | Type | Default | Description | +|---|---|---|---| +| `data.fake.batch_size` | int | `2` | _≥1._ Batch size of the fake data loader. | +| `data.fake.generate_samples` | bool | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | + + +### `ckpt` + +Full training-state checkpoint configuration (model + optimizer + scheduler). If None, no resume-capable checkpoints are written. + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | +| `ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | +| `ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | + + +#### `ckpt.weights` + +Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoints are written. + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | +| `ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | +| `ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | + + +### `log` + +| Field | Type | Default | Description | +|---|---|---|---| +| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | bool | `False` | Log the first data sample at startup. | +| `log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | + + +### `wandb` + +| Field | Type | Default | Description | +|---|---|---|---| +| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | str \| None | `None` | W&B entity to log to. | +| `wandb.name` | str \| None | `None` | W&B run name. | +| `wandb.group` | str \| None | `None` | W&B group. | +| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `wandb.offline` | bool | `False` | Run W&B in offline mode. | + + +### `bench` + +Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake data is used. + +| Field | Type | Default | Description | +|---|---|---|---| +| `bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | + + +### `gc` + +Garbage collection config. Disables automatic GC and runs deterministic collections every N steps to avoid stragglers. Set to null to use Python's default GC behavior. + +| Field | Type | Default | Description | +|---|---|---|---| +| `gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | + + +### `heartbeat` + +BetterStack heartbeat configuration for monitoring training progress. + +| Field | Type | Default | Description | +|---|---|---|---| +| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | + + +### `metrics_server` + +Prometheus metrics server configuration. If set, exposes a ``/metrics`` endpoint for scraping. + +| Field | Type | Default | Description | +|---|---|---|---| +| `metrics_server.port` | int | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | +| `metrics_server.host` | str | `'0.0.0.0'` | Host to bind the server to. | + + +### `experimental` + + +#### `experimental.token_export` + +Opt-in per-token JSONL export for rollout debugging. When enabled, writes token ids and aligned trainer metrics after each forward pass. + + +### `loss` + +Loss config for rl-mode batches. opd and sft batches dispatch to their own loss fns unconditionally and do not read this. + +Discriminated union — set `loss.type` to one of `default`, `custom` and provide the matching sub-fields. + + +#### `loss.type = "default"` (DefaultLossConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `loss.type` | 'default' | `'default'` | | +| `loss.dppo_mask_low` | float | `0.2` | _≥0._ Lower DPPO masking threshold. | +| `loss.dppo_mask_high` | float | `0.2` | _≥0._ Upper DPPO masking threshold. | +| `loss.adv_tau` | float | `1.0` | _≥0._ Temperature for the advantage term. | +| `loss.kl_tau` | float | `0.001` | _≥0._ Temperature for the KL term. | + + +#### `loss.type = "custom"` (CustomLossConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `loss.type` | 'custom' | `'custom'` | | +| `loss.import_path` | str | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | +| `loss.kwargs` | dict[str, Any] | `{}` | Kwargs forwarded to the loss function. | + + +### `optim` + +Discriminated union — set `optim.type` to one of `sgd`, `adamw`, `muon`, `sign_sgd` and provide the matching sub-fields. + + +#### `optim.type = "sgd"` (SGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'sgd' | `'sgd'` | | +| `optim.nesterov` | bool | `True` | Use Nesterov momentum. | +| `optim.momentum` | float | `0.9` | SGD momentum factor. | + + +#### `optim.type = "adamw"` (AdamWConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'adamw' | `'adamw'` | | +| `optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | + + +#### `optim.type = "muon"` (MuonConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'muon' | `'muon'` | | +| `optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | + + +#### `optim.type = "sign_sgd"` (SignSGDConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | 'sign_sgd' | `'sign_sgd'` | | + + +### `scheduler` + +Discriminated union — set `scheduler.type` to one of `constant`, `linear`, `cosine` and provide the matching sub-fields. + + +#### `scheduler.type = "constant"` (ConstantSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'constant' | `'constant'` | | + + +#### `scheduler.type = "linear"` (LinearSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'linear' | `'linear'` | | +| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +#### `scheduler.type = "cosine"` (CosineSchedulerConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `scheduler.type` | 'cosine' | `'cosine'` | | +| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | + + +### `weight_broadcast` + +Transport used to broadcast updated weights from trainer to inference. + +Discriminated union — set `weight_broadcast.type` to one of `filesystem`, `nccl` and provide the matching sub-fields. + + +#### `weight_broadcast.type = "filesystem"` (FileSystemWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'filesystem' | `'filesystem'` | | +| `weight_broadcast.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | +| `weight_broadcast.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | + + +#### `weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'nccl' | `'nccl'` | | +| `weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | +| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | +| `weight_broadcast.inference_world_size` | int | `1` | Number of GPUs used for inference. | +| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | + + +### `rollout_transport` + +Transport used to ship rollouts from orchestrator to trainer. + +Discriminated union — set `rollout_transport.type` to one of `filesystem`, `zmq` and provide the matching sub-fields. + + +#### `rollout_transport.type = "filesystem"` (FileSystemTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `rollout_transport.type` | 'filesystem' | `'filesystem'` | | + + +#### `rollout_transport.type = "zmq"` (ZMQTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `rollout_transport.type` | 'zmq' | `'zmq'` | | +| `rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | +| `rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | +| `rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | + + +## `orchestrator` — Standalone orchestrator + +The `orchestrator` entrypoint runs only the orchestrator process. It expects a separately-launched inference server to serve rollouts, and ships completed rollouts to a separately-launched trainer over the configured transport. + +_Defined in_ `prime_rl.configs.orchestrator.OrchestratorConfig`. + +| Field | Type | Default | Description | +|---|---|---|---| +| `training_mode` | 'rl' \| 'opd' \| 'sft' | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | +| `advantage` | DefaultAdvantageConfig \| CustomAdvantageConfig \| None | `DefaultAdvantageConfig()` | | +| `filters` | list[GibberishFilterConfig \| RepetitionFilterConfig \| ZeroAdvantageFilterConfig] | `[GibberishFilterConfig(type='gibberish', enforce=False, token_id_threshold=100000, logprob_offset=2.0), RepetitionFilterConfig(type='repetition', enforce=False, window=3000, prob_threshold=0.99), ZeroAdvantageFilterConfig(type='zero_advantage', enforce=True)]` | Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). | +| `collect_inference_metrics` | bool | `True` | Collect inference-server metrics (requires wandb). | +| `output_dir` | Path | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `tasks_per_minute` | int \| None | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | +| `batch_size` | int \| None | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | +| `token_batch_size` | int \| None | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | +| `oversampling_factor` | float \| None | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | +| `max_inflight_rollouts` | int \| None | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | +| `rollouts_per_example` | int | `1` | _≥1._ Output sequences returned per example during training. | +| `seq_len` | int | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | +| `num_train_workers` | int | `1` | _≥1._ Training workers to use. | +| `max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | +| `max_off_policy_steps` | int | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | +| `max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | +| `strict_async_level` | bool | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | +| `bench` | bool | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | +| `seed` | int \| None | `42` | Random seed for the orchestrator. | +| `use_renderer` | bool | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | +| `env_install_prerelease` | bool | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | + + +### `student` + +Student rollout participant (model + client) — the model being trained. + + +#### `student.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `student.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `student.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | + + +##### `student.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `student.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `student.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `student.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +##### `student.model.lora` + +Per-run LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `student.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `student.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `student.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | + + +#### `student.client` + +| Field | Type | Default | Description | +|---|---|---|---| +| `student.client.timeout` | int | `1200` | Request timeout in seconds. | +| `student.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `student.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `student.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `student.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `student.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | +| `student.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `student.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "example_id"}`` enables sticky routing at the inference router. | +| `student.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `student.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `student.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `student.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | + + +##### `student.client.elastic` + +Elastic inference pool config for DNS-based service discovery. When set, ``base_url`` is ignored and inference servers are discovered dynamically via DNS. + +| Field | Type | Default | Description | +|---|---|---|---| +| `student.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | +| `student.client.elastic.port` | int | `8000` | Port that inference servers listen on. | +| `student.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | + + +### `teacher` + +Teacher rollout participant (model + client). Role depends on ``training_mode``: ``opd`` — teacher computes logprobs; ``sft`` — teacher generates rollouts. + + +#### `teacher.model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `teacher.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | + + +##### `teacher.model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `teacher.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `teacher.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +##### `teacher.model.lora` + +Per-run LoRA configuration. If None, LoRA is disabled. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `teacher.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `teacher.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | + + +#### `teacher.client` + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher.client.timeout` | int | `1200` | Request timeout in seconds. | +| `teacher.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `teacher.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `teacher.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `teacher.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `teacher.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | +| `teacher.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `teacher.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "example_id"}`` enables sticky routing at the inference router. | +| `teacher.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `teacher.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `teacher.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `teacher.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | + + +##### `teacher.client.elastic` + +Elastic inference pool config for DNS-based service discovery. When set, ``base_url`` is ignored and inference servers are discovered dynamically via DNS. + +| Field | Type | Default | Description | +|---|---|---|---| +| `teacher.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | +| `teacher.client.elastic.port` | int | `8000` | Port that inference servers listen on. | +| `teacher.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | + + +### `train` + +| Field | Type | Default | Description | +|---|---|---|---| +| `train.env` | list[TrainEnvConfig] | `[TrainEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=TrainSamplingConfig(temperature=1.0, repetition_penalty=1.0, max_completion_tokens=None, min_tokens=0, seed=None, extra_body={}))]` | Training environments. | +| `train.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `train.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | + + +#### `train.sampling` + +Shared training sampling configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `train.sampling.temperature` | float | `1.0` | _≥0._ Sampling temperature. | +| `train.sampling.repetition_penalty` | float | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `train.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `train.sampling.min_tokens` | int | `0` | _≥0._ Minimum output tokens per sequence. | +| `train.sampling.seed` | int \| None | `None` | Random seed for sampling. If None, no seeding is used. | +| `train.sampling.extra_body` | dict[str, Any] | `{}` | Extra body forwarded with each request to the inference server. | + + +### `tokenizer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | + + +### `renderer` + +Client-side renderer configuration. Only consumed when ``use_renderer=true``. + +| Field | Type | Default | Description | +|---|---|---|---| +| `renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | + + +### `optim` + +Per-run optimizer configuration for multi-run training. + +| Field | Type | Default | Description | +|---|---|---|---| +| `optim.lr` | float | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | + + +### `eval` + +Evaluation configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `eval.env` | list[EvalEnvConfig] | `[EvalEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=EvalSamplingConfig(temperature=None, repetition_penalty=None, top_p=None, top_k=None, min_p=None, max_completion_tokens=None, min_tokens=None, reasoning_effort=None, seed=None, extra_body={}), num_examples=-1, rollouts_per_example=1, interval=100)]` | Evaluation environments. | +| `eval.num_examples` | int | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | +| `eval.rollouts_per_example` | int | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | +| `eval.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `eval.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `eval.interval` | int | `100` | _≥1._ Step interval at which to evaluate the model. | +| `eval.eval_base_model` | bool | `True` | Evaluate the base model we are training on. | +| `eval.skip_eval_on_resume` | bool | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | +| `eval.cancel_inflight_rollouts_on_eval` | bool | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | + + +#### `eval.sampling` + +Shared eval sampling configuration; can differ from training sampling. + +| Field | Type | Default | Description | +|---|---|---|---| +| `eval.sampling.temperature` | float \| None | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `eval.sampling.repetition_penalty` | float \| None | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `eval.sampling.top_p` | float \| None | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `eval.sampling.top_k` | int \| None | `None` | Top-k sampling. None defers to the inference server default. | +| `eval.sampling.min_p` | float \| None | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `eval.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `eval.sampling.min_tokens` | int \| None | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `eval.sampling.reasoning_effort` | 'minimal' \| 'low' \| 'medium' \| 'high' \| None | `None` | Reasoning effort constraint for reasoning models. | +| `eval.sampling.seed` | int \| None | `None` | Random seed for sampling. None means no seeding. | +| `eval.sampling.extra_body` | dict[str, Any] | `{}` | Extra body parameters forwarded to the inference server. | + + +### `buffer` + +| Field | Type | Default | Description | +|---|---|---|---| +| `buffer.seed` | int \| None | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | +| `buffer.easy_threshold` | float \| None | `None` | Average-reward threshold above which a problem is classified ``easy``. | +| `buffer.hard_threshold` | float \| None | `None` | Average-reward threshold below which a problem is classified ``hard``. | +| `buffer.easy_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `buffer.hard_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `buffer.online_difficulty_filtering` | bool | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | +| `buffer.hash_keys` | list[str] | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | + + +### `log` + +| Field | Type | Default | Description | +|---|---|---|---| +| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | bool | `False` | Log the first data sample at startup. | + + +### `wandb` + +| Field | Type | Default | Description | +|---|---|---|---| +| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | str \| None | `None` | W&B entity to log to. | +| `wandb.name` | str \| None | `None` | W&B run name. | +| `wandb.group` | str \| None | `None` | W&B group. | +| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | +| `wandb.offline` | bool | `False` | Run W&B in offline mode. | + + +#### `wandb.log_extras` + +Extras logging configuration. If None, no extras are logged. + +| Field | Type | Default | Description | +|---|---|---|---| +| `wandb.log_extras.samples` | bool | `True` | Log prompt/response samples. | +| `wandb.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | +| `wandb.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | +| `wandb.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | + + +### `prime_monitor` + +| Field | Type | Default | Description | +|---|---|---|---| +| `prime_monitor.base_url` | str | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | +| `prime_monitor.api_key_var` | str | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | +| `prime_monitor.run_name` | str \| None | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | +| `prime_monitor.team_id` | str \| None | `None` | Team ID to associate the run with. | +| `prime_monitor.frontend_url` | str \| None | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | + + +#### `prime_monitor.log_extras` + +Extras logging configuration. If None, no extras are logged. + +| Field | Type | Default | Description | +|---|---|---|---| +| `prime_monitor.log_extras.samples` | bool | `True` | Log prompt/response samples. | +| `prime_monitor.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | +| `prime_monitor.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | +| `prime_monitor.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | + + +### `ckpt` + +Checkpoint configuration. + +| Field | Type | Default | Description | +|---|---|---|---| +| `ckpt.interval` | int \| None | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | +| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | +| `ckpt.wait_for_weights_timeout` | int \| None | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | +| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_buffer` | bool | `False` | Skip loading the buffer from checkpoint. | + + +### `heartbeat` + +BetterStack heartbeat configuration for monitoring training progress. + +| Field | Type | Default | Description | +|---|---|---|---| +| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | + + +### `experimental` + + +### `weight_broadcast` + +Transport used to receive updated weights from the trainer. + +Discriminated union — set `weight_broadcast.type` to one of `filesystem`, `nccl` and provide the matching sub-fields. + + +#### `weight_broadcast.type = "filesystem"` (FileSystemWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'filesystem' | `'filesystem'` | | + + +#### `weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'nccl' | `'nccl'` | | +| `weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | +| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | +| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | +| `weight_broadcast.inference_world_size` | int | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | + + +### `rollout_transport` + +Transport used to ship rollouts from orchestrator to trainer. + +Discriminated union — set `rollout_transport.type` to one of `filesystem`, `zmq` and provide the matching sub-fields. + + +#### `rollout_transport.type = "filesystem"` (FileSystemTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `rollout_transport.type` | 'filesystem' | `'filesystem'` | | + + +#### `rollout_transport.type = "zmq"` (ZMQTransportConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `rollout_transport.type` | 'zmq' | `'zmq'` | | +| `rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | +| `rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | +| `rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | + + +## `inference` — Standalone vLLM server + +The `inference` entrypoint launches a vLLM server (or a disaggregated prefill/decode pair) that serves OpenAI-compatible completions to the orchestrator. + +_Defined in_ `prime_rl.configs.inference.InferenceConfig`. + +| Field | Type | Default | Description | +|---|---|---|---| +| `enable_lora` | bool | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | +| `max_loras` | int | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | +| `max_cpu_loras` | int | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | +| `max_lora_rank` | int \| None | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | +| `lora_target_modules` | list[str] \| None | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | +| `enable_prefix_caching` | bool \| None | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | +| `gpu_memory_utilization` | float | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | +| `api_server_count` | int | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | +| `data_parallel_size_local` | int \| None | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | +| `data_parallel_rpc_port` | int | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | +| `seed` | int | `0` | Seed the inference components. Forwarded as ``--seed``. | +| `enable_expert_parallel` | bool | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | +| `all2all_backend` | 'allgather_reducescatter' \| 'deepep_high_throughput' \| 'deepep_low_latency' \| 'flashinfer_nvlink_one_sided' \| 'flashinfer_nvlink_two_sided' | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | +| `enable_eplb` | bool | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | +| `enable_dbo` | bool | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | +| `use_deep_gemm` | bool | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | +| `enable_return_routed_experts` | bool | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | +| `enable_fp32_lm_head` | bool | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | +| `vllm_extra` | dict[str, Any] | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | +| `output_dir` | Path | `'outputs'` | Directory for SLURM logs and generated scripts. | +| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | + + +### `server` + +| Field | Type | Default | Description | +|---|---|---|---| +| `server.host` | str \| None | `None` | Host to bind to. | +| `server.port` | int | `8000` | Port to bind to. | +| `server.liveness_timeout_seconds` | float | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | + + +### `model` + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | bool | `False` | Trust remote code. Forwarded to vLLM engine init. | +| `model.dtype` | 'auto' \| 'float16' \| 'bfloat16' \| 'float32' | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | +| `model.max_model_len` | int \| None | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | +| `model.enforce_eager` | bool | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | +| `model.chat_template` | str \| None | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | +| `model.tool_call_parser` | str \| None | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `model.reasoning_parser` | str \| None | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `model.rope_scaling` | dict[str, Any] \| str \| None | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | + + +#### `model.vlm` + +VLM configuration. Setting this enables vision-language model support. + +| Field | Type | Default | Description | +|---|---|---|---| +| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | + + +### `parallel` + +Multi-node and multi-GPU parallelism (TP, DP, PP). + +| Field | Type | Default | Description | +|---|---|---|---| +| `parallel.tp` | int | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | +| `parallel.dp` | int | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | + + +### `weight_broadcast` + +| Field | Type | Default | Description | +|---|---|---|---| +| `weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | + + +### `kv_cache_offload` + +CPU KV cache offload for inference workers. Standard inference uses vLLM's ``OffloadingConnector``. Disaggregated P/D deployments combine it with NIXL through ``MultiConnector`` in the SLURM launcher. + +| Field | Type | Default | Description | +|---|---|---|---| +| `kv_cache_offload.cpu_bytes` | int | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | + + +### `slurm` + +SLURM configuration. When set, the run is submitted as a SLURM job instead of running locally. + +| Field | Type | Default | Description | +|---|---|---|---| +| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | + + +### `experimental` + + +### `deployment` + +Discriminated union — set `deployment.type` to one of `single_node`, `multi_node`, `disaggregated` and provide the matching sub-fields. + + +#### `deployment.type = "single_node"` (SingleNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'single_node' | `'single_node'` | | + + +#### `deployment.type = "multi_node"` (MultiNodeInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'multi_node' | `'multi_node'` | | +| `deployment.num_nodes` | int | `2` | _≥1._ Inference nodes. | +| `deployment.router_port` | int | `8000` | Port for the vllm-router. | +| `deployment.backend_port` | int | `8100` | Port for vLLM backend instances. | +| `deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | + + +#### `deployment.type = "disaggregated"` (DisaggregatedInferenceDeploymentConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `deployment.gpus_per_node` | int | `8` | GPUs per node. | +| `deployment.type` | 'disaggregated' | `'disaggregated'` | | +| `deployment.num_prefill_nodes` | int | `1` | _≥1._ Total prefill nodes. | +| `deployment.num_decode_nodes` | int | `1` | _≥1._ Total decode nodes. | +| `deployment.num_prefill_replicas` | int | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | +| `deployment.num_decode_replicas` | int | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | +| `deployment.router_port` | int | `8000` | Port for the vllm-router on each replica. | +| `deployment.prefill_port` | int | `8100` | Port for prefill vLLM instances. | +| `deployment.decode_port` | int | `8200` | Port for decode vLLM instances. | +| `deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `deployment.prefill_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on prefill nodes. | +| `deployment.decode_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on decode nodes. | + diff --git a/docs/scaling.md b/docs/scaling.md new file mode 100644 index 0000000000..1f2bf3bba0 --- /dev/null +++ b/docs/scaling.md @@ -0,0 +1,581 @@ +# Scaling + +This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster: single-node multi-GPU layouts, multi-node SLURM and Kubernetes deployments, FSDP / expert parallelism / context parallelism, disaggregated prefill/decode inference, and throughput benchmarking. For knobs that fit on one box, see [Training](training.md) first. + +## Table of Contents + +- [Choosing a layout](#choosing-a-layout) +- [Single GPU](#single-gpu) +- [Single-node multi-GPU](#single-node-multi-gpu) + - [RL placement](#rl-placement) + - [SFT and torchrun](#sft-and-torchrun) +- [Parallelism knobs](#parallelism-knobs) + - [FSDP](#fsdp) + - [Expert parallelism](#expert-parallelism) + - [Context parallelism](#context-parallelism) + - [Activation checkpointing and offloading](#activation-checkpointing-and-offloading) + - [CPU optimizer offload](#cpu-optimizer-offload) +- [Memory-tight recipe](#memory-tight-recipe) +- [Multi-node (manual)](#multi-node-manual) + - [RL training](#rl-training) + - [SFT training](#sft-training) + - [Multi-node inference](#multi-node-inference) +- [SLURM](#slurm) + - [Activation](#activation) + - [`[slurm]` and `[deployment]` reference](#slurm-and-deployment-reference) + - [RL example](#rl-example) + - [SFT and inference examples](#sft-and-inference-examples) + - [Custom templates](#custom-templates) +- [Kubernetes](#kubernetes) +- [Disaggregated prefill/decode inference](#disaggregated-prefilldecode-inference) +- [Benchmarking](#benchmarking) +- [Multi-node logs](#multi-node-logs) + +## Choosing a layout + +| You have… | Use this layout | +|---|---| +| 1 GPU | Single-GPU co-located RL (small model) or SFT-only | +| 1 node, 2–8 GPUs | `uv run rl` with `--inference-gpu-ids` / `--trainer-gpu-ids` | +| 1 node, 8 GPUs, large MoE | Custom impl + EP + activation checkpointing | +| 2+ nodes, SLURM | `[slurm]` + `[deployment]` overlay (recommended) | +| 2+ nodes, no SLURM | Manual `uv run inference` + `uv run orchestrator` + `uv run torchrun src/.../train.py` | +| Kubernetes | The bundled Helm chart at `k8s/prime-rl` | +| Production MoE with long contexts | Disaggregated prefill/decode inference | + +## Single GPU + +The trainer and inference server can share a GPU for small models or smoke tests. Pin both to GPU 0 and tighten the inference memory budget so the trainer has room: + +```bash +bash scripts/tmux.sh + +uv run rl @ configs//rl.toml \ + --trainer-gpu-ids 0 \ + --inference-gpu-ids 0 \ + --inference.gpu-memory-utilization 0.5 +``` + +Or launch the three processes manually if you want full control over each pane: + +```bash +# inference pane +uv run inference @ infer.toml --gpu-memory-utilization 0.5 +# orchestrator pane +uv run orchestrator @ orch.toml +# trainer pane +uv run trainer @ train.toml +``` + +For SFT, single-GPU is the default — `uv run sft` runs without torchrun unless you ask for multiple processes. + +## Single-node multi-GPU + +### RL placement + +`rl` defaults to GPU 0 for inference and GPU 1 for the trainer. Override the placement for a typical 8-GPU node by giving inference 6 GPUs with data parallelism and the trainer the remaining 2: + +```bash +uv run rl @ rl.toml \ + --inference-gpu-ids 0,1,2,3,4,5 \ + --trainer-gpu-ids 6,7 \ + --inference.parallel.dp 6 +``` + +For quick A/B ablations on the same node, run two RL instances side-by-side in separate tmux sessions, each pinned to half the GPUs and a separate inference port: + +```bash +# session 1, GPUs 0–1, default port 8000 +bash scripts/tmux.sh -s exp1 -o outputs/exp1 +uv run rl @ rl.toml --output-dir outputs/exp1 + +# session 2, GPUs 2–3, port 8001 +bash scripts/tmux.sh -s exp2 -o outputs/exp2 +uv run rl @ rl.toml \ + --inference-gpu-ids 2 --trainer-gpu-ids 3 \ + --inference.server.port 8001 \ + --orchestrator.client.base-url http://localhost:8001/v1 \ + --output-dir outputs/exp2 +``` + +### SFT and torchrun + +`uv run sft` manages torchrun internally — you don't need to call torchrun yourself. To scale from 1 to N GPUs, set the deployment GPU count (or just let it pick up `WORLD_SIZE`). For non-default layouts, the manual equivalent is: + +```bash +uv run torchrun \ + --nproc-per-node 8 \ + --local-ranks-filter 0 \ + src/prime_rl/trainer/sft/train.py @ sft.toml +``` + +`--local-ranks-filter 0` keeps console output to rank 0 only; per-rank stdout/stderr is still captured in `/logs/trainer/torchrun/`. + +## Parallelism knobs + +### FSDP + +FSDP2 is the default model sharding strategy. By default the trainer fully shards parameters, gradients, and optimizer state across the data-parallel mesh. Tweakable knobs: + +| Knob | Effect | +|---|---| +| `trainer.model.dp_replicate` | Number of dimensions to **replicate** instead of shard. Set to 2 to run 2-way DP replication × FSDP sharding within each replica — useful for very large clusters where pure FSDP communication dominates. | +| `trainer.model.reshard_after_forward` | If `true` (default), parameters are resharded after the forward pass to free memory; the backward pass re-gathers. Set `false` to keep params resident — faster but more memory. | +| `trainer.model.fsdp_cpu_offload` | Offload params + grads + optimizer state to CPU. Big memory win, large throughput hit. | +| `trainer.model.optim_cpu_offload` | Offload only optimizer state. Mid-ground — small throughput cost, decent memory savings, especially at low GPU count. | + +### Expert parallelism + +EP shards MoE expert weights across the EP mesh, dramatically reducing the FSDP communication volume per layer. EP is only available with the custom model implementation (`model.impl = "custom"` or `"auto"` for supported families). + +```toml +[trainer.model] +impl = "custom" +ep = 8 # EP degree; must divide num_experts +ep_comm_backend = "torch" # or "deepep" +``` + +`ep_comm_backend = "deepep"` uses DeepEP's custom dispatch/combine kernels for speed, with two extra knobs (`deepep_num_sms`, `deepep_token_chunk_size`) — tune on your hardware. See [Reference § `trainer.model`](reference.md#trainer-model) for the full set. + +### Context parallelism + +CP shards a single sequence across multiple GPUs along the token dimension — necessary for sequences past ~32K tokens. Only available with the custom impl and flash-attention. + +```toml +[trainer.model] +impl = "custom" +attn = "flash_attention_2" # or fa3 / fa4 +cp = 2 # CP degree (typically 2, 4, or 8) +cp_style = "ring" # "ulysses" for non-FA kernels +``` + +`cp = 2` or `cp = 4` works for most 128K-token training. Pushing past CP 8 typically isn't worth it — cross-node CP collectives become the bottleneck. + +### Activation checkpointing and offloading + +| Knob | Memory ↓ | Throughput ↓ | +|---|---|---| +| `trainer.model.ac` | large | ~25% | +| `trainer.model.ac.mode = "selective"` | medium | small | +| `trainer.model.ac_offloading` | extra (offloads checkpoints to CPU) | a bit more | + +Enable selective AC (custom impl only) for the best memory/throughput tradeoff: + +```toml +[trainer.model.ac] +mode = "selective" +targets = ["norm", "attn_proj"] # see Reference for the full list per architecture +``` + +### CPU optimizer offload + +In RL, the trainer typically does many gradient-accumulation steps per optimizer step, so the offload cost is amortized. Offloading optimizer states to CPU is a near-free memory win at low GPU counts: + +```toml +[trainer.optim] +# any optimizer type +type = "adamw" + +[trainer.model] +optim_cpu_offload = true +``` + +Mutually exclusive with `fsdp_cpu_offload`. Not supported with the Muon optimizer. + +## Memory-tight recipe + +The kitchen-sink config for fitting large MoE on limited GPUs at acceptable throughput: + +```toml +[trainer.model] +impl = "custom" +attn = "flash_attention_2" +fused_lm_head_token_chunk_size = 1024 +ep = 8 +cp = 2 +optim_cpu_offload = true + +[trainer.model.compile] + +[trainer.model.ac] +freq = 1 + +[trainer.model.ac_offloading] +max_inflight_activations = 1 +``` + +Walks through every memory lever in order: FSDP+EP shard the weights, CP shards the activations along the token dim, AC + AC offloading shrink the activation footprint, fused LM head chunks the loss, `torch.compile` reduces fragmentation, optim offload moves Adam state off GPU. Apply selectively — each knob has a throughput cost. + +## Multi-node (manual) + +When you don't have SLURM (or want fine-grained control), launch each process by hand. Multi-node RL currently requires a **shared filesystem** for the rollout transport and the weight broadcast. + +### RL training + +```bash +# On all nodes +export OUTPUT_DIR=/shared/outputs/my-run +export INFERENCE_SERVER_IP=10.0.0.1 +export INFERENCE_SERVER_API_KEY=... +``` + +```bash +# Inference node +uv run inference @ infer.toml \ + --api-key $INFERENCE_SERVER_API_KEY \ + --parallel.tp 4 --parallel.dp 2 + +# Orchestrator (either node) +uv run orchestrator @ orch.toml \ + --client.base-url http://$INFERENCE_SERVER_IP:8000/v1 \ + --client.api-key-var INFERENCE_SERVER_API_KEY \ + --output-dir $OUTPUT_DIR + +# Trainer node +uv run torchrun \ + --nproc-per-node 8 \ + --local-ranks-filter 0 \ + src/prime_rl/trainer/rl/train.py @ train.toml \ + --output-dir $OUTPUT_DIR +``` + +You can scale inference and trainer independently — multiple inference nodes (each running its own vLLM replica), one orchestrator, one or more trainer nodes. The orchestrator must be a single instance. + +### SFT training + +For multi-node SFT, point torchrun at a rendezvous endpoint: + +```bash +# On all nodes +export MASTER_ADDR=10.0.0.1 +export MASTER_PORT=29500 +export GLOO_SOCKET_IFNAME=... # only if default isn't routable +export NCCL_SOCKET_IFNAME=... + +# Node 0 +uv run torchrun \ + --nnodes 2 --node-rank 0 \ + --rdzv-endpoint=$MASTER_ADDR:$MASTER_PORT \ + --local-ranks-filter 0 \ + --nproc-per-node 8 \ + src/prime_rl/trainer/sft/train.py @ sft.toml + +# Node 1 — same but --node-rank 1 +``` + +If your nodes aren't colocated, set up a VPN (e.g. Tailscale) and use the VPN-resolvable IP for `MASTER_ADDR`. + +### Multi-node inference + +Multi-node vLLM uses native data parallelism — see the [vLLM docs](https://docs.vllm.ai/en/v0.10.0/serving/data_parallel_deployment.html). For TP=4, DP=4, two nodes: + +```bash +# Node 0 — DP ranks 0,1 +uv run inference \ + --parallel.tp 4 --parallel.dp 4 \ + --data-parallel-size-local 2 \ + --data-parallel-address $DATA_PARALLEL_ADDRESS \ + --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT + +# Node 1 — DP ranks 2,3 (headless) +uv run inference \ + --parallel.tp 4 --parallel.dp 4 \ + --data-parallel-size-local 2 \ + --data-parallel-address $DATA_PARALLEL_ADDRESS \ + --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT \ + --data-parallel-start-rank 2 \ + --headless +``` + +## SLURM + +The `rl`, `sft`, and `inference` entrypoints all submit to SLURM when a `[slurm]` table is present — there's no separate entrypoint. + +### Activation + +A SLURM config is usually a thin overlay that inherits a base config and adds `[slurm]` (and `[deployment]` for multi-node): + +```toml +# my_slurm.toml +toml_files = ["base_rl.toml"] +output_dir = "/shared/outputs/my-rl" + +[slurm] +job_name = "my-rl-run" +``` + +Launch: + +```bash +uv run rl @ my_slurm.toml # submits via sbatch +uv run rl @ my_slurm.toml --dry-run # writes the sbatch script + resolved config, exits +``` + +The dry-run mode is invaluable — inspect `/job.sbatch` and the per-process TOMLs before burning a queue slot. + +### `[slurm]` and `[deployment]` reference + +| `[slurm]` field | Default | Description | +|---|---|---| +| `job_name` | `"prime-rl"` | `#SBATCH --job-name` | +| `project_dir` | `"."` | Project root on the cluster (used to source `.env`, activate `.venv`, run `uv sync`) | +| `partition` | `"cluster"` | `#SBATCH --partition` | +| `nodelist` / `exclude` | `None` | `--nodelist` / `--exclude` | +| `account` | `None` | `--account` | +| `time` | `None` | Wall-time limit | +| `pre_run_command` | `None` | Shell command on head node before launch (cleanup, `pkill`, etc.) | +| `template_path` | auto-selected | Override the Jinja2 template | + +`[deployment]` is a discriminated union picked by `type` — `single_node` or `multi_node` for RL/SFT, with an extra disaggregated variant for inference. RL multi-node: + +```toml +[deployment] +type = "multi_node" +num_train_nodes = 2 +num_infer_nodes = 1 +gpus_per_node = 8 # default +nodes_per_fsdp_group = 1 # optional — controls FSDP island size +``` + +SFT multi-node: + +```toml +[deployment] +type = "multi_node" +num_nodes = 2 +gpus_per_node = 8 +``` + +### RL example + +A two-node RL run with NCCL weight broadcast and a 30B MoE student: + +```toml +toml_files = ["base.toml"] +output_dir = "/shared/outputs/rl-math-moe" +max_steps = 500 +seq_len = 2048 + +[slurm] +job_name = "hendrycks-math-rl-moe" + +[deployment] +type = "multi_node" +num_train_nodes = 1 +num_infer_nodes = 1 + +[weight_broadcast] +type = "nccl" # synchronous; max_async_level forced to 1 + +[model] +name = "Qwen/Qwen3-30B-A3B-Thinking-2507" + +[trainer.model] +impl = "custom" +attn = "flash_attention_3" +optim_cpu_offload = true + +[trainer.model.ac] +freq = 1 + +[trainer.model.ac_offloading] +max_inflight_activations = 5 + +[orchestrator] +batch_size = 512 +rollouts_per_example = 16 + +[[orchestrator.train.env]] +id = "math-env" +name = "hendrycks-math" +args = { dataset_name = "PrimeIntellect/Hendrycks-Math", dataset_subset = "default" } + +[inference.parallel] +tp = 4 +dp = 2 +``` + +See [`examples/hendrycks_math/rl.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/examples/hendrycks_math/rl.toml) for a complete worked example. + +### SFT and inference examples + +SFT multi-node MoE: + +```toml +toml_files = ["base_sft.toml"] +output_dir = "/shared/outputs/sft-moe-math" +max_steps = 500 + +[slurm] +job_name = "sft-moe-math" + +[deployment] +type = "multi_node" +num_nodes = 2 + +[model] +name = "Qwen/Qwen3-30B-A3B-Thinking-2507" +impl = "custom" + +[data] +type = "sft" +name = "PrimeIntellect/INTELLECT-3-SFT-10K" +batch_size = 128 +seq_len = 8192 +``` + +Multi-node inference (each node runs an independent vLLM replica — TP and DP must fit within one node): + +```toml +output_dir = "/shared/outputs/my-inference" + +[model] +name = "PrimeIntellect/INTELLECT-3-RL-600" + +[parallel] +tp = 4 +dp = 2 + +[deployment] +type = "multi_node" +num_nodes = 4 + +[slurm] +job_name = "my-inference" +``` + +Submission prints one URL per node — point clients at any of them, or front them with a router. + +### Custom templates + +For unusual partitions, module loads, or environment setup, supply your own Jinja2 template: + +```bash +uv run rl @ my_config.toml --slurm.template-path path/to/my_template.sbatch.j2 +``` + +The default templates live under [`src/prime_rl/templates/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/templates) — copy one as a starting point. + +## Kubernetes + +For Kubernetes-managed clusters, `prime-rl` ships a Helm chart at [`k8s/prime-rl`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/k8s/prime-rl). It deploys three StatefulSets (orchestrator, trainer, inference) sharing a single `ReadWriteMany` PVC mounted at `/data`. + +```bash +# Deploy with an example values file +helm install my-exp ./k8s/prime-rl -f ./k8s/prime-rl/examples/reverse-text.yaml + +# Or with custom overrides +helm install my-exp ./k8s/prime-rl --set trainer.replicas=3 --set inference.replicas=2 +``` + +After deployment, `kubectl exec` into `-trainer-0` and launch with `uv run trainer @ ` (or `uv run rl @ `). All three pod groups discover each other via stable DNS hostnames (`-{trainer,orchestrator,inference}-.-{...}-headless..svc.cluster.local`). + +Environment variables provided to every pod: + +- `$POD_NAME`, `$POD_IP` — standard K8s +- `$STATEFUL_REPLICAS` — total replicas for this component +- `$HEADLESS_SERVICE` — DNS suffix for peer discovery +- `$INFERENCE_URL` — first inference pod's URL (set in orchestrator and trainer pods) + +For distributed trainer launches inside K8s, extract the rank from the pod name and feed it to torchrun: + +```bash +RANK=$(echo $POD_NAME | grep -o '[0-9]*$') +torchrun \ + --nnodes=$STATEFUL_REPLICAS --node-rank=$RANK \ + --nproc-per-node=8 \ + --rdzv-endpoint=my-exp-trainer-0.$HEADLESS_SERVICE:29501 \ + src/prime_rl/trainer/rl/train.py @ /data/configs/train.toml +``` + +Common operations (logs, exec, scale, uninstall) are standard `kubectl`/`helm`. Auth (W&B, HF) is via K8s secrets — set `config.secrets.enabled=true` and `config.secrets.name=`. + +## Disaggregated prefill/decode inference + +For large MoE serving, splitting prefill and decode onto separate vLLM groups can substantially improve throughput. Pick the prefill:decode ratio based on workload shape: + +| Workload | P:D ratio | Why | +|---|---|---| +| Agentic (SWE, Lean) | 3:1 | Long growing contexts → prefill-heavy | +| Non-agentic (math, chat) | 1:2 | Short prompts, long generations → decode-heavy | + +Example config: [`configs/glm5_disagg_inference/inference.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/configs/glm5_disagg_inference/inference.toml). Launch with the standard `inference` entrypoint: + +```bash +uv run inference @ configs/glm5_disagg_inference/inference.toml --output-dir /data/$USER/outputs +``` + +Monitor live queue depths to detect imbalance: + +```bash +curl -s http://:8100/metrics | grep num_requests_waiting +curl -s http://:8200/metrics | grep num_requests_waiting +``` + +If prefill queues and decode is idle, add prefill nodes (and vice versa). + +**UCX 1.19 requirement.** NVSHMEM needs UCX ≥ 1.19 for multi-GPU CUDA. Most clusters ship UCX 1.17 via HPC-X, which manifests as `cuStreamCreate: invalid device context` errors during DeepEP internode dispatch. Check with `/opt/hpcx/ucx/bin/ucx_info -v` and, if needed, build from source: + +```bash +salloc -N 1 --gres=gpu:1 bash -c 'bash scripts/install_nixl_from_source.sh' +``` + +The script writes UCX 1.19 to `third_party/ucx/`; the bundled sbatch templates prepend it to `LD_LIBRARY_PATH` so it overrides the system version. + +## Benchmarking + +Every entrypoint supports a `--bench` flag that runs a few warm-up + measurement steps with fake data and prints a rich-formatted throughput / MFU table: + +```bash +# SFT trainer alone +uv run sft @ sft.toml --bench +uv run sft ... --data.type fake --data.length variable --bench # variable-length fake data + +# RL trainer alone (no inference involved) +uv run trainer @ train.toml --data.fake --bench + +# Inference alone — start the server normally, then bench the orchestrator +uv run inference @ infer.toml +uv run orchestrator @ orch.toml --bench + +# Full RL stack (trainer with fake data, inference with real data from orchestrator) +uv run rl @ rl.toml --bench +``` + +Persist results with `--bench.output-json`. Use this to compare parallelism configs before committing a multi-day run. + +## Multi-node logs + +Log layout under `/logs/`: + +``` +trainer.log # symlink → trainer/node_0.log +inference.log # symlink → inference/node_0.log +orchestrator.log # single instance, single file +trainer/ + node_*.log # per-node trainer stdout (rank 0 only) + torchrun/ # per-rank stdout/stderr +inference/ + node_*.log # per-node inference stdout + router_*.log # vllm-router per replica +envs/ + {train,eval}// + env_server.log + env_worker_*.log +``` + +Live tailing from the head node: + +```bash +tail -F /logs/{trainer,orchestrator,inference}.log +tail -F /logs/trainer/node_*.log +tail -F /logs/inference/router_*.log +``` + +The tmux helper also works on the head node: + +```bash +bash scripts/tmux.sh my-rl-job /shared/outputs/my-rl-job +``` + +For multi-rank trainer debugging, drop into `logs/trainer/torchrun//attempt_0//{stdout,stderr}.log` — verbose and per-rank. diff --git a/docs/slurm.md b/docs/slurm.md deleted file mode 100644 index 7b171d4894..0000000000 --- a/docs/slurm.md +++ /dev/null @@ -1,298 +0,0 @@ -# SLURM - -The `rl`, `sft`, and `inference` entrypoints all have built-in SLURM support. Adding a `[slurm]` section to your config switches from local execution to SLURM job submission — no separate entrypoint needed. - -## Quick Start - -```bash -# Local run -uv run rl @ examples/reverse_text/rl.toml - -# SLURM run (same entrypoint, just add [slurm] to the config) -uv run rl @ examples/reverse_text/slurm_rl.toml -``` - -The SLURM config is a thin overlay that inherits from a base config and adds `[slurm]` + `[deployment]` sections: - -```toml -# examples/reverse_text/slurm_rl.toml -toml_files = ["rl.toml"] - -output_dir = "outputs/reverse-text-rl" - -[slurm] -job_name = "reverse-text-rl" -``` - -## How it works - -When `[slurm]` is present, the entrypoint: - -1. Resolves the full config -2. Renders a SLURM batch script from a Jinja2 template -3. Writes the script and resolved config to `{output_dir}/` -4. Submits via `sbatch` (or prints the script with `--slurm.dry-run`) - -For **single-node** jobs, the entire config is dumped to a TOML file and the template simply runs `uv run rl @` or `uv run sft @` on the allocated node. - -For **multi-node** jobs, sub-configs are written separately and `srun` dispatches processes across nodes. - -## Configuration - -### `[slurm]` — Job submission (shared between RL and SFT) - -| Field | Description | Default | -|---|---|---| -| `job_name` | SLURM job name | `"prime-rl"` | -| `project_dir` | Path to the project root on the cluster | `"."` | -| `template_path` | Path to a custom Jinja2 template | auto-selected | -| `partition` | SLURM partition | `"cluster"` | -| `nodelist` | Comma-separated list of specific nodes to run on (`--nodelist`) | `None` | -| `exclude` | Comma-separated list of nodes to exclude (`--exclude`) | `None` | -| `account` | SLURM account to charge (`--account`) | `None` | -| `time` | Maximum wall time, e.g. `"24:00:00"` (`--time`) | `None` | -| `pre_run_command` | Shell command to run on head node after env setup, before starting the job (e.g. cleanup) | `None` | - -### `[deployment]` — Node and GPU allocation - -**RL** uses a discriminated union with `type = "single_node"` (default) or `type = "multi_node"`: - -| Field | single_node | multi_node | -|---|---|---| -| `gpus_per_node` | Number of GPUs per node (default: 8) | Same | -| `num_train_gpus` | Training GPUs | — | -| `num_infer_gpus` | Inference GPUs | — | -| `num_train_nodes` | — | Training nodes | -| `num_infer_nodes` | — | Inference nodes | -| `nodes_per_fsdp_group` | — | Nodes per FSDP island (optional) | - -**SFT** follows the same pattern but only has training nodes: - -| Field | single_node | multi_node | -|---|---|---| -| `gpus_per_node` | Number of GPUs per node (default: 8) | Same | -| `num_gpus` | Number of GPUs (default: 1) | — | -| `num_nodes` | — | Training nodes (default: 2) | -| `nodes_per_fsdp_group` | — | Nodes per FSDP island (optional) | - -**Inference** runs independent vLLM replicas per node: - -| Field | single_node | multi_node | -|---|---|---| -| `gpus_per_node` | Number of GPUs per node (default: 8) | Same | -| `num_nodes` | — | Number of inference nodes (default: 1) | - -The SLURM template is auto-selected based on `deployment.type`. You can override it with `slurm.template_path`. - -### Constraints - -- `output_dir` should be explicitly set when using SLURM (defaults to `"outputs"`) -- Multi-node deployment requires `[slurm]` to be set - ---- - -## RL Examples - -### Single-node SLURM - -The simplest case: run on a single allocated node. No `[deployment]` needed — defaults to `single_node`. - -```toml -output_dir = "/shared/outputs/my-rl-run" - -[slurm] -job_name = "my-rl-run" -``` - -### Multi-node SLURM (Hendrycks Math) - -```toml -output_dir = "outputs/rl-math-moe" -max_steps = 500 -seq_len = 2048 - -[slurm] -job_name = "hendrycks-math-rl-moe" - -[deployment] -type = "multi_node" -num_train_nodes = 1 -num_infer_nodes = 1 - -[weight_broadcast] -type = "nccl" - -[model] -name = "Qwen/Qwen3-30B-A3B-Thinking-2507" - -[trainer.model] -impl = "custom" -attn = "flash_attention_3" -optim_cpu_offload = true - -[trainer.model.ac_offloading] -max_inflight_activations = 5 - -[trainer.model.ac] -freq = 1 - -[orchestrator] -batch_size = 512 -rollouts_per_example = 16 - -[orchestrator.sampling] -max_tokens = 2048 - -[[orchestrator.env]] -id = "math-env" -name = "hendrycks-math" -args = { dataset_name = "PrimeIntellect/Hendrycks-Math", dataset_subset = "default" } - -[inference.parallel] -tp = 4 -dp = 2 -``` - -See [`examples/hendrycks_math/rl.toml`](../examples/hendrycks_math/rl.toml) for the full example. - ---- - -## SFT Examples - -### Single-node SLURM - -```toml -output_dir = "/shared/outputs/my-sft-run" - -[slurm] -job_name = "my-sft-run" -``` - -### Multi-node SLURM (MoE SFT) - -```toml -output_dir = "outputs/sft-moe-math" -max_steps = 500 - -[slurm] -job_name = "sft-moe-math" - -[deployment] -type = "multi_node" -num_nodes = 2 - -[model] -name = "Qwen/Qwen3-30B-A3B-Thinking-2507" -impl = "custom" -attn = "flash_attention_3" -optim_cpu_offload = true - -[model.ac_offloading] -max_inflight_activations = 5 - -[model.ac] -freq = 1 - -[data] -type = "sft" -name = "PrimeIntellect/INTELLECT-3-SFT-10K" -subsets = ["default"] -splits = ["math"] -batch_size = 128 -seq_len = 8192 -``` - -See [`examples/hendrycks_math/sft.toml`](../examples/hendrycks_math/sft.toml) for the full example. - ---- - -## Inference Examples - -### Single-node SLURM - -Run a vLLM server on a single allocated node: - -```toml -output_dir = "/shared/outputs/my-inference" - -[model] -name = "Qwen/Qwen3-8B" - -[parallel] -tp = 8 - -[slurm] -job_name = "my-inference" -``` - -```bash -uv run inference @ inference_slurm.toml -``` - -### Multi-node SLURM - -Each node runs an independent vLLM replica. TP and DP must fit within a single node — there is no cross-node parallelism. - -```toml -output_dir = "/shared/outputs/my-inference" - -[model] -name = "PrimeIntellect/INTELLECT-3-RL-600" - -[parallel] -tp = 4 -dp = 2 - -[deployment] -type = "multi_node" -num_nodes = 4 - -[slurm] -job_name = "my-inference" -``` - -After submission, the SLURM template prints the inference URLs for all nodes (one per node). - -### Dry run - -Use `dry_run = true` to generate the sbatch script without submitting: - -```bash -uv run inference @ config.toml --dry-run true -``` - ---- - -## Custom SLURM Templates - -The default templates handle standard setups with InfiniBand detection, environment setup, and `srun`-based process dispatch. For advanced use cases (custom partitions, account settings, module loads, etc.), provide your own Jinja2 template: - -```bash -uv run rl @ my_config.toml --slurm.template-path path/to/my_template.sbatch.j2 -``` - -See [`src/prime_rl/templates/`](../src/prime_rl/templates/) for the default templates as a starting point. - -## Monitoring - -After submission, logs are available at: - -```bash -# All deployment types (trainer.log and inference.log are symlinks for multi-node) -tail -F {output_dir}/logs/trainer.log -tail -F {output_dir}/logs/orchestrator.log -tail -F {output_dir}/logs/inference.log - -# Multi-node: per-node logs -tail -F {output_dir}/logs/trainer/node_*.log -tail -F {output_dir}/logs/inference/node_*.log - -# Multi-node inference: per-replica router logs -tail -F {output_dir}/logs/inference/router_*.log -``` - -For convenience, a tmux launcher sets up a session with all log streams: - -```bash -bash scripts/tmux.sh my-rl-job /shared/outputs/my-rl-job -``` diff --git a/docs/testing-moe-at-small-scale.md b/docs/testing-moe-at-small-scale.md deleted file mode 100644 index ba2ca048f9..0000000000 --- a/docs/testing-moe-at-small-scale.md +++ /dev/null @@ -1,113 +0,0 @@ -# Testing MoE at Small Scale - -When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ parameter model locally. This guide shows how to create a small (~0.5B) MoE model with the same architecture, run SFT to warm it up, and run RL on it — all on 1-2 GPUs. - -The goal isn't performance. It's catching bugs in modeling code, state dict conversions, and training pipeline integration before running at scale. - -## Overview - -1. **Create + verify** a mini model with random weights and check HF <-> PrimeRL roundtrip -2. **SFT** to give it a non-trivial distribution -3. **RL** on reverse-text to validate the full pipeline - -## Prerequisites - -- At least 1 GPU for steps 1-2, 2 GPUs for step 3 (RL) -- Architecture presets are defined in `scripts/mini_moe.py` - -## Step 1: Create and verify the mini model - -```bash -uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe -``` - -This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) with random weights, copies the tokenizer from the original GLM-4 model, then verifies that: -- Logits match between HF and PrimeRL implementations (`convert_to_prime`) -- The HF -> PrimeRL -> HF roundtrip is lossless (`convert_to_hf`) - -To re-run verification only (e.g. after a modeling code change): - -```bash -uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only -``` - -## Step 2: SFT warmup - -Using the existing debug MoE SFT config with overrides for real data: - -```bash -uv run sft @ configs/debug/moe/sft/train.toml \ - --model.name ./mini-glm-moe \ - --data.name PrimeIntellect/Reverse-Text-SFT \ - --data.type null \ - --max_steps 200 \ - --optim.lr 1e-4 \ - --ckpt.weights -``` - -This fine-tunes on [PrimeIntellect/Reverse-Text-SFT](https://huggingface.co/datasets/PrimeIntellect/Reverse-Text-SFT) for 200 steps. Loss should drop from ~12 to ~2.5. The model won't be coherent, but it will have a non-trivial distribution so KL divergence is meaningful during RL. - -The latest weight checkpoint is saved under `outputs/weights/step_`. You can verify the roundtrip on it: - -```bash -uv run python scripts/mini_moe.py --arch glm4_moe --output-dir outputs/weights/step_200 --verify-only -``` - -A pre-built SFT'd model is available at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). - -## Step 3: RL (reverse-text) - -Requires 2 GPUs (one for inference, one for training). - -```bash -uv run rl @ configs/ci/integration/rl/start.toml \ - --model.name samsja/mini-glm-moe \ - --trainer.model.impl custom \ - --inference.gpu-memory-utilization 0.7 \ - --inference.model.max-model-len 2048 -``` - -Or to use the checkpoint from step 2: - -```bash -uv run rl @ configs/ci/integration/rl/start.toml \ - --model.name outputs/weights/step_200 \ - --trainer.model.impl custom \ - --inference.gpu-memory-utilization 0.7 \ - --inference.model.max-model-len 2048 -``` - -What to look for: -- **Training runs without crashing** — validates the full pipeline (inference server, orchestrator, trainer) -- **KL divergence is non-zero and finite** — confirms the reference model distribution is working -- **Loss is reasonable** — not NaN, not stuck at a constant value - -Don't expect the reward to go up meaningfully in 20 steps on a random model. - -## Adding a new architecture - -To test a new MoE architecture (e.g., Kimi2.5): - -1. Add modeling code under `src/prime_rl/trainer/models//` -2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF model class, PrimeRL model class, and tokenizer source -3. Run steps 1-3 above with `--arch ` - -The preset defines the small config: - -```python -ARCH_PRESETS = { - "glm4_moe": { - "config_class": Glm4MoeConfig, - "config_kwargs": dict( - hidden_size=1024, - num_hidden_layers=24, - n_routed_experts=8, - # ... - ), - "hf_model_class": HFGlm4MoeForCausalLM, - "prime_model_class": PrimeRLGlm4MoeForCausalLM, - "tokenizer_source": "THUDM/GLM-4-9B-0414", - }, - # Add your new arch here -} -``` diff --git a/docs/training.md b/docs/training.md new file mode 100644 index 0000000000..bfc712a105 --- /dev/null +++ b/docs/training.md @@ -0,0 +1,363 @@ +# Training + +This page covers everything you need to launch, observe, checkpoint, and recover a `prime-rl` training run — RL, SFT, and the related on-policy distillation mode. For multi-node and cluster layouts, see [Scaling](scaling.md). For the loss math and algorithm knobs, see [Algorithms](algorithms.md). + +## Table of Contents + +- [Entrypoints](#entrypoints) +- [RL training](#rl-training) + - [Launch](#launch) + - [What each process does at runtime](#what-each-process-does-at-runtime) + - [Key knobs](#key-knobs) +- [SFT training](#sft-training) + - [Dataset format](#dataset-format) + - [Launch](#launch-1) + - [SFT-specific knobs](#sft-specific-knobs) +- [Training modes (RL / OPD / SFT-via-orchestrator)](#training-modes-rl--opd--sft-via-orchestrator) +- [Evaluations](#evaluations) +- [Checkpointing](#checkpointing) + - [Enabling checkpoints](#enabling-checkpoints) + - [Resuming a run](#resuming-a-run) + - [Saving HF weights for serving](#saving-hf-weights-for-serving) +- [Observability](#observability) + - [Log files](#log-files) + - [Console output and the tmux helper](#console-output-and-the-tmux-helper) + - [Weights & Biases](#weights--biases) + - [Prometheus and BetterStack](#prometheus-and-betterstack) + - [Platform monitoring](#platform-monitoring) +- [Metrics that matter](#metrics-that-matter) +- [Rules of thumb](#rules-of-thumb) +- [Common issues](#common-issues) + +## Entrypoints + +| Command | Purpose | Notes | +|---|---|---| +| `uv run rl` | Co-launches inference + orchestrator + trainer on one node | The default for any single-node RL run. Mirrors a `[trainer]` + `[orchestrator]` + `[inference]` TOML. | +| `uv run sft` | Supervised fine-tuning on a HF dataset | Launches torchrun internally; never call torchrun directly. | +| `uv run inference` | OpenAI-compatible vLLM server | Always use this entrypoint over `vllm serve` — it adds `/update_weights`, `/load_lora_adapter`, and `/init_broadcaster`. | +| `uv run trainer` | Standalone trainer process group | Use only when launching the trainer separately from the orchestrator (e.g. multi-node RL without the `rl` wrapper). | +| `uv run orchestrator` | Standalone orchestrator process | Pair with a separately-launched trainer + inference. | + +`rl` is a convenience wrapper — it parses one merged TOML, splits it across `[trainer]` / `[orchestrator]` / `[inference]` tables, picks GPUs, sets up logging, and spawns the three children. Standalone entrypoints exist for the multi-node case where each process lives on a different host. + +## RL training + +### Launch + +The minimal single-node RL run uses a shipped example config. From the project root: + +```bash +prime env install primeintellect/math-env # install the env once +bash scripts/tmux.sh # 4-pane tmux that tails the logs + +uv run rl @ configs/gsm8k/rl.toml \ + --wandb.project my-project \ + --wandb.name gsm8k-smoke \ + --ckpt +``` + +GPU placement: by default `rl` puts inference on GPU 0 and the trainer on GPU 1. Override with `--inference-gpu-ids` / `--trainer-gpu-ids`: + +```bash +uv run rl @ rl.toml \ + --inference-gpu-ids 0,1,2,3 \ + --trainer-gpu-ids 4,5,6,7 \ + --inference.parallel.dp 4 +``` + +For multi-node and SLURM, see [Scaling § RL training](scaling.md#rl-training). + +### What each process does at runtime + +- **Inference** (vLLM) holds the current policy and serves OpenAI-compatible completions. Receives a new HF checkpoint via `POST /update_weights` after each trainer step (or batched into one update per `max_async_level` steps). +- **Orchestrator** samples a prompt batch from the configured `[[orchestrator.train.env]]` envs, drives them against the inference server (multi-turn, tool calls, etc.), packs the completed rollouts into a binary batch, writes it under `outputs/rollouts/step_N/`, and notifies the trainer. +- **Trainer** waits for the binary batch, runs forward/backward/optimizer step under FSDP2, writes new weights to the broadcast transport, and signals the orchestrator that step `N+1` is in flight. + +The orchestrator is the only stateful CPU process; the trainer is GPU-bound; the inference server is stateless apart from KV cache. On restart the orchestrator pushes the latest checkpoint into inference automatically — you don't need to checkpoint inference state. + +### Key knobs + +These are the knobs you'll touch most often. The full field reference for each lives in [Reference](reference.md). + +| Knob | Where | What it controls | +|---|---|---| +| `model.name` | top-level | HF model ID or local path. Auto-fans-out to trainer/orchestrator/inference. | +| `max_steps` | top-level | Number of trainer steps before exit. | +| `seq_len` | top-level | Max sequence length per training sample; also enforced by the orchestrator when packing. | +| `max_async_level` | top-level | How many steps inference can run ahead of the trainer. 1 = fully overlapped; >1 = more off-policy, higher throughput. See [Algorithms § Async](algorithms.md#async--off-policy-training). | +| `orchestrator.batch_size` | orchestrator | Prompts per trainer step. | +| `orchestrator.rollouts_per_example` | orchestrator | Rollouts per prompt (the group size used for advantage normalization). | +| `orchestrator.train.sampling.max_completion_tokens` | orchestrator | Max tokens per turn at sampling time. | +| `inference.parallel.tp` / `inference.parallel.dp` | inference | Tensor and data parallelism for the inference server. | +| `inference.gpu_memory_utilization` | inference | Fraction of GPU memory vLLM may use. Tighten on co-located single-GPU runs. | +| `trainer.optim.lr` | trainer | Learning rate. Default optimizer is AdamW. | +| `trainer.loss.type` | trainer | Pick the loss variant (default AIPO vs custom). See [Algorithms § Loss](algorithms.md#loss). | + +## SFT training + +`uv run sft` runs supervised fine-tuning from a HF dataset. It shares model loaders, FSDP setup, checkpointing, and the chat-template plumbing with the RL trainer, so a typical workflow is _SFT → RL → SFT → …_ without any reformatting. + +### Dataset format + +Two accepted layouts: + +- **Prompt-completion**: a HF dataset with `prompt` and `completion` columns ([TRL format](https://huggingface.co/docs/trl/en/dataset_formats#prompt-completion)). The trainer masks out the prompt and computes loss only over the completion. +- **Messages**: a HF dataset with a single `messages` column containing a list of chat turns. The trainer interprets the whole conversation as one sample, applies role-based loss masking, and trains over all assistant turns. + +If both columns are present, `messages` takes precedence. + +**Chat-template prefix property.** Multi-turn SFT requires that tokenizing the first _k_ turns of a conversation be a strict prefix of tokenizing all _n ≥ k_ turns. Qwen3's default template _violates_ this (it strips past `` blocks), so use either the prime-rl–patched checkpoints (e.g. `PrimeIntellect/Qwen3-0.6B`) or a custom chat template that preserves thinking. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). + +### Launch + +Single GPU: + +```bash +uv run sft @ configs/.toml --wandb +``` + +Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manages this for you — see [Scaling § SFT training](scaling.md#sft-training) for non-default layouts). + +A CPU-friendly smoke run with fake data: + +```bash +uv run sft @ configs/debug/sft/train.toml +``` + +### SFT-specific knobs + +| Knob | What it controls | +|---|---| +| `data.type = "sft"` and `data.path` | HF dataset name or local path | +| `data.batch_size` | Tokens per trainer step (packed) | +| `data.seq_len` | Per-sample sequence length | +| `loss_mask.*` | Which roles contribute to loss; see [Reference § `sft.data.loss_mask`](reference.md#sft-data) | +| `val.interval` | Run validation every N steps; `val.data` mirrors `data` | + +## Training modes (RL / OPD / SFT-via-orchestrator) + +The RL entrypoint also supports two distillation modes, switched via `orchestrator.training_mode`: + +| Mode | Student | Teacher | Use case | +|---|---|---|---| +| `rl` | Required | Forbidden | Standard RL | +| `opd` | Required | Required, must be vLLM (needs `prompt_logprobs`) | [On-policy distillation](https://thinkingmachines.ai/blog/on-policy-distillation/): student generates rollouts, trainer minimizes KL to teacher logprobs | +| `sft` | Required | Required, any OpenAI-compatible endpoint | Hard-distill: teacher generates rollouts, student trains on them | + +For OPD and SFT-via-orchestrator, set `deployment.num_teacher_gpus` to auto-launch a teacher vLLM server, or hand-launch one and pass its URL via `orchestrator.client.base_url`. Debug configs for all variants ship under [`configs/debug/training_modes/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/configs/debug/training_modes). + +The standalone `uv run sft` entrypoint is the more traditional SFT path — pure dataset-based, no teacher, no orchestrator. Use `orchestrator.training_mode = "sft"` only when you want a teacher to generate the supervision on the fly. + +## Evaluations + +Evals run inside the orchestrator on a separate set of envs declared under `[[orchestrator.eval.env]]`: + +```toml +[orchestrator.eval] +interval = 25 # evaluate every 25 trainer steps +rollouts_per_example = 4 + +[[orchestrator.eval.env]] +id = "math-env" +name = "gsm8k-eval" +args = { dataset_name = "openai/gsm8k", dataset_subset = "main", split = "test" } +``` + +Eval scores land in the trainer logs as `eval/{env}/{avg@k,pass@k}` and in W&B under the same keys. For one-off evaluations outside of training, use `vf-eval`: + +```bash +uv run vf-eval math-env \ + -a '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ + -m PrimeIntellect/Qwen3-0.6B \ + -b http://localhost:8000/v1 \ + -n 50 -t 2048 +``` + +`vf-eval` talks to any OpenAI-compatible endpoint, so it works against `uv run inference`, hosted endpoints, or a stale checkpoint mid-run. + +## Checkpointing + +Checkpointing is split across processes because the orchestrator and trainer can be on different machines and on different steps at any given time. Inference is stateless. + +| Process | What's saved | Where | +|---|---|---| +| Trainer | FSDP-sharded model (DCP), optimizer, scheduler, progress | `/checkpoints/step_N/` | +| Orchestrator | Step counter, total tokens / samples / problems | `/checkpoints/orchestrator/step_N/` | +| Inference | _nothing_ — re-pushed from the latest checkpoint on restart | n/a | +| Trainer (HF weights) | HF-compatible weight snapshot for serving | `/weights/step_N/` | + +### Enabling checkpoints + +Checkpointing is **off by default** to save disk. Enable it with `--ckpt`: + +```bash +uv run rl @ rl.toml --ckpt # default: end-of-training only +uv run rl @ rl.toml --ckpt.interval 25 # every 25 steps +uv run rl @ rl.toml --ckpt.interval 25 --ckpt.keep-last 3 # rolling window of 3 +uv run rl @ rl.toml --ckpt.interval 25 --ckpt.keep-interval 100 # …plus permanent every 100 +``` + +Common combo for long runs: `--ckpt.interval 50 --ckpt.keep-last 3 --ckpt.keep-interval 500` — rolling 3-checkpoint window for fast recovery, plus a permanent snapshot every 500 steps. + +### Resuming a run + +Re-run the same launch command and pass `--ckpt.resume-step ` (or `-1` for "latest"). Make sure `--max-steps` is at least the target final step, not the remaining delta: + +```bash +# First run: steps 0–10 +uv run rl @ rl.toml --max-steps 10 --ckpt + +# Resume: continue to step 20 +uv run rl @ rl.toml --max-steps 20 --ckpt.resume-step 10 +``` + +Trainer + orchestrator step counters are kept in lockstep — both rewind to the same resume step. The inference server can stay running across restarts; the orchestrator pushes the resumed weights on reconnect. + +### Saving HF weights for serving + +HF-compatible weight snapshots are written under `/weights/step_N/` whenever a full checkpoint runs (or you can write weights-only via `--ckpt.weights-only` for cheaper snapshots). Upload directly: + +```bash +uv run hf upload /-RL outputs/weights/step_100 +``` + +For LoRA runs, set `ckpt.weights.save_adapter_separately = true` to also write the raw adapter alongside the merged weights — useful when serving the adapter through a separate `/load_lora_adapter` call. + +## Observability + +### Log files + +The launcher tees every process's stdout/stderr into `/logs/`: + +``` +/logs/ +├── trainer.log # rank 0 only +├── orchestrator.log +├── inference.log +├── trainer/torchrun//attempt_0//{stdout,stderr}.log +└── envs/{train,eval}// + ├── env_server.log + └── env_worker_.log +``` + +Multi-node runs add `trainer/node_*.log` and `inference/node_*.log` — `trainer.log` and `inference.log` at the top level symlink to node 0 for convenience. See [Scaling § Multi-node logs](scaling.md#multi-node-logs). + +Env worker logs are the first place to look for env-side errors (most user code lives there). Verbosity is controlled by `orchestrator.log.vf_level`. + +### Console output and the tmux helper + +`scripts/tmux.sh` opens a 4-pane tmux session that follows `trainer.log`, `orchestrator.log`, `inference.log`, and the union of env worker logs. Start it before launching: + +```bash +bash scripts/tmux.sh +# then in the Launcher window: +uv run rl @ ... --output-dir outputs/my-run +``` + +Pass `-s ` and `-o ` to run multiple parallel experiments side-by-side in different sessions. + +For multi-node SLURM runs, follow the head-node logs via `tail -f` on the shared filesystem — see [Scaling § SLURM](scaling.md#slurm). + +### Weights & Biases + +W&B is off by default. Enable with `--wandb`: + +```bash +uv run rl @ rl.toml --wandb # default project, random name +uv run rl @ rl.toml --wandb.project my-proj --wandb.name run-42 +``` + +For RL runs the trainer and orchestrator log as **two separate runs** with the same name: `-trainer` and `-orchestrator`. You'll usually want both grouped in a W&B group. + +By default, every 10 steps each process also logs a sample of prompts/completions (with rewards and advantages) and reward/advantage/entropy distributions as W&B tables. Tune via `--wandb.log-extras.interval` and `--wandb.log-extras.sample-ratio`, or disable subsets: + +```bash +uv run rl @ rl.toml --wandb \ + --orchestrator.wandb.log-extras.interval 50 \ + --no-trainer.wandb.log-extras.distributions +``` + +### Prometheus and BetterStack + +For long-running production training: + +- **Prometheus**: set `trainer.metrics_server.port` to expose `/metrics` on each trainer process. vLLM also exposes `/metrics` natively — useful for KV-cache saturation and pending-request counts. +- **BetterStack heartbeats**: set `trainer.heartbeat.url` (and the matching orchestrator field) to ping a heartbeat URL each step. Pair with a BetterStack monitor to page on stalls. + +### Platform monitoring + +Internal teams can register runs on the Prime Intellect platform: + +```toml +[orchestrator.prime_monitor] +run_name = "my-experiment" +``` + +This streams training metrics, samples, and distributions to the platform dashboard. Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted team. Currently internal-only. + +## Metrics that matter + +Pulled from the three console logs (and mirrored to W&B): + +**Progress** (orchestrator): + +- `reward/{all,env}/mean` — main signal. Should trend upward over hundreds of steps. +- `seq_len/{all,env}/mean` and `is_truncated/{all,env}/mean` — rollout length and truncation rate. +- `num_turns/{all,env}/mean` — for multi-turn envs. +- `empty_rollouts/{all,env}`, `errored_rollouts/{all,env}` — non-zero is fine in small numbers; sustained > 5% is a smell. +- `eval/{env}/{avg@k,pass@k}` — eval scores when `[orchestrator.eval]` is set. + +**Stability** (trainer): + +- `mismatch_kl/{all,env}/{mean,std,max}` — KL between trainer's current policy and the (older) inference policy that generated the rollouts. A sustained, growing mean is the early-warning sign for off-policy collapse. +- `entropy/{all,env}/mean` — too low means mode-collapse; too high means the model isn't committing. +- `masked_advantage_{positive,negative}/mean` — fraction of DPPO-masked tokens, split by sign. +- `optim/grad_norm` — spikes precede divergence; check the loss config or lower the LR. + +**Performance** (trainer + orchestrator step independently): + +| Source | Metric | Reading | +|---|---|---| +| trainer | `time/wait_for_batch` | **high → orchestrator bottleneck** | +| orchestrator | `time/wait_for_ckpt` | **high → trainer bottleneck** | +| trainer | `perf/throughput`, `perf/mfu` | tokens/s and MFU | +| orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | +| vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | + +Live vLLM stats (Prometheus): + +```bash +curl -s http://localhost:8000/metrics | grep -E "num_requests|gpu_cache_usage" +``` + +## Rules of thumb + +- **Start small.** Run `configs/gsm8k/rl.toml` end-to-end on 2 GPUs before scaling. If GSM8K runs cleanly, your install is good. +- **Eyeball the reward distribution.** If `reward/all/std` collapses to ~0 within a few steps, the env is too easy or rewards are degenerate — increase difficulty or check the rubric. +- **Match `inference.parallel.tp` to model layout.** TP > num attention heads / 2 starts losing efficiency. For dense models keep TP small and use DP for throughput. For MoE-heavy models prefer EP. +- **Set `max_async_level` deliberately.** `1` = fully synced overlap (lowest off-policy drift). `2` = default, suited for cross-WAN weight broadcast. Higher values trade more drift for throughput; watch `mismatch_kl/all/mean`. +- **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. +- **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. +- **Don't change `optimization_dtype` / `reduce_dtype`.** These are load-bearing — flipping bfloat16/float32 silently changes training dynamics. Stick with defaults unless you know what you're doing. + +## Common issues + +**`@ path/to/x.toml` fails to load.** Leave a space between `@` and the path — `@ rl.toml`, not `@rl.toml`. If the error mentions Pydantic, your TOML doesn't match the schema; `--dry-run` will pinpoint the offending field. + +**API timeouts under load.** Bump file descriptors: `ulimit -n 32000`. Our defaults are already generous, so a real timeout usually means inference is saturated — check `time/generate_completions` and vLLM's `gpu_cache_usage_perc`. + +**CUDA OOM in the trainer.** In order, try: + +1. Full activation checkpointing: `--model.ac` (the bare flag enables defaults). +2. Lower `seq_len` or `data.micro_batch_size`. +3. FSDP CPU offload: `--model.fsdp-cpu-offload` (or `--model.optim-cpu-offload` for optimizer states only). +4. Context parallelism: `--model.cp 2` (requires flash-attention; see [Scaling § CP](scaling.md#context-parallelism)). + +**CUDA OOM in inference.** Tighten `inference.gpu_memory_utilization` (start around 0.85), reduce `inference.model.max_model_len`, or split inference across more GPUs via `inference.parallel.dp`. + +**Eval scores frozen but training reward rising.** Likely a chat-template prefix violation eating the model's outputs. Check `orchestrator.renderer` settings (`preserve_all_thinking`, etc.) and use the prime-rl–patched model checkpoint if available. + +**Trainer hangs on weight broadcast.** NCCL transport requires `max_async_level=1` and is incompatible with LoRA — the run will fail at config-validate time if either is set. Otherwise check that all trainer ranks survived the previous step (`grep ERROR logs/trainer/torchrun/`). + +**Run dies mid-step with no traceback.** Look in `/logs/envs/train//env_worker_*.log` first — most silent kills come from OOM-killed env workers running user code. Set `orchestrator.log.vf_level = "debug"` for more verbose env logging. diff --git a/docs/training_modes.md b/docs/training_modes.md deleted file mode 100644 index e1787711b1..0000000000 --- a/docs/training_modes.md +++ /dev/null @@ -1,39 +0,0 @@ -# Training Modes - -PRIME-RL supports three training modes through our RL trainer, selected via `training_mode`: - -- **`rl`** — reinforcement learning: student generates rollouts, no teacher -- **`opd`** — [on-policy distillation](https://thinkingmachines.ai/blog/on-policy-distillation/): students generates rollouts, train to minimize the KL divergence between the student and teacher's logprobs for each token in the rollout -- **`sft`** — supervised fine-tuning on teacher-generated rollouts - -> Note: PRIME-RL also has a dedicated `sft` entrypoint for more traditional supervised fine-tuning from a HF dataset. When using the `sft` training mode on the orchestrator, teacher rollouts are generated on-the-fly and used for training. - -The mode determines who generates rollouts, what role the teacher plays, and what must be configured. - -| Mode | Student | Teacher | -|---|---|---| -| `rl` | required | forbidden | -| `opd` | required | required (local vLLM) | -| `sft` | required | required (any OAI-compatible endpoint) | - -**SFT vs OPD teachers** differ in what the orchestrator asks of them. SFT only calls `/v1/chat/completions` to generate rollouts — any OpenAI-compatible endpoint works (PI inference, OpenAI, Anthropic, a local vLLM). OPD additionally needs token-level logprobs scored over the student's tokens, which today only vLLM's `/inference/v1/generate` with `prompt_logprobs` exposes — so the OPD teacher must be a vLLM server. - -### Reference configs - -Debug-scale configs for all three modes (and LoRA variants) live in [`configs/debug/training_modes/`](../configs/debug/training_modes/): - -- `rl.toml` / `opd.toml` / `opd_lora.toml` -- `sft.toml` / `sft_lora.toml` (local vLLM teacher) -- `sft_external.toml` (PI inference teacher) - -See [`configs/debug/training_modes/README.md`](../configs/debug/training_modes/README.md) for run commands. - -## Parameter reference - -| Parameter | Default | Description | -|-----------|---------|-------------| -| `training_mode` | `"rl"` | One of `rl`, `opd`, `sft`. Propagates to `orchestrator.training_mode` and (for sft) `trainer.loss.type`. | -| `deployment.num_teacher_gpus` | `None` | Number of GPUs for the teacher vLLM server. Auto-starts when set. OPD only. | -| `trainer.loss.teacher_tau` | `0.0` | Distillation strength. Must be `> 0` in OPD. | -| `trainer.loss.adv_tau` | `1.0` | Weight for the RL advantage signal. Set `0` for pure distillation. | -| `orchestrator.verification.enabled` | `true` | Enable/disable verification. Set to `false` for pure distillation with `adv_tau = 0`. | diff --git a/docs/trajectories.md b/docs/trajectories.md deleted file mode 100644 index d5d7eee5e4..0000000000 --- a/docs/trajectories.md +++ /dev/null @@ -1,96 +0,0 @@ -# Trajectories - -Verifiers [v0.1.8](https://github.com/PrimeIntellect-ai/verifiers/releases/tag/v0.1.8) introduced trajectory-based rollouts, where each LLM request/response pair in a multi-turn interaction is recorded as an independent step. For details on the design decision, check the detailed [design document](https://github.com/PrimeIntellect-ai/verifiers/blob/main/notes/TRAJECTORIES.md) in the verifiers repository. - -## Best-Effort Interleaved Rollouts - -PRIME-RL uses a best-effort interleaving strategy that automatically merges consecutive trajectory steps when possible, and starts a new training sample when the extension property breaks. - -### The Extension Property - -A sequence of trajectory steps has the **extension property** when each successive step's prompt contains all previous prompts and completions as a prefix. When this holds: -- Multiple steps can be merged into a single training sample -- Compute scales as O(T) for a trajectory of length T - -When extension breaks (e.g., due to context compaction or thinking being stripped): -- A new training sample is started from that step -- Compute scales as O(T²) in the worst case (every step breaks extension) - -### How It Works - -``` -5-step trajectory where extension breaks at step 4: - -Steps 1-3: extension holds → merged into Sample 1 -Step 4: extension breaks (e.g., thinking stripped from history) -Steps 4-5: extension holds → merged into Sample 2 - -Result: 2 training samples instead of 5 -``` - -This approach gives you the best of both worlds: -- When extension holds: O(T) compute, single merged sample -- When extension breaks: graceful fallback, no corrupted data -- Mixed scenarios: optimal merging where possible - -### The Exact Prefix Invariant - -Interleaving enforces a strict invariant: - -> The prompt at turn $t$ must be the exact concatenation of prior messages exactly as the LLM originally generated them - -We call this the "exact prefix" invariant. For example, at turn 2, the LLM should see U1,A1,U2 as the prompt, where U1 exactly matches the user message in turn 1 and A1 exactly matches the produced assistant message in turn 1. Any violation of this invariant will result in downstream problems when computing the importance sampling ratio during training. - -For example, assume that at turn 2 the prompt is U1,A1',U2 where A1' varies from A1. In this scenario it is not clear whether to add A1 or A1' to the interleaved rollout: -- If we add A1', the logprobs from turn 1 might be off because the inference LLM produced A1 but the trainer LLM is computing logprobs for A1' -- If we add A1, the logprobs from turn 2 might be off because the inference LLM is attending to A1' but the trainer LLM is attending to A1 - -When the invariant is violated (extension breaks), PRIME-RL automatically starts a new training sample rather than producing corrupted data. - -### Arbitrary Chat Templates - -There exist chat templates which add, modify, or remove tokens across turns. One good example is the chat template of the Qwen3-series of models, which strips thinking across user turns. - -```python -from transformers import AutoTokenizer - -tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B") - -messages = [ - {"role": "user", "content": "U1"}, - {"role": "assistant", "content": "R1A1"}, - {"role": "user", "content": "U2"}, -] - -print(tokenizer.apply_chat_template(messages[:1], tokenize=False)) -# <|im_start|>user -# U1<|im_end|> - -print(tokenizer.apply_chat_template(messages, tokenize=False)) -# <|im_start|>user -# U1<|im_end|> -# <|im_start|>assistant -# A1<|im_end|> -# <|im_start|>user -# U2<|im_end|> -``` - -The chat template automatically strips away past thinking sections across user turns, which is often referred to as "interleaved thinking". Many chat templates, such as GLM or MiniMax, implement this approach. - -With best-effort interleaving, PRIME-RL handles this gracefully: when the thinking is stripped and the prefix no longer matches, a new training sample is started automatically. - -### Discontinuous Trajectories by Design - -Some multi-turn environments are intentionally discontinuous. For example, in a sub-agent calling scenario: - -1. Main agent receives a task and decides to delegate to a sub-agent -2. Sub-agent runs independently (possibly multiple turns with its own context) -3. Control returns to main agent with only the sub-agent's final result - -The main agent's trajectory is discontinuous because the sub-agent's internal conversation isn't part of its context. When the main agent resumes, its prompt doesn't extend the previous turn - it contains a summarized result instead. - -Best-effort interleaving handles this naturally: each agent's contiguous turns get merged, but the handoff between agents starts a new sample. - -## Deprecated: Branching Mode - -The `--trajectory-strategy branching` option is deprecated. The best-effort interleaving strategy now handles all cases automatically, falling back to separate samples (equivalent to branching) when the extension property breaks. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md deleted file mode 100644 index e2c1d68d2f..0000000000 --- a/docs/troubleshooting.md +++ /dev/null @@ -1,21 +0,0 @@ -# Troubleshooting - -> My API keeps timing out. - -We already set much larger timeout limits for the API clients that we use for training and evals. If you still encounter API timeout or connection errors, then this may be caused by your OS limiting the number of open file descriptors. Try increasing the maximum number of open files with - -```bash -ulimit -n 32000 -``` - -> I'm getting CUDA out of memory errors. - -Assuming this is happening on the RL or SFT trainer, you can try the following: -- Use full activation checkpointing (`--model.ac`) -- Reduce the micro batch size (`--data.micro-batch-size`) and sequence length (`--data.seq-len`) -- (*Experimental*) Use context parallelism with `--model.cp` - -> I cannot pass my TOML config file - -Check that you *did* leave a whitespace between the `@` and the config file (e.g. `uv run ... @ path/to/config.toml` instead of `uv run ... @path/to/config.toml`). Also, make sure that your TOML config matches the configuration schema. If not, the Pydantic error message (which arguably is quite ugly) will hopefully point you in the right direction. - diff --git a/scripts/generate_docs_reference.py b/scripts/generate_docs_reference.py new file mode 100644 index 0000000000..0a77193aeb --- /dev/null +++ b/scripts/generate_docs_reference.py @@ -0,0 +1,364 @@ +"""Generate docs/reference.md from the Pydantic config models. + +Walks every top-level user-facing config (RLConfig, SFTConfig, TrainerConfig, +OrchestratorConfig, InferenceConfig), recursively renders its nested sub-configs +and discriminated unions, and writes a single Markdown reference page. + +Run from the project root: + uv run python scripts/generate_docs_reference.py +""" + +from __future__ import annotations + +import io +import sys +import types +import typing +from dataclasses import dataclass +from pathlib import Path + +from pydantic import BaseModel +from pydantic.fields import FieldInfo +from pydantic_config.cli import _extract_field_docstrings + +from prime_rl.configs.inference import InferenceConfig +from prime_rl.configs.orchestrator import OrchestratorConfig +from prime_rl.configs.rl import RLConfig +from prime_rl.configs.sft import SFTConfig +from prime_rl.configs.trainer import TrainerConfig + +OUT_PATH = Path(__file__).resolve().parents[1] / "docs" / "reference.md" + + +@dataclass +class Entrypoint: + slug: str + title: str + cls: type[BaseModel] + blurb: str + + +ENTRYPOINTS = [ + Entrypoint( + slug="rl", + title="`rl` — Full RL training", + cls=RLConfig, + blurb=( + "The `rl` entrypoint composes a trainer, orchestrator, and (optionally) inference server into a single " + "co-located deployment. Sub-configs under `[trainer]`, `[orchestrator]`, and `[inference]` mirror the " + "standalone entrypoints below, with shared knobs (model name, output dir, W&B run name, …) lifted to " + "the top level so they only need to be set once." + ), + ), + Entrypoint( + slug="sft", + title="`sft` — Supervised fine-tuning", + cls=SFTConfig, + blurb="The `sft` entrypoint runs supervised fine-tuning on a tokenized dataset.", + ), + Entrypoint( + slug="trainer", + title="`trainer` — Standalone trainer", + cls=TrainerConfig, + blurb=( + "The `trainer` entrypoint runs only the trainer process. It expects rollouts to be shipped in via the " + "configured transport (filesystem or ZMQ) by an external orchestrator." + ), + ), + Entrypoint( + slug="orchestrator", + title="`orchestrator` — Standalone orchestrator", + cls=OrchestratorConfig, + blurb=( + "The `orchestrator` entrypoint runs only the orchestrator process. It expects a separately-launched " + "inference server to serve rollouts, and ships completed rollouts to a separately-launched trainer " + "over the configured transport." + ), + ), + Entrypoint( + slug="inference", + title="`inference` — Standalone vLLM server", + cls=InferenceConfig, + blurb=( + "The `inference` entrypoint launches a vLLM server (or a disaggregated prefill/decode pair) that " + "serves OpenAI-compatible completions to the orchestrator." + ), + ), +] + + +def is_pydantic_model(t: object) -> bool: + return isinstance(t, type) and issubclass(t, BaseModel) + + +def unwrap_annotated(t: object) -> object: + while typing.get_origin(t) is typing.Annotated: + t = typing.get_args(t)[0] + return t + + +def discriminated_variants(field: FieldInfo) -> list[type[BaseModel]] | None: + """Return the variant model classes if `field` is a discriminated union over BaseModels.""" + if field.discriminator is None: + return None + return _union_models(unwrap_annotated(field.annotation)) + + +def _union_models(t: object) -> list[type[BaseModel]] | None: + origin = typing.get_origin(t) + if origin in (typing.Union, types.UnionType): + args = [a for a in typing.get_args(t) if a is not type(None)] + if all(is_pydantic_model(a) for a in args): + return args + return None + + +def fmt_type(annotation: object) -> str: + """Render a type annotation as compact Markdown-safe text.""" + annotation = unwrap_annotated(annotation) + origin = typing.get_origin(annotation) + if origin in (typing.Union, types.UnionType): + args = typing.get_args(annotation) + return " \\| ".join(fmt_type(a) for a in args) + if origin is typing.Literal: + return " \\| ".join(repr(a) for a in typing.get_args(annotation)) + if origin is list: + return f"list[{fmt_type(typing.get_args(annotation)[0])}]" + if origin is dict: + k, v = typing.get_args(annotation) + return f"dict[{fmt_type(k)}, {fmt_type(v)}]" + if origin is tuple: + return f"tuple[{', '.join(fmt_type(a) for a in typing.get_args(annotation))}]" + if annotation is type(None): + return "None" + if isinstance(annotation, type): + return annotation.__name__ + return str(annotation).replace("typing.", "") + + +def fmt_default(field: FieldInfo) -> str: + if field.is_required(): + return "*required*" + default = field.default + factory = field.default_factory + if factory is not None: + try: + default = factory() + except Exception: + return "*factory*" + if isinstance(default, BaseModel): + return f"`{default.__class__.__name__}()`" + if default is None: + return "`None`" + if isinstance(default, str): + return f"`{default!r}`" + if isinstance(default, Path): + return f"`{str(default)!r}`" + if isinstance(default, (list, dict)) and not default: + return f"`{default!r}`" + return f"`{default!r}`" + + +def fmt_constraints(field: FieldInfo) -> str: + parts: list[str] = [] + for m in field.metadata: + for attr, sym in (("ge", "≥"), ("gt", ">"), ("le", "≤"), ("lt", "<")): + if hasattr(m, attr): + v = getattr(m, attr) + if v is not None: + parts.append(f"{sym}{v}") + if hasattr(m, "min_length") and m.min_length is not None: + parts.append(f"len ≥ {m.min_length}") + if hasattr(m, "max_length") and m.max_length is not None: + parts.append(f"len ≤ {m.max_length}") + return ", ".join(parts) + + +def slug(parts: list[str]) -> str: + return "-".join(parts).replace("_", "-").replace(".", "-").lower() + + +class Writer: + def __init__(self) -> None: + self.buf = io.StringIO() + self.toc: list[tuple[int, str, str]] = [] # (level, label, anchor) + + def h(self, level: int, text: str, anchor: str | None = None) -> None: + if anchor: + self.buf.write(f'\n') + self.toc.append((level, text, anchor)) + self.buf.write(f"{'#' * level} {text}\n\n") + + def p(self, text: str) -> None: + self.buf.write(f"{text}\n\n") + + def raw(self, text: str) -> None: + self.buf.write(text) + + +def render_field_row( + writer: Writer, + path: str, + field: FieldInfo, + docstring: str, +) -> None: + name = f"`{path}`" + type_str = fmt_type(field.annotation) + default = fmt_default(field) + constraints = fmt_constraints(field) + desc = (field.description or docstring or "").strip().replace("\n", " ") + if constraints: + desc = f"_{constraints}._ {desc}" if desc else f"_{constraints}._" + writer.raw(f"| {name} | {type_str} | {default} | {desc} |\n") + + +def render_model( + writer: Writer, + model_cls: type[BaseModel], + path_prefix: str, + anchor_prefix: list[str], + depth: int, + seen: set[type[BaseModel]], +) -> None: + """Render the fields of `model_cls` and recurse into nested BaseConfig sub-fields.""" + docstrings = _extract_field_docstrings(model_cls) + nested: list[tuple[str, type[BaseModel], FieldInfo, str]] = [] + union_fields: list[tuple[str, list[type[BaseModel]], FieldInfo, str]] = [] + flat_fields: list[tuple[str, FieldInfo, str]] = [] + + for name, field in model_cls.model_fields.items(): + full = f"{path_prefix}.{name}" if path_prefix else name + ds = docstrings.get(name, "") + ann = field.annotation + variants = discriminated_variants(field) + if variants is not None: + union_fields.append((full, variants, field, ds)) + continue + unwrapped = unwrap_annotated(ann) + if is_pydantic_model(unwrapped): + nested.append((full, unwrapped, field, ds)) + continue + # Optional[BaseConfig] case (e.g. `LoRAConfig | None`) + origin = typing.get_origin(unwrapped) + if origin in (typing.Union, types.UnionType): + args = [a for a in typing.get_args(unwrapped) if a is not type(None)] + if len(args) == 1 and is_pydantic_model(args[0]): + nested.append((full, args[0], field, ds)) + continue + flat_fields.append((full, field, ds)) + + if flat_fields: + writer.raw("| Field | Type | Default | Description |\n") + writer.raw("|---|---|---|---|\n") + for full, field, ds in flat_fields: + render_field_row(writer, full, field, ds) + writer.raw("\n") + + for full, child_cls, field, ds in nested: + sub_anchor = anchor_prefix + [full.split(".")[-1]] + heading = f"`{full}`" + writer.h(min(depth + 1, 6), heading, anchor=slug(sub_anchor)) + blurb = (field.description or ds or "").strip() + if blurb: + writer.p(blurb) + if child_cls in seen: + writer.p(f"_Recursive reference to_ `{child_cls.__name__}` _omitted._") + continue + render_model(writer, child_cls, full, sub_anchor, depth + 1, seen | {child_cls}) + + for full, variants, field, ds in union_fields: + sub_anchor = anchor_prefix + [full.split(".")[-1]] + heading = f"`{full}`" + writer.h(min(depth + 1, 6), heading, anchor=slug(sub_anchor)) + blurb = (field.description or ds or "").strip() + if blurb: + writer.p(blurb) + type_field = field.discriminator or "type" + writer.p( + f"Discriminated union — set `{full}.{type_field}` to one of " + + ", ".join(f"`{_type_literal(v, type_field)}`" for v in variants) + + " and provide the matching sub-fields." + ) + for variant in variants: + type_literal = _type_literal(variant, type_field) + var_anchor = sub_anchor + [type_literal or variant.__name__.lower()] + writer.h( + min(depth + 2, 6), + f'`{full}.{type_field} = "{type_literal}"` ({variant.__name__})', + anchor=slug(var_anchor), + ) + render_model(writer, variant, full, var_anchor, depth + 2, seen | {variant}) + + +def _type_literal(model_cls: type[BaseModel], type_field: str) -> str: + field = model_cls.model_fields.get(type_field) + if field is None: + return "" + ann = unwrap_annotated(field.annotation) + if typing.get_origin(ann) is typing.Literal: + return str(typing.get_args(ann)[0]) + return "" + + +def render_entrypoint(writer: Writer, ep: Entrypoint) -> None: + writer.h(2, ep.title, anchor=ep.slug) + writer.p(ep.blurb) + writer.p(f"_Defined in_ `{ep.cls.__module__}.{ep.cls.__qualname__}`.") + render_model(writer, ep.cls, "", [ep.slug], depth=2, seen={ep.cls}) + + +def render_toc(writer: Writer) -> str: + out = io.StringIO() + out.write("## Table of Contents\n\n") + for level, label, anchor in writer.toc: + if level == 2: + out.write(f"- [{label}](#{anchor})\n") + elif level == 3: + out.write(f" - [{label}](#{anchor})\n") + out.write("\n") + return out.getvalue() + + +HEADER = """# Reference + +This page documents every field accepted by every prime-rl entrypoint. It is +auto-generated from the Pydantic config models; do not edit by hand. + +To regenerate, run from the project root: + +```bash +uv run python scripts/generate_docs_reference.py +``` + +Each entrypoint section walks its config tree top-down. Nested sub-configs +appear under headings named after their dotted path (e.g. `trainer.model.ac`). +Discriminated unions (loss, advantage, scheduler, optimizer, …) document each +variant in turn — set the `type` field to pick one. + +For conceptual context behind these knobs, see +[Configuration](configuration.md), [Training](training.md), +[Scaling](scaling.md), [Algorithms](algorithms.md), and [Advanced](advanced.md). + +""" + + +def main() -> int: + writer = Writer() + writer.raw(HEADER) + # We render entrypoints into a separate buffer first so the TOC can be + # assembled from collected headings before being prepended. + body = Writer() + for ep in ENTRYPOINTS: + render_entrypoint(body, ep) + # Stitch: header + TOC built from body's headings + body content. + writer.raw(render_toc(body)) + writer.raw("---\n\n") + writer.raw(body.buf.getvalue()) + + OUT_PATH.write_text(writer.buf.getvalue()) + print(f"Wrote {OUT_PATH} ({writer.buf.tell()} chars)") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) From 1220bc56aff5cbef727b1059c3b588fa5c9dda29 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Fri, 22 May 2026 23:52:46 +0000 Subject: [PATCH 02/66] docs: fix stale claims found in source verification pass MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - algorithms.md: max_async_level default is 1 (not 2); default loss includes the Kimi-K2.5 KL regularizer (was wrongly claimed to drop it); update the formula to show the full L = -PG + tau_KL * KL form; filter table is `[[orchestrator.filters]]` (plural) - training.md: checkpoint paths nest under `checkpoints/step_N/{trainer, orchestrator}/` rather than separate hierarchies; --inference-gpu-ids / --trainer-gpu-ids don't exist — use --deployment.num-{infer,train}- gpus and pin physical GPUs via CUDA_VISIBLE_DEVICES; update max_async_level prose to match the new default - scaling.md: same GPU-flag fix throughout the single/multi-GPU examples; correct the claim that Muon + optim_cpu_offload is unsupported (only fsdp_cpu_offload is blocked) - configuration.md: there is no generic PRIME_* env var override mechanism in pydantic-config — rewrite the env vars section to list the specific named vars that individual fields read as defaults - advanced.md: add the qwen3_vl_moe entry to the VLM registry table; the small-scale MoE RL config lives at configs/ci/integration/reverse_text_moe/start.toml, not .../rl/ - faqs.md: update the max_async_level Q&A to match the new default Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/advanced.md | 5 +++-- docs/algorithms.md | 49 +++++++++++++++++++++++++++---------------- docs/configuration.md | 25 ++++++++++------------ docs/faqs.md | 2 +- docs/scaling.md | 25 +++++++++++----------- docs/training.md | 16 +++++++------- 6 files changed, 68 insertions(+), 54 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index a6ef4e83db..aeffd93ed6 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -69,10 +69,11 @@ The built-in VLM registry covers: | Family | `model_type` | Vision attr | LM attr | |---|---|---|---| | Qwen3-VL | `qwen3_vl` | `model.visual` | `model.language_model` | +| Qwen3-VL MoE | `qwen3_vl_moe` | `model.visual` | `model.language_model` | | Qwen3.5 | `qwen3_5` | `model.visual` | `model.language_model` | | Qwen3.5-MoE | `qwen3_5_moe` | `model.visual` | `model.language_model` | -For a model not in the table, look up the attribute paths on the loaded HF model with `model.named_children()`. +For a model not in the table, look up the attribute paths on the loaded HF model with `model.named_children()` and set them under `[model.vlm]` directly. ### Enabling VLM mode @@ -307,7 +308,7 @@ Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has ### Step 3: RL on reverse-text ```bash -uv run rl @ configs/ci/integration/rl/start.toml \ +uv run rl @ configs/ci/integration/reverse_text_moe/start.toml \ --model.name samsja/mini-glm-moe \ --trainer.model.impl custom \ --inference.gpu-memory-utilization 0.7 \ diff --git a/docs/algorithms.md b/docs/algorithms.md index 4011871229..decbaa5e0c 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -6,7 +6,7 @@ This page covers the math and the configurable algorithmic components: how off-p - [Async / off-policy training](#async--off-policy-training) - [Step semantics](#step-semantics) - - [The AIPO loss](#the-aipo-loss) + - [The default loss](#the-default-loss) - [Tuning `max_async_level`](#tuning-max_async_level) - [Loss](#loss) - [Default loss](#default-loss) @@ -23,7 +23,7 @@ This page covers the math and the configurable algorithmic components: how off-p ## Async / off-policy training -`prime-rl` is asynchronous by default. Inference is allowed to generate rollouts using a stale policy that is up to `k` steps behind the trainer, where `k = max_async_level`. Setting `k = 1` with matched trainer and inference step times produces fully-overlapped pipeline parallelism — neither side ever idles. The default is `k = 2`, chosen to absorb the latency of a cross-WAN weight broadcast for decentralized runs. +`prime-rl` is asynchronous by default. Inference is allowed to generate rollouts using a stale policy that is up to `k` steps behind the trainer, where `k = max_async_level`. Setting `k = 1` (the default) with matched trainer and inference step times produces fully-overlapped pipeline parallelism — neither side ever idles. Bump `k` higher when the weight-broadcast latency exceeds a single trainer step (e.g. cross-WAN decentralized runs) and the extra off-policy drift is acceptable. ![Two-Step Off-Policy Training](assets/two-step-off-policy.png) @@ -36,34 +36,47 @@ At step $n = 1, 2, 3, \dots$: So at step $n$ the gap between the policy being trained and the policy that generated the data is at most $k$ steps. Step indices are 0-indexed so the bound holds at startup. -### The AIPO loss +### The default loss -The default loss is a token-level variant of [AIPO](https://arxiv.org/abs/2505.24034), without the entropy and KL terms used in the original paper. For each prompt $x_j$ we sample a group of $G$ rollouts $\{y_i\}_{i=1}^G$, score them with the rubric to get $s_i$, then optimize: +The default RL loss combines a token-level [AIPO](https://arxiv.org/abs/2505.24034)-style policy-gradient term (importance-ratio clipped from above, plus DPPO token-level masking) with the Kimi-K2.5 KL regularizer. For each prompt $x_j$ we sample a group of $G$ rollouts $\{y_i\}_{i=1}^G$, score them to get $s_i$, then optimize: $$ -\mathcal{J}_{\text{AIPO}}(\theta) -= \frac{1}{\sum_{j=1}^N \sum_{i=1}^G |y_i^{(j)}|} -\sum_{j=1}^N \sum_{i=1}^G \sum_{t=1}^{|y_i^{(j)}|} +\mathcal{L}(\theta) = -\,\mathcal{J}_{\text{PG}}(\theta) \;+\; \tau_{KL}\,\mathcal{L}_{KL}(\theta) +$$ + +where the policy-gradient term is + +$$ +\mathcal{J}_{\text{PG}}(\theta) += \frac{1}{\sum_{j,i} |y_i^{(j)}|} +\sum_{j,i,t} \min\!\left(\frac{\pi(y_{i,t}^{(j)}\mid x_j, y_{i,` with `__` as the dot separator: - -```bash -export PRIME_MODEL__NAME=Qwen/Qwen3-0.6B -export PRIME_TRAINER__OPTIM__LR=1e-5 -``` +Only a fixed set of env vars are wired into individual config fields as their default. They're a convenience for things that legitimately vary per deployment (credentials, log levels) — they don't generalize to "set any field via env var." -In practice only a few env vars are used routinely: +- `PRIME_LOG_LEVEL` / `PRIME_VF_LOG_LEVEL` — defaults for `[log] level` and `[log] vf_level` (the prime-rl and verifiers loggers). +- `WANDB_API_KEY` / `HF_TOKEN` — read directly by W&B and `huggingface_hub`, never by prime-rl itself. +- `PRIME_API_KEY` — read by `[orchestrator.prime_monitor]` for [platform monitoring](training.md#platform-monitoring). The env var name is itself configurable via `prime_monitor.api_key_var`. +- `INFERENCE_SERVER_API_KEY` (or whatever you set in `client.api_key_var`) — used by the orchestrator to authenticate to the inference server. -- `PRIME_LOG_LEVEL` / `PRIME_VF_LOG_LEVEL` — log levels for the prime-rl and verifiers loggers (the `[log]` defaults read these). -- `WANDB_API_KEY` / `HF_TOKEN` — third-party credentials. -- `PRIME_API_KEY` — for [Prime Intellect platform monitoring](training.md#platform-monitoring). +Any other field needs to be set in TOML or on the CLI. ## Inspecting and validating diff --git a/docs/faqs.md b/docs/faqs.md index 35cd4ded5d..f3e362dde9 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -83,7 +83,7 @@ See [Configuration § Environments](configuration.md#environments-orchestratortr ### What does `max_async_level` actually do? -It caps how many steps inference can run ahead of training. `1` is pipelined (overlapped) but fully responsive; `2` (default) absorbs slower weight broadcasts. Higher values give more throughput at the cost of off-policy drift. Watch `mismatch_kl/all/mean` — if it grows, lower the value. See [Algorithms § Tuning `max_async_level`](algorithms.md#tuning-max_async_level). +It caps how many steps inference can run ahead of training. `1` (default) is pipelined — inference for step n+1 runs concurrently with trainer step n; off-policy drift is minimal. `2` absorbs slower weight broadcasts (e.g. cross-WAN). Higher values give more throughput at the cost of more drift; watch `mismatch_kl/all/mean`. See [Algorithms § Tuning `max_async_level`](algorithms.md#tuning-max_async_level). ### Why are there two W&B runs per RL job? diff --git a/docs/scaling.md b/docs/scaling.md index 1f2bf3bba0..16a8ec50d7 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -36,7 +36,7 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster | You have… | Use this layout | |---|---| | 1 GPU | Single-GPU co-located RL (small model) or SFT-only | -| 1 node, 2–8 GPUs | `uv run rl` with `--inference-gpu-ids` / `--trainer-gpu-ids` | +| 1 node, 2–8 GPUs | `uv run rl` with `--deployment.num-infer-gpus N --deployment.num-train-gpus M` | | 1 node, 8 GPUs, large MoE | Custom impl + EP + activation checkpointing | | 2+ nodes, SLURM | `[slurm]` + `[deployment]` overlay (recommended) | | 2+ nodes, no SLURM | Manual `uv run inference` + `uv run orchestrator` + `uv run torchrun src/.../train.py` | @@ -45,14 +45,14 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster ## Single GPU -The trainer and inference server can share a GPU for small models or smoke tests. Pin both to GPU 0 and tighten the inference memory budget so the trainer has room: +The trainer and inference server can share a GPU for small models or smoke tests. Pin everything to one physical GPU via `CUDA_VISIBLE_DEVICES`, set both deployment counts to 1, and tighten the inference memory budget so the trainer has room: ```bash bash scripts/tmux.sh -uv run rl @ configs//rl.toml \ - --trainer-gpu-ids 0 \ - --inference-gpu-ids 0 \ +CUDA_VISIBLE_DEVICES=0 uv run rl @ configs//rl.toml \ + --deployment.num-infer-gpus 1 \ + --deployment.num-train-gpus 1 \ --inference.gpu-memory-utilization 0.5 ``` @@ -73,26 +73,27 @@ For SFT, single-GPU is the default — `uv run sft` runs without torchrun unless ### RL placement -`rl` defaults to GPU 0 for inference and GPU 1 for the trainer. Override the placement for a typical 8-GPU node by giving inference 6 GPUs with data parallelism and the trainer the remaining 2: +`rl` defaults to 1 trainer GPU and 1 inference GPU. To give inference 6 GPUs with data parallelism and the trainer the remaining 2 on an 8-GPU node: ```bash uv run rl @ rl.toml \ - --inference-gpu-ids 0,1,2,3,4,5 \ - --trainer-gpu-ids 6,7 \ + --deployment.num-infer-gpus 6 \ + --deployment.num-train-gpus 2 \ --inference.parallel.dp 6 ``` +The launcher allocates GPUs in order from `CUDA_VISIBLE_DEVICES` (or all visible GPUs): inference first, trainer next, teacher last. To target a specific physical subset, pin `CUDA_VISIBLE_DEVICES` before launching. + For quick A/B ablations on the same node, run two RL instances side-by-side in separate tmux sessions, each pinned to half the GPUs and a separate inference port: ```bash # session 1, GPUs 0–1, default port 8000 bash scripts/tmux.sh -s exp1 -o outputs/exp1 -uv run rl @ rl.toml --output-dir outputs/exp1 +CUDA_VISIBLE_DEVICES=0,1 uv run rl @ rl.toml --output-dir outputs/exp1 # session 2, GPUs 2–3, port 8001 bash scripts/tmux.sh -s exp2 -o outputs/exp2 -uv run rl @ rl.toml \ - --inference-gpu-ids 2 --trainer-gpu-ids 3 \ +CUDA_VISIBLE_DEVICES=2,3 uv run rl @ rl.toml \ --inference.server.port 8001 \ --orchestrator.client.base-url http://localhost:8001/v1 \ --output-dir outputs/exp2 @@ -180,7 +181,7 @@ type = "adamw" optim_cpu_offload = true ``` -Mutually exclusive with `fsdp_cpu_offload`. Not supported with the Muon optimizer. +Mutually exclusive with `fsdp_cpu_offload`. Also incompatible with `trainer.max_concurrent_runs > 1` (the multi-run manager). Muon doesn't support `fsdp_cpu_offload` but does support `optim_cpu_offload`. ## Memory-tight recipe diff --git a/docs/training.md b/docs/training.md index bfc712a105..8a9386b893 100644 --- a/docs/training.md +++ b/docs/training.md @@ -57,15 +57,17 @@ uv run rl @ configs/gsm8k/rl.toml \ --ckpt ``` -GPU placement: by default `rl` puts inference on GPU 0 and the trainer on GPU 1. Override with `--inference-gpu-ids` / `--trainer-gpu-ids`: +GPU placement: by default `rl` uses 1 trainer GPU and 1 inference GPU on the local node. To run on (say) 8 GPUs with 4 inference + 4 trainer, set the deployment counts: ```bash uv run rl @ rl.toml \ - --inference-gpu-ids 0,1,2,3 \ - --trainer-gpu-ids 4,5,6,7 \ + --deployment.num-infer-gpus 4 \ + --deployment.num-train-gpus 4 \ --inference.parallel.dp 4 ``` +The launcher assigns physical GPUs from `CUDA_VISIBLE_DEVICES` (or all visible GPUs if unset) — inference takes the first `num_infer_gpus`, the trainer takes the next `num_train_gpus`, and any teacher gets the remainder. To run on a specific subset of physical GPUs, pin `CUDA_VISIBLE_DEVICES` before launching. + For multi-node and SLURM, see [Scaling § RL training](scaling.md#rl-training). ### What each process does at runtime @@ -85,7 +87,7 @@ These are the knobs you'll touch most often. The full field reference for each l | `model.name` | top-level | HF model ID or local path. Auto-fans-out to trainer/orchestrator/inference. | | `max_steps` | top-level | Number of trainer steps before exit. | | `seq_len` | top-level | Max sequence length per training sample; also enforced by the orchestrator when packing. | -| `max_async_level` | top-level | How many steps inference can run ahead of the trainer. 1 = fully overlapped; >1 = more off-policy, higher throughput. See [Algorithms § Async](algorithms.md#async--off-policy-training). | +| `max_async_level` | top-level | How many steps inference can run ahead of the trainer. `1` (default) is fully overlapped; `>1` is more off-policy with potentially higher throughput. See [Algorithms § Async](algorithms.md#async--off-policy-training). | | `orchestrator.batch_size` | orchestrator | Prompts per trainer step. | | `orchestrator.rollouts_per_example` | orchestrator | Rollouts per prompt (the group size used for advantage normalization). | | `orchestrator.train.sampling.max_completion_tokens` | orchestrator | Max tokens per turn at sampling time. | @@ -182,8 +184,8 @@ Checkpointing is split across processes because the orchestrator and trainer can | Process | What's saved | Where | |---|---|---| -| Trainer | FSDP-sharded model (DCP), optimizer, scheduler, progress | `/checkpoints/step_N/` | -| Orchestrator | Step counter, total tokens / samples / problems | `/checkpoints/orchestrator/step_N/` | +| Trainer | FSDP-sharded model (DCP), optimizer, scheduler, progress | `/checkpoints/step_N/trainer/` | +| Orchestrator | Step counter, total tokens / samples / problems | `/checkpoints/step_N/orchestrator/` | | Inference | _nothing_ — re-pushed from the latest checkpoint on restart | n/a | | Trainer (HF weights) | HF-compatible weight snapshot for serving | `/weights/step_N/` | @@ -336,7 +338,7 @@ curl -s http://localhost:8000/metrics | grep -E "num_requests|gpu_cache_usage" - **Start small.** Run `configs/gsm8k/rl.toml` end-to-end on 2 GPUs before scaling. If GSM8K runs cleanly, your install is good. - **Eyeball the reward distribution.** If `reward/all/std` collapses to ~0 within a few steps, the env is too easy or rewards are degenerate — increase difficulty or check the rubric. - **Match `inference.parallel.tp` to model layout.** TP > num attention heads / 2 starts losing efficiency. For dense models keep TP small and use DP for throughput. For MoE-heavy models prefer EP. -- **Set `max_async_level` deliberately.** `1` = fully synced overlap (lowest off-policy drift). `2` = default, suited for cross-WAN weight broadcast. Higher values trade more drift for throughput; watch `mismatch_kl/all/mean`. +- **Set `max_async_level` deliberately.** `1` (default) = pipelined overlap, lowest off-policy drift. `2` absorbs longer weight-broadcast latency (e.g. cross-WAN). Higher values trade more drift for throughput; watch `mismatch_kl/all/mean`. - **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. - **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. - **Don't change `optimization_dtype` / `reduce_dtype`.** These are load-bearing — flipping bfloat16/float32 silently changes training dynamics. Stick with defaults unless you know what you're doing. From 154de2e48a52f7e6754948ca6b33f4966cc960f1 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Fri, 22 May 2026 23:55:18 +0000 Subject: [PATCH 03/66] ci: enforce docs/reference.md stays in sync with configs Adds two safety nets so the auto-generated reference can't silently drift from the Pydantic config models: - Pre-commit hook (local): re-runs scripts/generate_docs_reference.py whenever a config class or the generator itself is staged. If the generated file changes, pre-commit fails the commit so the contributor re-stages the regenerated reference. - GitHub Actions (CI): a small workflow runs the generator and `git diff --exit-code docs/reference.md`. Catches anyone who bypassed the pre-commit hook. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/docs-reference.yaml | 42 +++++++++++++++++++++++++++ .pre-commit-config.yaml | 10 +++++++ 2 files changed, 52 insertions(+) create mode 100644 .github/workflows/docs-reference.yaml diff --git a/.github/workflows/docs-reference.yaml b/.github/workflows/docs-reference.yaml new file mode 100644 index 0000000000..baf200fc6b --- /dev/null +++ b/.github/workflows/docs-reference.yaml @@ -0,0 +1,42 @@ +name: Docs reference + +on: + push: + branches: [main] + pull_request: + types: [opened, synchronize, reopened, ready_for_review] + paths: + - "packages/prime-rl-configs/src/prime_rl/configs/**" + - "scripts/generate_docs_reference.py" + - "docs/reference.md" + - ".github/workflows/docs-reference.yaml" + +jobs: + reference-in-sync: + name: docs/reference.md in sync with configs + runs-on: ubuntu-latest + steps: + - name: Checkout repository + uses: actions/checkout@v4 + with: + submodules: false + - name: Init public submodules + run: | + git config --global url."https://github.com/".insteadOf "git@github.com:" + git submodule update --init --recursive -- deps/verifiers deps/renderers deps/research-environments deps/pydantic-config + - name: Install uv + uses: astral-sh/setup-uv@v5 + with: + enable-cache: true + cache-dependency-glob: "uv.lock" + - name: Install dependencies + run: uv sync --all-extras --locked + - name: Regenerate reference.md + run: uv run python scripts/generate_docs_reference.py + - name: Fail if reference.md drifted + run: | + if ! git diff --exit-code docs/reference.md; then + echo "::error::docs/reference.md is out of sync with the configs." + echo "Run: uv run python scripts/generate_docs_reference.py" + exit 1 + fi diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index de1acda1be..cb6a728465 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -9,3 +9,13 @@ repos: # Run the formatter. - id: ruff-format args: [ --config=pyproject.toml ] + +- repo: local + hooks: + - id: regen-docs-reference + name: Regenerate docs/reference.md + language: system + entry: uv run python scripts/generate_docs_reference.py + pass_filenames: false + # Regenerate when any config class or the generator itself changes. + files: ^(packages/prime-rl-configs/src/prime_rl/configs/.*\.py|scripts/generate_docs_reference\.py)$ From b234055c7be1cc99576ddac1cda9e30ef5973107 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Fri, 22 May 2026 23:55:31 +0000 Subject: [PATCH 04/66] chore: rename pre-commit hook id to docs-reference Co-Authored-By: Claude Opus 4.7 (1M context) --- .pre-commit-config.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index cb6a728465..ed7e62bd68 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -12,7 +12,7 @@ repos: - repo: local hooks: - - id: regen-docs-reference + - id: docs-reference name: Regenerate docs/reference.md language: system entry: uv run python scripts/generate_docs_reference.py From 5aa8765bb334e92f93c89baa4d15efea62994f21 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 00:17:33 +0000 Subject: [PATCH 05/66] docs(overview): tighten landing page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Quick run now uses examples/reverse_text/rl.toml; the env is bundled with the verifiers submodule so no prime env install is needed, and the tmux helper is documented elsewhere instead of duplicated here - Architecture bullets advertise the SOTA features per process: vLLM multi-node + FP8 + P/D disaggregation for inference; FSDP2 + EP (incl. DeepEP) + CP + selective AC + FP8 + LoRA + multi-run for the trainer - Drop the "Use prime-rl when you want to" bullets and the CPU-only SFT smoke check — the landing page reads cleaner without them Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/overview.md | 38 ++++++++------------------------------ 1 file changed, 8 insertions(+), 30 deletions(-) diff --git a/docs/overview.md b/docs/overview.md index 8ae0fd8531..4bf20735ed 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -1,13 +1,6 @@ # Overview -`prime-rl` is a framework for large-scale, asynchronous reinforcement learning of large language models. It is designed to be easy to use and hackable, yet capable of scaling to 1000+ GPU clusters. Models are trained with PyTorch FSDP2 (with optional expert and context parallelism), rollouts are generated with vLLM, and the two halves talk to each other through a thin orchestrator process that owns dataset sampling, advantage computation, and weight broadcasting. - -Use `prime-rl` when you want to: - -- Train an open-weights LLM with RL on one of the [Environments Hub](https://app.primeintellect.ai/dashboard/environments?ex_sort=most_stars) tasks, or your own [verifiers](https://github.com/PrimeIntellect-ai/verifiers) environment. -- Post-train with SFT, then continue with RL, using the same model loader, checkpoint format, and chat template plumbing for both phases. -- Scale across multiple nodes — SLURM, Kubernetes, or hand-launched — without rewriting your config. -- Run agentic multi-turn rollouts (tool use, browser environments, long horizons) without re-tokenizing across turns. +`prime-rl` is a framework for large-scale, asynchronous reinforcement learning of large language models. It is designed to be easy to use and hackable, yet capable of training 1T+-parameter MoE models on 1000+ GPU clusters. ## Architecture @@ -15,9 +8,9 @@ A `prime-rl` RL run is three cooperating processes: ![Architecture](assets/architecture.png) -- **Inference** — A vLLM server (or fleet) that holds the current policy weights and serves OpenAI-compatible completions. Updated in place via a custom `update_weights` endpoint after each trainer step. -- **Orchestrator** — A lightweight CPU process that samples prompts, drives `verifiers` environments to generate rollouts against the inference server, packs them into training batches, ships them to the trainer, and relays new weights back to inference. -- **Trainer** — A torchrun-launched FSDP2 process group that consumes packed rollouts, computes the loss, steps the optimizer, and writes the new policy to the weight broadcast transport. +- **Inference** — A vLLM-backed server (or fleet) that holds the current policy and serves OpenAI-compatible completions. Scales from a single co-located GPU to multi-node fleets with tensor + data parallelism, FP8 inference, and prefill/decode disaggregation for high-throughput long-context serving. Updated in place via a custom `update_weights` endpoint, with NCCL or filesystem transports. +- **Orchestrator** — A lightweight CPU process that samples prompts from one or more [verifiers](https://github.com/PrimeIntellect-ai/verifiers) environments, drives multi-turn rollouts against the inference fleet (tool use, browsers, sandboxes, long horizons) without re-tokenizing across turns, computes advantages, packs the rollouts into training batches, and relays new weights back to inference. +- **Trainer** — A torchrun-launched FSDP2 process group that consumes packed rollouts and steps the optimizer. For MoE families we ship optimized custom modeling code with expert parallelism (EP) — including DeepEP kernels — and context parallelism (CP) for long-sequence training. Plus selective activation checkpointing, FP8 training on Hopper+, LoRA, and a multi-run manager that hosts many concurrent adapters in one trainer process. The three processes communicate through configurable transports — by default the trainer↔orchestrator rollout link uses the local filesystem, and weight broadcast uses the filesystem (or NCCL for synchronous setups). Swap to ZMQ for multi-host setups without shared storage. See [Scaling](scaling.md) for the deployment options. @@ -35,31 +28,16 @@ You need at least one NVIDIA GPU (RTX 3090/4090/5090, A100, H100, H200, or B200) ## Quick run -Train `Qwen3-0.6B` on GSM8K with one trainer GPU and one inference GPU. This config ships in the repo: +Train an SFT-warmed `Qwen3-0.6B` on the `reverse-text` task — the env is bundled with the `verifiers` submodule so no separate install is needed. This config ships in the repo and runs on two GPUs (one for inference, one for the trainer): ```bash -# 1. Install the verifiers environment from the Environments Hub. -prime env install primeintellect/math-env - -# 2. Set up a four-pane tmux session that tails each process's logs. -bash scripts/tmux.sh - -# 3. From the `Trainer` pane, launch all three processes co-located on this node. -uv run rl @ configs/gsm8k/rl.toml \ +uv run rl @ examples/reverse_text/rl.toml \ --wandb.project your-project \ - --wandb.name gsm8k-smoke \ + --wandb.name reverse-text-smoke \ --ckpt ``` -The `rl` entrypoint reads `configs/gsm8k/rl.toml`, splits it into per-process sub-configs, picks GPU 0 for inference and GPU 1 for the trainer, launches all three processes, and tees their stdout into `outputs/logs/{trainer,orchestrator,inference}.log`. Watch the tmux panes — within a minute the trainer should log `step 1` and a reward sample. - -After 100 steps the run completes. Final HF-compatible weights land at `outputs/weights/step_100`. - -For a CPU-only smoke check (no real training, no GPU), use the SFT fake-data config: - -```bash -uv run sft @ configs/debug/sft/train.toml -``` +The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-process sub-configs, picks GPU 0 for inference and GPU 1 for the trainer, launches all three processes, and tees their stdout into `outputs/logs/{trainer,orchestrator,inference}.log`. Within a minute the trainer should log `step 1` and a reward sample; after 20 steps the run completes and final HF-compatible weights land at `outputs/weights/step_20`. For multi-GPU, multi-node, SLURM, and Kubernetes layouts, see [Scaling](scaling.md). From 6c6c70e0ff9079b841f01bfb1f7488f903131854 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 00:22:31 +0000 Subject: [PATCH 06/66] docs(configuration): trim and reorganize MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Drop the env-vars section entirely (and the precedence callout that referenced it) — the page is now strictly TOML + CLI. The few named env vars that individual fields read as defaults are out of scope for the config docs and stay in the per-feature pages (training.md, etc.) - Drop the entrypoint enumeration and the W&B/output-dir recommendation blurb in the intro - Reword "@ introduces a TOML file" so the sentence doesn't lead with an inline code token; convert the "Mind the space" hint to a blockquote - Drop --output-dir from the convenience-flag list (it's just another override, not a special flag) - Note that --dry-run is available on rl, sft, and inference only — the standalone trainer and orchestrator configs don't have a dry_run field - Split "Booleans, None, and lists" into one section each, matching the pydantic-config README style; add a Dicts section - Drop the "Prefer --ckpt" convention bullet — checkpointing is covered in training.md and didn't belong in the config conventions Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/configuration.md | 53 ++++++++++++++++++++----------------------- 1 file changed, 25 insertions(+), 28 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index 6fa92cae50..f0374f870e 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1,16 +1,18 @@ # Configuration -Every `prime-rl` entrypoint — `rl`, `sft`, `trainer`, `orchestrator`, `inference`, `eval` — is configured by the same system: TOML files for reproducible base configs, CLI flags for one-off overrides, and a small set of environment variables for production deployments. Under the hood it is [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config) wrapping our Pydantic config models. +Every `prime-rl` entrypoint is configured by the same system: TOML files for reproducible base configs and CLI flags for one-off overrides. Under the hood it is [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config) wrapping our Pydantic config models. ## Table of Contents - [Sources and precedence](#sources-and-precedence) - [TOML files and composition](#toml-files-and-composition) - [CLI overrides](#cli-overrides) -- [Environment variables](#environment-variables) - [Inspecting and validating](#inspecting-and-validating) - [Special syntax](#special-syntax) - - [Booleans, `None`, and lists](#booleans-none-and-lists) + - [Booleans](#booleans) + - [None](#none) + - [Lists](#lists) + - [Dicts](#dicts) - [Optional sub-configs](#optional-sub-configs) - [Discriminated unions](#discriminated-unions) - [Environments (`[[orchestrator.train.env]]`)](#environments-orchestratortrainenv) @@ -25,13 +27,9 @@ For a single field, sources are applied in this order — later sources win: 2. **TOML files** — passed with `@`, left to right (later files override earlier ones). 3. **CLI flags** — dotted, kebab-case (`--model.name`). -There is no generic `PRIME_*` env-var override for arbitrary fields. A small number of specific fields (log levels, API keys, monitor URLs) read named env vars as their **default** — see [Environment variables](#environment-variables) below — but a field that has been set in a TOML or on the CLI is not overridden by an env var. - -Recommendation: pin reproducible experiments in TOML and override one-off knobs (W&B name, output dir, max steps) on the CLI. - ## TOML files and composition -`@` introduces a TOML file. Multiple `@` arguments compose left-to-right, deep-merged — unset fields in an overlay keep the base value: +The `@` token introduces a TOML file. Multiple `@` arguments compose left-to-right, deep-merged — unset fields in an overlay keep the base value: ```bash uv run rl @ configs/gsm8k/rl.toml # one file @@ -40,7 +38,7 @@ uv run rl --trainer @ trainer.toml --orchestrator @ orch.toml # per-section uv run rl @ base.toml --trainer @ trainer.toml # mixed ``` -**Mind the space**: `@ path/to/x.toml`, not `@path/to/x.toml`. +> Mind the space: `@ path/to/x.toml`, not `@path/to/x.toml`. The composed `rl` entrypoint splits its config across three processes — `[trainer]`, `[orchestrator]`, and `[inference]` tables become the sub-configs for each. Shared knobs (`model.name`, `output_dir`, `wandb.*`, …) live at the top level and are fanned out automatically. Stand-alone entrypoints (`uv run trainer`, `uv run orchestrator`, …) skip this lifting — their TOMLs have no `[trainer]` table because the whole file _is_ the trainer. @@ -57,22 +55,10 @@ CLI flags mirror the TOML tree using dots, with kebab-case for field names (the Field names in TOML use snake_case (`max_model_len`); the same field on the CLI is kebab-case (`--max-model-len`). -Three convenience flags every entrypoint accepts: +Two convenience flags every entrypoint accepts: - `--help` — prints the full schema (all fields, defaults, types, descriptions). -- `--dry-run` — resolves the full config, writes it to `/configs/`, and exits without launching anything. Use to debug composition. -- `--output-dir ` — top-level override for the run's working directory (logs, checkpoints, weight snapshots). - -## Environment variables - -Only a fixed set of env vars are wired into individual config fields as their default. They're a convenience for things that legitimately vary per deployment (credentials, log levels) — they don't generalize to "set any field via env var." - -- `PRIME_LOG_LEVEL` / `PRIME_VF_LOG_LEVEL` — defaults for `[log] level` and `[log] vf_level` (the prime-rl and verifiers loggers). -- `WANDB_API_KEY` / `HF_TOKEN` — read directly by W&B and `huggingface_hub`, never by prime-rl itself. -- `PRIME_API_KEY` — read by `[orchestrator.prime_monitor]` for [platform monitoring](training.md#platform-monitoring). The env var name is itself configurable via `prime_monitor.api_key_var`. -- `INFERENCE_SERVER_API_KEY` (or whatever you set in `client.api_key_var`) — used by the orchestrator to authenticate to the inference server. - -Any other field needs to be set in TOML or on the CLI. +- `--dry-run` — resolves the full config, writes it to `/configs/`, and exits without launching anything. Use to debug composition. _Available on `rl`, `sft`, and `inference`; not on the standalone `trainer` or `orchestrator` entrypoints._ ## Inspecting and validating @@ -87,15 +73,17 @@ When a validator fails, the error names the conflicting fields — fix one and r ## Special syntax -### Booleans, `None`, and lists +### Booleans -**Booleans** — CLI uses paired flags: `--ckpt` enables, `--no-ckpt` disables. TOML must be explicit: +CLI uses paired flags: `--ckpt` enables, `--no-ckpt` disables. TOML must be explicit: ```toml ckpt = true ``` -**None** — TOML has no `null`. Use the string `"None"`, which the loader coerces: +### None + +TOML has no `null`. Use the string `"None"`, which the loader coerces: ```toml [inference.model] @@ -104,12 +92,22 @@ max_model_len = "None" On the CLI: `--inference.model.max-model-len None`. -**Lists** — TOML uses arrays of tables (see the env example below). Overlays **replace** lists wholesale, so an overlay that only wants to add an env still has to include the full list. On the CLI, index by position: +### Lists + +TOML uses arrays of tables (see [Environments](#environments-orchestratortrainenv) below for the canonical example). Overlays **replace** lists wholesale, so an overlay that only wants to add an env still has to include the full list. On the CLI, index by position: ```bash --orchestrator.train.env.0.id math-env --orchestrator.train.env.1.id reverse-text ``` +### Dicts + +Use a TOML table or inline-table syntax. On the CLI, pass a JSON literal: + +```bash +--orchestrator.train.env.0.args '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' +``` + ### Optional sub-configs Many sub-configs are typed `SomeConfig | None`. Two patterns enable them: @@ -180,7 +178,6 @@ Each per-process TOML reflects the final, validated configuration that the actua - **Reproducible base, mutable overlays.** Commit base TOMLs alongside example dirs (`configs//rl.toml`). Override on the CLI for one-shot experiments; promote overrides to a new TOML when they stabilize. - **One W&B name per run.** Pass `--wandb.name ` on every launch. The orchestrator and trainer share the W&B run, so the same name surfaces all metrics together. - **Always pin `output_dir`.** Per-run output directories prevent rollout files from one run leaking into another's training step. Use `--output-dir outputs/` or pin in TOML. -- **Prefer `--ckpt` for any run you might resume.** Without `ckpt`, only HF weight snapshots are written — you can serve them but cannot resume optimizer/scheduler state. See [Training § Checkpointing](training.md#checkpointing). - **Dry-run before scaling.** A multi-node SLURM job that crashes on a config validator wastes a queue slot. Always `--dry-run` first. For the full set of fields, defaults, types, and constraints accepted by each entrypoint, jump to [Reference](reference.md). From e031e19032ffc308c3a0a9ba81a3b6690e0d1221 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 00:23:05 +0000 Subject: [PATCH 07/66] docs(training): drop redundant OpenAI-compatible prefix on vLLM server vLLM is an OpenAI-compatible server by default; the prefix in the entrypoints table was just noise. Other mentions of "OpenAI-compatible" describe the API surface or third-party endpoints and stay. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/training.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/training.md b/docs/training.md index 8a9386b893..37cb3cecb8 100644 --- a/docs/training.md +++ b/docs/training.md @@ -35,7 +35,7 @@ This page covers everything you need to launch, observe, checkpoint, and recover |---|---|---| | `uv run rl` | Co-launches inference + orchestrator + trainer on one node | The default for any single-node RL run. Mirrors a `[trainer]` + `[orchestrator]` + `[inference]` TOML. | | `uv run sft` | Supervised fine-tuning on a HF dataset | Launches torchrun internally; never call torchrun directly. | -| `uv run inference` | OpenAI-compatible vLLM server | Always use this entrypoint over `vllm serve` — it adds `/update_weights`, `/load_lora_adapter`, and `/init_broadcaster`. | +| `uv run inference` | vLLM server | Always use this entrypoint over `vllm serve` — it adds `/update_weights`, `/load_lora_adapter`, and `/init_broadcaster`. | | `uv run trainer` | Standalone trainer process group | Use only when launching the trainer separately from the orchestrator (e.g. multi-node RL without the `rl` wrapper). | | `uv run orchestrator` | Standalone orchestrator process | Pair with a separately-launched trainer + inference. | From 1d50afe694f6d02360f10c9b3b35bfcbc1cc599c Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 00:32:40 +0000 Subject: [PATCH 08/66] docs(training): trim and restructure per review - Quick-start now uses examples/reverse_text/rl.toml; drop the prime env install and tmux preamble (covered elsewhere) - Add a "Useful CLI flags" subsection: --ckpt, --wandb, --orchestrator. prime-monitor (Prime Lab), --clean-output-dir, --output-dir, --max-steps, --dry-run - Mention the env-server / env-worker fan-out in the orchestrator bullet under "What each process does at runtime" - Restrict the Key knobs table to orchestrator-only args; drop max_async_level, max_completion_tokens, inference, trainer rows; rename rollouts_per_example row to lead with "Group size" - SFT Launch now uses examples/reverse_text/sft.toml; drop the CPU fake-data smoke alternative - "two distillation modes" -> "three training modes" (rl/opd/sft) - Drop the long-run checkpoint-combo recommendation - Drop the trainer+orchestrator lockstep note from Resuming a run - Swap order: Platform monitoring now appears before Prometheus + BetterStack under Observability; show --orchestrator.prime-monitor CLI invocation - Rename "Metrics that matter" -> "Important metrics"; drop the live vLLM curl snippet - Drop "Eyeball the reward distribution", "Match inference.parallel.tp", and "Set max_async_level deliberately" rules of thumb - Add new rules of thumb: batch size >= 64; group size >= 8 with the reasoning that all-succeed / all-fail groups give the trainer no signal because the within-group advantage collapses - Drop the Common Issues section entirely Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/training.md | 125 ++++++++++++++++++----------------------------- 1 file changed, 48 insertions(+), 77 deletions(-) diff --git a/docs/training.md b/docs/training.md index 37cb3cecb8..14d5e2bacf 100644 --- a/docs/training.md +++ b/docs/training.md @@ -7,6 +7,7 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Entrypoints](#entrypoints) - [RL training](#rl-training) - [Launch](#launch) + - [Useful CLI flags](#useful-cli-flags) - [What each process does at runtime](#what-each-process-does-at-runtime) - [Key knobs](#key-knobs) - [SFT training](#sft-training) @@ -23,11 +24,10 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Log files](#log-files) - [Console output and the tmux helper](#console-output-and-the-tmux-helper) - [Weights & Biases](#weights--biases) - - [Prometheus and BetterStack](#prometheus-and-betterstack) - [Platform monitoring](#platform-monitoring) -- [Metrics that matter](#metrics-that-matter) + - [Prometheus and BetterStack](#prometheus-and-betterstack) +- [Important metrics](#important-metrics) - [Rules of thumb](#rules-of-thumb) -- [Common issues](#common-issues) ## Entrypoints @@ -45,15 +45,12 @@ This page covers everything you need to launch, observe, checkpoint, and recover ### Launch -The minimal single-node RL run uses a shipped example config. From the project root: +The minimal RL run trains an SFT-warmed `Qwen3-0.6B` on the `reverse-text` task — the env is bundled with the `verifiers` submodule, so nothing else needs to be installed. From the project root, on two GPUs (one for inference, one for the trainer): ```bash -prime env install primeintellect/math-env # install the env once -bash scripts/tmux.sh # 4-pane tmux that tails the logs - -uv run rl @ configs/gsm8k/rl.toml \ +uv run rl @ examples/reverse_text/rl.toml \ --wandb.project my-project \ - --wandb.name gsm8k-smoke \ + --wandb.name reverse-text-smoke \ --ckpt ``` @@ -70,31 +67,37 @@ The launcher assigns physical GPUs from `CUDA_VISIBLE_DEVICES` (or all visible G For multi-node and SLURM, see [Scaling § RL training](scaling.md#rl-training). +### Useful CLI flags + +Commonly-used flags every RL launch should know about: + +| Flag | What it does | +|---|---| +| `--ckpt` | Enable end-of-training checkpoint. See [Checkpointing](#checkpointing) for interval / keep-last / resume variants. | +| `--wandb` | Enable Weights & Biases logging with defaults. Pair with `--wandb.project` / `--wandb.name`. | +| `--orchestrator.prime-monitor` | Register the run on the Prime Intellect platform (Lab) and stream metrics there. See [Platform monitoring](#platform-monitoring). | +| `--clean-output-dir` | Wipe `` before starting. Useful when re-running an experiment with the same name during iteration. | +| `--output-dir outputs/` | Per-run output directory. Always set this when running more than one experiment in parallel. | +| `--max-steps N` | Stop after `N` trainer steps. Overrides whatever the config sets. | +| `--dry-run` | Resolve + validate the full config, write per-process TOMLs to `/configs/`, and exit without launching. The fastest way to debug a misbehaving config. | + ### What each process does at runtime - **Inference** (vLLM) holds the current policy and serves OpenAI-compatible completions. Receives a new HF checkpoint via `POST /update_weights` after each trainer step (or batched into one update per `max_async_level` steps). -- **Orchestrator** samples a prompt batch from the configured `[[orchestrator.train.env]]` envs, drives them against the inference server (multi-turn, tool calls, etc.), packs the completed rollouts into a binary batch, writes it under `outputs/rollouts/step_N/`, and notifies the trainer. +- **Orchestrator** samples a prompt batch from the configured `[[orchestrator.train.env]]` envs, drives them against the inference server (multi-turn, tool calls, etc.), packs the completed rollouts into a binary batch, writes it under `outputs/rollouts/step_N/`, and notifies the trainer. The orchestrator talks to one **env server** per train/eval env (sidecar `vf.EnvServer` subprocess by default), and each env server holds a pool of **env workers** that run user code concurrently — that's where most rollout-time CPU work lives. - **Trainer** waits for the binary batch, runs forward/backward/optimizer step under FSDP2, writes new weights to the broadcast transport, and signals the orchestrator that step `N+1` is in flight. The orchestrator is the only stateful CPU process; the trainer is GPU-bound; the inference server is stateless apart from KV cache. On restart the orchestrator pushes the latest checkpoint into inference automatically — you don't need to checkpoint inference state. ### Key knobs -These are the knobs you'll touch most often. The full field reference for each lives in [Reference](reference.md). +The orchestrator owns the data-side knobs that most directly shape what the trainer sees. For trainer-side parallelism, sampling, optimizer, and loss knobs see [Scaling](scaling.md) and [Algorithms](algorithms.md); for the full field reference see [Reference](reference.md). -| Knob | Where | What it controls | -|---|---|---| -| `model.name` | top-level | HF model ID or local path. Auto-fans-out to trainer/orchestrator/inference. | -| `max_steps` | top-level | Number of trainer steps before exit. | -| `seq_len` | top-level | Max sequence length per training sample; also enforced by the orchestrator when packing. | -| `max_async_level` | top-level | How many steps inference can run ahead of the trainer. `1` (default) is fully overlapped; `>1` is more off-policy with potentially higher throughput. See [Algorithms § Async](algorithms.md#async--off-policy-training). | -| `orchestrator.batch_size` | orchestrator | Prompts per trainer step. | -| `orchestrator.rollouts_per_example` | orchestrator | Rollouts per prompt (the group size used for advantage normalization). | -| `orchestrator.train.sampling.max_completion_tokens` | orchestrator | Max tokens per turn at sampling time. | -| `inference.parallel.tp` / `inference.parallel.dp` | inference | Tensor and data parallelism for the inference server. | -| `inference.gpu_memory_utilization` | inference | Fraction of GPU memory vLLM may use. Tighten on co-located single-GPU runs. | -| `trainer.optim.lr` | trainer | Learning rate. Default optimizer is AdamW. | -| `trainer.loss.type` | trainer | Pick the loss variant (default AIPO vs custom). See [Algorithms § Loss](algorithms.md#loss). | +| Knob | What it controls | +|---|---| +| `orchestrator.batch_size` | Prompts per trainer step. | +| `orchestrator.rollouts_per_example` | Group size — rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | +| `orchestrator.training_mode` | Picks the training-mode dispatch: `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft-via-orchestrator). | ## SFT training @@ -113,20 +116,14 @@ If both columns are present, `messages` takes precedence. ### Launch -Single GPU: +The minimal SFT run trains `Qwen3-0.6B` on the `reverse-text` SFT dataset: ```bash -uv run sft @ configs/.toml --wandb +uv run sft @ examples/reverse_text/sft.toml --wandb ``` Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manages this for you — see [Scaling § SFT training](scaling.md#sft-training) for non-default layouts). -A CPU-friendly smoke run with fake data: - -```bash -uv run sft @ configs/debug/sft/train.toml -``` - ### SFT-specific knobs | Knob | What it controls | @@ -139,7 +136,7 @@ uv run sft @ configs/debug/sft/train.toml ## Training modes (RL / OPD / SFT-via-orchestrator) -The RL entrypoint also supports two distillation modes, switched via `orchestrator.training_mode`: +The RL entrypoint supports three training modes, switched via `orchestrator.training_mode`: | Mode | Student | Teacher | Use case | |---|---|---|---| @@ -200,8 +197,6 @@ uv run rl @ rl.toml --ckpt.interval 25 --ckpt.keep-last 3 # rolling window of 3 uv run rl @ rl.toml --ckpt.interval 25 --ckpt.keep-interval 100 # …plus permanent every 100 ``` -Common combo for long runs: `--ckpt.interval 50 --ckpt.keep-last 3 --ckpt.keep-interval 500` — rolling 3-checkpoint window for fast recovery, plus a permanent snapshot every 500 steps. - ### Resuming a run Re-run the same launch command and pass `--ckpt.resume-step ` (or `-1` for "latest"). Make sure `--max-steps` is at least the target final step, not the remaining delta: @@ -214,8 +209,6 @@ uv run rl @ rl.toml --max-steps 10 --ckpt uv run rl @ rl.toml --max-steps 20 --ckpt.resume-step 10 ``` -Trainer + orchestrator step counters are kept in lockstep — both rewind to the same resume step. The inference server can stay running across restarts; the orchestrator pushes the resumed weights on reconnect. - ### Saving HF weights for serving HF-compatible weight snapshots are written under `/weights/step_N/` whenever a full checkpoint runs (or you can write weights-only via `--ckpt.weights-only` for cheaper snapshots). Upload directly: @@ -280,25 +273,31 @@ uv run rl @ rl.toml --wandb \ --no-trainer.wandb.log-extras.distributions ``` -### Prometheus and BetterStack - -For long-running production training: +### Platform monitoring -- **Prometheus**: set `trainer.metrics_server.port` to expose `/metrics` on each trainer process. vLLM also exposes `/metrics` natively — useful for KV-cache saturation and pending-request counts. -- **BetterStack heartbeats**: set `trainer.heartbeat.url` (and the matching orchestrator field) to ping a heartbeat URL each step. Pair with a BetterStack monitor to page on stalls. +Register a run on the Prime Intellect platform (Prime Lab) and stream training metrics, samples, and distributions to the platform dashboard. Bare flag uses defaults: -### Platform monitoring +```bash +uv run rl @ rl.toml --orchestrator.prime-monitor +``` -Internal teams can register runs on the Prime Intellect platform: +Or set it in TOML: ```toml [orchestrator.prime_monitor] run_name = "my-experiment" ``` -This streams training metrics, samples, and distributions to the platform dashboard. Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted team. Currently internal-only. +Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted team. Currently internal-only. + +### Prometheus and BetterStack -## Metrics that matter +For long-running production training: + +- **Prometheus**: set `trainer.metrics_server.port` to expose `/metrics` on each trainer process. vLLM also exposes `/metrics` natively — useful for KV-cache saturation and pending-request counts. +- **BetterStack heartbeats**: set `trainer.heartbeat.url` (and the matching orchestrator field) to ping a heartbeat URL each step. Pair with a BetterStack monitor to page on stalls. + +## Important metrics Pulled from the three console logs (and mirrored to W&B): @@ -327,39 +326,11 @@ Pulled from the three console logs (and mirrored to W&B): | orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | | vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | -Live vLLM stats (Prometheus): - -```bash -curl -s http://localhost:8000/metrics | grep -E "num_requests|gpu_cache_usage" -``` - ## Rules of thumb -- **Start small.** Run `configs/gsm8k/rl.toml` end-to-end on 2 GPUs before scaling. If GSM8K runs cleanly, your install is good. -- **Eyeball the reward distribution.** If `reward/all/std` collapses to ~0 within a few steps, the env is too easy or rewards are degenerate — increase difficulty or check the rubric. -- **Match `inference.parallel.tp` to model layout.** TP > num attention heads / 2 starts losing efficiency. For dense models keep TP small and use DP for throughput. For MoE-heavy models prefer EP. -- **Set `max_async_level` deliberately.** `1` (default) = pipelined overlap, lowest off-policy drift. `2` absorbs longer weight-broadcast latency (e.g. cross-WAN). Higher values trade more drift for throughput; watch `mismatch_kl/all/mean`. +- **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. +- **Batch size ≥ 64.** Smaller batches give noisy gradient estimates and the trainer's overhead-per-step dominates throughput. 64 is the practical floor; 128–512 is typical for production RL. +- **Group size ≥ 8.** Bigger groups (`orchestrator.rollouts_per_example`) make it more likely that a prompt produces a mix of high- and low-reward rollouts, which is what gives the trainer a usable signal — if all rollouts in a group succeed or all fail, the within-group advantage collapses to zero and the trainer learns nothing from that prompt. Bigger groups also tighten advantage normalization. 8 is the floor; 16–32 is common. - **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. - **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. - **Don't change `optimization_dtype` / `reduce_dtype`.** These are load-bearing — flipping bfloat16/float32 silently changes training dynamics. Stick with defaults unless you know what you're doing. - -## Common issues - -**`@ path/to/x.toml` fails to load.** Leave a space between `@` and the path — `@ rl.toml`, not `@rl.toml`. If the error mentions Pydantic, your TOML doesn't match the schema; `--dry-run` will pinpoint the offending field. - -**API timeouts under load.** Bump file descriptors: `ulimit -n 32000`. Our defaults are already generous, so a real timeout usually means inference is saturated — check `time/generate_completions` and vLLM's `gpu_cache_usage_perc`. - -**CUDA OOM in the trainer.** In order, try: - -1. Full activation checkpointing: `--model.ac` (the bare flag enables defaults). -2. Lower `seq_len` or `data.micro_batch_size`. -3. FSDP CPU offload: `--model.fsdp-cpu-offload` (or `--model.optim-cpu-offload` for optimizer states only). -4. Context parallelism: `--model.cp 2` (requires flash-attention; see [Scaling § CP](scaling.md#context-parallelism)). - -**CUDA OOM in inference.** Tighten `inference.gpu_memory_utilization` (start around 0.85), reduce `inference.model.max_model_len`, or split inference across more GPUs via `inference.parallel.dp`. - -**Eval scores frozen but training reward rising.** Likely a chat-template prefix violation eating the model's outputs. Check `orchestrator.renderer` settings (`preserve_all_thinking`, etc.) and use the prime-rl–patched model checkpoint if available. - -**Trainer hangs on weight broadcast.** NCCL transport requires `max_async_level=1` and is incompatible with LoRA — the run will fail at config-validate time if either is set. Otherwise check that all trainer ranks survived the previous step (`grep ERROR logs/trainer/torchrun/`). - -**Run dies mid-step with no traceback.** Look in `/logs/envs/train//env_worker_*.log` first — most silent kills come from OOM-killed env workers running user code. Set `orchestrator.log.vf_level = "debug"` for more verbose env logging. From 59a0d699fa0215b10dc61003e4437e199ceebf45 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 00:36:07 +0000 Subject: [PATCH 09/66] docs: trim scaling layout table and move multi-node logs to training - scaling.md: drop the 1-GPU row and the "Production MoE with long contexts" row from the "Choosing a layout" table; the disaggregated prefill/decode page section is still findable via its own H2 - scaling.md: drop the trailing "Multi-node logs" section (heading + TOC entry); the content now lives next to single-node log layout - training.md: fold the multi-node tree into "Log files" with the single-node skip note inlined; add live-tail recipes and the per-rank torchrun debug note; mention the tmux helper works on a SLURM head node Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/scaling.md | 39 --------------------------------------- docs/training.md | 29 +++++++++++++++++++---------- 2 files changed, 19 insertions(+), 49 deletions(-) diff --git a/docs/scaling.md b/docs/scaling.md index 16a8ec50d7..7374264a56 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -29,19 +29,16 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster - [Kubernetes](#kubernetes) - [Disaggregated prefill/decode inference](#disaggregated-prefilldecode-inference) - [Benchmarking](#benchmarking) -- [Multi-node logs](#multi-node-logs) ## Choosing a layout | You have… | Use this layout | |---|---| -| 1 GPU | Single-GPU co-located RL (small model) or SFT-only | | 1 node, 2–8 GPUs | `uv run rl` with `--deployment.num-infer-gpus N --deployment.num-train-gpus M` | | 1 node, 8 GPUs, large MoE | Custom impl + EP + activation checkpointing | | 2+ nodes, SLURM | `[slurm]` + `[deployment]` overlay (recommended) | | 2+ nodes, no SLURM | Manual `uv run inference` + `uv run orchestrator` + `uv run torchrun src/.../train.py` | | Kubernetes | The bundled Helm chart at `k8s/prime-rl` | -| Production MoE with long contexts | Disaggregated prefill/decode inference | ## Single GPU @@ -544,39 +541,3 @@ uv run rl @ rl.toml --bench ``` Persist results with `--bench.output-json`. Use this to compare parallelism configs before committing a multi-day run. - -## Multi-node logs - -Log layout under `/logs/`: - -``` -trainer.log # symlink → trainer/node_0.log -inference.log # symlink → inference/node_0.log -orchestrator.log # single instance, single file -trainer/ - node_*.log # per-node trainer stdout (rank 0 only) - torchrun/ # per-rank stdout/stderr -inference/ - node_*.log # per-node inference stdout - router_*.log # vllm-router per replica -envs/ - {train,eval}// - env_server.log - env_worker_*.log -``` - -Live tailing from the head node: - -```bash -tail -F /logs/{trainer,orchestrator,inference}.log -tail -F /logs/trainer/node_*.log -tail -F /logs/inference/router_*.log -``` - -The tmux helper also works on the head node: - -```bash -bash scripts/tmux.sh my-rl-job /shared/outputs/my-rl-job -``` - -For multi-rank trainer debugging, drop into `logs/trainer/torchrun//attempt_0//{stdout,stderr}.log` — verbose and per-rank. diff --git a/docs/training.md b/docs/training.md index 14d5e2bacf..0065fc3559 100644 --- a/docs/training.md +++ b/docs/training.md @@ -223,22 +223,33 @@ For LoRA runs, set `ckpt.weights.save_adapter_separately = true` to also write t ### Log files -The launcher tees every process's stdout/stderr into `/logs/`: +The launcher tees every process's stdout/stderr into `/logs/`. The full layout (single-node runs skip the `node_*.log` and `router_*.log` files): ``` /logs/ -├── trainer.log # rank 0 only -├── orchestrator.log -├── inference.log -├── trainer/torchrun//attempt_0//{stdout,stderr}.log +├── trainer.log # rank 0 only; symlink → trainer/node_0.log on multi-node +├── orchestrator.log # single instance, single file +├── inference.log # symlink → inference/node_0.log on multi-node +├── trainer/ +│ ├── node_*.log # per-node trainer stdout (multi-node only) +│ └── torchrun//attempt_0//{stdout,stderr}.log # per-rank +├── inference/ +│ ├── node_*.log # per-node inference stdout (multi-node only) +│ └── router_*.log # vllm-router per replica (multi-node only) └── envs/{train,eval}// ├── env_server.log └── env_worker_.log ``` -Multi-node runs add `trainer/node_*.log` and `inference/node_*.log` — `trainer.log` and `inference.log` at the top level symlink to node 0 for convenience. See [Scaling § Multi-node logs](scaling.md#multi-node-logs). +Env worker logs are the first place to look for env-side errors (most user code lives there). Verbosity is controlled by `orchestrator.log.vf_level`. For multi-rank trainer debugging, drop into `logs/trainer/torchrun//attempt_0//{stdout,stderr}.log` — verbose and per-rank. -Env worker logs are the first place to look for env-side errors (most user code lives there). Verbosity is controlled by `orchestrator.log.vf_level`. +Live tailing from a single point (works on the head node for multi-node runs over a shared filesystem): + +```bash +tail -F /logs/{trainer,orchestrator,inference}.log +tail -F /logs/trainer/node_*.log # multi-node only +tail -F /logs/inference/router_*.log # multi-node only +``` ### Console output and the tmux helper @@ -250,9 +261,7 @@ bash scripts/tmux.sh uv run rl @ ... --output-dir outputs/my-run ``` -Pass `-s ` and `-o ` to run multiple parallel experiments side-by-side in different sessions. - -For multi-node SLURM runs, follow the head-node logs via `tail -f` on the shared filesystem — see [Scaling § SLURM](scaling.md#slurm). +Pass `-s ` and `-o ` to run multiple parallel experiments side-by-side in different sessions. The helper also works on a SLURM head node — `bash scripts/tmux.sh my-rl-job /shared/outputs/my-rl-job`. ### Weights & Biases From 0afd6b8c71633d7cf0b685df695448d765376737 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:06:55 +0000 Subject: [PATCH 10/66] docs(algorithms): add Renderers section, trim renderer prose - New Renderers section explains why best-effort interleaving works: the renderer guarantees the exact-prefix invariant by construction via bridge_to_next_turn. Lists the renderer API surface and the hand-coded model coverage - Drop the verifiers trajectories-design-note link from Discontinuous trajectories and the --trajectory-strategy branching deprecation - Drop preserve_all_thinking workaround mentions from algorithms.md and faqs.md (reference.md still documents the fields) - Leave a TODO(blog-post-url) for the PI site writeup Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/algorithms.md | 22 ++++++++++++++++++++-- docs/faqs.md | 2 +- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/docs/algorithms.md b/docs/algorithms.md index decbaa5e0c..0f1792b212 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -20,6 +20,7 @@ This page covers the math and the configurable algorithmic components: how off-p - [Extension property](#extension-property) - [Best-effort interleaving](#best-effort-interleaving) - [Discontinuous trajectories](#discontinuous-trajectories) +- [Renderers](#renderers) ## Async / off-policy training @@ -260,10 +261,27 @@ tok.apply_chat_template(messages, tokenize=False) # (the R1 from turn 2 is gone) ``` -Workarounds: use a chat template that preserves thinking (we ship patched versions for many models, e.g. `PrimeIntellect/Qwen3-0.6B`), or enable `orchestrator.renderer.preserve_all_thinking = true` so the renderer re-emits past thinking blocks itself. +Workaround: use a chat template that preserves thinking — we ship patched versions for many models, e.g. `PrimeIntellect/Qwen3-0.6B`. ### Discontinuous trajectories Some envs are discontinuous by design — e.g. a main agent delegating to a sub-agent and getting back only a summarized result, not the sub-agent's whole conversation. Best-effort interleaving handles this naturally: each agent's contiguous turns merge, the handoff starts a new sample. The trainer never sees fabricated extension where there is none. -For background on the design, see the verifiers [trajectories design note](https://github.com/PrimeIntellect-ai/verifiers/blob/main/notes/TRAJECTORIES.md). The `--trajectory-strategy branching` option is deprecated — best-effort interleaving covers all cases, falling back to separate samples (equivalent to old branching) when extension breaks. +## Renderers + +Best-effort interleaving only works because the renderer guarantees the exact-prefix invariant *by construction* — it never re-renders prior turns, so it can't lose tokens to chat-template normalization, BPE retokenization drift, or thinking stripping. A renderer turns a model's chat template into a Python object that can: + +- `render_ids(messages)` — tokenize messages to ids the inference engine accepts. +- `parse_response(completion_ids)` — recover structured `(content, reasoning_content, tool_calls)` from sampled ids. +- `bridge_to_next_turn(prev_prompt_ids, prev_completion_ids, new_messages)` — extend the previous turn's tokens verbatim with the new environment turn, instead of re-rendering history. + +When `bridge_to_next_turn` succeeds, the trainer sees the exact token stream the sampler produced; when it can't be proven safe (e.g. the model's renderer is `DefaultRenderer` and the template's stop sequence is unknown), it returns `None` and the orchestrator falls back to a full re-render — which is what triggers the new-sample fallback documented above. + +Hand-coded renderers ship for `qwen3`, `qwen3-vl`, `qwen3.5`, `glm5`, `glm4.5`, `minimax-m2`, `deepseek-v3`, `kimi-k2`, `kimi-k2.5`, `nemotron-3`, `gpt-oss`; anything else falls back to `DefaultRenderer` (a generic `apply_chat_template` wrapper). Pick one via: + +```toml +[orchestrator.renderer] +name = "auto" # detect from tokenizer; pass an explicit name for fine-tunes +``` + +For the full design rationale (failure modes ruled out, empirical token-identity comparison against `apply_chat_template`, when to write a hand-coded renderer), see **TODO(blog-post-url)** — our writeup on the PI site is the canonical reference. diff --git a/docs/faqs.md b/docs/faqs.md index f3e362dde9..76f1b2049e 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -122,7 +122,7 @@ This talks to any OpenAI-compatible endpoint, so it works against `uv run infere ### Why does Qwen3 fail multi-turn SFT silently? -Qwen3's default chat template strips past `` blocks when re-tokenizing, which violates the prefix property the SFT trainer depends on. Use a model with a patched chat template (e.g. `PrimeIntellect/Qwen3-0.6B`) or set `orchestrator.renderer.preserve_all_thinking = true`. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). +Qwen3's default chat template strips past `` blocks when re-tokenizing, which violates the prefix property the SFT trainer depends on. Use a model with a patched chat template — we ship one at `PrimeIntellect/Qwen3-0.6B`. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). ### Can I train on `prompt`/`completion` and `messages` mixed in one dataset? From 9cff8f41cfa755e24d692909545ea57916a6826e Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:08:50 +0000 Subject: [PATCH 11/66] docs: refresh FAQ + training W&B claim + recommend prime eval MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit faqs.md: - Drop the "override an env var in TOML" Q&A (matches the configuration page where env vars are no longer documented as a generic override) - Drop the "max_async_level" Q&A; replace with a max_off_policy_steps Q&A — the more impactful knob to tune on long agentic rollouts - Drop the outdated "two W&B runs per RL job" Q&A; default is shared now (wandb.shared = true) - Drop the SFT-section Q&As that referenced preserve_all_thinking or were too thin to keep - Switch the "evaluate without training" recipe from vf-eval to prime eval run (the Prime CLI is the recommended entrypoint) training.md: - Fix the W&B section to describe the new default: shared single run (wandb.shared = true), with the legacy split as opt-out - Add max_off_policy_steps to the Key knobs table - Switch the eval example from vf-eval to prime eval run Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/faqs.md | 40 ++++++++-------------------------------- docs/training.md | 17 ++++++++--------- 2 files changed, 16 insertions(+), 41 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 76f1b2049e..00d847f0ad 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -7,7 +7,6 @@ Frequently-asked questions, grouped by topic. For full background see the linked - [Getting started](#getting-started) - [Configs](#configs) - [RL training](#rl-training) -- [SFT training](#sft-training) - [Checkpoints and resume](#checkpoints-and-resume) - [Scaling](#scaling) - [Memory and OOM](#memory-and-oom) @@ -60,10 +59,6 @@ uv run rl @ rl.toml --no-trainer.gc # disable garbage collection config In TOML, comment out or remove the section. -### How do I override an env var in TOML? - -You can't directly — env vars are a separate source. To force a fixed value, set it in TOML; the precedence order (CLI > TOML > env > defaults) means the TOML wins. - ### How do I add a new environment to my training mix? Add another `[[orchestrator.train.env]]` table. Lists are replaced wholesale on overlay, so include the full list every time: @@ -81,13 +76,9 @@ See [Configuration § Environments](configuration.md#environments-orchestratortr ## RL training -### What does `max_async_level` actually do? - -It caps how many steps inference can run ahead of training. `1` (default) is pipelined — inference for step n+1 runs concurrently with trainer step n; off-policy drift is minimal. `2` absorbs slower weight broadcasts (e.g. cross-WAN). Higher values give more throughput at the cost of more drift; watch `mismatch_kl/all/mean`. See [Algorithms § Tuning `max_async_level`](algorithms.md#tuning-max_async_level). +### What should I tune for off-policy noise on long agentic rollouts? -### Why are there two W&B runs per RL job? - -The trainer and orchestrator log as separate runs so their step indices and timings stay independent. The names are `-trainer` and `-orchestrator`. Group them in W&B if you want a unified view. +`orchestrator.max_off_policy_steps` (default 8). It caps how many distinct policies are allowed to have contributed to a single rollout — rollouts whose source policy fell more than that many steps behind the trainer get discarded. On long multi-turn rollouts (SWE, browsing, anything where one rollout spans many trainer steps), this is often the most important throughput-vs-noise knob: bump it for higher throughput and accept more off-policy noise; lower it to keep training tighter. Watch the `errored_rollouts` and `mismatch_kl/all/mean` metrics when changing it. ### My reward isn't improving. What should I check first? @@ -101,37 +92,22 @@ In order: ### How do I evaluate without training? -Use `vf-eval`: +Use `prime eval` (from the [`prime` CLI](https://docs.primeintellect.ai/cli-reference/introduction)) — it defaults to Prime Inference but accepts any OpenAI-compatible endpoint via `--provider vllm --api-base-url ...`. Works against `uv run inference`, hosted endpoints, or a stale checkpoint mid-run. ```bash -uv run vf-eval math-env \ - -a '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ - -m PrimeIntellect/Qwen3-0.6B \ - -b http://localhost:8000/v1 -n 50 -t 2048 +prime eval run math-env \ + --env-args '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ + --model PrimeIntellect/Qwen3-0.6B \ + --provider vllm --api-base-url http://localhost:8000/v1 \ + --num-examples 50 --max-tokens 2048 ``` -This talks to any OpenAI-compatible endpoint, so it works against `uv run inference`, hosted endpoints, or a stale checkpoint mid-run. - ### What's the difference between `training_mode = "sft"` and the standalone `uv run sft`? `uv run sft` is the traditional path: load a HF dataset, train the model. No orchestrator, no teacher. `orchestrator.training_mode = "sft"` uses the RL pipeline to hard-distill from a teacher: the teacher (any OpenAI-compatible endpoint) generates the completions, and the student trains on them as they're produced. Use this when you want on-the-fly teacher supervision against a moving student. See [Training § Training modes](training.md#training-modes-rl--opd--sft-via-orchestrator). -## SFT training - -### Why does Qwen3 fail multi-turn SFT silently? - -Qwen3's default chat template strips past `` blocks when re-tokenizing, which violates the prefix property the SFT trainer depends on. Use a model with a patched chat template — we ship one at `PrimeIntellect/Qwen3-0.6B`. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). - -### Can I train on `prompt`/`completion` and `messages` mixed in one dataset? - -Yes — if both columns are present in a row, `messages` takes precedence. The trainer will use `messages` for that row and ignore `prompt`/`completion`. - -### How do I do tool-calling SFT? - -Tool-calling SFT works out of the box if your dataset uses the `messages` format with tool messages embedded. The renderer handles tool turns the same as assistant turns. Make sure your model's chat template supports tool tokens. - ## Checkpoints and resume ### How often should I checkpoint? diff --git a/docs/training.md b/docs/training.md index 0065fc3559..dbd86fedce 100644 --- a/docs/training.md +++ b/docs/training.md @@ -97,6 +97,7 @@ The orchestrator owns the data-side knobs that most directly shape what the trai |---|---| | `orchestrator.batch_size` | Prompts per trainer step. | | `orchestrator.rollouts_per_example` | Group size — rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | +| `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it gets discarded (default 8). The main throughput-vs-noise dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | | `orchestrator.training_mode` | Picks the training-mode dispatch: `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft-via-orchestrator). | ## SFT training @@ -163,18 +164,16 @@ name = "gsm8k-eval" args = { dataset_name = "openai/gsm8k", dataset_subset = "main", split = "test" } ``` -Eval scores land in the trainer logs as `eval/{env}/{avg@k,pass@k}` and in W&B under the same keys. For one-off evaluations outside of training, use `vf-eval`: +Eval scores land in the trainer logs as `eval/{env}/{avg@k,pass@k}` and in W&B under the same keys. For one-off evaluations outside of training, use `prime eval` (from the [`prime` CLI](https://docs.primeintellect.ai/cli-reference/introduction)) — it defaults to Prime Inference but talks to any OpenAI-compatible endpoint via `--provider vllm --api-base-url ...`: ```bash -uv run vf-eval math-env \ - -a '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ - -m PrimeIntellect/Qwen3-0.6B \ - -b http://localhost:8000/v1 \ - -n 50 -t 2048 +prime eval run math-env \ + --env-args '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ + --model PrimeIntellect/Qwen3-0.6B \ + --provider vllm --api-base-url http://localhost:8000/v1 \ + --num-examples 50 --max-tokens 2048 ``` -`vf-eval` talks to any OpenAI-compatible endpoint, so it works against `uv run inference`, hosted endpoints, or a stale checkpoint mid-run. - ## Checkpointing Checkpointing is split across processes because the orchestrator and trainer can be on different machines and on different steps at any given time. Inference is stateless. @@ -272,7 +271,7 @@ uv run rl @ rl.toml --wandb # default project, ran uv run rl @ rl.toml --wandb.project my-proj --wandb.name run-42 ``` -For RL runs the trainer and orchestrator log as **two separate runs** with the same name: `-trainer` and `-orchestrator`. You'll usually want both grouped in a W&B group. +By default (`wandb.shared = true`) the trainer and orchestrator log into a **single shared W&B run**, so all metrics from both processes land in one place. Set `wandb.shared = false` (or pass `--no-wandb.shared`) to fall back to the legacy split — two runs suffixed `-trainer` and `-orchestrator`. Shared mode requires the W&B SDK ≥ 0.19.9 and is incompatible with `wandb.offline = true`. By default, every 10 steps each process also logs a sample of prompts/completions (with rewards and advantages) and reward/advantage/entropy distributions as W&B tables. Tune via `--wandb.log-extras.interval` and `--wandb.log-extras.sample-ratio`, or disable subsets: From b305b861eed69d7f3bcb517d0832f1d29efb4f40 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:09:09 +0000 Subject: [PATCH 12/66] docs(faqs): drop the CP <= 8 recommendation Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/faqs.md b/docs/faqs.md index 00d847f0ad..39b60d256f 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -159,7 +159,7 @@ NCCL broadcast is much faster than filesystem for local-cluster setups, at the c - **TP (inference)**: scale within a node, up to `num_attention_heads / 2`. Past that, returns diminish. - **DP (inference and trainer)**: scale throughput linearly across replicas. Default scaling lever. - **EP (trainer, MoE only)**: shards expert weights; the right knob for MoE memory and throughput together. -- **CP (trainer)**: shards a sequence across GPUs along the token axis. Needed for sequences past ~32K tokens. Stick to CP ≤ 8. +- **CP (trainer)**: shards a sequence across GPUs along the token axis. Needed for sequences past ~32K tokens. See [Scaling § Parallelism knobs](scaling.md#parallelism-knobs). From 5b9898d2addc8fb767a46b5674e11298439ced40 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:09:41 +0000 Subject: [PATCH 13/66] docs(faqs): drop the vLLM log-quieting and KV-cache pressure Q&As Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/faqs.md | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 39b60d256f..1718527f07 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -199,18 +199,6 @@ uv run rl @ rl.toml --orchestrator.log.vf-level debug Or set `PRIME_VF_LOG_LEVEL=debug` in the environment. -### vLLM is logging too much. Can I quiet it? - -Set `inference.log.level = "warning"` (or pass `--inference.log.level warning`). Note that `inference.log` only controls the prime-rl logger; vLLM's own logging is controlled by `VLLM_LOGGING_LEVEL` env var. - -### What's the fastest way to see KV cache pressure? - -```bash -curl -s http://localhost:8000/metrics | grep gpu_cache_usage_perc -``` - -Approaching 1.0 means KV cache is saturated and request latency will spike. Reduce `max_model_len` or split across more inference GPUs. - ## Models and environments ### Which models have a custom optimized implementation? From dc40f710853995743dac969c53d75a7175bcb613 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:09:58 +0000 Subject: [PATCH 14/66] docs(faqs): drop the Environments Hub install Q&A Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/faqs.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 1718527f07..6aa2ffcbf6 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -211,15 +211,6 @@ Other HF causal LMs work via the HF path (`impl = "hf"` or `"auto"`) but without Yes — Qwen3-VL, Qwen3.5, Qwen3.5-MoE out of the box. Add `[model.vlm]` and use bfloat16 dtypes. See [Advanced § Vision-language models](advanced.md#vision-language-models). -### How do I install an environment from the Environments Hub? - -```bash -prime env install primeintellect/math-env -uv run python -c "import math_env" # verify -``` - -Then reference by ID in your config. See [Advanced § Environments](advanced.md#environments). - ### Can I install an environment from outside the Hub? Yes — install with `uv pip install -e path/to/my-env` and reference it by its `id` (the env's package name). The orchestrator will discover it. From 47671be64246a85fc8fbae56a5bf5ccd2a18bad9 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Sat, 23 May 2026 01:10:22 +0000 Subject: [PATCH 15/66] docs(faqs): drop the Models and environments section Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/faqs.md | 21 --------------------- 1 file changed, 21 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 6aa2ffcbf6..730c772d27 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -11,7 +11,6 @@ Frequently-asked questions, grouped by topic. For full background see the linked - [Scaling](#scaling) - [Memory and OOM](#memory-and-oom) - [Observability](#observability) -- [Models and environments](#models-and-environments) ## Getting started @@ -198,23 +197,3 @@ uv run rl @ rl.toml --orchestrator.log.vf-level debug ``` Or set `PRIME_VF_LOG_LEVEL=debug` in the environment. - -## Models and environments - -### Which models have a custom optimized implementation? - -GLM-5, Qwen3 MoE, Qwen3.5 MoE, Qwen3 / Qwen3.5 VLMs, Poolside Laguna, MiniMax M2, Nemotron H, Trinity (AFMoE), GLM-4 / GLM-4.5 / INTELLECT-3, GPT-OSS (HF-MoE only). See the table in [Advanced § MoE models](advanced.md#moe-models). - -Other HF causal LMs work via the HF path (`impl = "hf"` or `"auto"`) but without EP, FP8, or the custom kernels. - -### Can I train a VLM? - -Yes — Qwen3-VL, Qwen3.5, Qwen3.5-MoE out of the box. Add `[model.vlm]` and use bfloat16 dtypes. See [Advanced § Vision-language models](advanced.md#vision-language-models). - -### Can I install an environment from outside the Hub? - -Yes — install with `uv pip install -e path/to/my-env` and reference it by its `id` (the env's package name). The orchestrator will discover it. - -### My environment hangs occasionally. What's happening? - -Most likely it's running user code that blocks on a network call or an external service (e.g. a math verifier, a sandbox). Check the env worker logs and the event-loop lag metrics on the env server. The orchestrator's `max_retries` and `errored_rollouts` metric should tell you how often rollouts fail vs hang. From 1bb335042827478d2ee5bb85db3645802e815d71 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 20:53:00 +0000 Subject: [PATCH 16/66] docs(overview): second tightening pass - Inference bullet now leads with the local default (token-in /v1/generate via renderers; OpenAI-compatible routes called out as the external-client path) and adds DP/TP/EP with deepep + flashinfer all-to-all backends + EPLB, P/D disaggregation behind vllm-router, CPU KV-cache offload, and router replay (FP8 MoE numerical-parity feature). Weight broadcast is filesystem or NCCL. - Orchestrator bullet now leads with "owns the data plane across many verifiers training and eval environments" plus the per-env isolated subprocess + variable-size env-worker pool. - Trainer bullet drops "torchrun-launched" and surfaces the custom modeling code as the enabler for advanced trainer parallelism (EP with DeepEP, CP for long sequences). - Drop the [AIPO] link in the async paragraph (off-policy-aware PG + KL regularizer, no paper handle); also drop the "AIPO loss" mention from the Algorithms blurb in "Where to go next" so the page is internally consistent. - Quick-run command is now bare: uv run rl @ examples/reverse_text/rl.toml (no --wandb.* / --ckpt). - Drop the trailing scaling pointer (Scaling is already linked in "Where to go next"). Co-authored-by: Cursor --- docs/overview.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/docs/overview.md b/docs/overview.md index 4bf20735ed..2781fcc3c7 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -8,13 +8,13 @@ A `prime-rl` RL run is three cooperating processes: ![Architecture](assets/architecture.png) -- **Inference** — A vLLM-backed server (or fleet) that holds the current policy and serves OpenAI-compatible completions. Scales from a single co-located GPU to multi-node fleets with tensor + data parallelism, FP8 inference, and prefill/decode disaggregation for high-throughput long-context serving. Updated in place via a custom `update_weights` endpoint, with NCCL or filesystem transports. -- **Orchestrator** — A lightweight CPU process that samples prompts from one or more [verifiers](https://github.com/PrimeIntellect-ai/verifiers) environments, drives multi-turn rollouts against the inference fleet (tool use, browsers, sandboxes, long horizons) without re-tokenizing across turns, computes advantages, packs the rollouts into training batches, and relays new weights back to inference. -- **Trainer** — A torchrun-launched FSDP2 process group that consumes packed rollouts and steps the optimizer. For MoE families we ship optimized custom modeling code with expert parallelism (EP) — including DeepEP kernels — and context parallelism (CP) for long-sequence training. Plus selective activation checkpointing, FP8 training on Hopper+, LoRA, and a multi-run manager that hosts many concurrent adapters in one trainer process. +- **Inference** — vLLM-backed server (or fleet) holding the current policy. The orchestrator drives rollouts through the token-in `/v1/generate` route via the [`renderers`](https://github.com/PrimeIntellect-ai/renderers) package (OpenAI-compatible chat/completions routes are also exposed for external clients). Supports data + tensor + expert parallelism (with `deepep` and `flashinfer` all-to-all backends and EPLB), FP8 inference, prefill/decode disaggregation behind a `vllm-router`, CPU KV-cache offload, and *router replay* (the routed-expert mask is returned to the trainer for FP8 MoE numerical parity). Weights are pushed in place through a custom `update_weights` endpoint over filesystem or NCCL transports. +- **Orchestrator** — Lightweight CPU process that owns the data plane across many [verifiers](https://github.com/PrimeIntellect-ai/verifiers) training and eval environments. Each env runs in an isolated subprocess with a variable-size pool of env workers for scalability. The orchestrator drives multi-turn rollouts against the inference fleet (tool use, browsers, sandboxes, long horizons) without re-tokenizing across turns, computes advantages, packs the rollouts into training batches, and relays new weights from trainer to inference. +- **Trainer** — FSDP2 process group that consumes packed rollouts and steps the optimizer. We ship optimized custom modeling code for many MoE / dense / VLM families that unlocks advanced trainer parallelism — expert parallelism (EP, with DeepEP kernels) and context parallelism (CP) for long-sequence training — plus selective activation checkpointing, FP8 training on Hopper+, LoRA, and a multi-run manager that hosts many concurrent adapters in one trainer process. The three processes communicate through configurable transports — by default the trainer↔orchestrator rollout link uses the local filesystem, and weight broadcast uses the filesystem (or NCCL for synchronous setups). Swap to ZMQ for multi-host setups without shared storage. See [Scaling](scaling.md) for the deployment options. -Training is **asynchronous by default**: inference is allowed to run ahead of training by up to `max_async_level` steps, which hides the weight-broadcast latency behind ongoing rollouts. The loss is an off-policy-aware variant of [AIPO](https://arxiv.org/abs/2505.24034); see [Algorithms](algorithms.md) for the details. +Training is **asynchronous by default**: inference is allowed to run ahead of training by up to `max_async_level` steps, which hides the weight-broadcast latency behind ongoing rollouts. The loss combines an off-policy-aware policy gradient with a KL regularizer; see [Algorithms](algorithms.md) for the details. ## Installation @@ -31,22 +31,17 @@ You need at least one NVIDIA GPU (RTX 3090/4090/5090, A100, H100, H200, or B200) Train an SFT-warmed `Qwen3-0.6B` on the `reverse-text` task — the env is bundled with the `verifiers` submodule so no separate install is needed. This config ships in the repo and runs on two GPUs (one for inference, one for the trainer): ```bash -uv run rl @ examples/reverse_text/rl.toml \ - --wandb.project your-project \ - --wandb.name reverse-text-smoke \ - --ckpt +uv run rl @ examples/reverse_text/rl.toml ``` The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-process sub-configs, picks GPU 0 for inference and GPU 1 for the trainer, launches all three processes, and tees their stdout into `outputs/logs/{trainer,orchestrator,inference}.log`. Within a minute the trainer should log `step 1` and a reward sample; after 20 steps the run completes and final HF-compatible weights land at `outputs/weights/step_20`. -For multi-GPU, multi-node, SLURM, and Kubernetes layouts, see [Scaling](scaling.md). - ## Where to go next - **[Configuration](configuration.md)** — How TOML files, `@` composition, CLI overrides, and env vars combine; the precedence rules; worked examples. - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. -- **[Algorithms](algorithms.md)** — Async / off-policy semantics; the AIPO loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. +- **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. - **[Advanced](advanced.md)** — MoE training (EP backends, custom impls); VLMs; LoRA and the multi-run manager; small-scale MoE testing; environments deep-dive. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. From 9ec03bb392f9269b60001ef021b93d7bd6b32e62 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 20:53:13 +0000 Subject: [PATCH 17/66] docs(configuration): second tightening pass MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Drop the entrypoint-splitting paragraph ([trainer] / [orchestrator] / [inference] table lifting); covered elsewhere. - Rename "TOML files and composition" -> "TOML composition", and "Special syntax" -> "Syntax". - Open "Sources and precedence" by naming the three sources (Pydantic defaults, TOML files, CLI flags) up front, then layering them. - Drop the "(-- is a kebab-case marker)" parenthetical from CLI overrides; turn the snake/kebab note into a callout. - Drop the --help / --dry-run convenience-flag block and the "--dry-run is the single most useful debugging tool" prose; the bash example is enough. - Reorder Syntax subsections to mirror the pydantic-config README: Booleans -> Lists -> Dicts -> Optional sub-configs -> None -> Discriminated unions -> Environments. None moves down and is cross-linked from "disabling an optional sub-config". - Booleans example swapped from --ckpt (which is itself an optional sub-config) to --clean-output-dir (a real bool = False field), showing both --flag and --no-flag forms. - Lists / Dicts now show TOML and CLI on the *same* field name so the mapping is obvious (target_modules for lists, env.0.args for dicts), and add the "lists are replaced wholesale" overlay note + "dicts deep-merge across sources" detail. - Add a callout on validation aliases (rollouts_per_example still works after the rename to group_size) — only material gap vs the pydantic-config README that's relevant to end users. - Worked example: --dry-run is now the final flag. - Drop the Conventions section. Co-authored-by: Cursor --- docs/configuration.md | 98 ++++++++++++++++++++++--------------------- 1 file changed, 50 insertions(+), 48 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index f0374f870e..70e0ab9e09 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1,33 +1,32 @@ # Configuration -Every `prime-rl` entrypoint is configured by the same system: TOML files for reproducible base configs and CLI flags for one-off overrides. Under the hood it is [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config) wrapping our Pydantic config models. +Every `prime-rl` entrypoint uses [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config): TOML files for reproducible base configs, CLI flags for one-off overrides. ## Table of Contents - [Sources and precedence](#sources-and-precedence) -- [TOML files and composition](#toml-files-and-composition) +- [TOML composition](#toml-composition) - [CLI overrides](#cli-overrides) - [Inspecting and validating](#inspecting-and-validating) -- [Special syntax](#special-syntax) +- [Syntax](#syntax) - [Booleans](#booleans) - - [None](#none) - [Lists](#lists) - [Dicts](#dicts) - [Optional sub-configs](#optional-sub-configs) + - [None](#none) - [Discriminated unions](#discriminated-unions) - [Environments (`[[orchestrator.train.env]]`)](#environments-orchestratortrainenv) - [Worked example](#worked-example) -- [Conventions](#conventions) ## Sources and precedence -For a single field, sources are applied in this order — later sources win: +Field values come from three sources — Pydantic defaults, TOML files (passed with `@`), and CLI flags. They're layered in this order, with later sources winning: -1. **Defaults** — declared on the Pydantic model. -2. **TOML files** — passed with `@`, left to right (later files override earlier ones). -3. **CLI flags** — dotted, kebab-case (`--model.name`). +1. **Defaults** declared on the Pydantic model. +2. **TOML files** passed with `@`, left to right — later files override earlier ones. +3. **CLI flags** in dotted, kebab-case form (`--model.name`). -## TOML files and composition +## TOML composition The `@` token introduces a TOML file. Multiple `@` arguments compose left-to-right, deep-merged — unset fields in an overlay keep the base value: @@ -40,11 +39,9 @@ uv run rl @ base.toml --trainer @ trainer.toml # mixed > Mind the space: `@ path/to/x.toml`, not `@path/to/x.toml`. -The composed `rl` entrypoint splits its config across three processes — `[trainer]`, `[orchestrator]`, and `[inference]` tables become the sub-configs for each. Shared knobs (`model.name`, `output_dir`, `wandb.*`, …) live at the top level and are fanned out automatically. Stand-alone entrypoints (`uv run trainer`, `uv run orchestrator`, …) skip this lifting — their TOMLs have no `[trainer]` table because the whole file _is_ the trainer. - ## CLI overrides -CLI flags mirror the TOML tree using dots, with kebab-case for field names (the leading `--` is a kebab-case marker; TOML stays snake_case): +CLI flags mirror the TOML tree using dots: ```bash --max-steps 50 # top-level @@ -53,12 +50,9 @@ CLI flags mirror the TOML tree using dots, with kebab-case for field names (the --inference.parallel.tp 4 ``` -Field names in TOML use snake_case (`max_model_len`); the same field on the CLI is kebab-case (`--max-model-len`). +> Field names are snake_case in TOML (`max_model_len`) and kebab-case on the CLI (`--max-model-len`). -Two convenience flags every entrypoint accepts: - -- `--help` — prints the full schema (all fields, defaults, types, descriptions). -- `--dry-run` — resolves the full config, writes it to `/configs/`, and exits without launching anything. Use to debug composition. _Available on `rl`, `sft`, and `inference`; not on the standalone `trainer` or `orchestrator` entrypoints._ +> Renamed fields keep their old name as a validation alias — e.g. `rollouts_per_example` is still accepted in TOML and CLI after being renamed to `group_size`. Mixing the two names across sources is safe. ## Inspecting and validating @@ -67,45 +61,49 @@ uv run rl --help # full schema uv run rl @ rl.toml --dry-run --output-dir /tmp/check # write resolved configs ``` -`--dry-run` is the single most useful debugging tool: it runs every Pydantic validator (catching incompatibilities like CP requiring flash-attention, or NCCL weight broadcast requiring `max_async_level=1`) and dumps the fully merged config to disk. If a run misbehaves in mysterious ways, dry-run it first and inspect `/configs/`. - -When a validator fails, the error names the conflicting fields — fix one and re-run dry-run until clean. - -## Special syntax +## Syntax ### Booleans -CLI uses paired flags: `--ckpt` enables, `--no-ckpt` disables. TOML must be explicit: +CLI uses paired flags: bare `--flag` sets `True`, `--no-flag` sets `False`. TOML must be explicit: -```toml -ckpt = true +```bash +uv run rl @ rl.toml --clean-output-dir # True +uv run rl @ rl.toml --no-clean-output-dir # False ``` -### None - -TOML has no `null`. Use the string `"None"`, which the loader coerces: - ```toml -[inference.model] -max_model_len = "None" +clean_output_dir = true ``` -On the CLI: `--inference.model.max-model-len None`. - ### Lists -TOML uses arrays of tables (see [Environments](#environments-orchestratortrainenv) below for the canonical example). Overlays **replace** lists wholesale, so an overlay that only wants to add an env still has to include the full list. On the CLI, index by position: +CLI accepts space-separated values or a JSON literal. TOML uses an array literal. Both forms target the same field: ```bash ---orchestrator.train.env.0.id math-env --orchestrator.train.env.1.id reverse-text +uv run rl @ rl.toml --trainer.model.lora.target-modules q_proj k_proj v_proj +uv run rl @ rl.toml --trainer.model.lora.target-modules '["q_proj", "k_proj", "v_proj"]' +``` + +```toml +[trainer.model.lora] +target_modules = ["q_proj", "k_proj", "v_proj"] ``` +Overlay TOMLs **replace** lists wholesale — an overlay that wants to add one item must still spell out the full list. For arrays of tables (e.g. environments), see [Environments](#environments-orchestratortrainenv). + ### Dicts -Use a TOML table or inline-table syntax. On the CLI, pass a JSON literal: +CLI takes a JSON literal. TOML uses a table or inline-table. CLI dicts deep-merge with TOML dicts — CLI keys win on conflict but don't wipe the file's keys: ```bash ---orchestrator.train.env.0.args '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' +uv run rl @ rl.toml --orchestrator.train.env.0.args \ + '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' +``` + +```toml +[[orchestrator.train.env]] +args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } ``` ### Optional sub-configs @@ -115,7 +113,18 @@ Many sub-configs are typed `SomeConfig | None`. Two patterns enable them: - **Bare flag with defaults**: `--model.compile` or, in TOML, an empty section `[model.compile]`. The sub-config materializes with all-default values. - **Enable and set fields together**: `--model.compile.fullgraph` (CLI) or any populated `[model.compile]` table (TOML). -This is how `[ckpt]`, `[model.lora]`, `[model.compile]`, `[trainer.wandb]`, etc. are turned on. +To **disable** a sub-config that's on by default, use `--no-` on the CLI or assign the string `"None"` in TOML (see [None](#none)). This is how `[ckpt]`, `[model.lora]`, `[model.compile]`, `[trainer.wandb]`, etc. are turned on and off. + +### None + +TOML has no `null`. Use the string `"None"`, which the loader coerces: + +```toml +[inference.model] +max_model_len = "None" +``` + +On the CLI: `--inference.model.max-model-len None`. ### Discriminated unions @@ -160,8 +169,8 @@ Start from a shipped base config, override two fields on the CLI, and dry-run: uv run rl @ configs/gsm8k/rl.toml \ --wandb.name my-experiment \ --trainer.optim.lr 5e-6 \ - --dry-run \ - --output-dir /tmp/gsm8k-dry + --output-dir /tmp/gsm8k-dry \ + --dry-run ``` Then inspect the resolved config: @@ -173,11 +182,4 @@ ls /tmp/gsm8k-dry/configs/ Each per-process TOML reflects the final, validated configuration that the actual run would consume — exactly what each process sees when started standalone (`uv run trainer @ /tmp/gsm8k-dry/configs/trainer.toml`, etc.). This is the easiest way to bisect a misbehaving config: dry-run a known-good base, dry-run your overlay, diff the two. -## Conventions - -- **Reproducible base, mutable overlays.** Commit base TOMLs alongside example dirs (`configs//rl.toml`). Override on the CLI for one-shot experiments; promote overrides to a new TOML when they stabilize. -- **One W&B name per run.** Pass `--wandb.name ` on every launch. The orchestrator and trainer share the W&B run, so the same name surfaces all metrics together. -- **Always pin `output_dir`.** Per-run output directories prevent rollout files from one run leaking into another's training step. Use `--output-dir outputs/` or pin in TOML. -- **Dry-run before scaling.** A multi-node SLURM job that crashes on a config validator wastes a queue slot. Always `--dry-run` first. - For the full set of fields, defaults, types, and constraints accepted by each entrypoint, jump to [Reference](reference.md). From 6f73062ed5435ccae90f2975b123beb8a1bca4a2 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:25:18 +0000 Subject: [PATCH 18/66] docs(training): second tightening pass MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Rename "RL training" -> "RL trainer" and "SFT training" -> "SFT trainer" (and update the page intro accordingly). - Entrypoints table: clarify that `uv run rl` wraps the trainer, orchestrator, and inference server in one launch — runs locally for single-node experiments and submits to SLURM for single- or multi-node when [slurm] is set. Drop the trailing "rl is a convenience wrapper" paragraph. - RL trainer Launch: minimal command is now bare `uv run rl @ examples/reverse_text/rl.toml` (no flags). Drop the GPU-placement paragraph + multi-GPU example (covered in scaling.md). - Replace "Useful CLI flags" + "Key knobs" + "What each process does at runtime" with one consolidated "Useful knobs" section split into three sub-tables: data-and-algorithm, monitoring, run management. - Add training environments ([[orchestrator.train.env]] for multi-env training) - Add eval environments ([[orchestrator.eval.env]] + orchestrator.eval.interval) - Add monitoring entries: orchestrator.log.vf_level, --wandb, --orchestrator.prime-monitor - Move Training modes from a top-level section into RL trainer as a subsection (it's RL-entrypoint-specific). - Drop the standalone Evaluations section — eval syntax is covered in configuration.md and the eval-knobs row in Useful knobs links to `prime eval` for one-off evals. - Drop the optimization_dtype / reduce_dtype rule of thumb. Co-authored-by: Cursor --- docs/training.md | 147 ++++++++++++++++------------------------------- 1 file changed, 50 insertions(+), 97 deletions(-) diff --git a/docs/training.md b/docs/training.md index 42972a391c..a7b46ee54e 100644 --- a/docs/training.md +++ b/docs/training.md @@ -1,21 +1,18 @@ # Training -This page covers everything you need to launch, observe, checkpoint, and recover a `prime-rl` training run — RL, SFT, and the related on-policy distillation mode. For multi-node and cluster layouts, see [Scaling](scaling.md). For the loss math and algorithm knobs, see [Algorithms](algorithms.md). +This page covers everything you need to launch, observe, checkpoint, and recover a `prime-rl` training run — the RL trainer, the SFT trainer, and the related on-policy distillation mode. For multi-node and cluster layouts, see [Scaling](scaling.md). For the loss math and algorithm knobs, see [Algorithms](algorithms.md). ## Table of Contents - [Entrypoints](#entrypoints) -- [RL training](#rl-training) +- [RL trainer](#rl-trainer) - [Launch](#launch) - - [Useful CLI flags](#useful-cli-flags) - - [What each process does at runtime](#what-each-process-does-at-runtime) - - [Key knobs](#key-knobs) -- [SFT training](#sft-training) + - [Useful knobs](#useful-knobs) + - [Training modes (RL / OPD / SFT-via-orchestrator)](#training-modes-rl--opd--sft-via-orchestrator) +- [SFT trainer](#sft-trainer) - [Dataset format](#dataset-format) - [Launch](#launch-1) - [SFT-specific knobs](#sft-specific-knobs) -- [Training modes (RL / OPD / SFT-via-orchestrator)](#training-modes-rl--opd--sft-via-orchestrator) -- [Evaluations](#evaluations) - [Checkpointing](#checkpointing) - [Enabling checkpoints](#enabling-checkpoints) - [Resuming a run](#resuming-a-run) @@ -33,74 +30,77 @@ This page covers everything you need to launch, observe, checkpoint, and recover | Command | Purpose | Notes | |---|---|---| -| `uv run rl` | Co-launches inference + orchestrator + trainer on one node | The default for any single-node RL run. Mirrors a `[trainer]` + `[orchestrator]` + `[inference]` TOML. | -| `uv run sft` | Supervised fine-tuning on a HF dataset | Launches torchrun internally; never call torchrun directly. | -| `uv run inference` | vLLM server | Always use this entrypoint over `vllm serve` — it adds `/update_weights`, `/load_lora_adapter`, and `/init_broadcaster`. | -| `uv run trainer` | Standalone trainer process group | Use only when launching the trainer separately from the orchestrator (e.g. multi-node RL without the `rl` wrapper). | -| `uv run orchestrator` | Standalone orchestrator process | Pair with a separately-launched trainer + inference. | +| `uv run rl` | Wraps the trainer, orchestrator, and inference server in one launch from a merged TOML. | The default for any RL run. Runs locally for single-node experiments; submits to SLURM for single- or multi-node when `[slurm]` is set (see [Scaling § SLURM](scaling.md#slurm)). | +| `uv run sft` | Supervised fine-tuning on a HF dataset. | Launches torchrun internally; never call torchrun directly. | +| `uv run inference` | vLLM server. | Always use this entrypoint over `vllm serve` — it adds `/update_weights`, `/load_lora_adapter`, and `/init_broadcaster`. | +| `uv run trainer` | Standalone trainer process group. | Use only when launching the trainer separately from the orchestrator (e.g. multi-node RL without the `rl` wrapper). | +| `uv run orchestrator` | Standalone orchestrator process. | Pair with a separately-launched trainer + inference. | -`rl` is a convenience wrapper — it parses one merged TOML, splits it across `[trainer]` / `[orchestrator]` / `[inference]` tables, picks GPUs, sets up logging, and spawns the three children. Standalone entrypoints exist for the multi-node case where each process lives on a different host. - -## RL training +## RL trainer ### Launch -The minimal RL run trains an SFT-warmed `Qwen3-0.6B` on the `reverse-text` task — the env is bundled with the `verifiers` submodule, so nothing else needs to be installed. From the project root, on two GPUs (one for inference, one for the trainer): +The minimal RL run trains an SFT-warmed `Qwen3-0.6B` on the `reverse-text` task — the env is bundled with the `verifiers` submodule, so nothing else needs to be installed: ```bash -uv run rl @ examples/reverse_text/rl.toml \ - --wandb.project my-project \ - --wandb.name reverse-text-smoke \ - --ckpt +uv run rl @ examples/reverse_text/rl.toml ``` -GPU placement: by default `rl` uses 1 trainer GPU and 1 inference GPU on the local node. To run on (say) 8 GPUs with 4 inference + 4 trainer, set the deployment counts: +### Useful knobs -```bash -uv run rl @ rl.toml \ - --deployment.num-infer-gpus 4 \ - --deployment.num-train-gpus 4 \ - --inference.parallel.dp 4 -``` +A condensed view of the knobs you'll most often tune. For trainer-side parallelism, sampling, optimizer, and loss knobs see [Scaling](scaling.md) and [Algorithms](algorithms.md); for the full field reference see [Reference](reference.md). + +**Data and algorithm:** -The launcher assigns physical GPUs from `CUDA_VISIBLE_DEVICES` (or all visible GPUs if unset) — inference takes the first `num_infer_gpus`, the trainer takes the next `num_train_gpus`, and any teacher gets the remainder. To run on a specific subset of physical GPUs, pin `CUDA_VISIBLE_DEVICES` before launching. +| Knob | What it does | +|---|---| +| `orchestrator.batch_size` | Prompts per trainer step. | +| `orchestrator.group_size` | Rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | +| `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it's discarded (default 8). The main throughput-vs-noise dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | +| `orchestrator.training_mode` | `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft-via-orchestrator). | +| `[[orchestrator.train.env]]` | Training environments. List multiple tables for multi-env training; weight them via `ratio`. See [Configuration § Environments](configuration.md#environments-orchestratortrainenv). | +| `[[orchestrator.eval.env]]` + `orchestrator.eval.interval` | Eval environments and cadence (default every 100 steps). Scores land in trainer logs and W&B as `eval/{env}/{avg@k,pass@k}`. For one-off evaluations outside training, use [`prime eval`](https://docs.primeintellect.ai/cli-reference/introduction). | -For multi-node and SLURM, see [Scaling § RL training](scaling.md#rl-training). +**Monitoring:** -### Useful CLI flags +| Knob | What it does | +|---|---| +| `orchestrator.log.vf_level` | Env-worker / verifiers log level (`info` default; `debug` is noisy but useful for env debugging). | +| `--wandb` (+ `--wandb.project`, `--wandb.name`) | Enable Weights & Biases logging. See [Weights & Biases](#weights--biases). | +| `--orchestrator.prime-monitor` | Stream metrics to the Prime Intellect platform (Prime Lab). See [Platform monitoring](#platform-monitoring). | -Commonly-used flags every RL launch should know about: +**Run management:** -| Flag | What it does | +| Knob | What it does | |---|---| | `--ckpt` | Enable end-of-training checkpoint. See [Checkpointing](#checkpointing) for interval / keep-last / resume variants. | -| `--wandb` | Enable Weights & Biases logging with defaults. Pair with `--wandb.project` / `--wandb.name`. | -| `--orchestrator.prime-monitor` | Register the run on the Prime Intellect platform (Lab) and stream metrics there. See [Platform monitoring](#platform-monitoring). | | `--clean-output-dir` | Wipe `` before starting. Useful when re-running an experiment with the same name during iteration. | | `--output-dir outputs/` | Per-run output directory. Always set this when running more than one experiment in parallel. | -| `--max-steps N` | Stop after `N` trainer steps. Overrides whatever the config sets. | +| `--max-steps N` | Stop after `N` trainer steps. Overrides the config value. | | `--dry-run` | Resolve + validate the full config, write per-process TOMLs to `/configs/`, and exit without launching. The fastest way to debug a misbehaving config. | -### What each process does at runtime +### Training modes (RL / OPD / SFT-via-orchestrator) + +The RL entrypoint supports three training modes, switched via `orchestrator.training_mode`: -- **Inference** (vLLM) holds the current policy and serves OpenAI-compatible completions. Receives a new HF checkpoint via `POST /update_weights` after each trainer step (or batched into one update per `max_async_level` steps). -- **Orchestrator** samples a prompt batch from the configured `[[orchestrator.train.env]]` envs, drives them against the inference server (multi-turn, tool calls, etc.), packs the completed rollouts into a binary batch, writes it under `outputs/rollouts/step_N/`, and notifies the trainer. The orchestrator talks to one **env server** per train/eval env (sidecar `vf.EnvServer` subprocess by default), and each env server holds a pool of **env workers** that run user code concurrently — that's where most rollout-time CPU work lives. -- **Trainer** waits for the binary batch, runs forward/backward/optimizer step under FSDP2, writes new weights to the broadcast transport, and signals the orchestrator that step `N+1` is in flight. +| Mode | Student | Teacher | Use case | +|---|---|---|---| +| `rl` | Required | Forbidden | Standard RL | +| `opd` | Required | Required, must be vLLM (needs `prompt_logprobs`) | [On-policy distillation](https://thinkingmachines.ai/blog/on-policy-distillation/): student generates rollouts, trainer minimizes KL to teacher logprobs | +| `sft` | Required | Required, any OpenAI-compatible endpoint | Hard-distill: teacher generates rollouts, student trains on them | -The orchestrator is the only stateful CPU process; the trainer is GPU-bound; the inference server is stateless apart from KV cache. On restart the orchestrator pushes the latest checkpoint into inference automatically — you don't need to checkpoint inference state. +The `rl` entrypoint only manages student-policy inference. For OPD and (local-vLLM) SFT, start the teacher inference server manually and point `[orchestrator.teacher.client]` at it: -### Key knobs +```bash +CUDA_VISIBLE_DEVICES=1 uv run inference \ + --model.name --server.port 8001 +``` -The orchestrator owns the data-side knobs that most directly shape what the trainer sees. For trainer-side parallelism, sampling, optimizer, and loss knobs see [Scaling](scaling.md) and [Algorithms](algorithms.md); for the full field reference see [Reference](reference.md). +Debug configs for all variants ship under [`configs/debug/training_modes/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/configs/debug/training_modes). -| Knob | What it controls | -|---|---| -| `orchestrator.batch_size` | Prompts per trainer step. | -| `orchestrator.group_size` | Rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | -| `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it gets discarded (default 8). The main throughput-vs-noise dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | -| `orchestrator.training_mode` | Picks the training-mode dispatch: `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft-via-orchestrator). | +The standalone `uv run sft` entrypoint is the more traditional SFT path — pure dataset-based, no teacher, no orchestrator. Use `orchestrator.training_mode = "sft"` only when you want a teacher to generate the supervision on the fly. -## SFT training +## SFT trainer `uv run sft` runs supervised fine-tuning from a HF dataset. It shares model loaders, FSDP setup, checkpointing, and the chat-template plumbing with the RL trainer, so a typical workflow is _SFT → RL → SFT → …_ without any reformatting. @@ -135,52 +135,6 @@ Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manag | `loss_mask.*` | Which roles contribute to loss; see [Reference § `sft.data.loss_mask`](reference.md#sft-data) | | `val.interval` | Run validation every N steps; `val.data` mirrors `data` | -## Training modes (RL / OPD / SFT-via-orchestrator) - -The RL entrypoint supports three training modes, switched via `orchestrator.training_mode`: - -| Mode | Student | Teacher | Use case | -|---|---|---|---| -| `rl` | Required | Forbidden | Standard RL | -| `opd` | Required | Required, must be vLLM (needs `prompt_logprobs`) | [On-policy distillation](https://thinkingmachines.ai/blog/on-policy-distillation/): student generates rollouts, trainer minimizes KL to teacher logprobs | -| `sft` | Required | Required, any OpenAI-compatible endpoint | Hard-distill: teacher generates rollouts, student trains on them | - -The `rl` entrypoint only manages student-policy inference. For OPD and (local-vLLM) SFT, start the teacher inference server manually and point `[orchestrator.teacher.client]` at it: - -```bash -CUDA_VISIBLE_DEVICES=1 uv run inference \ - --model.name --server.port 8001 -``` - -Debug configs for all variants ship under [`configs/debug/training_modes/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/configs/debug/training_modes). - -The standalone `uv run sft` entrypoint is the more traditional SFT path — pure dataset-based, no teacher, no orchestrator. Use `orchestrator.training_mode = "sft"` only when you want a teacher to generate the supervision on the fly. - -## Evaluations - -Evals run inside the orchestrator on a separate set of envs declared under `[[orchestrator.eval.env]]`: - -```toml -[orchestrator.eval] -interval = 25 # evaluate every 25 trainer steps -group_size = 4 - -[[orchestrator.eval.env]] -id = "math-env" -name = "gsm8k-eval" -args = { dataset_name = "openai/gsm8k", dataset_subset = "main", split = "test" } -``` - -Eval scores land in the trainer logs as `eval/{env}/{avg@k,pass@k}` and in W&B under the same keys. For one-off evaluations outside of training, use `prime eval` (from the [`prime` CLI](https://docs.primeintellect.ai/cli-reference/introduction)) — it defaults to Prime Inference but talks to any OpenAI-compatible endpoint via `--provider vllm --api-base-url ...`: - -```bash -prime eval run math-env \ - --env-args '{"dataset_name": "openai/gsm8k", "dataset_subset": "main"}' \ - --model PrimeIntellect/Qwen3-0.6B \ - --provider vllm --api-base-url http://localhost:8000/v1 \ - --num-examples 50 --max-tokens 2048 -``` - ## Checkpointing Checkpointing is split across processes because the orchestrator and trainer can be on different machines and on different steps at any given time. Inference is stateless. @@ -348,4 +302,3 @@ Pulled from the three console logs (and mirrored to W&B): - **Group size ≥ 8.** Bigger groups (`orchestrator.group_size`) make it more likely that a prompt produces a mix of high- and low-reward rollouts, which is what gives the trainer a usable signal — if all rollouts in a group succeed or all fail, the within-group advantage collapses to zero and the trainer learns nothing from that prompt. Bigger groups also tighten advantage normalization. 8 is the floor; 16–32 is common. - **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. - **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. -- **Don't change `optimization_dtype` / `reduce_dtype`.** These are load-bearing — flipping bfloat16/float32 silently changes training dynamics. Stick with defaults unless you know what you're doing. From aa912b5a5fb53a6311f59e8f95f9c0f49cb4f335 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:35:15 +0000 Subject: [PATCH 19/66] docs(training): SFT trainer metrics + tools column + minor renames - Drop "-via-orchestrator" from the Training modes heading and the internal/cross-doc anchors. The mode value is just `sft` and the short title reads cleaner. - Drop "and the tmux helper" from the Console output subsection title; the tmux helper is still documented in the section body. - Important metrics is now split into RL trainer and SFT trainer subsections so the SFT-only metrics (loss/mean, val/loss, progress/{epoch,num_samples,num_tokens}, optim/zero_grad_ratio, per-subset mixing ratios, MoE max_vio + routing_confidence, perf/peak_memory + the time/* breakdown) are documented. - SFT Dataset format gains a Tool definitions paragraph: rows can carry a `tools` column (OAI function-calling format) or `tool_defs` (verifiers rollout format), as either a list of dicts or a JSON-encoded string. `tool_defs` is auto-converted to OAI shape before being passed into the chat template's `tools=...` argument. `chat_template_kwargs` rows pass through verbatim. Co-authored-by: Cursor --- docs/faqs.md | 2 +- docs/training.md | 41 +++++++++++++++++++++++++++++++++++------ 2 files changed, 36 insertions(+), 7 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 730c772d27..25d4a36b8c 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -105,7 +105,7 @@ prime eval run math-env \ `uv run sft` is the traditional path: load a HF dataset, train the model. No orchestrator, no teacher. -`orchestrator.training_mode = "sft"` uses the RL pipeline to hard-distill from a teacher: the teacher (any OpenAI-compatible endpoint) generates the completions, and the student trains on them as they're produced. Use this when you want on-the-fly teacher supervision against a moving student. See [Training § Training modes](training.md#training-modes-rl--opd--sft-via-orchestrator). +`orchestrator.training_mode = "sft"` uses the RL pipeline to hard-distill from a teacher: the teacher (any OpenAI-compatible endpoint) generates the completions, and the student trains on them as they're produced. Use this when you want on-the-fly teacher supervision against a moving student. See [Training § Training modes](training.md#training-modes-rl--opd--sft). ## Checkpoints and resume diff --git a/docs/training.md b/docs/training.md index a7b46ee54e..51fca88374 100644 --- a/docs/training.md +++ b/docs/training.md @@ -8,7 +8,7 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [RL trainer](#rl-trainer) - [Launch](#launch) - [Useful knobs](#useful-knobs) - - [Training modes (RL / OPD / SFT-via-orchestrator)](#training-modes-rl--opd--sft-via-orchestrator) + - [Training modes (RL / OPD / SFT)](#training-modes-rl--opd--sft) - [SFT trainer](#sft-trainer) - [Dataset format](#dataset-format) - [Launch](#launch-1) @@ -19,7 +19,7 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Saving HF weights for serving](#saving-hf-weights-for-serving) - [Observability](#observability) - [Log files](#log-files) - - [Console output and the tmux helper](#console-output-and-the-tmux-helper) + - [Console output](#console-output) - [Weights & Biases](#weights--biases) - [Platform monitoring](#platform-monitoring) - [Prometheus and BetterStack](#prometheus-and-betterstack) @@ -57,7 +57,7 @@ A condensed view of the knobs you'll most often tune. For trainer-side paralleli | `orchestrator.batch_size` | Prompts per trainer step. | | `orchestrator.group_size` | Rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | | `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it's discarded (default 8). The main throughput-vs-noise dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | -| `orchestrator.training_mode` | `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft-via-orchestrator). | +| `orchestrator.training_mode` | `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft). | | `[[orchestrator.train.env]]` | Training environments. List multiple tables for multi-env training; weight them via `ratio`. See [Configuration § Environments](configuration.md#environments-orchestratortrainenv). | | `[[orchestrator.eval.env]]` + `orchestrator.eval.interval` | Eval environments and cadence (default every 100 steps). Scores land in trainer logs and W&B as `eval/{env}/{avg@k,pass@k}`. For one-off evaluations outside training, use [`prime eval`](https://docs.primeintellect.ai/cli-reference/introduction). | @@ -79,7 +79,7 @@ A condensed view of the knobs you'll most often tune. For trainer-side paralleli | `--max-steps N` | Stop after `N` trainer steps. Overrides the config value. | | `--dry-run` | Resolve + validate the full config, write per-process TOMLs to `/configs/`, and exit without launching. The fastest way to debug a misbehaving config. | -### Training modes (RL / OPD / SFT-via-orchestrator) +### Training modes (RL / OPD / SFT) The RL entrypoint supports three training modes, switched via `orchestrator.training_mode`: @@ -113,6 +113,8 @@ Two accepted layouts: If both columns are present, `messages` takes precedence. +**Tool definitions.** For tool-use SFT, add a `tools` column (OpenAI function-calling format) or `tool_defs` (verifiers rollout format). Each row's value can be either a list of dicts or a JSON-encoded string of a list — both are accepted, and `tool_defs` rows are auto-converted to OAI shape before being passed into the chat template's `tools=...` argument. The `chat_template_kwargs` column, if present, is forwarded verbatim into `apply_chat_template`. + **Chat-template prefix property.** Multi-turn SFT requires that tokenizing the first _k_ turns of a conversation be a strict prefix of tokenizing all _n ≥ k_ turns. Qwen3's default template _violates_ this (it strips past `` blocks), so use either the prime-rl–patched checkpoints (e.g. `PrimeIntellect/Qwen3-0.6B`) or a custom chat template that preserves thinking. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). ### Launch @@ -211,7 +213,7 @@ tail -F /logs/trainer/node_*.log # multi-node only tail -F /logs/inference/router_*.log # multi-node only ``` -### Console output and the tmux helper +### Console output `scripts/tmux.sh` opens a 4-pane tmux session that follows `trainer.log`, `orchestrator.log`, `inference.log`, and the union of env worker logs. Start it before launching: @@ -268,7 +270,9 @@ For long-running production training: ## Important metrics -Pulled from the three console logs (and mirrored to W&B): +Pulled from the console logs and mirrored to W&B. + +### RL trainer **Progress** (orchestrator): @@ -295,6 +299,31 @@ Pulled from the three console logs (and mirrored to W&B): | orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | | vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | +### SFT trainer + +**Progress and loss:** + +- `loss/mean` — main signal. Should decrease through the run. +- `loss/nan_count` — non-zero is a red flag; check LR and dtype. +- `val/loss` — validation loss when `[val]` is set, logged every `val.interval` steps. +- `progress/epoch`, `progress/num_samples`, `progress/num_tokens` — dataset progress. +- `progress//ratio_{samples,tokens}` — when training on multiple HF subsets/splits, the realized mixing ratio. + +**Stability and optimization:** + +- `optim/grad_norm` — spikes precede divergence. +- `optim/lr`, `optim/zero_grad_ratio` — LR schedule and the fraction of params that received zero gradients (high → dead path or wrong loss masking). +- For MoE: `max_vio/mean` (load-balancing violation), `routing_confidence/mean` — both are logged when non-zero. + +**Performance:** + +| Metric | Reading | +|---|---| +| `perf/throughput`, `perf/throughput_per_gpu` | tokens/s overall and per GPU | +| `perf/mfu` | MFU | +| `perf/peak_memory` | peak GPU memory (GiB) | +| `time/step`, `time/forward_backward`, `time/save_ckpt` | step breakdown | + ## Rules of thumb - **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. From 232791ee5e381acbfcc73fb543205d152a3a136d Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:42:29 +0000 Subject: [PATCH 20/66] docs(training): rename 'Saving HF weights for serving' -> 'Serving checkpoints' Co-authored-by: Cursor --- docs/training.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/training.md b/docs/training.md index 51fca88374..a8698a9a44 100644 --- a/docs/training.md +++ b/docs/training.md @@ -16,7 +16,7 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Checkpointing](#checkpointing) - [Enabling checkpoints](#enabling-checkpoints) - [Resuming a run](#resuming-a-run) - - [Saving HF weights for serving](#saving-hf-weights-for-serving) + - [Serving checkpoints](#serving-checkpoints) - [Observability](#observability) - [Log files](#log-files) - [Console output](#console-output) @@ -171,7 +171,7 @@ uv run rl @ rl.toml --max-steps 10 --ckpt uv run rl @ rl.toml --max-steps 20 --ckpt.resume-step 10 ``` -### Saving HF weights for serving +### Serving checkpoints HF-compatible weight snapshots are written under `/weights/step_N/` whenever a full checkpoint runs (or you can write weights-only via `--ckpt.weights-only` for cheaper snapshots). Upload directly: From 059a2c9988a6efd1b9521c154a5b2c32d29fe6e4 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:43:46 +0000 Subject: [PATCH 21/66] docs: cross-link the agent skills from training + configuration Adds a callout under the intro of training.md and configuration.md pointing at the equivalent skill files for AI agents working in this repo: - training.md -> skills/training/SKILL.md (top-level routing) + skills/training/start-run/SKILL.md (launch details) + skills/training/monitor-run/SKILL.md (check-in / restart). - configuration.md -> skills/configs/SKILL.md. The skills aren't part of the published Mintlify nav, so the links go to GitHub blob URLs. Co-authored-by: Cursor --- docs/configuration.md | 2 ++ docs/training.md | 2 ++ 2 files changed, 4 insertions(+) diff --git a/docs/configuration.md b/docs/configuration.md index 70e0ab9e09..2eb4f05240 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -2,6 +2,8 @@ Every `prime-rl` entrypoint uses [`pydantic-config`](https://github.com/PrimeIntellect-ai/pydantic-config): TOML files for reproducible base configs, CLI flags for one-off overrides. +> **AI agents working in this repo:** the equivalent runbook is at [`skills/configs/SKILL.md`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/skills/configs/SKILL.md), with extra runtime hints (where config classes live, validator conventions, the trainer-side `token_export` flag) that aren't surfaced here. + ## Table of Contents - [Sources and precedence](#sources-and-precedence) diff --git a/docs/training.md b/docs/training.md index a8698a9a44..0bebb382e9 100644 --- a/docs/training.md +++ b/docs/training.md @@ -2,6 +2,8 @@ This page covers everything you need to launch, observe, checkpoint, and recover a `prime-rl` training run — the RL trainer, the SFT trainer, and the related on-policy distillation mode. For multi-node and cluster layouts, see [Scaling](scaling.md). For the loss math and algorithm knobs, see [Algorithms](algorithms.md). +> **AI agents working in this repo:** the equivalent runbooks are at [`skills/training/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/skills/training) — top-level routing in [`skills/training/SKILL.md`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/skills/training/SKILL.md), launch details in [`skills/training/start-run/SKILL.md`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/skills/training/start-run/SKILL.md), and check-in / restart procedures in [`skills/training/monitor-run/SKILL.md`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/skills/training/monitor-run/SKILL.md). + ## Table of Contents - [Entrypoints](#entrypoints) From 86bd5cdfaa1be562b534d4846d33d02314dabcab Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:45:50 +0000 Subject: [PATCH 22/66] docs(training): fold Important metrics into RL + SFT trainer sections MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The standalone "## Important metrics" section is gone. Each trainer subsection now ends with its own "### Important metrics" covering only the metrics relevant to that flow: - RL trainer / Important metrics: reward + rollout signals from the orchestrator, mismatch_kl + entropy + grad_norm from the trainer, and the trainer/orchestrator/vLLM performance grid. - SFT trainer / Important metrics: loss/mean, val/loss, progress counters, optim signals, MoE max_vio + routing_confidence, and the perf/{throughput,mfu,peak_memory} + time/* breakdown. TOC updated to point at #important-metrics (RL) and #important-metrics-1 (SFT) — Mintlify de-duplicates with the same -N suffix scheme it already uses for the two Launch subsections. Co-authored-by: Cursor --- docs/training.md | 115 ++++++++++++++++++++++++----------------------- 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/docs/training.md b/docs/training.md index 0bebb382e9..3c4483531b 100644 --- a/docs/training.md +++ b/docs/training.md @@ -11,10 +11,12 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Launch](#launch) - [Useful knobs](#useful-knobs) - [Training modes (RL / OPD / SFT)](#training-modes-rl--opd--sft) + - [Important metrics](#important-metrics) - [SFT trainer](#sft-trainer) - [Dataset format](#dataset-format) - [Launch](#launch-1) - [SFT-specific knobs](#sft-specific-knobs) + - [Important metrics](#important-metrics-1) - [Checkpointing](#checkpointing) - [Enabling checkpoints](#enabling-checkpoints) - [Resuming a run](#resuming-a-run) @@ -25,7 +27,6 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Weights & Biases](#weights--biases) - [Platform monitoring](#platform-monitoring) - [Prometheus and BetterStack](#prometheus-and-betterstack) -- [Important metrics](#important-metrics) - [Rules of thumb](#rules-of-thumb) ## Entrypoints @@ -102,6 +103,35 @@ Debug configs for all variants ship under [`configs/debug/training_modes/`](http The standalone `uv run sft` entrypoint is the more traditional SFT path — pure dataset-based, no teacher, no orchestrator. Use `orchestrator.training_mode = "sft"` only when you want a teacher to generate the supervision on the fly. +### Important metrics + +Pulled from the console logs and mirrored to W&B. + +**Progress** (orchestrator): + +- `reward/{all,env}/mean` — main signal. Should trend upward over hundreds of steps. +- `seq_len/{all,env}/mean` and `is_truncated/{all,env}/mean` — rollout length and truncation rate. +- `num_turns/{all,env}/mean` — for multi-turn envs. +- `empty_rollouts/{all,env}`, `errored_rollouts/{all,env}` — non-zero is fine in small numbers; sustained > 5% is a smell. +- `eval/{env}/{avg@k,pass@k}` — eval scores when `[orchestrator.eval]` is set. + +**Stability** (trainer): + +- `mismatch_kl/{all,env}/{mean,std,max}` — KL between trainer's current policy and the (older) inference policy that generated the rollouts. A sustained, growing mean is the early-warning sign for off-policy collapse. +- `entropy/{all,env}/mean` — too low means mode-collapse; too high means the model isn't committing. +- `masked_advantage_{positive,negative}/mean` — fraction of DPPO-masked tokens, split by sign. +- `optim/grad_norm` — spikes precede divergence; check the loss config or lower the LR. + +**Performance** (trainer + orchestrator step independently): + +| Source | Metric | Reading | +|---|---|---| +| trainer | `time/wait_for_batch` | **high → orchestrator bottleneck** | +| orchestrator | `time/wait_for_ckpt` | **high → trainer bottleneck** | +| trainer | `perf/throughput`, `perf/mfu` | tokens/s and MFU | +| orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | +| vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | + ## SFT trainer `uv run sft` runs supervised fine-tuning from a HF dataset. It shares model loaders, FSDP setup, checkpointing, and the chat-template plumbing with the RL trainer, so a typical workflow is _SFT → RL → SFT → …_ without any reformatting. @@ -139,6 +169,33 @@ Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manag | `loss_mask.*` | Which roles contribute to loss; see [Reference § `sft.data.loss_mask`](reference.md#sft-data) | | `val.interval` | Run validation every N steps; `val.data` mirrors `data` | +### Important metrics + +Pulled from the console log and mirrored to W&B. + +**Progress and loss:** + +- `loss/mean` — main signal. Should decrease through the run. +- `loss/nan_count` — non-zero is a red flag; check LR and dtype. +- `val/loss` — validation loss when `[val]` is set, logged every `val.interval` steps. +- `progress/epoch`, `progress/num_samples`, `progress/num_tokens` — dataset progress. +- `progress//ratio_{samples,tokens}` — when training on multiple HF subsets/splits, the realized mixing ratio. + +**Stability and optimization:** + +- `optim/grad_norm` — spikes precede divergence. +- `optim/lr`, `optim/zero_grad_ratio` — LR schedule and the fraction of params that received zero gradients (high → dead path or wrong loss masking). +- For MoE: `max_vio/mean` (load-balancing violation), `routing_confidence/mean` — both are logged when non-zero. + +**Performance:** + +| Metric | Reading | +|---|---| +| `perf/throughput`, `perf/throughput_per_gpu` | tokens/s overall and per GPU | +| `perf/mfu` | MFU | +| `perf/peak_memory` | peak GPU memory (GiB) | +| `time/step`, `time/forward_backward`, `time/save_ckpt` | step breakdown | + ## Checkpointing Checkpointing is split across processes because the orchestrator and trainer can be on different machines and on different steps at any given time. Inference is stateless. @@ -270,62 +327,6 @@ For long-running production training: - **Prometheus**: set `trainer.metrics_server.port` to expose `/metrics` on each trainer process. vLLM also exposes `/metrics` natively — useful for KV-cache saturation and pending-request counts. - **BetterStack heartbeats**: set `trainer.heartbeat.url` (and the matching orchestrator field) to ping a heartbeat URL each step. Pair with a BetterStack monitor to page on stalls. -## Important metrics - -Pulled from the console logs and mirrored to W&B. - -### RL trainer - -**Progress** (orchestrator): - -- `reward/{all,env}/mean` — main signal. Should trend upward over hundreds of steps. -- `seq_len/{all,env}/mean` and `is_truncated/{all,env}/mean` — rollout length and truncation rate. -- `num_turns/{all,env}/mean` — for multi-turn envs. -- `empty_rollouts/{all,env}`, `errored_rollouts/{all,env}` — non-zero is fine in small numbers; sustained > 5% is a smell. -- `eval/{env}/{avg@k,pass@k}` — eval scores when `[orchestrator.eval]` is set. - -**Stability** (trainer): - -- `mismatch_kl/{all,env}/{mean,std,max}` — KL between trainer's current policy and the (older) inference policy that generated the rollouts. A sustained, growing mean is the early-warning sign for off-policy collapse. -- `entropy/{all,env}/mean` — too low means mode-collapse; too high means the model isn't committing. -- `masked_advantage_{positive,negative}/mean` — fraction of DPPO-masked tokens, split by sign. -- `optim/grad_norm` — spikes precede divergence; check the loss config or lower the LR. - -**Performance** (trainer + orchestrator step independently): - -| Source | Metric | Reading | -|---|---|---| -| trainer | `time/wait_for_batch` | **high → orchestrator bottleneck** | -| orchestrator | `time/wait_for_ckpt` | **high → trainer bottleneck** | -| trainer | `perf/throughput`, `perf/mfu` | tokens/s and MFU | -| orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | -| vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | - -### SFT trainer - -**Progress and loss:** - -- `loss/mean` — main signal. Should decrease through the run. -- `loss/nan_count` — non-zero is a red flag; check LR and dtype. -- `val/loss` — validation loss when `[val]` is set, logged every `val.interval` steps. -- `progress/epoch`, `progress/num_samples`, `progress/num_tokens` — dataset progress. -- `progress//ratio_{samples,tokens}` — when training on multiple HF subsets/splits, the realized mixing ratio. - -**Stability and optimization:** - -- `optim/grad_norm` — spikes precede divergence. -- `optim/lr`, `optim/zero_grad_ratio` — LR schedule and the fraction of params that received zero gradients (high → dead path or wrong loss masking). -- For MoE: `max_vio/mean` (load-balancing violation), `routing_confidence/mean` — both are logged when non-zero. - -**Performance:** - -| Metric | Reading | -|---|---| -| `perf/throughput`, `perf/throughput_per_gpu` | tokens/s overall and per GPU | -| `perf/mfu` | MFU | -| `perf/peak_memory` | peak GPU memory (GiB) | -| `time/step`, `time/forward_backward`, `time/save_ckpt` | step breakdown | - ## Rules of thumb - **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. From 9d5844ac489293d6515a5408d581b1ff6a6a69fb Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:55:09 +0000 Subject: [PATCH 23/66] docs(training): drop Prometheus and BetterStack subsection Not ready for end-user docs yet. Both knobs (trainer.metrics_server.port, trainer.heartbeat.url) are still in reference.md for anyone who needs them, just no narrative coverage. Co-authored-by: Cursor --- docs/training.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/docs/training.md b/docs/training.md index 3c4483531b..98124d02dd 100644 --- a/docs/training.md +++ b/docs/training.md @@ -26,7 +26,6 @@ This page covers everything you need to launch, observe, checkpoint, and recover - [Console output](#console-output) - [Weights & Biases](#weights--biases) - [Platform monitoring](#platform-monitoring) - - [Prometheus and BetterStack](#prometheus-and-betterstack) - [Rules of thumb](#rules-of-thumb) ## Entrypoints @@ -320,13 +319,6 @@ run_name = "my-experiment" Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted team. Currently internal-only. -### Prometheus and BetterStack - -For long-running production training: - -- **Prometheus**: set `trainer.metrics_server.port` to expose `/metrics` on each trainer process. vLLM also exposes `/metrics` natively — useful for KV-cache saturation and pending-request counts. -- **BetterStack heartbeats**: set `trainer.heartbeat.url` (and the matching orchestrator field) to ping a heartbeat URL each step. Pair with a BetterStack monitor to page on stalls. - ## Rules of thumb - **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. From ebcc9b81455e59d2c03871bb0d5906f460e3defa Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 21:59:38 +0000 Subject: [PATCH 24/66] docs(training): tighten Useful knobs language - Use "task" not "prompt" for the conceptual unit ingested by the orchestrator (rows for batch_size and group_size, plus the matching Rules of thumb wording). - group_size: drop the "Used for advantage normalization and pass@k estimation" trailer; the row name is enough. - max_off_policy_steps: rename "throughput-vs-noise dial" to just "off-policy dial". - eval row: drop the "Scores land in trainer logs and W&B as eval/{env}/{avg@k,pass@k}" + `prime eval` trailer; keep the row scoped to what the knob does. - Add log.level to the Monitoring table (trainer/orchestrator process log level, $PRIME_LOG_LEVEL fallback, per-process or global on rl). - Drop --ckpt from Run management; checkpointing has its own section. Co-authored-by: Cursor --- docs/training.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/training.md b/docs/training.md index 98124d02dd..a383c34157 100644 --- a/docs/training.md +++ b/docs/training.md @@ -56,17 +56,18 @@ A condensed view of the knobs you'll most often tune. For trainer-side paralleli | Knob | What it does | |---|---| -| `orchestrator.batch_size` | Prompts per trainer step. | -| `orchestrator.group_size` | Rollouts generated per prompt. Used for advantage normalization and pass@k estimation. | -| `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it's discarded (default 8). The main throughput-vs-noise dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | +| `orchestrator.batch_size` | Tasks per trainer step. | +| `orchestrator.group_size` | Rollouts generated per task. | +| `orchestrator.max_off_policy_steps` | How many distinct policies may have contributed to one rollout before it's discarded (default 8). The main off-policy dial on long agentic rollouts — bump for throughput, lower for tighter on-policyness. Watch `errored_rollouts` and `mismatch_kl/all/mean` when tuning. | | `orchestrator.training_mode` | `rl` (default), `opd`, or `sft`. See [Training modes](#training-modes-rl--opd--sft). | | `[[orchestrator.train.env]]` | Training environments. List multiple tables for multi-env training; weight them via `ratio`. See [Configuration § Environments](configuration.md#environments-orchestratortrainenv). | -| `[[orchestrator.eval.env]]` + `orchestrator.eval.interval` | Eval environments and cadence (default every 100 steps). Scores land in trainer logs and W&B as `eval/{env}/{avg@k,pass@k}`. For one-off evaluations outside training, use [`prime eval`](https://docs.primeintellect.ai/cli-reference/introduction). | +| `[[orchestrator.eval.env]]` + `orchestrator.eval.interval` | Eval environments and cadence (default every 100 steps). | **Monitoring:** | Knob | What it does | |---|---| +| `log.level` | Process log level for trainer + orchestrator (`info` default; falls back to `$PRIME_LOG_LEVEL`). Set per-process via `trainer.log.level` / `orchestrator.log.level`, or globally on the `rl` entrypoint to propagate to both. | | `orchestrator.log.vf_level` | Env-worker / verifiers log level (`info` default; `debug` is noisy but useful for env debugging). | | `--wandb` (+ `--wandb.project`, `--wandb.name`) | Enable Weights & Biases logging. See [Weights & Biases](#weights--biases). | | `--orchestrator.prime-monitor` | Stream metrics to the Prime Intellect platform (Prime Lab). See [Platform monitoring](#platform-monitoring). | @@ -75,7 +76,6 @@ A condensed view of the knobs you'll most often tune. For trainer-side paralleli | Knob | What it does | |---|---| -| `--ckpt` | Enable end-of-training checkpoint. See [Checkpointing](#checkpointing) for interval / keep-last / resume variants. | | `--clean-output-dir` | Wipe `` before starting. Useful when re-running an experiment with the same name during iteration. | | `--output-dir outputs/` | Per-run output directory. Always set this when running more than one experiment in parallel. | | `--max-steps N` | Stop after `N` trainer steps. Overrides the config value. | @@ -323,6 +323,6 @@ Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted t - **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. - **Batch size ≥ 64.** Smaller batches give noisy gradient estimates and the trainer's overhead-per-step dominates throughput. 64 is the practical floor; 128–512 is typical for production RL. -- **Group size ≥ 8.** Bigger groups (`orchestrator.group_size`) make it more likely that a prompt produces a mix of high- and low-reward rollouts, which is what gives the trainer a usable signal — if all rollouts in a group succeed or all fail, the within-group advantage collapses to zero and the trainer learns nothing from that prompt. Bigger groups also tighten advantage normalization. 8 is the floor; 16–32 is common. +- **Group size ≥ 8.** Bigger groups (`orchestrator.group_size`) make it more likely that a task produces a mix of high- and low-reward rollouts, which is what gives the trainer a usable signal — if all rollouts in a group succeed or all fail, the within-group advantage collapses to zero and the trainer learns nothing from that task. Bigger groups also tighten advantage normalization. 8 is the floor; 16–32 is common. - **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. - **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. From d5b7ac875d42086937fe265b86de2fdb5dd5b46f Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:00:02 +0000 Subject: [PATCH 25/66] docs(training): trim RL Performance table to the bottleneck signals MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drop the throughput/MFU/async-lag/KV-cache rows from the RL trainer Performance table — they're either generic perf metrics already covered by the SFT trainer's perf table or vLLM-internal. The two remaining rows (time/wait_for_batch, time/wait_for_ckpt) are the useful diagnostic — they tell you which side is the bottleneck. Co-authored-by: Cursor --- docs/training.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/training.md b/docs/training.md index a383c34157..69e69cf097 100644 --- a/docs/training.md +++ b/docs/training.md @@ -127,9 +127,6 @@ Pulled from the console logs and mirrored to W&B. |---|---|---| | trainer | `time/wait_for_batch` | **high → orchestrator bottleneck** | | orchestrator | `time/wait_for_ckpt` | **high → trainer bottleneck** | -| trainer | `perf/throughput`, `perf/mfu` | tokens/s and MFU | -| orchestrator | `scheduler/async_level`, `scheduler/inflight_rollouts` | current async lag | -| vLLM | `vllm:gpu_cache_usage_perc` | → 1.0 means KV cache saturated, slow generation | ## SFT trainer From b7b5ab0d843bc4c2499274b891b3254fb5a4c171 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:01:03 +0000 Subject: [PATCH 26/66] docs(training): refresh chat-template prefix-property paragraph MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The old wording recommended only patched checkpoints / a custom chat template as the fix for position-dependent templates. The renderer-based path landed in SFT (use_renderer flag) and is now the primary recommended fix, but it's still off by default (use_renderer: bool = False on SFTConfig), so the patched- checkpoint path also still works — and is what examples/reverse_text/sft.toml uses today via PrimeIntellect/ Qwen3-0.6B. Rewrite the paragraph to cover both fixes: - Renderer path: use_renderer = true, lists the hand-coded renderers, calls out the VLM unsupported case. - Patched template path: the prime-rl-patched checkpoint or a user-supplied template that preserves thinking. Cross-link both to Algorithms § Renderers and § Multi-turn trajectories. Co-authored-by: Cursor --- docs/training.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/training.md b/docs/training.md index 69e69cf097..96d2215ee7 100644 --- a/docs/training.md +++ b/docs/training.md @@ -143,7 +143,12 @@ If both columns are present, `messages` takes precedence. **Tool definitions.** For tool-use SFT, add a `tools` column (OpenAI function-calling format) or `tool_defs` (verifiers rollout format). Each row's value can be either a list of dicts or a JSON-encoded string of a list — both are accepted, and `tool_defs` rows are auto-converted to OAI shape before being passed into the chat template's `tools=...` argument. The `chat_template_kwargs` column, if present, is forwarded verbatim into `apply_chat_template`. -**Chat-template prefix property.** Multi-turn SFT requires that tokenizing the first _k_ turns of a conversation be a strict prefix of tokenizing all _n ≥ k_ turns. Qwen3's default template _violates_ this (it strips past `` blocks), so use either the prime-rl–patched checkpoints (e.g. `PrimeIntellect/Qwen3-0.6B`) or a custom chat template that preserves thinking. See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories). +**Position-dependent chat templates.** Multi-turn SFT under the default tokenization path (`build_incremental_token_mask`) requires that tokenizing the first _k_ turns of a conversation be a strict prefix of tokenizing all _n ≥ k_ turns. Qwen3's upstream template _violates_ this — it strips past `` blocks across user turns, silently corrupting the loss mask. Two fixes: + +- **Enable the renderer** (`use_renderer = true`, recommended). The [`renderers`](algorithms.md#renderers) package owns tokenization end-to-end and is robust to position-dependent templates. Hand-coded renderers ship for Qwen3, Qwen3.5, GLM-5, GLM-4.5, Kimi K2/K2.5, MiniMax M2, DeepSeek V3, Nemotron 3, GPT-OSS. Not supported for VLMs. +- **Patched chat template** — the prime-rl–patched checkpoints (e.g. `PrimeIntellect/Qwen3-0.6B`, used in `examples/reverse_text/sft.toml`) ship a chat template that preserves thinking. Or supply your own. + +See [Algorithms § Multi-turn trajectories](algorithms.md#multi-turn-trajectories) for the full picture. ### Launch From b40e36f3f6603c488a98e688ec76f57437d4b47a Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:01:39 +0000 Subject: [PATCH 27/66] docs(training): fix SFT data row (data.name, drop default data.type) - data.type = "sft" is the discriminated-union default for SFTConfig.data, so users don't need to spell it out. - The dataset path field is data.name, not data.path. Confirmed against SFTDataConfig.name in packages/prime-rl-configs/src/ prime_rl/configs/sft.py. Co-authored-by: Cursor --- docs/training.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/training.md b/docs/training.md index 96d2215ee7..0ab9c73393 100644 --- a/docs/training.md +++ b/docs/training.md @@ -164,7 +164,7 @@ Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manag | Knob | What it controls | |---|---| -| `data.type = "sft"` and `data.path` | HF dataset name or local path | +| `data.name` | HF dataset name or local path | | `data.batch_size` | Tokens per trainer step (packed) | | `data.seq_len` | Per-sample sequence length | | `loss_mask.*` | Which roles contribute to loss; see [Reference § `sft.data.loss_mask`](reference.md#sft-data) | From 27ce09926b2e0e118d7bf37b0c9cd9e855f530e8 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:01:55 +0000 Subject: [PATCH 28/66] docs(training): drop loss/nan_count from SFT trainer metrics Co-authored-by: Cursor --- docs/training.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/training.md b/docs/training.md index 0ab9c73393..5429cd3ada 100644 --- a/docs/training.md +++ b/docs/training.md @@ -177,7 +177,6 @@ Pulled from the console log and mirrored to W&B. **Progress and loss:** - `loss/mean` — main signal. Should decrease through the run. -- `loss/nan_count` — non-zero is a red flag; check LR and dtype. - `val/loss` — validation loss when `[val]` is set, logged every `val.interval` steps. - `progress/epoch`, `progress/num_samples`, `progress/num_tokens` — dataset progress. - `progress//ratio_{samples,tokens}` — when training on multiple HF subsets/splits, the realized mixing ratio. From 22c17a85d16fa400d79c437dc468ca0891dae985 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:53:53 +0000 Subject: [PATCH 29/66] docs(advanced): drop the Environments section Environment installation, authoring, and multi-env mixing are already covered concisely in configuration.md (Environments syntax) and training.md (Useful knobs row pointing at the same). The advanced.md duplicate just spread the same material across three subsections without adding anything material. - Drop the section + its TOC entries. - Update the page intro and overview.md "Where to go next" to stop advertising "environments deep-dive" / "environments installation and authoring". Co-authored-by: Cursor --- docs/advanced.md | 79 +----------------------------------------------- docs/overview.md | 2 +- 2 files changed, 2 insertions(+), 79 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index aeffd93ed6..2b000e7ee8 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: MoE training and our custom model implementations, vision-language models, LoRA and the multi-run manager, environments installation and authoring, and the small-scale MoE testing workflow used during architecture work. +This page covers the specialized features layered on top of the core training stack: MoE training and our custom model implementations, vision-language models, LoRA and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. ## Table of Contents @@ -17,10 +17,6 @@ This page covers the specialized features layered on top of the core training st - [Run discovery](#run-discovery) - [Eviction](#eviction) - [Hooks](#hooks) -- [Environments](#environments) - - [Installing from the Hub](#installing-from-the-hub) - - [Authoring locally](#authoring-locally) - - [Multi-env training](#multi-env-training) - [Testing MoE at small scale](#testing-moe-at-small-scale) ## MoE models @@ -200,79 +196,6 @@ Deletion hooks always run before creation hooks. The creation/deletion hooks run For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). The primary use case today is the LoRA-per-run training topology — many lightweight RL runs (e.g. one per environment) sharing a single trainer process group. -## Environments - -`prime-rl` trains in any [`verifiers`](https://github.com/PrimeIntellect-ai/verifiers) environment. The orchestrator hosts each declared environment as either a local subprocess (`vf.EnvServer` sidecar — default) or a standalone process you launched elsewhere. - -### Installing from the Hub - -Explore what's available: - -```bash -prime env info / -``` - -Install: - -```bash -prime env install / -# or pin a version -prime env install /@1.2.3 -``` - -Verify the import works: - -```bash -uv run python -c "import " -``` - -Then reference it in your config by ID: - -```toml -[[orchestrator.train.env]] -id = "primeintellect/math-env" -name = "gsm8k" -args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } -``` - -### Authoring locally - -For local dev or pre-Hub work, install an environment in editable mode: - -```bash -uv pip install -e path/to/my-env -``` - -The env exposes a `load_environment(**kwargs)` returning a `vf.Environment` (or v1 `vf.Env`). The `args` field in the orchestrator config is forwarded verbatim as `**kwargs`. See the [verifiers docs](https://docs.primeintellect.ai/verifiers) for environment authoring. - -To run an env in an isolated process (e.g. inside a container, with its own conda environment), launch the env server separately and pass its address: - -```toml -[[orchestrator.train.env]] -id = "my-env" -address = "tcp://10.0.0.5:5000" -``` - -When `address` is set, the orchestrator connects to that ZMQ server rather than spawning a subprocess. - -### Multi-env training - -You can train on a mixture of environments by listing several `[[orchestrator.train.env]]` tables. Set `ratio` on each to weight sampling; omit `ratio` on all of them to sample uniformly across problems (not envs). - -```toml -[[orchestrator.train.env]] -id = "math-env" -name = "gsm8k" -args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } -ratio = 3 - -[[orchestrator.train.env]] -id = "reverse-text" -ratio = 1 -``` - -This batches roughly 75% from `gsm8k` and 25% from `reverse-text`. The same applies to `[[orchestrator.eval.env]]` for evaluation mixtures. - ## Testing MoE at small scale When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. diff --git a/docs/overview.md b/docs/overview.md index 2781fcc3c7..65811d28fe 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -42,6 +42,6 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. -- **[Advanced](advanced.md)** — MoE training (EP backends, custom impls); VLMs; LoRA and the multi-run manager; small-scale MoE testing; environments deep-dive. +- **[Advanced](advanced.md)** — MoE training (EP backends, custom impls); VLMs; LoRA and the multi-run manager; small-scale MoE testing. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. From bd3b34512e7ad644b78658d3e3f71db7b9ada301 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:55:04 +0000 Subject: [PATCH 30/66] docs(advanced): rename "MoE models" -> "Custom modeling" + "LoRA" -> "LoRA training" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The first section was conceptually about our custom modeling stack (model.impl = "custom"/"auto", the family registry, EP backends), not about MoE specifically — even though the supported families happen to be MoE-heavy. Rename so the section title matches the defining feature. Bump LoRA to "LoRA training" for parity with the trainer-oriented section names elsewhere on the page. Update the page intro and TOC entries to match. No external doc links to the renamed anchors (#moe-models, #lora) so nothing else to fix. Co-authored-by: Cursor --- docs/advanced.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 2b000e7ee8..23f0af8331 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,10 +1,10 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: MoE training and our custom model implementations, vision-language models, LoRA and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), vision-language models, LoRA training and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. ## Table of Contents -- [MoE models](#moe-models) +- [Custom modeling](#custom-modeling) - [Custom vs HF implementations](#custom-vs-hf-implementations) - [Expert parallelism backends](#expert-parallelism-backends) - [Vision-language models](#vision-language-models) @@ -12,14 +12,14 @@ This page covers the specialized features layered on top of the core training st - [Enabling VLM mode](#enabling-vlm-mode) - [Limitations](#limitations) - [Multi-turn VLM training](#multi-turn-vlm-training) -- [LoRA](#lora) +- [LoRA training](#lora-training) - [Multi-run manager](#multi-run-manager) - [Run discovery](#run-discovery) - [Eviction](#eviction) - [Hooks](#hooks) - [Testing MoE at small scale](#testing-moe-at-small-scale) -## MoE models +## Custom modeling ### Custom vs HF implementations @@ -111,7 +111,7 @@ Each multimodal sample becomes its own micro-batch (no packing) because image te `VLLM_WORKER_MULTIPROC_METHOD=spawn` is required for VLM inference — set automatically by `uv run rl`, but if you launch `uv run inference` separately for a VLM, export it yourself. -## LoRA +## LoRA training LoRA is enabled by adding `[model.lora]`: From 714af81542f5649d5d24caa70ffaa452859b2482 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:55:27 +0000 Subject: [PATCH 31/66] docs(advanced): drop 'Poolside' prefix from Laguna family row Co-authored-by: Cursor --- docs/advanced.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/advanced.md b/docs/advanced.md index 23f0af8331..28ba10f531 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -36,7 +36,7 @@ impl = "custom" # or "hf" to force the HF path | Qwen3 MoE | `Qwen/Qwen3-30B-A3B`, … | ✅ | ✅ | | Qwen3.5 MoE | `Qwen/Qwen3.5-35B-A3B`, … | ✅ | ✅ | | Qwen3 / Qwen3.5 VLMs | see [Multimodal](#vision-language-models) | MoE only | ✅ | -| Poolside Laguna | `poolside/Laguna-XS.2` | ✅ | ✅ | +| Laguna | `poolside/Laguna-XS.2` | ✅ | ✅ | | MiniMax M2 | `MiniMax/MiniMax-M2` | ✅ | ✅ | | Nemotron H | `nvidia/Nemotron-3-Nano-30B-A3B`, … | ✅ | ❌ | | Trinity (AFMoE) | `arcee-ai/Trinity-Mini`, … | ✅ | ✅ | From ae7d0363b3c664074193750ddc431683031090aa Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:56:53 +0000 Subject: [PATCH 32/66] docs: rename Vision-language models -> Multimodal training Standardize on "multimodal training" as the section title: - advanced.md: "## Vision-language models" -> "## Multimodal training" + "### Multi-turn VLM training" -> "### Multi-turn training" (the parent context is already multimodal). "### Enabling VLM mode" stays because it's literally enabling the [model.vlm] config block. - advanced.md page intro: "vision-language models" -> "multimodal training". - advanced.md custom-modeling cross-link to (#vision-language-models) -> (#multimodal-training). - overview.md "Where to go next" blurb: align the Advanced bullet's language ("MoE training" + "VLMs") with the destination's section titles ("Custom modeling" + "multimodal training"). reference.md still mentions "vision-language model" in auto-generated descriptions sourced from Pydantic field docstrings. Those are out-of-scope for this rename; would need a config docstring change. Co-authored-by: Cursor --- docs/advanced.md | 12 ++++++------ docs/overview.md | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 28ba10f531..a2703454d6 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,17 +1,17 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), vision-language models, LoRA training and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. ## Table of Contents - [Custom modeling](#custom-modeling) - [Custom vs HF implementations](#custom-vs-hf-implementations) - [Expert parallelism backends](#expert-parallelism-backends) -- [Vision-language models](#vision-language-models) +- [Multimodal training](#multimodal-training) - [Supported families](#supported-families) - [Enabling VLM mode](#enabling-vlm-mode) - [Limitations](#limitations) - - [Multi-turn VLM training](#multi-turn-vlm-training) + - [Multi-turn training](#multi-turn-training) - [LoRA training](#lora-training) - [Multi-run manager](#multi-run-manager) - [Run discovery](#run-discovery) @@ -35,7 +35,7 @@ impl = "custom" # or "hf" to force the HF path | GLM-5 (`glm_moe_dsa`) | `zai-org/GLM-5`, `zai-org/GLM-5-FP8` | ✅ | ✅ | | Qwen3 MoE | `Qwen/Qwen3-30B-A3B`, … | ✅ | ✅ | | Qwen3.5 MoE | `Qwen/Qwen3.5-35B-A3B`, … | ✅ | ✅ | -| Qwen3 / Qwen3.5 VLMs | see [Multimodal](#vision-language-models) | MoE only | ✅ | +| Qwen3 / Qwen3.5 VLMs | see [Multimodal training](#multimodal-training) | MoE only | ✅ | | Laguna | `poolside/Laguna-XS.2` | ✅ | ✅ | | MiniMax M2 | `MiniMax/MiniMax-M2` | ✅ | ✅ | | Nemotron H | `nvidia/Nemotron-3-Nano-30B-A3B`, … | ✅ | ❌ | @@ -56,7 +56,7 @@ DeepEP intranode dispatch derives the RDMA channel count as `deepep_num_sms / 2` When you enable DeepEP, gradient clipping is auto-disabled (`optim.max_norm` set to `None`) because the kernels don't currently support it. This is a tradeoff — watch `grad_norm` in the trainer logs to make sure nothing diverges. -## Vision-language models +## Multimodal training ### Supported families @@ -99,7 +99,7 @@ To add a new model family permanently, append an entry to `VLM_REGISTRY` in `src - **Higher KL mismatch with multi-image inputs.** Expect noisier `mismatch_kl` than text-only; this is from minor numerical differences between the trainer's and vLLM's image processing. - **Images aren't logged to monitors.** Sample logging captures the prompt text but not the actual images. -### Multi-turn VLM training +### Multi-turn training VLM rollouts go through the renderer-backed TITO client (`orchestrator.use_renderer = true`, required for VLMs). Per trajectory step: diff --git a/docs/overview.md b/docs/overview.md index 65811d28fe..7cca5ad820 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -42,6 +42,6 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. -- **[Advanced](advanced.md)** — MoE training (EP backends, custom impls); VLMs; LoRA and the multi-run manager; small-scale MoE testing. +- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA and the multi-run manager; small-scale MoE testing. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. From 3eca0888f039df720f677b9b015de769329b4209 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:57:13 +0000 Subject: [PATCH 33/66] docs(advanced): drop the DeepEP grad-clip-disable tradeoff aside Co-authored-by: Cursor --- docs/advanced.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/advanced.md b/docs/advanced.md index a2703454d6..03bcaa8e7b 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -54,7 +54,7 @@ The custom path enables EP, selective activation checkpointing, FP8 training (`m DeepEP intranode dispatch derives the RDMA channel count as `deepep_num_sms / 2`. Lower SM count leaves more for compute; higher speeds up dispatch. Useful starting points: 16–24 SMs on H100, 20–40 on B200. -When you enable DeepEP, gradient clipping is auto-disabled (`optim.max_norm` set to `None`) because the kernels don't currently support it. This is a tradeoff — watch `grad_norm` in the trainer logs to make sure nothing diverges. +When you enable DeepEP, gradient clipping is auto-disabled (`optim.max_norm` set to `None`) because the kernels don't currently support it. ## Multimodal training From ae12a2ee0b3492951d43517536574996cda64e6a Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 22:59:31 +0000 Subject: [PATCH 34/66] docs(advanced): fix the freeze_vision_encoder + LoRA claim MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The bullet said the combo "has no effect under LoRA (LoRA freezes everything non-adapter regardless)". Source check: packages/prime-rl-configs/.../trainer.py:591-597 now contains a vlm_freeze_incompatible_with_lora validator that explicitly *rejects* the combo with an error. So the user-facing behavior is not silent-no-op, it's a config validation failure. Other bullets in the Limitations list verified against source and are correct as written: - "No multimodal-safe truncation": confirmed in trainer/batch.py (input_ids / mm_token_type_ids truncated to seq_len; mm_kwargs passes through unchanged). - "bfloat16 mandatory": confirmed by the vlms_require_bfloat16 validator in trainer.py. - "Images aren't logged to monitors": confirmed — no image / pixel handling in src/prime_rl/utils/monitor/. The "Higher KL mismatch with multi-image inputs" bullet is left unchanged for now — flagged for review separately because the cited rationale ("minor numerical differences between trainer and vLLM image processing") no longer matches reality after the renderer- only multimodal path landed (#2473): the renderer now owns image processing end-to-end and both sides consume the same pre-processed pixel_values / image_grid_thw. The empirical observation may still hold but the reason given would be stale. Co-authored-by: Cursor --- docs/advanced.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/advanced.md b/docs/advanced.md index 03bcaa8e7b..2ad188d928 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -93,7 +93,7 @@ To add a new model family permanently, append an entry to `VLM_REGISTRY` in `src ### Limitations -- **Vision encoder frozen by default.** Set `freeze_vision_encoder = false` to fine-tune it; in that case it's FSDP-sharded per block. Has no effect under LoRA (LoRA freezes everything non-adapter regardless). +- **Vision encoder frozen by default.** Set `freeze_vision_encoder = false` to fine-tune it; in that case it's FSDP-sharded per block. The combination `freeze_vision_encoder = false` + LoRA is rejected by a config validator — LoRA freezes everything non-adapter, so unfreezing the encoder under LoRA would be a silent no-op. - **No multimodal-safe truncation.** Token sequences are truncated to `seq_len`, but `pixel_values` and `image_grid_thw` pass through unchanged. If a sample's tokens overflow, image tokens may get dropped while image tensors still describe the full image set. Set `seq_len` to cover your longest sample. - **bfloat16 mandatory.** The trainer config validator refuses any other `optimization_dtype` / `reduce_dtype` for VLMs — vLLM serves VLMs in bfloat16 and a mismatch breaks the importance ratio. - **Higher KL mismatch with multi-image inputs.** Expect noisier `mismatch_kl` than text-only; this is from minor numerical differences between the trainer's and vLLM's image processing. From d9866b719ed3a287f2709cfdb81d4ece0453defb Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:00:12 +0000 Subject: [PATCH 35/66] docs(advanced): drop the Multi-turn training subsection The render/pack/forward rendition lived close to internal code and felt out of scope for the user-facing Advanced page; the spawn-method hint that ended it can move elsewhere if needed. Co-authored-by: Cursor --- docs/advanced.md | 13 ------------- 1 file changed, 13 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 2ad188d928..5dd758402e 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -11,7 +11,6 @@ This page covers the specialized features layered on top of the core training st - [Supported families](#supported-families) - [Enabling VLM mode](#enabling-vlm-mode) - [Limitations](#limitations) - - [Multi-turn training](#multi-turn-training) - [LoRA training](#lora-training) - [Multi-run manager](#multi-run-manager) - [Run discovery](#run-discovery) @@ -99,18 +98,6 @@ To add a new model family permanently, append an entry to `VLM_REGISTRY` in `src - **Higher KL mismatch with multi-image inputs.** Expect noisier `mismatch_kl` than text-only; this is from minor numerical differences between the trainer's and vLLM's image processing. - **Images aren't logged to monitors.** Sample logging captures the prompt text but not the actual images. -### Multi-turn training - -VLM rollouts go through the renderer-backed TITO client (`orchestrator.use_renderer = true`, required for VLMs). Per trajectory step: - -1. **Render** — the renderer tokenizes messages and emits per-image multimodal tensors (`pixel_values`, `image_grid_thw` for Qwen3-VL) as `multi_modal_data`. -2. **Pack** — `interleave_rollout` concatenates per-image tensors across a sample's merged step range into a single `mm_kwargs` dict on the `TrainingSample`. Per-token `mm_token_type_ids` (0=text, 1=image, 2=video) come from `renderer.mm_token_type_id_map`. -3. **Forward** — the trainer `**`-unpacks `mm_kwargs` into the model's `forward`. Any VLM whose HF processor and forward signature agree on kwarg names works without modifying the transport. - -Each multimodal sample becomes its own micro-batch (no packing) because image tensor sizes vary. - -`VLLM_WORKER_MULTIPROC_METHOD=spawn` is required for VLM inference — set automatically by `uv run rl`, but if you launch `uv run inference` separately for a VLM, export it yourself. - ## LoRA training LoRA is enabled by adding `[model.lora]`: From b72f12a5f65ef3480e5979352dcb7c0eb7f4b977 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:01:36 +0000 Subject: [PATCH 36/66] docs(advanced): rename Multi-run manager -> Multi-tenant training MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User-facing framing now leads with the actual use case: many concurrent LoRA tenants sharing a single trainer + inference deployment, which is the topology behind hosted training on the Prime Intellect platform (Lab). The MultiRunManager class still exists in code and is name-checked in the section body — only the section title and surrounding prose change. - advanced.md: rename TOC entry + H2 + open paragraph; drop the now- redundant "primary use case today" trailer (Lab is in the intro). - advanced.md LoRA section: reword the cross-link. - overview.md: trainer description bullet now says "multi-tenant training" instead of "multi-run manager"; "Where to go next" Advanced bullet renamed similarly. - scaling.md: parenthetical aside on optim_cpu_offload incompatibility updated for consistency. Co-authored-by: Cursor --- docs/advanced.md | 12 ++++++------ docs/overview.md | 4 ++-- docs/scaling.md | 2 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 5dd758402e..a06cc85f42 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training and the multi-run manager, and the small-scale MoE testing workflow used during architecture work. +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, multi-tenant training, and the small-scale MoE testing workflow used during architecture work. ## Table of Contents @@ -12,7 +12,7 @@ This page covers the specialized features layered on top of the core training st - [Enabling VLM mode](#enabling-vlm-mode) - [Limitations](#limitations) - [LoRA training](#lora-training) -- [Multi-run manager](#multi-run-manager) +- [Multi-tenant training](#multi-tenant-training) - [Run discovery](#run-discovery) - [Eviction](#eviction) - [Hooks](#hooks) @@ -118,11 +118,11 @@ LoRA is supported across SFT and RL. For RL, `weight_broadcast.type = "nccl"` is save_adapter_separately = true ``` -LoRA pairs naturally with the multi-run manager — each run gets its own adapter, and many runs share the same backbone in trainer memory. +LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — each tenant gets its own adapter and the backbone is shared across all of them in trainer memory. -## Multi-run manager +## Multi-tenant training -`MultiRunManager` is a trainer-side singleton that lets one trainer process serve multiple concurrent orchestrator deployments, each with its own LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking. Enable by setting `trainer.max_concurrent_runs > 1`. +Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. Per-run layout under `/`: @@ -181,7 +181,7 @@ Five hook types fire at well-defined points: Deletion hooks always run before creation hooks. The creation/deletion hooks run on **all** ranks, so they're the right place for DTensor allocation and other collective work; `torch.dist.barrier()` is safe inside. -For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). The primary use case today is the LoRA-per-run training topology — many lightweight RL runs (e.g. one per environment) sharing a single trainer process group. +For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). ## Testing MoE at small scale diff --git a/docs/overview.md b/docs/overview.md index 7cca5ad820..26af7bb104 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -10,7 +10,7 @@ A `prime-rl` RL run is three cooperating processes: - **Inference** — vLLM-backed server (or fleet) holding the current policy. The orchestrator drives rollouts through the token-in `/v1/generate` route via the [`renderers`](https://github.com/PrimeIntellect-ai/renderers) package (OpenAI-compatible chat/completions routes are also exposed for external clients). Supports data + tensor + expert parallelism (with `deepep` and `flashinfer` all-to-all backends and EPLB), FP8 inference, prefill/decode disaggregation behind a `vllm-router`, CPU KV-cache offload, and *router replay* (the routed-expert mask is returned to the trainer for FP8 MoE numerical parity). Weights are pushed in place through a custom `update_weights` endpoint over filesystem or NCCL transports. - **Orchestrator** — Lightweight CPU process that owns the data plane across many [verifiers](https://github.com/PrimeIntellect-ai/verifiers) training and eval environments. Each env runs in an isolated subprocess with a variable-size pool of env workers for scalability. The orchestrator drives multi-turn rollouts against the inference fleet (tool use, browsers, sandboxes, long horizons) without re-tokenizing across turns, computes advantages, packs the rollouts into training batches, and relays new weights from trainer to inference. -- **Trainer** — FSDP2 process group that consumes packed rollouts and steps the optimizer. We ship optimized custom modeling code for many MoE / dense / VLM families that unlocks advanced trainer parallelism — expert parallelism (EP, with DeepEP kernels) and context parallelism (CP) for long-sequence training — plus selective activation checkpointing, FP8 training on Hopper+, LoRA, and a multi-run manager that hosts many concurrent adapters in one trainer process. +- **Trainer** — FSDP2 process group that consumes packed rollouts and steps the optimizer. We ship optimized custom modeling code for many MoE / dense / VLM families that unlocks advanced trainer parallelism — expert parallelism (EP, with DeepEP kernels) and context parallelism (CP) for long-sequence training — plus selective activation checkpointing, FP8 training on Hopper+, LoRA, and multi-tenant training (many concurrent LoRA tenants sharing one trainer + inference deployment). The three processes communicate through configurable transports — by default the trainer↔orchestrator rollout link uses the local filesystem, and weight broadcast uses the filesystem (or NCCL for synchronous setups). Swap to ZMQ for multi-host setups without shared storage. See [Scaling](scaling.md) for the deployment options. @@ -42,6 +42,6 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. -- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA and the multi-run manager; small-scale MoE testing. +- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training; small-scale MoE testing. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. diff --git a/docs/scaling.md b/docs/scaling.md index 4c919ded9a..a557c87c6b 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -178,7 +178,7 @@ type = "adamw" optim_cpu_offload = true ``` -Mutually exclusive with `fsdp_cpu_offload`. Also incompatible with `trainer.max_concurrent_runs > 1` (the multi-run manager). Muon doesn't support `fsdp_cpu_offload` but does support `optim_cpu_offload`. +Mutually exclusive with `fsdp_cpu_offload`. Also incompatible with `trainer.max_concurrent_runs > 1` (multi-tenant training). Muon doesn't support `fsdp_cpu_offload` but does support `optim_cpu_offload`. ## Memory-tight recipe From 6ee5c5c9287f1cf5abedd7c42bf33b997fafbd41 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:02:28 +0000 Subject: [PATCH 37/66] docs(advanced): trim multi-tenant training to user-facing surface only MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drop the per-run filesystem layout, Run discovery / Eviction / Hooks subsections — these are MultiRunManager internals (run_* dir discovery, evicted.txt protocol, hook lifecycle) that don't belong in a user-facing Advanced page. The pointer to src/prime_rl/trainer/runs/ still tells anyone who needs the API where to look. The remaining section is one paragraph: what multi-tenant training buys you, that Lab is the production user, and the trainer.max_concurrent_runs entry point. Co-authored-by: Cursor --- docs/advanced.md | 64 +----------------------------------------------- 1 file changed, 1 insertion(+), 63 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index a06cc85f42..2710d6c1cb 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -13,9 +13,6 @@ This page covers the specialized features layered on top of the core training st - [Limitations](#limitations) - [LoRA training](#lora-training) - [Multi-tenant training](#multi-tenant-training) - - [Run discovery](#run-discovery) - - [Eviction](#eviction) - - [Hooks](#hooks) - [Testing MoE at small scale](#testing-moe-at-small-scale) ## Custom modeling @@ -122,66 +119,7 @@ LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — ea ## Multi-tenant training -Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. - -Per-run layout under `/`: - -``` -run_abc123/ -├── control/ -│ ├── orch.toml # orchestrator config for this run -│ ├── config_validation_error.txt # populated if validation failed -│ └── evicted.txt # populated if the run was evicted -├── checkpoints/ -│ └── step_/ # orchestrator checkpoints -├── rollouts/ -│ └── step_/ # rollouts -└── broadcast/ - └── step_/ # weight snapshots for inference -``` - -### Run discovery - -Runs are added by dropping a `run_*` directory into `` with a valid `control/orch.toml`. The trainer scans periodically: - -```python -multi_run_manager.discover_runs() # master rank only -multi_run_manager.synchronize_state() # all ranks -``` - -- `discover_runs()` (master): scans, filters evicted runs, detects new/deleted, validates configs, fires `discovered_hook` / `forgotten_hook`. -- `synchronize_state()` (all ranks): master broadcasts run state over the distributed store; all ranks run `deletion_hook` then `creation_hook` so DTensor allocations and other collective ops happen in lock-step. - -Once `max_concurrent_runs` is reached, new `run_*` directories are ignored until existing runs are evicted or deleted. - -### Eviction - -The master can evict a run with `evict_run(idx, reason)`: - -```python -multi_run_manager.evict_run(idx=0, reason="exceeded memory limits") -``` - -The eviction writes `/control/evicted.txt`. Effect: - -- **Trainer side**: next `discover_runs()` treats the run as deleted, hooks fire, the index returns to the unused pool. -- **Orchestrator side**: checks for `evicted.txt` at the top of each iteration. If found, it raises a `RuntimeError` with the reason. The orchestrator also self-evicts after `MAX_EMPTY_BATCH_ATTEMPTS` (3) consecutive empty-batch failures, so a run with degenerate rewards doesn't sit consuming a slot forever. - -### Hooks - -Five hook types fire at well-defined points: - -| Hook | Where | When | -|---|---|---| -| `discovered_hook` | master | new run detected and config validated | -| `forgotten_hook` | master | run deleted from the output dir | -| `config_validation_hook` | master | validate the orchestrator config when a new run is discovered | -| `creation_hook` | all ranks | after `synchronize_state` for a newly created run (use for optimizer/scheduler init, LoRA param reset) | -| `deletion_hook` | all ranks | after `synchronize_state` for a deleted run (use for releasing per-run resources) | - -Deletion hooks always run before creation hooks. The creation/deletion hooks run on **all** ranks, so they're the right place for DTensor allocation and other collective work; `torch.dist.barrier()` is safe inside. - -For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). +Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). ## Testing MoE at small scale From fa06df7fc60998f82ced98dc5ba1cff5907d7b84 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:04:13 +0000 Subject: [PATCH 38/66] docs: add a Development page; move "Testing MoE at small scale" off Advanced Advanced was mixing user-facing features (custom modeling, multimodal, LoRA, multi-tenant training) with developer-facing workflows (building a tiny MoE for modeling-code iteration, adding a new architecture). Split the latter into a new Development page so each audience reads only what's relevant. - New page: docs/development.md, scoped to "developing on prime-rl itself". Contains the unchanged Testing MoE at small scale recipe (build/verify mini model, SFT warmup, RL on reverse-text) and the Adding a new architecture how-to. - advanced.md: drop the section + TOC entry; rewrite the page intro to point at Development for developer-side workflows. - mint.json: insert "development" in the nav between "advanced" and "faqs". - overview.md "Where to go next": add the Development bullet, drop "small-scale MoE testing" from the Advanced bullet. - README.md docs index: same split (Advanced + Development as two separate bullets). No external links to the moved anchors needed updating; the only existing cross-link was the relocated TOC entry in development.md itself. Co-authored-by: Cursor --- README.md | 3 +- docs/advanced.md | 75 +--------------------------------------- docs/development.md | 83 +++++++++++++++++++++++++++++++++++++++++++++ docs/mint.json | 1 + docs/overview.md | 3 +- 5 files changed, 89 insertions(+), 76 deletions(-) create mode 100644 docs/development.md diff --git a/README.md b/README.md index 6a092f7dc4..a8006998ae 100644 --- a/README.md +++ b/README.md @@ -222,7 +222,8 @@ Check out the [docs](docs) directory for in-depth guides on how to use PRIME-RL. - [**Training**](docs/training.md) - RL, SFT, evals, checkpointing, observability, rules of thumb - [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, Kubernetes, disaggregated inference, benchmarking - [**Algorithms**](docs/algorithms.md) - Async/off-policy training, the AIPO loss, advantage and filter plugins, trajectory merging -- [**Advanced**](docs/advanced.md) - MoE, VLMs, LoRA, multi-run manager, environments, small-scale MoE testing +- [**Advanced**](docs/advanced.md) - Custom modeling, multimodal training, LoRA, multi-tenant training +- [**Development**](docs/development.md) - Adding a new model architecture, small-scale MoE testing - [**FAQs**](docs/faqs.md) - Frequently-asked questions - [**Reference**](docs/reference.md) - Auto-generated field-by-field reference for every entrypoint config diff --git a/docs/advanced.md b/docs/advanced.md index 2710d6c1cb..76fc39c5a0 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, multi-tenant training, and the small-scale MoE testing workflow used during architecture work. +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, and multi-tenant training. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). ## Table of Contents @@ -13,7 +13,6 @@ This page covers the specialized features layered on top of the core training st - [Limitations](#limitations) - [LoRA training](#lora-training) - [Multi-tenant training](#multi-tenant-training) -- [Testing MoE at small scale](#testing-moe-at-small-scale) ## Custom modeling @@ -120,75 +119,3 @@ LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — ea ## Multi-tenant training Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). - -## Testing MoE at small scale - -When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. - -### Step 1: build and verify a mini model - -```bash -uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe -``` - -This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) with random weights, copies the tokenizer from the original GLM-4 model, and verifies the HF↔PrimeRL roundtrip is lossless. To re-verify after a modeling-code change without re-creating the model: - -```bash -uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only -``` - -### Step 2: SFT warmup - -Use the shipped debug MoE SFT config with reverse-text data: - -```bash -uv run sft @ configs/debug/moe/sft/train.toml \ - --model.name ./mini-glm-moe \ - --data.name PrimeIntellect/Reverse-Text-SFT \ - --data.type null \ - --max_steps 200 \ - --optim.lr 1e-4 \ - --ckpt.weights -``` - -Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution so KL divergence becomes meaningful in RL. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). - -### Step 3: RL on reverse-text - -```bash -uv run rl @ configs/ci/integration/reverse_text_moe/start.toml \ - --model.name samsja/mini-glm-moe \ - --trainer.model.impl custom \ - --inference.gpu-memory-utilization 0.7 \ - --inference.model.max-model-len 2048 -``` - -What to look for: - -- **No crashes.** Validates the full inference + orchestrator + trainer pipeline end-to-end. -- **Finite, non-zero KL.** Confirms the reference distribution is meaningful. -- **Loss reasonable.** Not NaN, not stuck. - -Don't expect reward to climb meaningfully in 20 steps on a random model. - -### Adding a new architecture - -To add (e.g.) Kimi 2.5: - -1. Add the modeling code under `src/prime_rl/trainer/models//`. -2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: - -```python -ARCH_PRESETS = { - "glm4_moe": { - "config_class": Glm4MoeConfig, - "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), - "hf_model_class": HFGlm4MoeForCausalLM, - "prime_model_class": PrimeRLGlm4MoeForCausalLM, - "tokenizer_source": "THUDM/GLM-4-9B-0414", - }, - # add your arch here -} -``` - -3. Run the three steps above with `--arch `. diff --git a/docs/development.md b/docs/development.md new file mode 100644 index 0000000000..9099d8b1e7 --- /dev/null +++ b/docs/development.md @@ -0,0 +1,83 @@ +# Development + +This page covers workflows for developing on `prime-rl` itself — adding new model architectures, debugging modeling code, and the small-scale tooling we use to iterate on MoE families without booting up a 100B+ run. + +## Table of Contents + +- [Testing MoE at small scale](#testing-moe-at-small-scale) + - [Step 1: build and verify a mini model](#step-1-build-and-verify-a-mini-model) + - [Step 2: SFT warmup](#step-2-sft-warmup) + - [Step 3: RL on reverse-text](#step-3-rl-on-reverse-text) + - [Adding a new architecture](#adding-a-new-architecture) + +## Testing MoE at small scale + +When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. + +### Step 1: build and verify a mini model + +```bash +uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe +``` + +This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) with random weights, copies the tokenizer from the original GLM-4 model, and verifies the HF↔PrimeRL roundtrip is lossless. To re-verify after a modeling-code change without re-creating the model: + +```bash +uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only +``` + +### Step 2: SFT warmup + +Use the shipped debug MoE SFT config with reverse-text data: + +```bash +uv run sft @ configs/debug/moe/sft/train.toml \ + --model.name ./mini-glm-moe \ + --data.name PrimeIntellect/Reverse-Text-SFT \ + --data.type null \ + --max_steps 200 \ + --optim.lr 1e-4 \ + --ckpt.weights +``` + +Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution so KL divergence becomes meaningful in RL. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). + +### Step 3: RL on reverse-text + +```bash +uv run rl @ configs/ci/integration/reverse_text_moe/start.toml \ + --model.name samsja/mini-glm-moe \ + --trainer.model.impl custom \ + --inference.gpu-memory-utilization 0.7 \ + --inference.model.max-model-len 2048 +``` + +What to look for: + +- **No crashes.** Validates the full inference + orchestrator + trainer pipeline end-to-end. +- **Finite, non-zero KL.** Confirms the reference distribution is meaningful. +- **Loss reasonable.** Not NaN, not stuck. + +Don't expect reward to climb meaningfully in 20 steps on a random model. + +### Adding a new architecture + +To add (e.g.) Kimi 2.5: + +1. Add the modeling code under `src/prime_rl/trainer/models//`. +2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: + +```python +ARCH_PRESETS = { + "glm4_moe": { + "config_class": Glm4MoeConfig, + "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), + "hf_model_class": HFGlm4MoeForCausalLM, + "prime_model_class": PrimeRLGlm4MoeForCausalLM, + "tokenizer_source": "THUDM/GLM-4-9B-0414", + }, + # add your arch here +} +``` + +3. Run the three steps above with `--arch `. diff --git a/docs/mint.json b/docs/mint.json index 3fed8a01df..3d8213c2db 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -10,6 +10,7 @@ "scaling", "algorithms", "advanced", + "development", "faqs", "reference" ] diff --git a/docs/overview.md b/docs/overview.md index 26af7bb104..3f0f18f0e0 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -42,6 +42,7 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. -- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training; small-scale MoE testing. +- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training. +- **[Development](development.md)** — Adding a new model architecture, debugging modeling code at small scale. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. From 6a98d2e5a136a2fd6391da792b7f88659a1374a3 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:06:05 +0000 Subject: [PATCH 39/66] docs(algorithms): fold Renderers into Multi-turn trajectories MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Renderers existed as a top-level section because they were added in a follow-up, but conceptually they're the mechanism that makes best-effort interleaving safe — splitting them apart forced a forward-reference and duplicated the "exact-prefix invariant" framing. - Demote "## Renderers" to "### Renderers" inside "## Multi-turn trajectories", placed between Best-effort interleaving and Discontinuous trajectories. - Move the Qwen3 thinking-stripping example from Best-effort interleaving into Renderers — it's the failure case the renderer fixes, so it reads better adjacent to the family list / config block. - Drop the "Workaround: use a chat template that preserves thinking" trailer; the patched-checkpoint workaround for SFT is already documented in training.md and isn't relevant in the orchestrator context (use_renderer defaults to true). - Open Multi-turn trajectories with a forward-link to Renderers so the reader knows the safety mechanism is coming. - TOC updated; both #multi-turn-trajectories and #renderers anchors preserved (only their nesting changes), so the existing cross- links from training.md and faqs.md keep working. Co-authored-by: Cursor --- docs/algorithms.md | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/docs/algorithms.md b/docs/algorithms.md index 52e37fa779..adcd3ada36 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -19,8 +19,8 @@ This page covers the math and the configurable algorithmic components: how off-p - [Multi-turn trajectories](#multi-turn-trajectories) - [Extension property](#extension-property) - [Best-effort interleaving](#best-effort-interleaving) + - [Renderers](#renderers) - [Discontinuous trajectories](#discontinuous-trajectories) -- [Renderers](#renderers) ## Async / off-policy training @@ -212,7 +212,7 @@ Filtered rollouts still appear in W&B distributions, just not in the trainer bat ## Multi-turn trajectories -Multi-turn rollouts (tool use, browser environments, long conversations) used to be stitched into a single fake "single-turn" sample, which silently corrupted the importance ratio when chat templates didn't roundtrip. Since [verifiers v0.1.8](https://github.com/PrimeIntellect-ai/verifiers/releases/tag/v0.1.8), `prime-rl` records each LLM request/response as an independent **trajectory step** and merges them at training time using best-effort interleaving. +Multi-turn rollouts (tool use, browser environments, long conversations) used to be stitched into a single fake "single-turn" sample, which silently corrupted the importance ratio when chat templates didn't roundtrip. Since [verifiers v0.1.8](https://github.com/PrimeIntellect-ai/verifiers/releases/tag/v0.1.8), `prime-rl` records each LLM request/response as an independent **trajectory step** and merges them at training time using best-effort interleaving — with [renderers](#renderers) as the mechanism that keeps the merge safe by construction. ### Extension property @@ -242,7 +242,17 @@ result: 2 training samples instead of 5 The orchestrator enforces an **exact prefix invariant**: the prompt at turn $t$ must be the exact concatenation of prior messages exactly as the LLM originally generated them. If turn 2's prompt is `U1, A1', U2` while `A1' ≠ A1`, the orchestrator can't safely merge — either choice produces logprob drift between trainer and inference. Starting a fresh sample is the only correct behavior, so that's what happens. -A common source of breakage is models like Qwen3 whose chat templates strip past `` blocks across user turns: +### Renderers + +Best-effort interleaving works because the renderer guarantees the exact-prefix invariant *by construction* — it never re-renders prior turns, so it can't lose tokens to chat-template normalization, BPE retokenization drift, or thinking stripping. A renderer turns a model's chat template into a Python object that can: + +- `render_ids(messages)` — tokenize messages to ids the inference engine accepts. +- `parse_response(completion_ids)` — recover structured `(content, reasoning_content, tool_calls)` from sampled ids. +- `bridge_to_next_turn(prev_prompt_ids, prev_completion_ids, new_messages)` — extend the previous turn's tokens verbatim with the new environment turn, instead of re-rendering history. + +When `bridge_to_next_turn` succeeds, the trainer sees the exact token stream the sampler produced; when it can't be proven safe (e.g. the renderer is `DefaultRenderer` and the template's stop sequence is unknown), it returns `None` and the orchestrator falls back to a full re-render — which triggers the new-sample fallback above. + +A common source of breakage in the absence of a hand-coded renderer is models like Qwen3 whose chat templates strip past `` blocks across user turns: ```python from transformers import AutoTokenizer @@ -261,22 +271,6 @@ tok.apply_chat_template(messages, tokenize=False) # (the R1 from turn 2 is gone) ``` -Workaround: use a chat template that preserves thinking — we ship patched versions for many models, e.g. `PrimeIntellect/Qwen3-0.6B`. - -### Discontinuous trajectories - -Some envs are discontinuous by design — e.g. a main agent delegating to a sub-agent and getting back only a summarized result, not the sub-agent's whole conversation. Best-effort interleaving handles this naturally: each agent's contiguous turns merge, the handoff starts a new sample. The trainer never sees fabricated extension where there is none. - -## Renderers - -Best-effort interleaving only works because the renderer guarantees the exact-prefix invariant *by construction* — it never re-renders prior turns, so it can't lose tokens to chat-template normalization, BPE retokenization drift, or thinking stripping. A renderer turns a model's chat template into a Python object that can: - -- `render_ids(messages)` — tokenize messages to ids the inference engine accepts. -- `parse_response(completion_ids)` — recover structured `(content, reasoning_content, tool_calls)` from sampled ids. -- `bridge_to_next_turn(prev_prompt_ids, prev_completion_ids, new_messages)` — extend the previous turn's tokens verbatim with the new environment turn, instead of re-rendering history. - -When `bridge_to_next_turn` succeeds, the trainer sees the exact token stream the sampler produced; when it can't be proven safe (e.g. the model's renderer is `DefaultRenderer` and the template's stop sequence is unknown), it returns `None` and the orchestrator falls back to a full re-render — which is what triggers the new-sample fallback documented above. - Hand-coded renderers ship for `qwen3`, `qwen3-vl`, `qwen3.5`, `glm5`, `glm4.5`, `minimax-m2`, `deepseek-v3`, `kimi-k2`, `kimi-k2.5`, `nemotron-3`, `gpt-oss`; anything else falls back to `DefaultRenderer` (a generic `apply_chat_template` wrapper). Pick one via: ```toml @@ -285,3 +279,7 @@ name = "auto" # detect from tokenizer; pass an explicit name for fine-tunes ``` For the full design rationale (failure modes ruled out, empirical token-identity comparison against `apply_chat_template`, when to write a hand-coded renderer), see [the renderers writeup on the Prime Intellect blog](https://www.primeintellect.ai/blog/renderers) — the canonical reference. + +### Discontinuous trajectories + +Some envs are discontinuous by design — e.g. a main agent delegating to a sub-agent and getting back only a summarized result, not the sub-agent's whole conversation. Best-effort interleaving handles this naturally: each agent's contiguous turns merge, the handoff starts a new sample. The trainer never sees fabricated extension where there is none. From 858d4287a31b1adc7f1a46d6866dc1d0a3f53ef7 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:07:52 +0000 Subject: [PATCH 40/66] docs(algorithms): drop max_async_level tuning + reframe as fixed one-step MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit max_async_level is being deprecated as a user-facing knob (hardcoded to 1; matching reframe in docs/async.md on feat/deprecate-max-async- level). Update algorithms.md so the long-form treatment matches. - Drop the "### Tuning max_async_level" subsection (the k=0/1/2/>=3 table) and the NCCL-needs-max_async_level=1 line that follows it — both become vacuous when k is fixed at 1. - Reword the Async / off-policy training intro to describe the one-step overlap directly instead of "up to k steps where k = max_async_level". - Step semantics: rho_inference is now pi_{max(0, n-1)}, with prose "inference is exactly one step behind the trainer" replacing the generic "gap is at most k steps". - Drop the Tuning entry from the page TOC. Other docs (overview.md intro, training.md rule of thumb, faqs.md two entries, scaling.md NCCL-example comment) still mention max_async_level and are now stale; will clean those up in the next turn unless flagged otherwise. Co-authored-by: Cursor --- docs/algorithms.md | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/docs/algorithms.md b/docs/algorithms.md index adcd3ada36..88f532791a 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -7,7 +7,6 @@ This page covers the math and the configurable algorithmic components: how off-p - [Async / off-policy training](#async--off-policy-training) - [Step semantics](#step-semantics) - [The default loss](#the-default-loss) - - [Tuning `max_async_level`](#tuning-max_async_level) - [Loss](#loss) - [Default loss](#default-loss) - [Custom loss](#custom-loss) @@ -24,7 +23,7 @@ This page covers the math and the configurable algorithmic components: how off-p ## Async / off-policy training -`prime-rl` is asynchronous by default. Inference is allowed to generate rollouts using a stale policy that is up to `k` steps behind the trainer, where `k = max_async_level`. Setting `k = 1` (the default) with matched trainer and inference step times produces fully-overlapped pipeline parallelism — neither side ever idles. Bump `k` higher when the weight-broadcast latency exceeds a single trainer step (e.g. cross-WAN decentralized runs) and the extra off-policy drift is acceptable. +`prime-rl` is asynchronous by default. The trainer and inference always run one step overlapped: while the trainer is producing $\pi_n$ from rollouts at step $n$, inference is already generating the rollouts for step $n+1$ using $\pi_{n-1}$. With matched trainer and inference step times this produces fully-overlapped pipeline parallelism — neither side ever idles. ![Two-Step Off-Policy Training](assets/two-step-off-policy.png) @@ -33,9 +32,9 @@ This page covers the math and the configurable algorithmic components: how off-p At step $n = 1, 2, 3, \dots$: - **Trainer** produces policy $\pi_n$ with weights $\theta_n$ from rollouts $(x_n, y_n)$. -- **Inference** produces rollouts $(x_n, y_n)$ from policy $\pi_{\max(0,\,n - k)}$. +- **Inference** produces rollouts $(x_n, y_n)$ from policy $\pi_{\max(0,\,n-1)}$. -So at step $n$ the gap between the policy being trained and the policy that generated the data is at most $k$ steps. Step indices are 0-indexed so the bound holds at startup. +Step indices are 0-indexed so the gap holds at startup — inference is exactly one step behind the trainer. ### The default loss @@ -71,17 +70,6 @@ The knobs (under `[trainer.loss]` with `type = "default"`): | `adv_tau` | 1.0 | Temperature on the advantage term. Set to 0 for pure distillation (no RL signal). | | `kl_tau` | 1e-3 | Temperature on the KL regularizer. Set to 0 to disable. | -### Tuning `max_async_level` - -| `k` | Behavior | -|---|---| -| `0` | Fully synchronous — trainer and inference alternate. Lowest off-policy drift, lowest throughput. | -| `1` (default) | Pipelined — inference for step $n+1$ runs concurrently with trainer step $n$. Throughput-optimal when step times match. | -| `2` | Two-step async. Absorbs longer weight-broadcast latency, e.g. cross-WAN decentralized runs. | -| `≥ 3` | Increasing off-policy drift. Use only with confirmed throughput gain; watch `mismatch_kl/all/mean`. | - -NCCL weight broadcast (`weight_broadcast.type = "nccl"`) requires `max_async_level = 1` — the validator will refuse otherwise. - ## Loss ### Default loss From 0b7daac2f337424df9b35b2efba0059e078d61ab Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:10:17 +0000 Subject: [PATCH 41/66] docs(development): add Test suite + Pre-commit hooks sections MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pull in the testing- and contributor-workflow content from README and the GitHub Actions configs so contributors don't have to dig through .github/workflows/ to figure out what runs where. New sections in development.md: - Test suite - Layout: tests/unit/, tests/integration/, tests/nightly/ — what each tier is for, with the actual file names contributors will encounter. - Running tests locally: pytest one-liners (everything, unit-only, integration-only, -m "not gpu", -m gpu, single file). - CI workflows: a 3-row table covering cpu_tests.yaml, gpu_tests.yaml (matrix list pulled verbatim from the workflow), and nightly_tests.yaml — including the trigger conditions (cpu = always, gpu = non-draft, nightly = scheduled + workflow_dispatch) and the runners (ubuntu-latest, vm/4xa6000, research-cluster). - Markers: gpu + slow, both declared with --strict-markers in pyproject.toml. - Pre-commit hooks - uv run pre-commit install - Currently configured hooks: ruff check/format and docs-reference (regenerator that fails the commit if docs/reference.md would drift). Plus update the Development pitch in overview.md "Where to go next" and README.md docs index to mention the new scope. Co-authored-by: Cursor --- README.md | 2 +- docs/development.md | 59 ++++++++++++++++++++++++++++++++++++++++++++- docs/overview.md | 2 +- 3 files changed, 60 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index a8006998ae..803c942064 100644 --- a/README.md +++ b/README.md @@ -223,7 +223,7 @@ Check out the [docs](docs) directory for in-depth guides on how to use PRIME-RL. - [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, Kubernetes, disaggregated inference, benchmarking - [**Algorithms**](docs/algorithms.md) - Async/off-policy training, the AIPO loss, advantage and filter plugins, trajectory merging - [**Advanced**](docs/advanced.md) - Custom modeling, multimodal training, LoRA, multi-tenant training -- [**Development**](docs/development.md) - Adding a new model architecture, small-scale MoE testing +- [**Development**](docs/development.md) - Test suite, pre-commit hooks, adding a new model architecture, small-scale MoE testing - [**FAQs**](docs/faqs.md) - Frequently-asked questions - [**Reference**](docs/reference.md) - Auto-generated field-by-field reference for every entrypoint config diff --git a/docs/development.md b/docs/development.md index 9099d8b1e7..a0b50eb41b 100644 --- a/docs/development.md +++ b/docs/development.md @@ -1,15 +1,72 @@ # Development -This page covers workflows for developing on `prime-rl` itself — adding new model architectures, debugging modeling code, and the small-scale tooling we use to iterate on MoE families without booting up a 100B+ run. +This page covers workflows for developing on `prime-rl` itself — running the test suite, contributing changes, adding new model architectures, and the small-scale tooling we use to iterate on MoE families without booting up a 100B+ run. ## Table of Contents +- [Test suite](#test-suite) + - [Layout](#layout) + - [Running tests locally](#running-tests-locally) + - [CI workflows](#ci-workflows) + - [Markers](#markers) +- [Pre-commit hooks](#pre-commit-hooks) - [Testing MoE at small scale](#testing-moe-at-small-scale) - [Step 1: build and verify a mini model](#step-1-build-and-verify-a-mini-model) - [Step 2: SFT warmup](#step-2-sft-warmup) - [Step 3: RL on reverse-text](#step-3-rl-on-reverse-text) - [Adding a new architecture](#adding-a-new-architecture) +## Test suite + +The test suite is split into three tiers, each with its own CI workflow. + +### Layout + +- **`tests/unit/`** — fast-running, hermetic tests for isolated logic: config parsing and validation, advantage / loss / scheduler / packer math, individual dataset paths, model-conversion roundtrips, etc. Tests that need a GPU are tagged with the `gpu` marker. +- **`tests/integration/`** — full-stack RL/SFT runs on a tiny model end-to-end through inference + orchestrator + trainer (e.g. `test_reverse_text.py`, `test_reverse_text_lora.py`, `test_reverse_text_moe.py`, `test_reverse_text_multi_run.py`, `test_alphabet_sort.py`). +- **`tests/nightly/`** — long-running training runs against shipped configs and real environments (`hendrycks_sanity`, `acereason_math`, `multimodal_color_codeword`, `wiki_search`, `wordle`, …). Each runs to completion on the research cluster with a 24h timeout. + +### Running tests locally + +```bash +uv run pytest -v # everything +uv run pytest tests/unit -v # unit only +uv run pytest tests/integration -v # integration only +uv run pytest -v -m "not gpu" # CPU-only subset (mirrors CPU CI) +uv run pytest -v -m gpu # GPU-only subset +uv run pytest tests/integration/test_reverse_text.py -vvs # one specific scenario +``` + +### CI workflows + +| Workflow | Trigger | What runs | Where | +|---|---|---|---| +| [`cpu_tests.yaml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/.github/workflows/cpu_tests.yaml) | every PR + push to `main` | `pytest tests/unit -m "not gpu"`, plus a slim-wheel install check that `prime-rl-configs` imports cleanly without heavy deps (no torch / vllm / transformers / wandb / verifiers / datasets / liger / loguru in `sys.modules`) | `ubuntu-latest` | +| [`gpu_tests.yaml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/.github/workflows/gpu_tests.yaml) | every non-draft PR + push to `main` | `pytest tests/unit -m gpu`, plus a matrix of named integration scenarios (`reverse_text`, `reverse_text_sft`, `reverse_text_lora`, `reverse_text_moe`, `reverse_text_multi_run`, `reverse_text_rl_opd`, `reverse_text_rl_sft`, `reverse_text_sft_lora`, `alphabet_sort`, `benchmark_regression`) | self-hosted GPU runners (`vm`, `4xa6000`) | +| [`nightly_tests.yaml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/.github/workflows/nightly_tests.yaml) | 03:00 PST daily + manual `workflow_dispatch` (single-file filter optional) | every file in `tests/nightly/`, one matrix job per file | `research-cluster` | + +The GPU + Nightly workflows skip drafts — open the PR as **Draft** until you're ready to consume CI compute, then mark it ready for review to trigger the GPU matrix. + +### Markers + +Two pytest markers are declared in `pyproject.toml` (`addopts = "--strict-markers"`): + +- `gpu` — gate a test that needs CUDA. CPU CI uses `-m "not gpu"`; the GPU unit job uses `-m gpu`. +- `slow` — gate a test that's expensive enough you'd usually skip it locally. Deselect with `-m "not slow"`. + +## Pre-commit hooks + +Install the [pre-commit](https://pre-commit.com) hooks before your first commit so ruff and the docs-reference regenerator run automatically: + +```bash +uv run pre-commit install +``` + +The configured hooks: + +- **`ruff` check + format** on staged Python files. +- **`docs-reference`** — re-runs [`scripts/generate_docs_reference.py`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/scripts/generate_docs_reference.py) whenever a config class or the generator itself is staged. If `docs/reference.md` would change, the commit fails so you can re-stage the regenerated file. + ## Testing MoE at small scale When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. diff --git a/docs/overview.md b/docs/overview.md index 3f0f18f0e0..bbbbcf6719 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -43,6 +43,6 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. - **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training. -- **[Development](development.md)** — Adding a new model architecture, debugging modeling code at small scale. +- **[Development](development.md)** — Test suite (unit / integration / nightly), pre-commit hooks, adding a new model architecture. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. From 414e3e95b620bf9023c8dc351c3e8927aee127f0 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:11:31 +0000 Subject: [PATCH 42/66] docs(algorithms): fold Length penalties into Default advantage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Length penalties are configured under [orchestrator.length_penalty] and layer on top of *any* advantage function — they're conceptually not a standalone advantage variant. Move them into Default advantage where readers will see the option while reading about advantages. Drive-by: clean up the parenthetical hint in Default advantage. The old version listed "length penalties tied to turn count" as a reason to write a custom advantage, which conflicts with the actual turn-count length penalty being a built-in. New wording points length-penalty users at the now-adjacent built-in section, and points custom-advantage users at trajectory-metadata-driven shaping (sub-agents, relative-rank, …). TOC entry for #length-penalties dropped; no external doc links to that anchor. Co-authored-by: Cursor --- docs/algorithms.md | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/docs/algorithms.md b/docs/algorithms.md index 88f532791a..ef9cc23a2a 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -13,7 +13,6 @@ This page covers the math and the configurable algorithmic components: how off-p - [Advantage](#advantage) - [Default advantage](#default-advantage) - [Custom advantage](#custom-advantage) - - [Length penalties](#length-penalties) - [Filters](#filters) - [Multi-turn trajectories](#multi-turn-trajectories) - [Extension property](#extension-property) @@ -139,7 +138,14 @@ Anything you put in `metrics` is averaged across sequences and logged with the o The default advantage is per-group reward minus per-group baseline (DR-GRPO without std normalization). For each prompt's group of `group_size` rollouts, every token in rollout $i$ receives advantage $s_i - \bar{s}$ where $\bar{s}$ is the group mean. -This is intentionally simple — it does the right thing for most envs. Switch to a custom function when you need group-aware shaping (e.g. length penalties tied to turn count, sub-agent rollouts, or relative-rank shaping). +This is intentionally simple — it does the right thing for most envs. Switch to a [custom advantage](#custom-advantage) when you need group-aware shaping that depends on trajectory metadata (sub-agent rollouts, relative-rank shaping, …). + +Two built-in **length penalties** can be layered on top of any advantage to discourage rambling: + +- `[orchestrator.length_penalty] type = "tokens"` — penalizes long completions in tokens, with configurable target and slope. +- `[orchestrator.length_penalty] type = "turns"` — penalizes long multi-turn rollouts by turn count. + +See [Reference § orchestrator length penalties](reference.md#orchestrator) for the fields. ### Custom advantage @@ -166,15 +172,6 @@ kwargs = { eps = 1e-8 } `AdvantageInputs.rollouts` is a list of `verifiers.RolloutOutput`, so you have access to the full rollout (turns, tool calls, custom metadata) — not just the reward. Use this for anything reward-shaping-like that needs trajectory context. -### Length penalties - -Two built-in length penalties can be layered on top of any advantage: - -- `[orchestrator.length_penalty] type = "tokens"` — penalizes long completions in tokens, with configurable target and slope. -- `[orchestrator.length_penalty] type = "turns"` — penalizes long multi-turn rollouts by turn count. - -See [Reference § orchestrator length penalties](reference.md#orchestrator) for the fields. - ## Filters Filters drop rollouts between scoring and training. Built-ins (composable): From 00f2c8f8d986a9a00c6fafebb24b04f1a4928bee Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:14:36 +0000 Subject: [PATCH 43/66] docs(advanced): add Difficulty pools + Online difficulty filtering MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Covers the two complementary buffer-side mechanisms for keeping the trainer batch high-signal. Verified each claim against src/prime_rl/orchestrator/buffer.py and packages/prime-rl-configs/ src/prime_rl/configs/orchestrator.py. - Difficulty pools (buffer.easy_threshold / hard_threshold + easy_fraction / hard_fraction): per-problem running-average reward is compared to the thresholds; problems hitting either bound move to easy/hard pool and stop being sampled. Pool assignments persist across checkpoints (easy_examples.jsonl / hard_examples.jsonl); *_fraction lifts a fraction of pooled problems back into normal on resume / start. - Online difficulty filtering (buffer.online_difficulty_filtering bool): groups whose avg reward is exactly 0.0 or 1.0 are dropped from the buffer because their within-group advantage is zero (DR-GRPO produces no signal). Counted under filtered_rollouts/ {env}/{easy,hard} for visibility. - The tradeoff bit the user asked for explicitly: with ODF on each trainer step's effective batch is dense + predictable, but the orchestrator pays for the throw-away rollouts and may need a higher oversampling_factor; if time/wait_for_batch is already high on the trainer, ODF can starve the loop. - ODF is orthogonal to the pools — ODF reacts to the current group's reward distribution, pools track the running per-problem average. Configs often use both. Plus a one-line page-intro update + TOC entries. Co-authored-by: Cursor --- docs/advanced.md | 48 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/docs/advanced.md b/docs/advanced.md index 76fc39c5a0..396c5ceaf6 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, and multi-tenant training. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, multi-tenant training, and the buffer-side difficulty controls. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). ## Table of Contents @@ -13,6 +13,9 @@ This page covers the specialized features layered on top of the core training st - [Limitations](#limitations) - [LoRA training](#lora-training) - [Multi-tenant training](#multi-tenant-training) +- [Difficulty pools and online filtering](#difficulty-pools-and-online-filtering) + - [Difficulty pools](#difficulty-pools) + - [Online difficulty filtering (ODF)](#online-difficulty-filtering-odf) ## Custom modeling @@ -119,3 +122,46 @@ LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — ea ## Multi-tenant training Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). + +## Difficulty pools and online filtering + +Two complementary mechanisms keep the trainer batch high-signal: **difficulty pools** that gradually retire problems the model has solved or never solves, and **online difficulty filtering (ODF)** that drops collapsed-advantage groups from the current batch. + +### Difficulty pools + +After each rollout, the average reward across a problem's group is compared to two thresholds: + +- `buffer.easy_threshold` — at or above this, the problem moves into the `easy` pool and is no longer sampled. +- `buffer.hard_threshold` — at or below this, the problem moves into the `hard` pool and is no longer sampled. +- Otherwise the problem stays in `normal` and remains in the sampling rotation. + +Pool assignments persist across checkpoints (`easy_examples.jsonl` / `hard_examples.jsonl` under each step's orchestrator checkpoint). When you resume — or want to broaden the curriculum mid-run — `buffer.easy_fraction` / `buffer.hard_fraction` randomly lift that fraction of pooled problems back into `normal` so they re-enter sampling. + +```toml +[orchestrator.buffer] +easy_threshold = 0.95 +hard_threshold = 0.05 +easy_fraction = 0.0 # default; bump on resume to bring some easy problems back +hard_fraction = 0.0 # default; bump on resume to bring some hard problems back +``` + +Watch `pool/{env}/{easy,normal,hard}` (current pool ratios) and `evicted_examples/{env}/{easy,hard}` (per-step eviction rate). + +### Online difficulty filtering (ODF) + +`buffer.online_difficulty_filtering = true` is a per-rollout filter on the way *into* the buffer: + +- Average reward across the group is **0.0** (every rollout failed) → drop the group, count under `filtered_rollouts/{env}/hard`. +- Average reward **1.0** (every rollout succeeded) → drop, count under `filtered_rollouts/{env}/easy`. +- Otherwise → into the buffer. + +These are exactly the groups whose within-group advantage collapses to zero — DR-GRPO produces no gradient signal for them, so the trainer would burn step time on tokens it can't learn from. + +```toml +[orchestrator.buffer] +online_difficulty_filtering = true +``` + +**Tradeoff: trainer stability vs. inference speed.** With ODF on, every rollout that reaches the trainer carries non-zero advantage — each trainer step's effective batch is predictable and the gradient signal is denser. The cost is paid on the inference side: rollouts get produced and then thrown away, so the orchestrator has to oversample to keep the trainer fed. If the orchestrator is your bottleneck (`time/wait_for_batch` high on the trainer), ODF can starve the loop. Bump `orchestrator.oversampling_factor` so inference produces enough groups per step to absorb the drops. + +ODF is orthogonal to the pools: ODF reacts to the *current* group's reward distribution, the pools track the *running* per-problem average. Many configs use both — ODF for per-step density, pools for long-horizon curriculum cleanup. From 8d61759bda98a7daffdebe9af72531170ead3f18 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:20:18 +0000 Subject: [PATCH 44/66] docs: relocate Difficulty pools + ODF from Advanced to Algorithms MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Both are buffer-side controls over what reaches the trainer — same conceptual category as Filters and the loss/advantage knobs. Algorithms is the right home; Advanced is for orthogonal feature sets (LoRA, multi-tenant, custom modeling, multimodal). Promote each to its own top-level section. - algorithms.md: new "## Difficulty pools" and "## Online difficulty filtering" sections, inserted between Filters and Multi-turn trajectories. TOC updated. - advanced.md: section + TOC entries + page-intro mention removed. Content unchanged from the previous commit on Advanced — same threshold / fraction / persistence claims, same ODF tradeoff explanation, same cross-link to oversampling_factor. ODF section back-links to Difficulty pools for the orthogonality note. Co-authored-by: Cursor --- docs/advanced.md | 48 +--------------------------------------------- docs/algorithms.md | 41 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 47 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 396c5ceaf6..76fc39c5a0 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, multi-tenant training, and the buffer-side difficulty controls. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, and multi-tenant training. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). ## Table of Contents @@ -13,9 +13,6 @@ This page covers the specialized features layered on top of the core training st - [Limitations](#limitations) - [LoRA training](#lora-training) - [Multi-tenant training](#multi-tenant-training) -- [Difficulty pools and online filtering](#difficulty-pools-and-online-filtering) - - [Difficulty pools](#difficulty-pools) - - [Online difficulty filtering (ODF)](#online-difficulty-filtering-odf) ## Custom modeling @@ -122,46 +119,3 @@ LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — ea ## Multi-tenant training Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). - -## Difficulty pools and online filtering - -Two complementary mechanisms keep the trainer batch high-signal: **difficulty pools** that gradually retire problems the model has solved or never solves, and **online difficulty filtering (ODF)** that drops collapsed-advantage groups from the current batch. - -### Difficulty pools - -After each rollout, the average reward across a problem's group is compared to two thresholds: - -- `buffer.easy_threshold` — at or above this, the problem moves into the `easy` pool and is no longer sampled. -- `buffer.hard_threshold` — at or below this, the problem moves into the `hard` pool and is no longer sampled. -- Otherwise the problem stays in `normal` and remains in the sampling rotation. - -Pool assignments persist across checkpoints (`easy_examples.jsonl` / `hard_examples.jsonl` under each step's orchestrator checkpoint). When you resume — or want to broaden the curriculum mid-run — `buffer.easy_fraction` / `buffer.hard_fraction` randomly lift that fraction of pooled problems back into `normal` so they re-enter sampling. - -```toml -[orchestrator.buffer] -easy_threshold = 0.95 -hard_threshold = 0.05 -easy_fraction = 0.0 # default; bump on resume to bring some easy problems back -hard_fraction = 0.0 # default; bump on resume to bring some hard problems back -``` - -Watch `pool/{env}/{easy,normal,hard}` (current pool ratios) and `evicted_examples/{env}/{easy,hard}` (per-step eviction rate). - -### Online difficulty filtering (ODF) - -`buffer.online_difficulty_filtering = true` is a per-rollout filter on the way *into* the buffer: - -- Average reward across the group is **0.0** (every rollout failed) → drop the group, count under `filtered_rollouts/{env}/hard`. -- Average reward **1.0** (every rollout succeeded) → drop, count under `filtered_rollouts/{env}/easy`. -- Otherwise → into the buffer. - -These are exactly the groups whose within-group advantage collapses to zero — DR-GRPO produces no gradient signal for them, so the trainer would burn step time on tokens it can't learn from. - -```toml -[orchestrator.buffer] -online_difficulty_filtering = true -``` - -**Tradeoff: trainer stability vs. inference speed.** With ODF on, every rollout that reaches the trainer carries non-zero advantage — each trainer step's effective batch is predictable and the gradient signal is denser. The cost is paid on the inference side: rollouts get produced and then thrown away, so the orchestrator has to oversample to keep the trainer fed. If the orchestrator is your bottleneck (`time/wait_for_batch` high on the trainer), ODF can starve the loop. Bump `orchestrator.oversampling_factor` so inference produces enough groups per step to absorb the drops. - -ODF is orthogonal to the pools: ODF reacts to the *current* group's reward distribution, the pools track the *running* per-problem average. Many configs use both — ODF for per-step density, pools for long-horizon curriculum cleanup. diff --git a/docs/algorithms.md b/docs/algorithms.md index ef9cc23a2a..71d1f11d73 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -14,6 +14,8 @@ This page covers the math and the configurable algorithmic components: how off-p - [Default advantage](#default-advantage) - [Custom advantage](#custom-advantage) - [Filters](#filters) +- [Difficulty pools](#difficulty-pools) +- [Online difficulty filtering](#online-difficulty-filtering) - [Multi-turn trajectories](#multi-turn-trajectories) - [Extension property](#extension-property) - [Best-effort interleaving](#best-effort-interleaving) @@ -195,6 +197,45 @@ threshold = 0.4 Filtered rollouts still appear in W&B distributions, just not in the trainer batch — useful for spotting whether filtering is doing its job. +## Difficulty pools + +Difficulty pools gradually retire problems the model has solved or never solves. After each rollout, the average reward across a problem's group is compared to two thresholds: + +- `buffer.easy_threshold` — at or above this, the problem moves into the `easy` pool and is no longer sampled. +- `buffer.hard_threshold` — at or below this, the problem moves into the `hard` pool and is no longer sampled. +- Otherwise the problem stays in `normal` and remains in the sampling rotation. + +Pool assignments persist across checkpoints (`easy_examples.jsonl` / `hard_examples.jsonl` under each step's orchestrator checkpoint). When you resume — or want to broaden the curriculum mid-run — `buffer.easy_fraction` / `buffer.hard_fraction` randomly lift that fraction of pooled problems back into `normal` so they re-enter sampling. + +```toml +[orchestrator.buffer] +easy_threshold = 0.95 +hard_threshold = 0.05 +easy_fraction = 0.0 # default; bump on resume to bring some easy problems back +hard_fraction = 0.0 # default; bump on resume to bring some hard problems back +``` + +Watch `pool/{env}/{easy,normal,hard}` (current pool ratios) and `evicted_examples/{env}/{easy,hard}` (per-step eviction rate). + +## Online difficulty filtering + +Online difficulty filtering (ODF) drops collapsed-advantage groups on the way *into* the buffer. Set `buffer.online_difficulty_filtering = true` (default `false`) to enable: + +- Average reward across the group is **0.0** (every rollout failed) → drop the group, count under `filtered_rollouts/{env}/hard`. +- Average reward **1.0** (every rollout succeeded) → drop, count under `filtered_rollouts/{env}/easy`. +- Otherwise → into the buffer. + +These are exactly the groups whose within-group advantage collapses to zero — DR-GRPO produces no gradient signal for them, so the trainer would burn step time on tokens it can't learn from. + +```toml +[orchestrator.buffer] +online_difficulty_filtering = true +``` + +**Tradeoff: trainer stability vs. inference speed.** With ODF on, every rollout that reaches the trainer carries non-zero advantage — each trainer step's effective batch is predictable and the gradient signal is denser. The cost is paid on the inference side: rollouts get produced and then thrown away, so the orchestrator has to oversample to keep the trainer fed. If the orchestrator is your bottleneck (`time/wait_for_batch` high on the trainer), ODF can starve the loop. Bump `orchestrator.oversampling_factor` so inference produces enough groups per step to absorb the drops. + +ODF is orthogonal to the [pools](#difficulty-pools): ODF reacts to the *current* group's reward distribution, the pools track the *running* per-problem average. Many configs use both — ODF for per-step density, pools for long-horizon curriculum cleanup. + ## Multi-turn trajectories Multi-turn rollouts (tool use, browser environments, long conversations) used to be stitched into a single fake "single-turn" sample, which silently corrupted the importance ratio when chat templates didn't roundtrip. Since [verifiers v0.1.8](https://github.com/PrimeIntellect-ai/verifiers/releases/tag/v0.1.8), `prime-rl` records each LLM request/response as an independent **trajectory step** and merges them at training time using best-effort interleaving — with [renderers](#renderers) as the mechanism that keeps the merge safe by construction. From c8a64b3ed5863969c64fcb0c56885a6d8b9e288a Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:22:03 +0000 Subject: [PATCH 45/66] docs(development): split MoE-debug recipe; promote "Adding a new architecture" - "Adding a new architecture" promoted to its own top-level section, placed *before* the renamed Debugging MoE recipe so the natural reading order is "here's how to wire up a new arch -> here's how to smoke-test it". Step 3 of the recipe now forward-links to Debugging MoE instead of "the three steps above". - "Testing MoE at small scale" -> "Debugging MoE". The page is already titled Development, so the qualifier was redundant. - Subsection titles drop the "Step N:" prefixes (the page-level TOC already implies sequence) and switch to sentence case for consistency with the rest of the docs: - Step 1: build and verify a mini model -> Build and verify a mini model - Step 2: SFT warmup -> SFT warmup (already correct) - Step 3: RL on reverse-text -> RL on reverse-text (already correct) - TOC updated; README docs index pitch swaps "small-scale MoE testing" -> "debugging MoE" to match the new section title. Co-authored-by: Cursor --- README.md | 2 +- docs/development.md | 62 ++++++++++++++++++++++----------------------- 2 files changed, 32 insertions(+), 32 deletions(-) diff --git a/README.md b/README.md index 803c942064..ff7a16086c 100644 --- a/README.md +++ b/README.md @@ -223,7 +223,7 @@ Check out the [docs](docs) directory for in-depth guides on how to use PRIME-RL. - [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, Kubernetes, disaggregated inference, benchmarking - [**Algorithms**](docs/algorithms.md) - Async/off-policy training, the AIPO loss, advantage and filter plugins, trajectory merging - [**Advanced**](docs/advanced.md) - Custom modeling, multimodal training, LoRA, multi-tenant training -- [**Development**](docs/development.md) - Test suite, pre-commit hooks, adding a new model architecture, small-scale MoE testing +- [**Development**](docs/development.md) - Test suite, pre-commit hooks, adding a new model architecture, debugging MoE - [**FAQs**](docs/faqs.md) - Frequently-asked questions - [**Reference**](docs/reference.md) - Auto-generated field-by-field reference for every entrypoint config diff --git a/docs/development.md b/docs/development.md index a0b50eb41b..d885da5195 100644 --- a/docs/development.md +++ b/docs/development.md @@ -10,11 +10,11 @@ This page covers workflows for developing on `prime-rl` itself — running the t - [CI workflows](#ci-workflows) - [Markers](#markers) - [Pre-commit hooks](#pre-commit-hooks) -- [Testing MoE at small scale](#testing-moe-at-small-scale) - - [Step 1: build and verify a mini model](#step-1-build-and-verify-a-mini-model) - - [Step 2: SFT warmup](#step-2-sft-warmup) - - [Step 3: RL on reverse-text](#step-3-rl-on-reverse-text) - - [Adding a new architecture](#adding-a-new-architecture) +- [Adding a new architecture](#adding-a-new-architecture) +- [Debugging MoE](#debugging-moe) + - [Build and verify a mini model](#build-and-verify-a-mini-model) + - [SFT warmup](#sft-warmup) + - [RL on reverse-text](#rl-on-reverse-text) ## Test suite @@ -67,11 +67,33 @@ The configured hooks: - **`ruff` check + format** on staged Python files. - **`docs-reference`** — re-runs [`scripts/generate_docs_reference.py`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/scripts/generate_docs_reference.py) whenever a config class or the generator itself is staged. If `docs/reference.md` would change, the commit fails so you can re-stage the regenerated file. -## Testing MoE at small scale +## Adding a new architecture + +To add (e.g.) Kimi 2.5: + +1. Add the modeling code under `src/prime_rl/trainer/models//`. +2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: + +```python +ARCH_PRESETS = { + "glm4_moe": { + "config_class": Glm4MoeConfig, + "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), + "hf_model_class": HFGlm4MoeForCausalLM, + "prime_model_class": PrimeRLGlm4MoeForCausalLM, + "tokenizer_source": "THUDM/GLM-4-9B-0414", + }, + # add your arch here +} +``` + +3. Run the [Debugging MoE](#debugging-moe) workflow with `--arch ` to smoke-test the new modeling code end-to-end. + +## Debugging MoE When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. -### Step 1: build and verify a mini model +### Build and verify a mini model ```bash uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe @@ -83,7 +105,7 @@ This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) wit uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only ``` -### Step 2: SFT warmup +### SFT warmup Use the shipped debug MoE SFT config with reverse-text data: @@ -99,7 +121,7 @@ uv run sft @ configs/debug/moe/sft/train.toml \ Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution so KL divergence becomes meaningful in RL. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). -### Step 3: RL on reverse-text +### RL on reverse-text ```bash uv run rl @ configs/ci/integration/reverse_text_moe/start.toml \ @@ -116,25 +138,3 @@ What to look for: - **Loss reasonable.** Not NaN, not stuck. Don't expect reward to climb meaningfully in 20 steps on a random model. - -### Adding a new architecture - -To add (e.g.) Kimi 2.5: - -1. Add the modeling code under `src/prime_rl/trainer/models//`. -2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: - -```python -ARCH_PRESETS = { - "glm4_moe": { - "config_class": Glm4MoeConfig, - "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), - "hf_model_class": HFGlm4MoeForCausalLM, - "prime_model_class": PrimeRLGlm4MoeForCausalLM, - "tokenizer_source": "THUDM/GLM-4-9B-0414", - }, - # add your arch here -} -``` - -3. Run the three steps above with `--arch `. From b39704b69a61060a2835eac751b2617804d4d8a3 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:27:27 +0000 Subject: [PATCH 46/66] docs(development): tighten Debugging MoE subsections - "### Build and verify a mini model" -> "### Create mini model". The roundtrip-verify body covers the "and verify" half; the shorter title is enough. - Merge "### SFT warmup" + "### RL on reverse-text" into one "### Smoketest training" subsection. The two were always run together as a single end-to-end smoke test (warmup so KL is meaningful, then RL stack), so a single subsection with two code blocks reads better than two artificially-split ones. - TOC updated. No external doc links to the old sub-anchors. Co-authored-by: Cursor --- docs/development.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/docs/development.md b/docs/development.md index d885da5195..2a27c3d809 100644 --- a/docs/development.md +++ b/docs/development.md @@ -12,9 +12,8 @@ This page covers workflows for developing on `prime-rl` itself — running the t - [Pre-commit hooks](#pre-commit-hooks) - [Adding a new architecture](#adding-a-new-architecture) - [Debugging MoE](#debugging-moe) - - [Build and verify a mini model](#build-and-verify-a-mini-model) - - [SFT warmup](#sft-warmup) - - [RL on reverse-text](#rl-on-reverse-text) + - [Create mini model](#create-mini-model) + - [Smoketest training](#smoketest-training) ## Test suite @@ -93,7 +92,7 @@ ARCH_PRESETS = { When working on MoE architectures (GLM-4, Kimi, etc.), you can't iterate on a 100B+ model locally. The workflow below builds a ~0.5B model with the same architecture, warms it up with SFT, and runs RL — all on 1–2 GPUs. The goal is catching bugs in modeling code, state-dict conversions, and pipeline integration before scaling. -### Build and verify a mini model +### Create mini model ```bash uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe @@ -105,9 +104,9 @@ This creates a ~543M parameter GLM-4 MoE (1024 hidden, 24 layers, 8 experts) wit uv run python scripts/mini_moe.py --arch glm4_moe --output-dir ./mini-glm-moe --verify-only ``` -### SFT warmup +### Smoketest training -Use the shipped debug MoE SFT config with reverse-text data: +First warm up the random-weight mini model with SFT on reverse-text so KL divergence becomes meaningful in the RL phase: ```bash uv run sft @ configs/debug/moe/sft/train.toml \ @@ -119,9 +118,9 @@ uv run sft @ configs/debug/moe/sft/train.toml \ --ckpt.weights ``` -Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution so KL divergence becomes meaningful in RL. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe). +Loss drops from ~12 to ~2.5. The output won't be coherent, but the model now has a non-trivial distribution. A pre-built SFT'd checkpoint lives at [samsja/mini-glm-moe](https://huggingface.co/samsja/mini-glm-moe) if you want to skip this step. -### RL on reverse-text +Then run the full RL stack on reverse-text: ```bash uv run rl @ configs/ci/integration/reverse_text_moe/start.toml \ From 939089310600caccb1682e26c00cb84d33337e48 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:39:55 +0000 Subject: [PATCH 47/66] docs: rename Worked example -> Examples; update reference generator MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User-facing docs no longer reference configs/ directly — examples/ is the only "we keep this up to date" surface, the rest is CI- and debug-internal: - configuration.md: launch line + worked-example switched to examples/reverse_text/rl.toml. Section renamed "Worked example" -> "Examples" with a curated tour of the 10 README examples (basic 1-8 GPU + advanced SLURM tiers); the compose / override / dry-run walkthrough lives as a "### Worked example" subsection. - training.md: drop the "Debug configs for all variants ship under configs/debug/training_modes/" pointer in the Training modes section. The prose already explains how to set the mode. - scaling.md: P/D inference now points at examples/glm5_pd_disag/ rl.toml (with a link to its README) instead of the configs/-side inference-only TOML. - faqs.md: install-verify and smoke-test recipes both switch from configs/debug/sft and configs/gsm8k to examples/reverse_text. Reference generator (scripts/generate_docs_reference.py): - Drop "from the Pydantic config models" from the page header. - Move the regenerate command + structural notes from the header to a new "## About this page" footer. - Wrap the Type column in code spans so list[int], int | None, etc. render as code instead of plain text. fmt_type now emits literal `|` (GFM accepts pipes inside code spans inside table cells; no escaping needed). - Walk list-of-BaseModel fields. Previously orchestrator.train.env / orchestrator.eval.env / orchestrator.filters were rendered as one row showing the default repr; their leaf fields never showed up. New _list_inner_models() detects both list[X] (single model) and list[Annotated[Union[A | B], discriminator]] (discriminated union of list items, e.g. filters: list[FilterConfig]). Index placeholder rendered as to match the CLI form (--orchestrator.train.env.0.id ...). Regenerate reference.md: +5k chars, mostly the new env/filter list-item subsections that were missing before. development.md still references two CI-tested configs/ paths (configs/debug/moe/sft/train.toml, configs/ci/integration/ reverse_text_moe/start.toml) — those are validated by the reverse_text_moe GPU integration test on every PR, so they don't risk drifting. Flagging in case the user wants those swapped too. Co-authored-by: Cursor --- docs/configuration.md | 34 +- docs/faqs.md | 9 +- docs/reference.md | 2143 +++++++++++++++------------- docs/scaling.md | 6 +- docs/training.md | 2 - scripts/generate_docs_reference.py | 93 +- 6 files changed, 1292 insertions(+), 995 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index 2eb4f05240..2db0afef0e 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -18,7 +18,7 @@ Every `prime-rl` entrypoint uses [`pydantic-config`](https://github.com/PrimeInt - [None](#none) - [Discriminated unions](#discriminated-unions) - [Environments (`[[orchestrator.train.env]]`)](#environments-orchestratortrainenv) -- [Worked example](#worked-example) +- [Examples](#examples) ## Sources and precedence @@ -33,7 +33,7 @@ Field values come from three sources — Pydantic defaults, TOML files (passed w The `@` token introduces a TOML file. Multiple `@` arguments compose left-to-right, deep-merged — unset fields in an overlay keep the base value: ```bash -uv run rl @ configs/gsm8k/rl.toml # one file +uv run rl @ examples/reverse_text/rl.toml # one file uv run rl @ base.toml @ overlay.toml # left to right uv run rl --trainer @ trainer.toml --orchestrator @ orch.toml # per-section uv run rl @ base.toml --trainer @ trainer.toml # mixed @@ -163,25 +163,45 @@ args = { dataset_name = "openai/gsm8k", dataset_subset = "main" } `args` is forwarded verbatim to the environment's `load_environment(**args)`. See each environment's README on the [Hub](https://app.primeintellect.ai/dashboard/environments) for accepted args. -## Worked example +## Examples + +The shipped end-to-end examples in [`examples/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples) are the canonical, kept-up-to-date references — the rest of the repo's TOMLs (under `configs/`) are CI- and debug-internal and may drift. Each example directory has its own README with the full launch story. + +**Basic** (1–8 GPUs): + +- [**Reverse Text**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/reverse_text) — `Qwen3-0.6B` reversing a chunk of text. Tiny single-turn SFT + RL; runs on a single consumer GPU in minutes. +- [**Wordle**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/wordle) — `Qwen3-1.7B` playing Wordle. Multi-turn SFT + RL; 2–4 H100s. +- [**Alphabet Sort**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/alphabet_sort) — `Qwen3-4B-Instruct-2507` sorting names alphabetically. Multi-turn LoRA RL without SFT warmup; one H100. +- [**Wiki Search**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/wiki_search) — `Qwen3-4B-Instruct-2507` answering trivia by web-searching Wikipedia. Multi-turn with tool use. +- [**Hendrycks Sanity**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/hendrycks_sanity) — `DeepSeek-R1-Distill-Qwen-1.5B` on a filtered MATH subset. Useful for algorithm ablations. + +**Advanced** (32–2048 GPUs, SLURM): + +- [**Qwen 3 30B – A3B Math**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/qwen30b_math) — `Qwen3-30B-A3B` on hard math. +- [**Qwen 3 30B – A3B SWE**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/qwen30b_swe) — `Qwen3-30B-A3B` on hard SWE. +- [**INTELLECT-3.1**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/Intellect-3.1) — reproduces our INTELLECT-3.1 training run. +- [**MiniMax-M2.5 SWE**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/minimax_m2.5_swe) — `MiniMax-M2.5` on agentic SWE. +- [**High-throughput GLM-5**](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/glm5_pd_disag) — `GLM-5` with P/D disaggregation and FP8 inference. + +### Worked example: compose, override, dry-run Start from a shipped base config, override two fields on the CLI, and dry-run: ```bash -uv run rl @ configs/gsm8k/rl.toml \ +uv run rl @ examples/reverse_text/rl.toml \ --wandb.name my-experiment \ --trainer.optim.lr 5e-6 \ - --output-dir /tmp/gsm8k-dry \ + --output-dir /tmp/reverse-dry \ --dry-run ``` Then inspect the resolved config: ```bash -ls /tmp/gsm8k-dry/configs/ +ls /tmp/reverse-dry/configs/ # rl.toml trainer.toml orchestrator.toml inference.toml ``` -Each per-process TOML reflects the final, validated configuration that the actual run would consume — exactly what each process sees when started standalone (`uv run trainer @ /tmp/gsm8k-dry/configs/trainer.toml`, etc.). This is the easiest way to bisect a misbehaving config: dry-run a known-good base, dry-run your overlay, diff the two. +Each per-process TOML reflects the final, validated configuration that the actual run would consume — exactly what each process sees when started standalone (`uv run trainer @ /tmp/reverse-dry/configs/trainer.toml`, etc.). This is the easiest way to bisect a misbehaving config: dry-run a known-good base, dry-run your overlay, diff the two. For the full set of fields, defaults, types, and constraints accepted by each entrypoint, jump to [Reference](reference.md). diff --git a/docs/faqs.md b/docs/faqs.md index 25d4a36b8c..6d0c49279e 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -16,18 +16,17 @@ Frequently-asked questions, grouped by topic. For full background see the linked ### What's the fastest way to verify my install works? -The SFT debug config runs end-to-end on CPU or any single GPU with fake data: +The minimal SFT run uses the shipped reverse-text example: ```bash -uv run sft @ configs/debug/sft/train.toml +uv run sft @ examples/reverse_text/sft.toml ``` -For the full RL stack on 2 GPUs, the GSM8K example is the smallest realistic run: +For the full RL stack on 2 GPUs, the same example covers it: ```bash -prime env install primeintellect/math-env bash scripts/tmux.sh -uv run rl @ configs/gsm8k/rl.toml --wandb.name smoke-test --ckpt +uv run rl @ examples/reverse_text/rl.toml --wandb.name smoke-test --ckpt ``` See [Overview § Quick run](overview.md#quick-run). diff --git a/docs/reference.md b/docs/reference.md index 6a7763db08..932e4ebe57 100644 --- a/docs/reference.md +++ b/docs/reference.md @@ -1,22 +1,7 @@ # Reference This page documents every field accepted by every prime-rl entrypoint. It is -auto-generated from the Pydantic config models; do not edit by hand. - -To regenerate, run from the project root: - -```bash -uv run python scripts/generate_docs_reference.py -``` - -Each entrypoint section walks its config tree top-down. Nested sub-configs -appear under headings named after their dotted path (e.g. `trainer.model.ac`). -Discriminated unions (loss, advantage, scheduler, optimizer, …) document each -variant in turn — set the `type` field to pick one. - -For conceptual context behind these knobs, see -[Configuration](configuration.md), [Training](training.md), -[Scaling](scaling.md), [Algorithms](algorithms.md), and [Advanced](advanced.md). +auto-generated; do not edit by hand. ## Table of Contents @@ -82,6 +67,7 @@ For conceptual context behind these knobs, see - [`ckpt`](#orchestrator-ckpt) - [`heartbeat`](#orchestrator-heartbeat) - [`experimental`](#orchestrator-experimental) + - [`filters.` (list item)](#orchestrator-filters) - [`weight_broadcast`](#orchestrator-weight-broadcast) - [`rollout_transport`](#orchestrator-rollout-transport) - [`inference` — Standalone vLLM server](#inference) @@ -105,55 +91,55 @@ _Defined in_ `prime_rl.configs.rl.RLConfig`. | Field | Type | Default | Description | |---|---|---|---| -| `output_dir` | Path | `'outputs'` | Output directory. Should be unique per experiment. | -| `clean_output_dir` | bool | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | -| `max_steps` | int \| None | `None` | Shared maximum training steps. If None, falls back to the sub-config ``max_steps``. | -| `seq_len` | int \| None | `None` | Shared sequence length. Propagates to ``trainer.model.seq_len`` and ``orchestrator.seq_len`` only when those values were not explicitly set; explicit per-component values always win. | -| `max_async_level` | int \| None | `None` | Shared async level. If None, falls back to the sub-config ``max_async_level``. | -| `bench` | bool | `False` | Benchmark mode. Sets trainer and orchestrator to benchmark mode and, when set, suffixes the W&B project with ``-bench``. | -| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | +| `output_dir` | `Path` | `'outputs'` | Output directory. Should be unique per experiment. | +| `clean_output_dir` | `bool` | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | +| `max_steps` | `int | None` | `None` | Shared maximum training steps. If None, falls back to the sub-config ``max_steps``. | +| `seq_len` | `int | None` | `None` | Shared sequence length. Propagates to ``trainer.model.seq_len`` and ``orchestrator.seq_len`` only when those values were not explicitly set; explicit per-component values always win. | +| `max_async_level` | `int | None` | `None` | Shared async level. If None, falls back to the sub-config ``max_async_level``. | +| `bench` | `bool` | `False` | Benchmark mode. Sets trainer and orchestrator to benchmark mode and, when set, suffixes the W&B project with ``-bench``. | +| `dry_run` | `bool` | `False` | Only validate and dump resolved configs, then exit early. | ### `trainer` | Field | Type | Default | Description | |---|---|---|---| -| `trainer.output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | -| `trainer.matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | -| `trainer.max_steps` | int \| None | `None` | Maximum number of training steps. If None, runs indefinitely. | -| `trainer.max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | -| `trainer.enable_router_replay` | bool | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | -| `trainer.memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | -| `trainer.trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | -| `trainer.dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | -| `trainer.max_concurrent_runs` | int | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | +| `trainer.output_dir` | `Path` | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `trainer.matmul_precision` | `'highest' | 'high' | 'medium'` | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `trainer.max_steps` | `int | None` | `None` | Maximum number of training steps. If None, runs indefinitely. | +| `trainer.max_async_level` | `int` | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | +| `trainer.enable_router_replay` | `bool` | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | +| `trainer.memory_profiler_path` | `Path | None` | `None` | Path to write the memory profile to. | +| `trainer.trace_path` | `Path | None` | `None` | Path to write the PyTorch profiler trace to. | +| `trainer.dist_timeout_seconds` | `int` | `600` | Timeout in seconds for torch distributed ops. | +| `trainer.max_concurrent_runs` | `int` | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | #### `trainer.model` | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `trainer.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | -| `trainer.model.seq_len` | int | `2048` | Sequence length the model is trained on. | -| `trainer.model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | -| `trainer.model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | -| `trainer.model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | -| `trainer.model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | -| `trainer.model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | -| `trainer.model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | -| `trainer.model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | -| `trainer.model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | -| `trainer.model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | -| `trainer.model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | -| `trainer.model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | -| `trainer.model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | -| `trainer.model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | -| `trainer.model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | -| `trainer.model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | -| `trainer.model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | -| `trainer.model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | -| `trainer.model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | +| `trainer.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `trainer.model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | +| `trainer.model.seq_len` | `int` | `2048` | Sequence length the model is trained on. | +| `trainer.model.attn` | `'eager' | 'sdpa' | 'flash_attention_2' | 'flash_attention_3' | 'fa4'` | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `trainer.model.fsdp_cpu_offload` | `bool` | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `trainer.model.optim_cpu_offload` | `bool` | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `trainer.model.reshard_after_forward` | `bool` | `True` | Reshard the model after each forward pass. | +| `trainer.model.dp_replicate` | `int` | `1` | Data parallel dim where model weights are replicated. | +| `trainer.model.ep` | `int` | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `trainer.model.ep_comm_backend` | `'torch' | 'deepep'` | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `trainer.model.deepep_num_sms` | `int` | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `trainer.model.deepep_token_chunk_size` | `int | None` | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `trainer.model.cp` | `int` | `1` | Context parallelism degree. 1 disables CP. | +| `trainer.model.cp_style` | `'ring' | 'ulysses'` | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `trainer.model.impl` | `'hf' | 'custom' | 'auto'` | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `trainer.model.optimization_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for model optimization. | +| `trainer.model.reduce_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for gradient/parameter reductions. | +| `trainer.model.moe_use_grouped_mm` | `bool` | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `trainer.model.fp8` | `bool` | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `trainer.model.freeze_moe_router` | `bool` | `False` | Freeze MoE router parameters during training. | +| `trainer.model.fused_lm_head_token_chunk_size` | `int | 'auto' | 'disabled'` | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | ##### `trainer.model.vlm` @@ -162,9 +148,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `trainer.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `trainer.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `trainer.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `trainer.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `trainer.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ##### `trainer.model.compile` @@ -173,7 +159,7 @@ Compile the model with ``torch.compile``. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | +| `trainer.model.compile.fullgraph` | `bool` | `False` | Compile transformer blocks with ``fullgraph=True``. | ##### `trainer.model.ac` @@ -182,9 +168,9 @@ Activation checkpointing configuration. If None, activation checkpointing is dis | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | -| `trainer.model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | -| `trainer.model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | +| `trainer.model.ac.mode` | `'full' | 'selective'` | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `trainer.model.ac.freq` | `int` | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `trainer.model.ac.targets` | `list[str]` | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | ##### `trainer.model.ac_offloading` @@ -193,8 +179,8 @@ Activation offloading configuration. If None, activation offloading is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | -| `trainer.model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | +| `trainer.model.ac_offloading.pin_memory` | `bool` | `True` | Pin offloaded activations to CPU memory. | +| `trainer.model.ac_offloading.max_inflight_activations` | `int` | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | ##### `trainer.model.index_cache` @@ -203,8 +189,8 @@ DSA IndexCache sub-configuration. If set, sparse-attention top-k indices are reu | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.index_cache.topk_freq` | int | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | -| `trainer.model.index_cache.topk_pattern` | str \| None | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | +| `trainer.model.index_cache.topk_freq` | `int` | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | +| `trainer.model.index_cache.topk_pattern` | `str | None` | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | ##### `trainer.model.lora` @@ -213,11 +199,11 @@ LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | -| `trainer.model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | -| `trainer.model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | -| `trainer.model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | -| `trainer.model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | +| `trainer.model.lora.rank` | `int` | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `trainer.model.lora.alpha` | `float` | `32.0` | _≥0._ LoRA scaling parameter. | +| `trainer.model.lora.dropout` | `float` | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `trainer.model.lora.target_modules` | `list[str]` | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `trainer.model.lora.modules_to_save` | `list[str]` | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | ##### `trainer.model.debug` @@ -226,18 +212,18 @@ Debugging knobs for the model and distributed training. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | -| `trainer.model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | -| `trainer.model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | +| `trainer.model.debug.num_layers` | `int | None` | `None` | Override the number of transformer layers (truncates the model). | +| `trainer.model.debug.random_init` | `bool` | `False` | Randomly initialize the model instead of loading weights. | +| `trainer.model.debug.force_balanced_routing` | `bool` | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | #### `trainer.tokenizer` | Field | Type | Default | Description | |---|---|---|---| -| `trainer.tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `trainer.tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `trainer.tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `trainer.tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `trainer.tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `trainer.tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | #### `trainer.data` @@ -249,8 +235,8 @@ Use a fake data loader sampling random micro-batches (for debugging). | Field | Type | Default | Description | |---|---|---|---| -| `trainer.data.fake.batch_size` | int | `2` | _≥1._ Batch size of the fake data loader. | -| `trainer.data.fake.generate_samples` | bool | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | +| `trainer.data.fake.batch_size` | `int` | `2` | _≥1._ Batch size of the fake data loader. | +| `trainer.data.fake.generate_samples` | `bool` | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | #### `trainer.ckpt` @@ -259,17 +245,17 @@ Full training-state checkpoint configuration (model + optimizer + scheduler). If | Field | Type | Default | Description | |---|---|---|---| -| `trainer.ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | -| `trainer.ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | -| `trainer.ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | -| `trainer.ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | -| `trainer.ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | -| `trainer.ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `trainer.ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | -| `trainer.ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | -| `trainer.ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | -| `trainer.ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | -| `trainer.ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | +| `trainer.ckpt.output_dir` | `Path | None` | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `trainer.ckpt.interval` | `int | None` | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `trainer.ckpt.skip_gather_master_weights` | `bool` | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `trainer.ckpt.weights_only` | `bool` | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `trainer.ckpt.resume_step` | `int | None` | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `trainer.ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `trainer.ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `trainer.ckpt.skip_progress` | `bool` | `False` | Skip loading the progress from checkpoint. | +| `trainer.ckpt.skip_scheduler` | `bool` | `False` | Skip loading the scheduler from checkpoint. | +| `trainer.ckpt.skip_dataloader` | `bool` | `False` | Skip loading the dataloader from checkpoint. | +| `trainer.ckpt.skip_optimizer` | `bool` | `False` | Skip loading the optimizer state from checkpoint. | ##### `trainer.ckpt.weights` @@ -278,32 +264,32 @@ Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoint | Field | Type | Default | Description | |---|---|---|---| -| `trainer.ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | -| `trainer.ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | -| `trainer.ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | +| `trainer.ckpt.weights.save_sharded` | `bool` | `True` | Save the weight checkpoint in sharded format. | +| `trainer.ckpt.weights.save_format` | `'safetensors' | 'torch'` | `'safetensors'` | Weight checkpoint serialization format. | +| `trainer.ckpt.weights.save_adapter_separately` | `bool` | `False` | Save LoRA adapters separately before merging into full model weights. | #### `trainer.log` | Field | Type | Default | Description | |---|---|---|---| -| `trainer.log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | -| `trainer.log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | -| `trainer.log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | -| `trainer.log.log_data` | bool | `False` | Log the first data sample at startup. | -| `trainer.log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | +| `trainer.log.level` | `str` | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `trainer.log.vf_level` | `str` | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `trainer.log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `trainer.log.log_data` | `bool` | `False` | Log the first data sample at startup. | +| `trainer.log.ranks_filter` | `list[int]` | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | #### `trainer.wandb` | Field | Type | Default | Description | |---|---|---|---| -| `trainer.wandb.project` | str | `'prime-rl'` | W&B project to log to. | -| `trainer.wandb.entity` | str \| None | `None` | W&B entity to log to. | -| `trainer.wandb.name` | str \| None | `None` | W&B run name. | -| `trainer.wandb.group` | str \| None | `None` | W&B group. | -| `trainer.wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `trainer.wandb.offline` | bool | `False` | Run W&B in offline mode. | +| `trainer.wandb.project` | `str` | `'prime-rl'` | W&B project to log to. | +| `trainer.wandb.entity` | `str | None` | `None` | W&B entity to log to. | +| `trainer.wandb.name` | `str | None` | `None` | W&B run name. | +| `trainer.wandb.group` | `str | None` | `None` | W&B group. | +| `trainer.wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `trainer.wandb.offline` | `bool` | `False` | Run W&B in offline mode. | #### `trainer.bench` @@ -312,7 +298,7 @@ Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake da | Field | Type | Default | Description | |---|---|---|---| -| `trainer.bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | +| `trainer.bench.output_json` | `Path | None` | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | #### `trainer.gc` @@ -321,7 +307,7 @@ Garbage collection config. Disables automatic GC and runs deterministic collecti | Field | Type | Default | Description | |---|---|---|---| -| `trainer.gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | +| `trainer.gc.interval` | `int` | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | #### `trainer.heartbeat` @@ -330,7 +316,7 @@ BetterStack heartbeat configuration for monitoring training progress. | Field | Type | Default | Description | |---|---|---|---| -| `trainer.heartbeat.url` | str | *required* | URL to send the heartbeat to. | +| `trainer.heartbeat.url` | `str` | *required* | URL to send the heartbeat to. | #### `trainer.metrics_server` @@ -339,8 +325,8 @@ Prometheus metrics server configuration. If set, exposes a ``/metrics`` endpoint | Field | Type | Default | Description | |---|---|---|---| -| `trainer.metrics_server.port` | int | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | -| `trainer.metrics_server.host` | str | `'0.0.0.0'` | Host to bind the server to. | +| `trainer.metrics_server.port` | `int` | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | +| `trainer.metrics_server.host` | `str` | `'0.0.0.0'` | Host to bind the server to. | #### `trainer.experimental` @@ -362,20 +348,20 @@ Discriminated union — set `trainer.loss.type` to one of `default`, `custom` an | Field | Type | Default | Description | |---|---|---|---| -| `trainer.loss.type` | 'default' | `'default'` | | -| `trainer.loss.dppo_mask_low` | float | `0.2` | _≥0._ Lower DPPO masking threshold. | -| `trainer.loss.dppo_mask_high` | float | `0.2` | _≥0._ Upper DPPO masking threshold. | -| `trainer.loss.adv_tau` | float | `1.0` | _≥0._ Temperature for the advantage term. | -| `trainer.loss.kl_tau` | float | `0.001` | _≥0._ Temperature for the KL term. | +| `trainer.loss.type` | `'default'` | `'default'` | | +| `trainer.loss.dppo_mask_low` | `float` | `0.2` | _≥0._ Lower DPPO masking threshold. | +| `trainer.loss.dppo_mask_high` | `float` | `0.2` | _≥0._ Upper DPPO masking threshold. | +| `trainer.loss.adv_tau` | `float` | `1.0` | _≥0._ Temperature for the advantage term. | +| `trainer.loss.kl_tau` | `float` | `0.001` | _≥0._ Temperature for the KL term. | ##### `trainer.loss.type = "custom"` (CustomLossConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.loss.type` | 'custom' | `'custom'` | | -| `trainer.loss.import_path` | str | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | -| `trainer.loss.kwargs` | dict[str, Any] | `{}` | Kwargs forwarded to the loss function. | +| `trainer.loss.type` | `'custom'` | `'custom'` | | +| `trainer.loss.import_path` | `str` | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | +| `trainer.loss.kwargs` | `dict[str, Any]` | `{}` | Kwargs forwarded to the loss function. | #### `trainer.optim` @@ -387,47 +373,47 @@ Discriminated union — set `trainer.optim.type` to one of `sgd`, `adamw`, `muon | Field | Type | Default | Description | |---|---|---|---| -| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `trainer.optim.type` | 'sgd' | `'sgd'` | | -| `trainer.optim.nesterov` | bool | `True` | Use Nesterov momentum. | -| `trainer.optim.momentum` | float | `0.9` | SGD momentum factor. | +| `trainer.optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | `'sgd'` | `'sgd'` | | +| `trainer.optim.nesterov` | `bool` | `True` | Use Nesterov momentum. | +| `trainer.optim.momentum` | `float` | `0.9` | SGD momentum factor. | ##### `trainer.optim.type = "adamw"` (AdamWConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `trainer.optim.type` | 'adamw' | `'adamw'` | | -| `trainer.optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | -| `trainer.optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | +| `trainer.optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | `'adamw'` | `'adamw'` | | +| `trainer.optim.betas1` | `float` | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `trainer.optim.betas2` | `float` | `0.999` | _≥0._ Adam second-moment (β2) decay. | ##### `trainer.optim.type = "muon"` (MuonConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `trainer.optim.type` | 'muon' | `'muon'` | | -| `trainer.optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | -| `trainer.optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | -| `trainer.optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `trainer.optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | `'muon'` | `'muon'` | | +| `trainer.optim.mu` | `float` | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `trainer.optim.betas1` | `float` | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `trainer.optim.betas2` | `float` | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | ##### `trainer.optim.type = "sign_sgd"` (SignSGDConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `trainer.optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `trainer.optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `trainer.optim.type` | 'sign_sgd' | `'sign_sgd'` | | +| `trainer.optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `trainer.optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `trainer.optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `trainer.optim.type` | `'sign_sgd'` | `'sign_sgd'` | | #### `trainer.scheduler` @@ -439,26 +425,26 @@ Discriminated union — set `trainer.scheduler.type` to one of `constant`, `line | Field | Type | Default | Description | |---|---|---|---| -| `trainer.scheduler.type` | 'constant' | `'constant'` | | +| `trainer.scheduler.type` | `'constant'` | `'constant'` | | ##### `trainer.scheduler.type = "linear"` (LinearSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.scheduler.type` | 'linear' | `'linear'` | | -| `trainer.scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `trainer.scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | -| `trainer.scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `trainer.scheduler.type` | `'linear'` | `'linear'` | | +| `trainer.scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `trainer.scheduler.decay_steps` | `int` | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `trainer.scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | ##### `trainer.scheduler.type = "cosine"` (CosineSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.scheduler.type` | 'cosine' | `'cosine'` | | -| `trainer.scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `trainer.scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `trainer.scheduler.type` | `'cosine'` | `'cosine'` | | +| `trainer.scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `trainer.scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | #### `trainer.weight_broadcast` @@ -472,21 +458,21 @@ Discriminated union — set `trainer.weight_broadcast.type` to one of `filesyste | Field | Type | Default | Description | |---|---|---|---| -| `trainer.weight_broadcast.type` | 'filesystem' | `'filesystem'` | | -| `trainer.weight_broadcast.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | -| `trainer.weight_broadcast.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | +| `trainer.weight_broadcast.type` | `'filesystem'` | `'filesystem'` | | +| `trainer.weight_broadcast.save_sharded` | `bool` | `True` | Save the weight checkpoint in sharded format. | +| `trainer.weight_broadcast.save_format` | `'safetensors' | 'torch'` | `'safetensors'` | Weight checkpoint serialization format. | ##### `trainer.weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.weight_broadcast.type` | 'nccl' | `'nccl'` | | -| `trainer.weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | -| `trainer.weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | -| `trainer.weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | -| `trainer.weight_broadcast.inference_world_size` | int | `1` | Number of GPUs used for inference. | -| `trainer.weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | +| `trainer.weight_broadcast.type` | `'nccl'` | `'nccl'` | | +| `trainer.weight_broadcast.host` | `str` | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `trainer.weight_broadcast.port` | `int` | `29501` | Port for the NCCL broadcast rendezvous. | +| `trainer.weight_broadcast.timeout` | `int` | `1200` | Timeout in seconds for the NCCL broadcast. | +| `trainer.weight_broadcast.inference_world_size` | `int` | `1` | Number of GPUs used for inference. | +| `trainer.weight_broadcast.quantize_in_weight_transfer` | `bool` | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | #### `trainer.rollout_transport` @@ -500,44 +486,43 @@ Discriminated union — set `trainer.rollout_transport.type` to one of `filesyst | Field | Type | Default | Description | |---|---|---|---| -| `trainer.rollout_transport.type` | 'filesystem' | `'filesystem'` | | +| `trainer.rollout_transport.type` | `'filesystem'` | `'filesystem'` | | ##### `trainer.rollout_transport.type = "zmq"` (ZMQTransportConfig) | Field | Type | Default | Description | |---|---|---|---| -| `trainer.rollout_transport.type` | 'zmq' | `'zmq'` | | -| `trainer.rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | -| `trainer.rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | -| `trainer.rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | +| `trainer.rollout_transport.type` | `'zmq'` | `'zmq'` | | +| `trainer.rollout_transport.host` | `str` | `'localhost'` | Host address for ZMQ transport. | +| `trainer.rollout_transport.port` | `int` | `5555` | Base port for ZMQ transport. | +| `trainer.rollout_transport.hwm` | `int` | `10` | High-water mark (max in-flight messages per ZMQ socket). | ### `orchestrator` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.training_mode` | 'rl' \| 'opd' \| 'sft' | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | -| `orchestrator.advantage` | DefaultAdvantageConfig \| CustomAdvantageConfig \| None | `DefaultAdvantageConfig()` | | -| `orchestrator.filters` | list[GibberishFilterConfig \| RepetitionFilterConfig \| ZeroAdvantageFilterConfig] | `[GibberishFilterConfig(type='gibberish', enforce=False, token_id_threshold=100000, logprob_offset=2.0), RepetitionFilterConfig(type='repetition', enforce=False, window=3000, prob_threshold=0.99), ZeroAdvantageFilterConfig(type='zero_advantage', enforce=True)]` | Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). | -| `orchestrator.collect_inference_metrics` | bool | `True` | Collect inference-server metrics (requires wandb). | -| `orchestrator.output_dir` | Path | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | -| `orchestrator.tasks_per_minute` | int \| None | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | -| `orchestrator.batch_size` | int \| None | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | -| `orchestrator.token_batch_size` | int \| None | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | -| `orchestrator.oversampling_factor` | float \| None | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | -| `orchestrator.max_inflight_rollouts` | int \| None | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | -| `orchestrator.group_size` | int | `1` | _≥1._ Output sequences returned per example during training. | -| `orchestrator.seq_len` | int | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | -| `orchestrator.num_train_workers` | int | `1` | _≥1._ Training workers to use. | -| `orchestrator.max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | -| `orchestrator.max_off_policy_steps` | int | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | -| `orchestrator.max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | -| `orchestrator.strict_async_level` | bool | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | -| `orchestrator.bench` | bool | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | -| `orchestrator.seed` | int \| None | `42` | Random seed for the orchestrator. | -| `orchestrator.use_renderer` | bool | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | -| `orchestrator.env_install_prerelease` | bool | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | +| `orchestrator.training_mode` | `'rl' | 'opd' | 'sft'` | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | +| `orchestrator.advantage` | `DefaultAdvantageConfig | CustomAdvantageConfig | None` | `DefaultAdvantageConfig()` | | +| `orchestrator.collect_inference_metrics` | `bool` | `True` | Collect inference-server metrics (requires wandb). | +| `orchestrator.output_dir` | `Path` | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `orchestrator.tasks_per_minute` | `int | None` | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | +| `orchestrator.batch_size` | `int | None` | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | +| `orchestrator.token_batch_size` | `int | None` | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | +| `orchestrator.oversampling_factor` | `float | None` | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | +| `orchestrator.max_inflight_rollouts` | `int | None` | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | +| `orchestrator.group_size` | `int` | `1` | _≥1._ Output sequences returned per example during training. | +| `orchestrator.seq_len` | `int` | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | +| `orchestrator.num_train_workers` | `int` | `1` | _≥1._ Training workers to use. | +| `orchestrator.max_steps` | `int | None` | `None` | Maximum training steps. If None, runs indefinitely. | +| `orchestrator.max_off_policy_steps` | `int` | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | +| `orchestrator.max_async_level` | `int` | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | +| `orchestrator.strict_async_level` | `bool` | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | +| `orchestrator.bench` | `bool` | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | +| `orchestrator.seed` | `int | None` | `42` | Random seed for the orchestrator. | +| `orchestrator.use_renderer` | `bool` | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | +| `orchestrator.env_install_prerelease` | `bool` | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | #### `orchestrator.student` @@ -549,8 +534,8 @@ Student rollout participant (model + client) — the model being trained. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.student.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `orchestrator.student.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `orchestrator.student.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `orchestrator.student.model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | ###### `orchestrator.student.model.vlm` @@ -559,9 +544,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.student.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `orchestrator.student.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `orchestrator.student.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `orchestrator.student.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `orchestrator.student.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `orchestrator.student.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ###### `orchestrator.student.model.lora` @@ -570,27 +555,27 @@ Per-run LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.student.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | -| `orchestrator.student.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | -| `orchestrator.student.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | +| `orchestrator.student.model.lora.name` | `str | None` | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `orchestrator.student.model.lora.rank` | `int | None` | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `orchestrator.student.model.lora.alpha` | `float | None` | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | ##### `orchestrator.student.client` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.student.client.timeout` | int | `1200` | Request timeout in seconds. | -| `orchestrator.student.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | -| `orchestrator.student.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | -| `orchestrator.student.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | -| `orchestrator.student.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | -| `orchestrator.student.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | -| `orchestrator.student.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | -| `orchestrator.student.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | -| `orchestrator.student.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | -| `orchestrator.student.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | -| `orchestrator.student.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | -| `orchestrator.student.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | +| `orchestrator.student.client.timeout` | `int` | `1200` | Request timeout in seconds. | +| `orchestrator.student.client.connect_timeout` | `float` | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `orchestrator.student.client.wait_for_ready_timeout` | `int` | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `orchestrator.student.client.base_url` | `list[str]` | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `orchestrator.student.client.api_key_var` | `str` | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `orchestrator.student.client.headers` | `dict[str, str]` | `{}` | Static headers sent with every request. | +| `orchestrator.student.client.headers_from_env` | `dict[str, str]` | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `orchestrator.student.client.extra_headers_from_state` | `dict[str, str]` | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | +| `orchestrator.student.client.skip_model_check` | `bool` | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `orchestrator.student.client.dp_rank_count` | `int` | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `orchestrator.student.client.admin_base_url` | `list[str] | None` | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `orchestrator.student.client.router_url` | `str | None` | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | ###### `orchestrator.student.client.elastic` @@ -599,9 +584,9 @@ Elastic inference pool config for DNS-based service discovery. When set, ``base_ | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.student.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | -| `orchestrator.student.client.elastic.port` | int | `8000` | Port that inference servers listen on. | -| `orchestrator.student.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | +| `orchestrator.student.client.elastic.hostname` | `str` | *required* | DNS hostname that resolves to inference server IPs. | +| `orchestrator.student.client.elastic.port` | `int` | `8000` | Port that inference servers listen on. | +| `orchestrator.student.client.elastic.sync_interval` | `float` | `5.0` | Seconds between server discovery checks. | #### `orchestrator.teacher` @@ -613,8 +598,8 @@ Teacher rollout participant (model + client). Role depends on ``training_mode``: | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.teacher.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `orchestrator.teacher.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `orchestrator.teacher.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `orchestrator.teacher.model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | ###### `orchestrator.teacher.model.vlm` @@ -623,9 +608,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.teacher.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `orchestrator.teacher.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `orchestrator.teacher.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `orchestrator.teacher.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `orchestrator.teacher.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `orchestrator.teacher.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ###### `orchestrator.teacher.model.lora` @@ -634,27 +619,27 @@ Per-run LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.teacher.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | -| `orchestrator.teacher.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | -| `orchestrator.teacher.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | +| `orchestrator.teacher.model.lora.name` | `str | None` | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `orchestrator.teacher.model.lora.rank` | `int | None` | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `orchestrator.teacher.model.lora.alpha` | `float | None` | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | ##### `orchestrator.teacher.client` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.teacher.client.timeout` | int | `1200` | Request timeout in seconds. | -| `orchestrator.teacher.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | -| `orchestrator.teacher.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | -| `orchestrator.teacher.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | -| `orchestrator.teacher.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | -| `orchestrator.teacher.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | -| `orchestrator.teacher.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | -| `orchestrator.teacher.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | -| `orchestrator.teacher.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | -| `orchestrator.teacher.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | -| `orchestrator.teacher.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | -| `orchestrator.teacher.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | +| `orchestrator.teacher.client.timeout` | `int` | `1200` | Request timeout in seconds. | +| `orchestrator.teacher.client.connect_timeout` | `float` | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `orchestrator.teacher.client.wait_for_ready_timeout` | `int` | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `orchestrator.teacher.client.base_url` | `list[str]` | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `orchestrator.teacher.client.api_key_var` | `str` | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `orchestrator.teacher.client.headers` | `dict[str, str]` | `{}` | Static headers sent with every request. | +| `orchestrator.teacher.client.headers_from_env` | `dict[str, str]` | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `orchestrator.teacher.client.extra_headers_from_state` | `dict[str, str]` | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | +| `orchestrator.teacher.client.skip_model_check` | `bool` | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `orchestrator.teacher.client.dp_rank_count` | `int` | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `orchestrator.teacher.client.admin_base_url` | `list[str] | None` | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `orchestrator.teacher.client.router_url` | `str | None` | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | ###### `orchestrator.teacher.client.elastic` @@ -663,18 +648,17 @@ Elastic inference pool config for DNS-based service discovery. When set, ``base_ | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.teacher.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | -| `orchestrator.teacher.client.elastic.port` | int | `8000` | Port that inference servers listen on. | -| `orchestrator.teacher.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | +| `orchestrator.teacher.client.elastic.hostname` | `str` | *required* | DNS hostname that resolves to inference server IPs. | +| `orchestrator.teacher.client.elastic.port` | `int` | `8000` | Port that inference servers listen on. | +| `orchestrator.teacher.client.elastic.sync_interval` | `float` | `5.0` | Seconds between server discovery checks. | #### `orchestrator.train` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.train.env` | list[TrainEnvConfig] | `[TrainEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=TrainSamplingConfig(temperature=1.0, repetition_penalty=1.0, max_completion_tokens=None, min_tokens=0, seed=None, extra_body={}))]` | Training environments. | -| `orchestrator.train.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | -| `orchestrator.train.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `orchestrator.train.num_workers` | `int | 'auto'` | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `orchestrator.train.max_retries` | `int` | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | ##### `orchestrator.train.sampling` @@ -683,21 +667,54 @@ Shared training sampling configuration. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.train.sampling.temperature` | float | `1.0` | _≥0._ Sampling temperature. | -| `orchestrator.train.sampling.repetition_penalty` | float | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | -| `orchestrator.train.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | -| `orchestrator.train.sampling.min_tokens` | int | `0` | _≥0._ Minimum output tokens per sequence. | -| `orchestrator.train.sampling.seed` | int \| None | `None` | Random seed for sampling. If None, no seeding is used. | -| `orchestrator.train.sampling.extra_body` | dict[str, Any] | `{}` | Extra body forwarded with each request to the inference server. | +| `orchestrator.train.sampling.temperature` | `float` | `1.0` | _≥0._ Sampling temperature. | +| `orchestrator.train.sampling.repetition_penalty` | `float` | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `orchestrator.train.sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `orchestrator.train.sampling.min_tokens` | `int` | `0` | _≥0._ Minimum output tokens per sequence. | +| `orchestrator.train.sampling.seed` | `int | None` | `None` | Random seed for sampling. If None, no seeding is used. | +| `orchestrator.train.sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body forwarded with each request to the inference server. | + + +##### `orchestrator.train.env.` (list item) + +Training environments. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.train.env..id` | `str` | `'reverse-text'` | Registered verifiers environment ID (e.g. ``math-env``, ``primeintellect/math-env``). May include an ``@version`` suffix for installation. | +| `orchestrator.train.env..name` | `str | None` | `None` | Display name for this environment in logs, metrics, and buffer keys. Defaults to the ``id`` without ``@version``. Must be unique across all envs in the same group. | +| `orchestrator.train.env..args` | `dict` | `{}` | Keyword arguments forwarded to ``vf.load_environment``. See the environment's docstring for accepted args. | +| `orchestrator.train.env..extra_env_kwargs` | `dict[str, Any]` | `{}` | Extra kwargs passed to the env (e.g. ``seq_len``, ``max_total_completion_tokens``). Auto-populated by the orchestrator; user overrides are generally discouraged. The main use case is matching ``extra_env_kwargs`` when running an env in an isolated environment server. | +| `orchestrator.train.env..address` | `str | None` | `None` | ZMQ address of an external env server (e.g. ``tcp://host:5000``). When set, the orchestrator connects to this server instead of spawning one; when None, a subprocess env server is spawned automatically. | +| `orchestrator.train.env..num_workers` | `int | 'auto'` | `'auto'` | Worker processes for the spawned env server. ``auto`` scales to 1 worker per 256 concurrent rollouts. Ignored when ``address`` is set. | +| `orchestrator.train.env..ratio` | `float | None` | `None` | _>0._ Sampling weight for this environment in the buffer. When None for all envs, samples uniformly across all available problems. When set, must be set on all envs — values are relative weights normalized to probabilities (e.g. [1, 1] and [0.5, 0.5] are equivalent). | +| `orchestrator.train.env..max_retries` | `int` | `3` | _≥0._ Times the env server retries a failed rollout before returning an error. | +| `orchestrator.train.env..max_total_completion_tokens` | `int` | `-1` | Maximum total completion tokens across all turns in a multi-turn rollout. ``-1`` disables. Auto-populated into ``extra_env_kwargs``. | +| `orchestrator.train.env..timeout` | `float | None` | `None` | Per-rollout wall-clock timeout in seconds. None disables. | +| `orchestrator.train.env..state_columns` | `list[str]` | `[]` | Extra ``State`` fields to persist into the saved rollout records (in addition to the always-saved ``trajectory`` and ``sampling_args``). Values must be JSON-serializable. | + + +###### `orchestrator.train.env..sampling` + +Per-env sampling overrides. Unset fields inherit from the group-level train sampling config. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.train.env..sampling.temperature` | `float` | `1.0` | _≥0._ Sampling temperature. | +| `orchestrator.train.env..sampling.repetition_penalty` | `float` | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `orchestrator.train.env..sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `orchestrator.train.env..sampling.min_tokens` | `int` | `0` | _≥0._ Minimum output tokens per sequence. | +| `orchestrator.train.env..sampling.seed` | `int | None` | `None` | Random seed for sampling. If None, no seeding is used. | +| `orchestrator.train.env..sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body forwarded with each request to the inference server. | #### `orchestrator.tokenizer` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `orchestrator.tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `orchestrator.tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `orchestrator.tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `orchestrator.tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `orchestrator.tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | #### `orchestrator.renderer` @@ -706,12 +723,12 @@ Client-side renderer configuration. Only consumed when ``use_renderer=true``. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | -| `orchestrator.renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | -| `orchestrator.renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | -| `orchestrator.renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | -| `orchestrator.renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | -| `orchestrator.renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | +| `orchestrator.renderer.name` | `str` | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `orchestrator.renderer.tool_parser` | `str | None` | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `orchestrator.renderer.reasoning_parser` | `str | None` | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `orchestrator.renderer.pool_size` | `int | None` | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `orchestrator.renderer.preserve_all_thinking` | `bool` | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `orchestrator.renderer.preserve_thinking_between_tool_calls` | `bool` | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | #### `orchestrator.optim` @@ -720,7 +737,7 @@ Per-run optimizer configuration for multi-run training. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.optim.lr` | float | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | +| `orchestrator.optim.lr` | `float` | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | #### `orchestrator.eval` @@ -729,15 +746,14 @@ Evaluation configuration. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.eval.env` | list[EvalEnvConfig] | `[EvalEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=EvalSamplingConfig(temperature=None, repetition_penalty=None, top_p=None, top_k=None, min_p=None, max_completion_tokens=None, min_tokens=None, reasoning_effort=None, seed=None, extra_body={}), num_examples=-1, group_size=1, interval=100)]` | Evaluation environments. | -| `orchestrator.eval.num_examples` | int | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | -| `orchestrator.eval.group_size` | int | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | -| `orchestrator.eval.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | -| `orchestrator.eval.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | -| `orchestrator.eval.interval` | int | `100` | _≥1._ Step interval at which to evaluate the model. | -| `orchestrator.eval.eval_base_model` | bool | `True` | Evaluate the base model we are training on. | -| `orchestrator.eval.skip_eval_on_resume` | bool | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | -| `orchestrator.eval.cancel_inflight_rollouts_on_eval` | bool | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | +| `orchestrator.eval.num_examples` | `int` | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | +| `orchestrator.eval.group_size` | `int` | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | +| `orchestrator.eval.num_workers` | `int | 'auto'` | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `orchestrator.eval.max_retries` | `int` | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `orchestrator.eval.interval` | `int` | `100` | _≥1._ Step interval at which to evaluate the model. | +| `orchestrator.eval.eval_base_model` | `bool` | `True` | Evaluate the base model we are training on. | +| `orchestrator.eval.skip_eval_on_resume` | `bool` | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | +| `orchestrator.eval.cancel_inflight_rollouts_on_eval` | `bool` | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | ##### `orchestrator.eval.sampling` @@ -746,51 +762,91 @@ Shared eval sampling configuration; can differ from training sampling. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.eval.sampling.temperature` | float \| None | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | -| `orchestrator.eval.sampling.repetition_penalty` | float \| None | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | -| `orchestrator.eval.sampling.top_p` | float \| None | `None` | Nucleus sampling threshold. None defers to the inference server default. | -| `orchestrator.eval.sampling.top_k` | int \| None | `None` | Top-k sampling. None defers to the inference server default. | -| `orchestrator.eval.sampling.min_p` | float \| None | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | -| `orchestrator.eval.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. None defers to the inference server default. | -| `orchestrator.eval.sampling.min_tokens` | int \| None | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | -| `orchestrator.eval.sampling.reasoning_effort` | 'minimal' \| 'low' \| 'medium' \| 'high' \| None | `None` | Reasoning effort constraint for reasoning models. | -| `orchestrator.eval.sampling.seed` | int \| None | `None` | Random seed for sampling. None means no seeding. | -| `orchestrator.eval.sampling.extra_body` | dict[str, Any] | `{}` | Extra body parameters forwarded to the inference server. | +| `orchestrator.eval.sampling.temperature` | `float | None` | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `orchestrator.eval.sampling.repetition_penalty` | `float | None` | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `orchestrator.eval.sampling.top_p` | `float | None` | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.sampling.top_k` | `int | None` | `None` | Top-k sampling. None defers to the inference server default. | +| `orchestrator.eval.sampling.min_p` | `float | None` | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `orchestrator.eval.sampling.min_tokens` | `int | None` | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `orchestrator.eval.sampling.reasoning_effort` | `'minimal' | 'low' | 'medium' | 'high' | None` | `None` | Reasoning effort constraint for reasoning models. | +| `orchestrator.eval.sampling.seed` | `int | None` | `None` | Random seed for sampling. None means no seeding. | +| `orchestrator.eval.sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body parameters forwarded to the inference server. | + + +##### `orchestrator.eval.env.` (list item) + +Evaluation environments. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.eval.env..id` | `str` | `'reverse-text'` | Registered verifiers environment ID (e.g. ``math-env``, ``primeintellect/math-env``). May include an ``@version`` suffix for installation. | +| `orchestrator.eval.env..name` | `str | None` | `None` | Display name for this environment in logs, metrics, and buffer keys. Defaults to the ``id`` without ``@version``. Must be unique across all envs in the same group. | +| `orchestrator.eval.env..args` | `dict` | `{}` | Keyword arguments forwarded to ``vf.load_environment``. See the environment's docstring for accepted args. | +| `orchestrator.eval.env..extra_env_kwargs` | `dict[str, Any]` | `{}` | Extra kwargs passed to the env (e.g. ``seq_len``, ``max_total_completion_tokens``). Auto-populated by the orchestrator; user overrides are generally discouraged. The main use case is matching ``extra_env_kwargs`` when running an env in an isolated environment server. | +| `orchestrator.eval.env..address` | `str | None` | `None` | ZMQ address of an external env server (e.g. ``tcp://host:5000``). When set, the orchestrator connects to this server instead of spawning one; when None, a subprocess env server is spawned automatically. | +| `orchestrator.eval.env..num_workers` | `int | 'auto'` | `'auto'` | Worker processes for the spawned env server. ``auto`` scales to 1 worker per 256 concurrent rollouts. Ignored when ``address`` is set. | +| `orchestrator.eval.env..ratio` | `float | None` | `None` | _>0._ Sampling weight for this environment in the buffer. When None for all envs, samples uniformly across all available problems. When set, must be set on all envs — values are relative weights normalized to probabilities (e.g. [1, 1] and [0.5, 0.5] are equivalent). | +| `orchestrator.eval.env..max_retries` | `int` | `3` | _≥0._ Times the env server retries a failed rollout before returning an error. | +| `orchestrator.eval.env..max_total_completion_tokens` | `int` | `-1` | Maximum total completion tokens across all turns in a multi-turn rollout. ``-1`` disables. Auto-populated into ``extra_env_kwargs``. | +| `orchestrator.eval.env..timeout` | `float | None` | `None` | Per-rollout wall-clock timeout in seconds. None disables. | +| `orchestrator.eval.env..state_columns` | `list[str]` | `[]` | Extra ``State`` fields to persist into the saved rollout records (in addition to the always-saved ``trajectory`` and ``sampling_args``). Values must be JSON-serializable. | +| `orchestrator.eval.env..num_examples` | `int` | `-1` | Eval examples to sample from the dataset. ``-1`` uses all available examples. | +| `orchestrator.eval.env..group_size` | `int` | `1` | _≥1._ Rollouts generated per example. Used for pass@k estimation (e.g. ``group_size=8`` enables pass@1 through pass@8). | +| `orchestrator.eval.env..interval` | `int` | `100` | _≥1._ Per-env eval interval. If unset, inherits from the group-level eval interval. | + + +###### `orchestrator.eval.env..sampling` + +Per-env sampling overrides. Unset fields inherit from the group-level eval sampling config. + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.eval.env..sampling.temperature` | `float | None` | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.repetition_penalty` | `float | None` | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.top_p` | `float | None` | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.top_k` | `int | None` | `None` | Top-k sampling. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.min_p` | `float | None` | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.min_tokens` | `int | None` | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `orchestrator.eval.env..sampling.reasoning_effort` | `'minimal' | 'low' | 'medium' | 'high' | None` | `None` | Reasoning effort constraint for reasoning models. | +| `orchestrator.eval.env..sampling.seed` | `int | None` | `None` | Random seed for sampling. None means no seeding. | +| `orchestrator.eval.env..sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body parameters forwarded to the inference server. | #### `orchestrator.buffer` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.buffer.seed` | int \| None | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | -| `orchestrator.buffer.easy_threshold` | float \| None | `None` | Average-reward threshold above which a problem is classified ``easy``. | -| `orchestrator.buffer.hard_threshold` | float \| None | `None` | Average-reward threshold below which a problem is classified ``hard``. | -| `orchestrator.buffer.easy_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | -| `orchestrator.buffer.hard_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | -| `orchestrator.buffer.online_difficulty_filtering` | bool | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | -| `orchestrator.buffer.hash_keys` | list[str] | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | +| `orchestrator.buffer.seed` | `int | None` | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | +| `orchestrator.buffer.easy_threshold` | `float | None` | `None` | Average-reward threshold above which a problem is classified ``easy``. | +| `orchestrator.buffer.hard_threshold` | `float | None` | `None` | Average-reward threshold below which a problem is classified ``hard``. | +| `orchestrator.buffer.easy_fraction` | `float` | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `orchestrator.buffer.hard_fraction` | `float` | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `orchestrator.buffer.online_difficulty_filtering` | `bool` | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | +| `orchestrator.buffer.hash_keys` | `list[str]` | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | #### `orchestrator.log` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | -| `orchestrator.log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | -| `orchestrator.log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | -| `orchestrator.log.log_data` | bool | `False` | Log the first data sample at startup. | +| `orchestrator.log.level` | `str` | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `orchestrator.log.vf_level` | `str` | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `orchestrator.log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `orchestrator.log.log_data` | `bool` | `False` | Log the first data sample at startup. | #### `orchestrator.wandb` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.wandb.project` | str | `'prime-rl'` | W&B project to log to. | -| `orchestrator.wandb.entity` | str \| None | `None` | W&B entity to log to. | -| `orchestrator.wandb.name` | str \| None | `None` | W&B run name. | -| `orchestrator.wandb.group` | str \| None | `None` | W&B group. | -| `orchestrator.wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `orchestrator.wandb.offline` | bool | `False` | Run W&B in offline mode. | +| `orchestrator.wandb.project` | `str` | `'prime-rl'` | W&B project to log to. | +| `orchestrator.wandb.entity` | `str | None` | `None` | W&B entity to log to. | +| `orchestrator.wandb.name` | `str | None` | `None` | W&B run name. | +| `orchestrator.wandb.group` | `str | None` | `None` | W&B group. | +| `orchestrator.wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `orchestrator.wandb.offline` | `bool` | `False` | Run W&B in offline mode. | ##### `orchestrator.wandb.log_extras` @@ -799,21 +855,21 @@ Extras logging configuration. If None, no extras are logged. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.wandb.log_extras.samples` | bool | `True` | Log prompt/response samples. | -| `orchestrator.wandb.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | -| `orchestrator.wandb.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | -| `orchestrator.wandb.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | +| `orchestrator.wandb.log_extras.samples` | `bool` | `True` | Log prompt/response samples. | +| `orchestrator.wandb.log_extras.distributions` | `bool` | `True` | Log distributions (rewards, advantages, etc.). | +| `orchestrator.wandb.log_extras.interval` | `int` | `10` | _≥1._ Step interval between extras logs. | +| `orchestrator.wandb.log_extras.sample_ratio` | `float | None` | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | #### `orchestrator.prime_monitor` | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.prime_monitor.base_url` | str | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | -| `orchestrator.prime_monitor.api_key_var` | str | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | -| `orchestrator.prime_monitor.run_name` | str \| None | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | -| `orchestrator.prime_monitor.team_id` | str \| None | `None` | Team ID to associate the run with. | -| `orchestrator.prime_monitor.frontend_url` | str \| None | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | +| `orchestrator.prime_monitor.base_url` | `str` | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | +| `orchestrator.prime_monitor.api_key_var` | `str` | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | +| `orchestrator.prime_monitor.run_name` | `str | None` | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | +| `orchestrator.prime_monitor.team_id` | `str | None` | `None` | Team ID to associate the run with. | +| `orchestrator.prime_monitor.frontend_url` | `str | None` | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | ##### `orchestrator.prime_monitor.log_extras` @@ -822,10 +878,10 @@ Extras logging configuration. If None, no extras are logged. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.prime_monitor.log_extras.samples` | bool | `True` | Log prompt/response samples. | -| `orchestrator.prime_monitor.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | -| `orchestrator.prime_monitor.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | -| `orchestrator.prime_monitor.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | +| `orchestrator.prime_monitor.log_extras.samples` | `bool` | `True` | Log prompt/response samples. | +| `orchestrator.prime_monitor.log_extras.distributions` | `bool` | `True` | Log distributions (rewards, advantages, etc.). | +| `orchestrator.prime_monitor.log_extras.interval` | `int` | `10` | _≥1._ Step interval between extras logs. | +| `orchestrator.prime_monitor.log_extras.sample_ratio` | `float | None` | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | #### `orchestrator.ckpt` @@ -834,13 +890,13 @@ Checkpoint configuration. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.ckpt.interval` | int \| None | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | -| `orchestrator.ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | -| `orchestrator.ckpt.wait_for_weights_timeout` | int \| None | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | -| `orchestrator.ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `orchestrator.ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | -| `orchestrator.ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | -| `orchestrator.ckpt.skip_buffer` | bool | `False` | Skip loading the buffer from checkpoint. | +| `orchestrator.ckpt.interval` | `int | None` | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | +| `orchestrator.ckpt.resume_step` | `int | None` | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | +| `orchestrator.ckpt.wait_for_weights_timeout` | `int | None` | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | +| `orchestrator.ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `orchestrator.ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `orchestrator.ckpt.skip_progress` | `bool` | `False` | Skip loading the progress from checkpoint. | +| `orchestrator.ckpt.skip_buffer` | `bool` | `False` | Skip loading the buffer from checkpoint. | #### `orchestrator.heartbeat` @@ -849,11 +905,46 @@ BetterStack heartbeat configuration for monitoring training progress. | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.heartbeat.url` | str | *required* | URL to send the heartbeat to. | +| `orchestrator.heartbeat.url` | `str` | *required* | URL to send the heartbeat to. | #### `orchestrator.experimental` + +#### `orchestrator.filters.` (list item) + +Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). + +Discriminated list-item union — set `orchestrator.filters..type` to one of `gibberish`, `repetition`, `zero_advantage` and provide the matching sub-fields. + + +##### `orchestrator.filters..type = "gibberish"` (GibberishFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.filters..type` | `'gibberish'` | `'gibberish'` | | +| `orchestrator.filters..enforce` | `bool` | `False` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | +| `orchestrator.filters..token_id_threshold` | `int` | `100000` | Token IDs above this are candidates for gibberish. BPE tokens are sorted by merge order. | +| `orchestrator.filters..logprob_offset` | `float` | `2.0` | Offset from uniform-distribution logprob. Threshold = ``-log(vocab_size) - logprob_offset``. | + + +##### `orchestrator.filters..type = "repetition"` (RepetitionFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.filters..type` | `'repetition'` | `'repetition'` | | +| `orchestrator.filters..enforce` | `bool` | `False` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | +| `orchestrator.filters..window` | `int` | `3000` | _≥1._ Consecutive high-probability steps required to flag the rollout. | +| `orchestrator.filters..prob_threshold` | `float` | `0.99` | _>0, ≤1._ Tokens sampled with probability above this are considered repetitive. Consecutive such tokens count toward the window. | + + +##### `orchestrator.filters..type = "zero_advantage"` (ZeroAdvantageFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `orchestrator.filters..type` | `'zero_advantage'` | `'zero_advantage'` | | +| `orchestrator.filters..enforce` | `bool` | `True` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | + #### `orchestrator.weight_broadcast` @@ -866,19 +957,19 @@ Discriminated union — set `orchestrator.weight_broadcast.type` to one of `file | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.weight_broadcast.type` | 'filesystem' | `'filesystem'` | | +| `orchestrator.weight_broadcast.type` | `'filesystem'` | `'filesystem'` | | ##### `orchestrator.weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.weight_broadcast.type` | 'nccl' | `'nccl'` | | -| `orchestrator.weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | -| `orchestrator.weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | -| `orchestrator.weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | -| `orchestrator.weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | -| `orchestrator.weight_broadcast.inference_world_size` | int | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | +| `orchestrator.weight_broadcast.type` | `'nccl'` | `'nccl'` | | +| `orchestrator.weight_broadcast.host` | `str` | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `orchestrator.weight_broadcast.port` | `int` | `29501` | Port for the NCCL broadcast rendezvous. | +| `orchestrator.weight_broadcast.timeout` | `int` | `1200` | Timeout in seconds for the NCCL broadcast. | +| `orchestrator.weight_broadcast.quantize_in_weight_transfer` | `bool` | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | +| `orchestrator.weight_broadcast.inference_world_size` | `int` | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | #### `orchestrator.rollout_transport` @@ -892,17 +983,17 @@ Discriminated union — set `orchestrator.rollout_transport.type` to one of `fil | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.rollout_transport.type` | 'filesystem' | `'filesystem'` | | +| `orchestrator.rollout_transport.type` | `'filesystem'` | `'filesystem'` | | ##### `orchestrator.rollout_transport.type = "zmq"` (ZMQTransportConfig) | Field | Type | Default | Description | |---|---|---|---| -| `orchestrator.rollout_transport.type` | 'zmq' | `'zmq'` | | -| `orchestrator.rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | -| `orchestrator.rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | -| `orchestrator.rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | +| `orchestrator.rollout_transport.type` | `'zmq'` | `'zmq'` | | +| `orchestrator.rollout_transport.host` | `str` | `'localhost'` | Host address for ZMQ transport. | +| `orchestrator.rollout_transport.port` | `int` | `5555` | Base port for ZMQ transport. | +| `orchestrator.rollout_transport.hwm` | `int` | `10` | High-water mark (max in-flight messages per ZMQ socket). | ### `inference` @@ -911,51 +1002,51 @@ Inference server configuration. If None, the rl entrypoint will not start an inf | Field | Type | Default | Description | |---|---|---|---| -| `inference.enable_lora` | bool | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | -| `inference.max_loras` | int | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | -| `inference.max_cpu_loras` | int | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | -| `inference.max_lora_rank` | int \| None | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | -| `inference.lora_target_modules` | list[str] \| None | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | -| `inference.enable_prefix_caching` | bool \| None | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | -| `inference.gpu_memory_utilization` | float | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | -| `inference.api_server_count` | int | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | -| `inference.data_parallel_size_local` | int \| None | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | -| `inference.data_parallel_rpc_port` | int | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | -| `inference.seed` | int | `0` | Seed the inference components. Forwarded as ``--seed``. | -| `inference.enable_expert_parallel` | bool | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | -| `inference.all2all_backend` | 'allgather_reducescatter' \| 'deepep_high_throughput' \| 'deepep_low_latency' \| 'flashinfer_nvlink_one_sided' \| 'flashinfer_nvlink_two_sided' | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | -| `inference.enable_eplb` | bool | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | -| `inference.enable_dbo` | bool | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | -| `inference.use_deep_gemm` | bool | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | -| `inference.enable_return_routed_experts` | bool | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | -| `inference.enable_fp32_lm_head` | bool | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | -| `inference.vllm_extra` | dict[str, Any] | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | -| `inference.output_dir` | Path | `'outputs'` | Directory for SLURM logs and generated scripts. | -| `inference.dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | +| `inference.enable_lora` | `bool` | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | +| `inference.max_loras` | `int` | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | +| `inference.max_cpu_loras` | `int` | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | +| `inference.max_lora_rank` | `int | None` | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | +| `inference.lora_target_modules` | `list[str] | None` | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | +| `inference.enable_prefix_caching` | `bool | None` | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | +| `inference.gpu_memory_utilization` | `float` | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | +| `inference.api_server_count` | `int` | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | +| `inference.data_parallel_size_local` | `int | None` | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | +| `inference.data_parallel_rpc_port` | `int` | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | +| `inference.seed` | `int` | `0` | Seed the inference components. Forwarded as ``--seed``. | +| `inference.enable_expert_parallel` | `bool` | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | +| `inference.all2all_backend` | `'allgather_reducescatter' | 'deepep_high_throughput' | 'deepep_low_latency' | 'flashinfer_nvlink_one_sided' | 'flashinfer_nvlink_two_sided'` | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | +| `inference.enable_eplb` | `bool` | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | +| `inference.enable_dbo` | `bool` | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | +| `inference.use_deep_gemm` | `bool` | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | +| `inference.enable_return_routed_experts` | `bool` | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | +| `inference.enable_fp32_lm_head` | `bool` | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | +| `inference.vllm_extra` | `dict[str, Any]` | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | +| `inference.output_dir` | `Path` | `'outputs'` | Directory for SLURM logs and generated scripts. | +| `inference.dry_run` | `bool` | `False` | Only validate and dump resolved configs, then exit early. | #### `inference.server` | Field | Type | Default | Description | |---|---|---|---| -| `inference.server.host` | str \| None | `None` | Host to bind to. | -| `inference.server.port` | int | `8000` | Port to bind to. | -| `inference.server.liveness_timeout_seconds` | float | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | +| `inference.server.host` | `str | None` | `None` | Host to bind to. | +| `inference.server.port` | `int` | `8000` | Port to bind to. | +| `inference.server.liveness_timeout_seconds` | `float` | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | #### `inference.model` | Field | Type | Default | Description | |---|---|---|---| -| `inference.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `inference.model.trust_remote_code` | bool | `False` | Trust remote code. Forwarded to vLLM engine init. | -| `inference.model.dtype` | 'auto' \| 'float16' \| 'bfloat16' \| 'float32' | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | -| `inference.model.max_model_len` | int \| None | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | -| `inference.model.enforce_eager` | bool | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | -| `inference.model.chat_template` | str \| None | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | -| `inference.model.tool_call_parser` | str \| None | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | -| `inference.model.reasoning_parser` | str \| None | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | -| `inference.model.rope_scaling` | dict[str, Any] \| str \| None | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | +| `inference.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `inference.model.trust_remote_code` | `bool` | `False` | Trust remote code. Forwarded to vLLM engine init. | +| `inference.model.dtype` | `'auto' | 'float16' | 'bfloat16' | 'float32'` | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | +| `inference.model.max_model_len` | `int | None` | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | +| `inference.model.enforce_eager` | `bool` | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | +| `inference.model.chat_template` | `str | None` | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | +| `inference.model.tool_call_parser` | `str | None` | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `inference.model.reasoning_parser` | `str | None` | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `inference.model.rope_scaling` | `dict[str, Any] | str | None` | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | ##### `inference.model.vlm` @@ -964,9 +1055,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `inference.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `inference.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `inference.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `inference.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `inference.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `inference.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | #### `inference.parallel` @@ -975,15 +1066,15 @@ Multi-node and multi-GPU parallelism (TP, DP, PP). | Field | Type | Default | Description | |---|---|---|---| -| `inference.parallel.tp` | int | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | -| `inference.parallel.dp` | int | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | +| `inference.parallel.tp` | `int` | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | +| `inference.parallel.dp` | `int` | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | #### `inference.weight_broadcast` | Field | Type | Default | Description | |---|---|---|---| -| `inference.weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | +| `inference.weight_broadcast.type` | `'nccl' | 'filesystem'` | `'filesystem'` | Weight broadcast transport. | #### `inference.kv_cache_offload` @@ -992,7 +1083,7 @@ CPU KV cache offload for inference workers. Standard inference uses vLLM's ``Off | Field | Type | Default | Description | |---|---|---|---| -| `inference.kv_cache_offload.cpu_bytes` | int | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | +| `inference.kv_cache_offload.cpu_bytes` | `int` | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | #### `inference.slurm` @@ -1001,15 +1092,15 @@ SLURM configuration. When set, the run is submitted as a SLURM job instead of ru | Field | Type | Default | Description | |---|---|---|---| -| `inference.slurm.job_name` | str | `'prime-rl'` | SLURM job name. | -| `inference.slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | -| `inference.slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | -| `inference.slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | -| `inference.slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | -| `inference.slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | -| `inference.slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | -| `inference.slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | -| `inference.slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | +| `inference.slurm.job_name` | `str` | `'prime-rl'` | SLURM job name. | +| `inference.slurm.project_dir` | `Path` | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `inference.slurm.template_path` | `Path | None` | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `inference.slurm.partition` | `str` | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `inference.slurm.nodelist` | `str | None` | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `inference.slurm.exclude` | `str | None` | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `inference.slurm.account` | `str | None` | `None` | SLURM account to charge (#SBATCH --account). | +| `inference.slurm.time` | `str | None` | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `inference.slurm.pre_run_command` | `str | None` | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | #### `inference.experimental` @@ -1024,38 +1115,38 @@ Discriminated union — set `inference.deployment.type` to one of `single_node`, | Field | Type | Default | Description | |---|---|---|---| -| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `inference.deployment.type` | 'single_node' | `'single_node'` | | +| `inference.deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `inference.deployment.type` | `'single_node'` | `'single_node'` | | ##### `inference.deployment.type = "multi_node"` (MultiNodeInferenceDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `inference.deployment.type` | 'multi_node' | `'multi_node'` | | -| `inference.deployment.num_nodes` | int | `2` | _≥1._ Inference nodes. | -| `inference.deployment.router_port` | int | `8000` | Port for the vllm-router. | -| `inference.deployment.backend_port` | int | `8100` | Port for vLLM backend instances. | -| `inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `inference.deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `inference.deployment.type` | `'multi_node'` | `'multi_node'` | | +| `inference.deployment.num_nodes` | `int` | `2` | _≥1._ Inference nodes. | +| `inference.deployment.router_port` | `int` | `8000` | Port for the vllm-router. | +| `inference.deployment.backend_port` | `int` | `8100` | Port for vLLM backend instances. | +| `inference.deployment.router_policy` | `str` | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | ##### `inference.deployment.type = "disaggregated"` (DisaggregatedInferenceDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `inference.deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `inference.deployment.type` | 'disaggregated' | `'disaggregated'` | | -| `inference.deployment.num_prefill_nodes` | int | `1` | _≥1._ Total prefill nodes. | -| `inference.deployment.num_decode_nodes` | int | `1` | _≥1._ Total decode nodes. | -| `inference.deployment.num_prefill_replicas` | int | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | -| `inference.deployment.num_decode_replicas` | int | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | -| `inference.deployment.router_port` | int | `8000` | Port for the vllm-router on each replica. | -| `inference.deployment.prefill_port` | int | `8100` | Port for prefill vLLM instances. | -| `inference.deployment.decode_port` | int | `8200` | Port for decode vLLM instances. | -| `inference.deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | -| `inference.deployment.prefill_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on prefill nodes. | -| `inference.deployment.decode_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on decode nodes. | +| `inference.deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `inference.deployment.type` | `'disaggregated'` | `'disaggregated'` | | +| `inference.deployment.num_prefill_nodes` | `int` | `1` | _≥1._ Total prefill nodes. | +| `inference.deployment.num_decode_nodes` | `int` | `1` | _≥1._ Total decode nodes. | +| `inference.deployment.num_prefill_replicas` | `int` | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | +| `inference.deployment.num_decode_replicas` | `int` | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | +| `inference.deployment.router_port` | `int` | `8000` | Port for the vllm-router on each replica. | +| `inference.deployment.prefill_port` | `int` | `8100` | Port for prefill vLLM instances. | +| `inference.deployment.decode_port` | `int` | `8200` | Port for decode vLLM instances. | +| `inference.deployment.router_policy` | `str` | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `inference.deployment.prefill_env_overrides` | `dict[str, str]` | `{}` | Extra environment variables exported only on prefill nodes. | +| `inference.deployment.decode_env_overrides` | `dict[str, str]` | `{}` | Extra environment variables exported only on decode nodes. | ### `log` @@ -1064,8 +1155,8 @@ Shared log config. Propagated to trainer and orchestrator. | Field | Type | Default | Description | |---|---|---|---| -| `log.level` | str \| None | `None` | Log level for trainer and orchestrator. When unset, each sub-config's own log level applies (defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``). | -| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.level` | `str | None` | `None` | Log level for trainer and orchestrator. When unset, each sub-config's own log level applies (defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``). | +| `log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | ### `ckpt` @@ -1074,11 +1165,11 @@ Shared checkpoint config. If None, falls back to the sub-config checkpoint setti | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. When set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir``. | -| `ckpt.interval` | int \| None | `None` | Interval at which to save checkpoints. | -| `ckpt.resume_step` | int \| None | `None` | Step to resume from. If None, does not resume from a checkpoint. | -| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.output_dir` | `Path | None` | `None` | Override directory for checkpoints and weights. When set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir``. | +| `ckpt.interval` | `int | None` | `None` | Interval at which to save checkpoints. | +| `ckpt.resume_step` | `int | None` | `None` | Step to resume from. If None, does not resume from a checkpoint. | +| `ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | ### `wandb` @@ -1087,13 +1178,13 @@ Shared W&B config. If None, falls back to the sub-config W&B settings. | Field | Type | Default | Description | |---|---|---|---| -| `wandb.project` | str \| None | `'prime-rl'` | W&B project. | -| `wandb.entity` | str \| None | `None` | W&B entity. | -| `wandb.name` | str \| None | `None` | W&B run name. | -| `wandb.group` | str \| None | `None` | W&B group. | -| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `wandb.offline` | bool \| None | `False` | Run W&B in offline mode. | -| `wandb.shared` | bool | `True` | Log trainer and orchestrator metrics to a single shared W&B run. Requires wandb SDK ≥ 0.19.9. Incompatible with offline mode. | +| `wandb.project` | `str | None` | `'prime-rl'` | W&B project. | +| `wandb.entity` | `str | None` | `None` | W&B entity. | +| `wandb.name` | `str | None` | `None` | W&B run name. | +| `wandb.group` | `str | None` | `None` | W&B group. | +| `wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `wandb.offline` | `bool | None` | `False` | Run W&B in offline mode. | +| `wandb.shared` | `bool` | `True` | Log trainer and orchestrator metrics to a single shared W&B run. Requires wandb SDK ≥ 0.19.9. Incompatible with offline mode. | ### `model` @@ -1102,7 +1193,7 @@ Shared model config. If None, falls back to the sub-config model settings. | Field | Type | Default | Description | |---|---|---|---| -| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | #### `model.vlm` @@ -1111,9 +1202,9 @@ VLM configuration. Set this to enable vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ### `tokenizer` @@ -1122,19 +1213,19 @@ Shared tokenizer config. Propagated to trainer, orchestrator, and inference. If | Field | Type | Default | Description | |---|---|---|---| -| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | ### `weight_broadcast` | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | -| `weight_broadcast.port` | int | `29501` | Port for NCCL weight broadcast. | -| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for NCCL weight broadcast. | -| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | +| `weight_broadcast.type` | `'nccl' | 'filesystem'` | `'filesystem'` | Weight broadcast transport. | +| `weight_broadcast.port` | `int` | `29501` | Port for NCCL weight broadcast. | +| `weight_broadcast.timeout` | `int` | `1200` | Timeout in seconds for NCCL weight broadcast. | +| `weight_broadcast.quantize_in_weight_transfer` | `bool` | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | ### `slurm` @@ -1143,15 +1234,15 @@ SLURM configuration. If None, runs locally. | Field | Type | Default | Description | |---|---|---|---| -| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | -| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | -| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | -| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | -| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | -| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | -| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | -| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | -| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | +| `slurm.job_name` | `str` | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | `Path` | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | `Path | None` | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | `str` | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | `str | None` | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | `str | None` | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | `str | None` | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | `str | None` | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | `str | None` | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | ### `experimental` @@ -1166,22 +1257,22 @@ Discriminated union — set `deployment.type` to one of `single_node`, `multi_no | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'single_node' | `'single_node'` | | -| `deployment.num_train_gpus` | int | `1` | GPUs allocated to the trainer. | -| `deployment.num_infer_gpus` | int | `1` | GPUs allocated to inference. | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'single_node'` | `'single_node'` | | +| `deployment.num_train_gpus` | `int` | `1` | GPUs allocated to the trainer. | +| `deployment.num_infer_gpus` | `int` | `1` | GPUs allocated to inference. | #### `deployment.type = "multi_node"` (MultiNodeDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'multi_node' | `'multi_node'` | | -| `deployment.num_train_nodes` | int | *required* | Training nodes. | -| `deployment.num_infer_nodes` | int | *required* | _≥0._ Inference nodes per replica. Set to 0 to skip inference and orchestrator (requires fake data). | -| `deployment.num_infer_replicas` | int | `1` | _≥1._ Independent inference replicas. Total inference nodes = ``num_infer_nodes * num_infer_replicas``. | -| `deployment.nodes_per_fsdp_group` | int \| None | `None` | Training nodes per FSDP island. Auto-sets ``trainer.dp_replicate = num_train_nodes / nodes_per_fsdp_group``. | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'multi_node'` | `'multi_node'` | | +| `deployment.num_train_nodes` | `int` | *required* | Training nodes. | +| `deployment.num_infer_nodes` | `int` | *required* | _≥0._ Inference nodes per replica. Set to 0 to skip inference and orchestrator (requires fake data). | +| `deployment.num_infer_replicas` | `int` | `1` | _≥1._ Independent inference replicas. Total inference nodes = ``num_infer_nodes * num_infer_replicas``. | +| `deployment.nodes_per_fsdp_group` | `int | None` | `None` | Training nodes per FSDP island. Auto-sets ``trainer.dp_replicate = num_train_nodes / nodes_per_fsdp_group``. | ## `sft` — Supervised fine-tuning @@ -1192,43 +1283,43 @@ _Defined in_ `prime_rl.configs.sft.SFTConfig`. | Field | Type | Default | Description | |---|---|---|---| -| `use_renderer` | bool | `False` | Tokenize SFT samples through the ``renderers`` library (single ``render()`` + ``message_indices`` mask) instead of the default ``build_incremental_token_mask`` path. Required for chat templates that render position-dependently (e.g. Qwen3, Qwen3.5). | -| `output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | -| `clean_output_dir` | bool | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | -| `matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | -| `max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | -| `memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | -| `trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | -| `dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | -| `loss_impl` | 'liger' \| 'torch' \| 'liger_fused' \| 'quack_fused' | `'torch'` | Cross-entropy loss implementation. ``liger_fused`` fuses the lm_head projection with the CE loss to avoid materializing full logits. ``quack_fused`` uses quack-kernels for chunked linear + CE with CuTe DSL CUDA kernels. | -| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | +| `use_renderer` | `bool` | `False` | Tokenize SFT samples through the ``renderers`` library (single ``render()`` + ``message_indices`` mask) instead of the default ``build_incremental_token_mask`` path. Required for chat templates that render position-dependently (e.g. Qwen3, Qwen3.5). | +| `output_dir` | `Path` | `'outputs'` | Directory to write outputs to — checkpoints and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `clean_output_dir` | `bool` | `False` | Delete the output directory before starting training. Required to overwrite an output directory that contains checkpoints from a previous run when not resuming. | +| `matmul_precision` | `'highest' | 'high' | 'medium'` | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `max_steps` | `int | None` | `None` | Maximum training steps. If None, runs indefinitely. | +| `memory_profiler_path` | `Path | None` | `None` | Path to write the memory profile to. | +| `trace_path` | `Path | None` | `None` | Path to write the PyTorch profiler trace to. | +| `dist_timeout_seconds` | `int` | `600` | Timeout in seconds for torch distributed ops. | +| `loss_impl` | `'liger' | 'torch' | 'liger_fused' | 'quack_fused'` | `'torch'` | Cross-entropy loss implementation. ``liger_fused`` fuses the lm_head projection with the CE loss to avoid materializing full logits. ``quack_fused`` uses quack-kernels for chunked linear + CE with CuTe DSL CUDA kernels. | +| `dry_run` | `bool` | `False` | Only validate and dump resolved configs, then exit early. | ### `model` | Field | Type | Default | Description | |---|---|---|---| -| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | -| `model.seq_len` | int | `2048` | Sequence length the model is trained on. | -| `model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | -| `model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | -| `model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | -| `model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | -| `model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | -| `model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | -| `model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | -| `model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | -| `model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | -| `model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | -| `model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | -| `model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | -| `model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | -| `model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | -| `model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | -| `model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | -| `model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | -| `model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | +| `model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | +| `model.seq_len` | `int` | `2048` | Sequence length the model is trained on. | +| `model.attn` | `'eager' | 'sdpa' | 'flash_attention_2' | 'flash_attention_3' | 'fa4'` | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `model.fsdp_cpu_offload` | `bool` | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `model.optim_cpu_offload` | `bool` | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `model.reshard_after_forward` | `bool` | `True` | Reshard the model after each forward pass. | +| `model.dp_replicate` | `int` | `1` | Data parallel dim where model weights are replicated. | +| `model.ep` | `int` | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `model.ep_comm_backend` | `'torch' | 'deepep'` | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `model.deepep_num_sms` | `int` | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `model.deepep_token_chunk_size` | `int | None` | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `model.cp` | `int` | `1` | Context parallelism degree. 1 disables CP. | +| `model.cp_style` | `'ring' | 'ulysses'` | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `model.impl` | `'hf' | 'custom' | 'auto'` | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `model.optimization_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for model optimization. | +| `model.reduce_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for gradient/parameter reductions. | +| `model.moe_use_grouped_mm` | `bool` | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `model.fp8` | `bool` | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `model.freeze_moe_router` | `bool` | `False` | Freeze MoE router parameters during training. | +| `model.fused_lm_head_token_chunk_size` | `int | 'auto' | 'disabled'` | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | #### `model.vlm` @@ -1237,9 +1328,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | #### `model.compile` @@ -1248,7 +1339,7 @@ Compile the model with ``torch.compile``. | Field | Type | Default | Description | |---|---|---|---| -| `model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | +| `model.compile.fullgraph` | `bool` | `False` | Compile transformer blocks with ``fullgraph=True``. | #### `model.ac` @@ -1257,9 +1348,9 @@ Activation checkpointing configuration. If None, activation checkpointing is dis | Field | Type | Default | Description | |---|---|---|---| -| `model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | -| `model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | -| `model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | +| `model.ac.mode` | `'full' | 'selective'` | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `model.ac.freq` | `int` | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `model.ac.targets` | `list[str]` | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | #### `model.ac_offloading` @@ -1268,8 +1359,8 @@ Activation offloading configuration. If None, activation offloading is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | -| `model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | +| `model.ac_offloading.pin_memory` | `bool` | `True` | Pin offloaded activations to CPU memory. | +| `model.ac_offloading.max_inflight_activations` | `int` | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | #### `model.index_cache` @@ -1278,8 +1369,8 @@ DSA IndexCache sub-configuration. If set, sparse-attention top-k indices are reu | Field | Type | Default | Description | |---|---|---|---| -| `model.index_cache.topk_freq` | int | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | -| `model.index_cache.topk_pattern` | str \| None | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | +| `model.index_cache.topk_freq` | `int` | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | +| `model.index_cache.topk_pattern` | `str | None` | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | #### `model.lora` @@ -1288,11 +1379,11 @@ LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | -| `model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | -| `model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | -| `model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | -| `model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | +| `model.lora.rank` | `int` | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `model.lora.alpha` | `float` | `32.0` | _≥0._ LoRA scaling parameter. | +| `model.lora.dropout` | `float` | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `model.lora.target_modules` | `list[str]` | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `model.lora.modules_to_save` | `list[str]` | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | #### `model.debug` @@ -1301,18 +1392,18 @@ Debugging knobs for the model and distributed training. | Field | Type | Default | Description | |---|---|---|---| -| `model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | -| `model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | -| `model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | +| `model.debug.num_layers` | `int | None` | `None` | Override the number of transformer layers (truncates the model). | +| `model.debug.random_init` | `bool` | `False` | Randomly initialize the model instead of loading weights. | +| `model.debug.force_balanced_routing` | `bool` | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | ### `tokenizer` | Field | Type | Default | Description | |---|---|---|---| -| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | ### `renderer` @@ -1321,12 +1412,12 @@ Client-side renderer configuration. Only consumed when ``use_renderer=true``. | Field | Type | Default | Description | |---|---|---|---| -| `renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | -| `renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | -| `renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | -| `renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | -| `renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | -| `renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | +| `renderer.name` | `str` | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `renderer.tool_parser` | `str | None` | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `renderer.reasoning_parser` | `str | None` | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `renderer.pool_size` | `int | None` | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `renderer.preserve_all_thinking` | `bool` | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `renderer.preserve_thinking_between_tool_calls` | `bool` | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | ### `val` @@ -1335,26 +1426,26 @@ Validation configuration. If None, no validation runs. | Field | Type | Default | Description | |---|---|---|---| -| `val.interval` | int | `50` | _≥1._ Run validation every N training steps. | -| `val.eval_on_start` | bool | `False` | Run validation before the first training step. | +| `val.interval` | `int` | `50` | _≥1._ Run validation every N training steps. | +| `val.eval_on_start` | `bool` | `False` | Run validation before the first training step. | #### `val.data` | Field | Type | Default | Description | |---|---|---|---| -| `val.data.batch_size` | int | `128` | _≥1._ Global batch size. | -| `val.data.seq_len` | int | `128` | _≥1._ Sequence length. | -| `val.data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | -| `val.data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | -| `val.data.type` | 'sft' | `'sft'` | | -| `val.data.name` | str | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | -| `val.data.subsets` | list[str] \| None | `None` | Subsets to load from the HF dataset. | -| `val.data.splits` | list[str] \| None | `None` | Splits to load from the HF dataset. | -| `val.data.probabilities` | list[float] \| None | `None` | Sampling probabilities for each subset/split. | -| `val.data.stopping_strategy` | 'first_exhausted' \| 'all_exhausted' | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | -| `val.data.shuffle` | bool | `True` | Shuffle the dataset at the start of each epoch. | -| `val.data.seed` | int | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | +| `val.data.batch_size` | `int` | `128` | _≥1._ Global batch size. | +| `val.data.seq_len` | `int` | `128` | _≥1._ Sequence length. | +| `val.data.pack_function` | `'cat' | 'stack'` | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `val.data.micro_batch_size` | `int` | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `val.data.type` | `'sft'` | `'sft'` | | +| `val.data.name` | `str` | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | +| `val.data.subsets` | `list[str] | None` | `None` | Subsets to load from the HF dataset. | +| `val.data.splits` | `list[str] | None` | `None` | Splits to load from the HF dataset. | +| `val.data.probabilities` | `list[float] | None` | `None` | Sampling probabilities for each subset/split. | +| `val.data.stopping_strategy` | `'first_exhausted' | 'all_exhausted'` | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | +| `val.data.shuffle` | `bool` | `True` | Shuffle the dataset at the start of each epoch. | +| `val.data.seed` | `int` | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | ##### `val.data.loss_mask` @@ -1363,27 +1454,27 @@ Which message types contribute to the loss. | Field | Type | Default | Description | |---|---|---|---| -| `val.data.loss_mask.system` | bool | `False` | System messages contribute to the loss. | -| `val.data.loss_mask.user` | bool | `False` | User messages contribute to the loss. | -| `val.data.loss_mask.assistant` | bool | `True` | Assistant messages contribute to the loss. | -| `val.data.loss_mask.tool` | bool | `False` | Tool messages contribute to the loss. | +| `val.data.loss_mask.system` | `bool` | `False` | System messages contribute to the loss. | +| `val.data.loss_mask.user` | `bool` | `False` | User messages contribute to the loss. | +| `val.data.loss_mask.assistant` | `bool` | `True` | Assistant messages contribute to the loss. | +| `val.data.loss_mask.tool` | `bool` | `False` | Tool messages contribute to the loss. | ### `ckpt` | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | -| `ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | -| `ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | -| `ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | -| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | -| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | -| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | -| `ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | -| `ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | -| `ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | +| `ckpt.output_dir` | `Path | None` | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `ckpt.interval` | `int | None` | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `ckpt.skip_gather_master_weights` | `bool` | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `ckpt.weights_only` | `bool` | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `ckpt.resume_step` | `int | None` | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | `bool` | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_scheduler` | `bool` | `False` | Skip loading the scheduler from checkpoint. | +| `ckpt.skip_dataloader` | `bool` | `False` | Skip loading the dataloader from checkpoint. | +| `ckpt.skip_optimizer` | `bool` | `False` | Skip loading the optimizer state from checkpoint. | #### `ckpt.weights` @@ -1392,32 +1483,32 @@ Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoint | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | -| `ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | -| `ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | +| `ckpt.weights.save_sharded` | `bool` | `True` | Save the weight checkpoint in sharded format. | +| `ckpt.weights.save_format` | `'safetensors' | 'torch'` | `'safetensors'` | Weight checkpoint serialization format. | +| `ckpt.weights.save_adapter_separately` | `bool` | `False` | Save LoRA adapters separately before merging into full model weights. | ### `log` | Field | Type | Default | Description | |---|---|---|---| -| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | -| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | -| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | -| `log.log_data` | bool | `False` | Log the first data sample at startup. | -| `log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | +| `log.level` | `str` | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | `str` | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | `bool` | `False` | Log the first data sample at startup. | +| `log.ranks_filter` | `list[int]` | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | ### `wandb` | Field | Type | Default | Description | |---|---|---|---| -| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | -| `wandb.entity` | str \| None | `None` | W&B entity to log to. | -| `wandb.name` | str \| None | `None` | W&B run name. | -| `wandb.group` | str \| None | `None` | W&B group. | -| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `wandb.offline` | bool | `False` | Run W&B in offline mode. | +| `wandb.project` | `str` | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | `str | None` | `None` | W&B entity to log to. | +| `wandb.name` | `str | None` | `None` | W&B run name. | +| `wandb.group` | `str | None` | `None` | W&B group. | +| `wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `wandb.offline` | `bool` | `False` | Run W&B in offline mode. | ### `bench` @@ -1426,7 +1517,7 @@ Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake da | Field | Type | Default | Description | |---|---|---|---| -| `bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | +| `bench.output_json` | `Path | None` | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | ### `gc` @@ -1435,7 +1526,7 @@ Garbage collection config. Disables automatic GC and runs deterministic collecti | Field | Type | Default | Description | |---|---|---|---| -| `gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | +| `gc.interval` | `int` | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | ### `heartbeat` @@ -1444,7 +1535,7 @@ BetterStack heartbeat configuration for monitoring training progress. | Field | Type | Default | Description | |---|---|---|---| -| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | +| `heartbeat.url` | `str` | *required* | URL to send the heartbeat to. | ### `slurm` @@ -1453,15 +1544,15 @@ SLURM configuration. When set, the run is submitted as a SLURM job instead of ru | Field | Type | Default | Description | |---|---|---|---| -| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | -| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | -| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | -| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | -| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | -| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | -| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | -| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | -| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | +| `slurm.job_name` | `str` | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | `Path` | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | `Path | None` | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | `str` | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | `str | None` | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | `str | None` | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | `str | None` | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | `str | None` | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | `str | None` | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | ### `experimental` @@ -1476,31 +1567,31 @@ Discriminated union — set `data.type` to one of `fake`, `sft` and provide the | Field | Type | Default | Description | |---|---|---|---| -| `data.batch_size` | int | `128` | _≥1._ Global batch size. | -| `data.seq_len` | int | `128` | _≥1._ Sequence length. | -| `data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | -| `data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | -| `data.type` | 'fake' | `'fake'` | | -| `data.length` | 'fixed' \| 'variable' | `'fixed'` | Use fixed-length samples or variable-length samples. | -| `data.input_ids` | 'increasing' \| 'random' | `'increasing'` | Token id generator: ``increasing`` for deterministic sequences, ``random`` for random ids. | +| `data.batch_size` | `int` | `128` | _≥1._ Global batch size. | +| `data.seq_len` | `int` | `128` | _≥1._ Sequence length. | +| `data.pack_function` | `'cat' | 'stack'` | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `data.micro_batch_size` | `int` | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `data.type` | `'fake'` | `'fake'` | | +| `data.length` | `'fixed' | 'variable'` | `'fixed'` | Use fixed-length samples or variable-length samples. | +| `data.input_ids` | `'increasing' | 'random'` | `'increasing'` | Token id generator: ``increasing`` for deterministic sequences, ``random`` for random ids. | #### `data.type = "sft"` (SFTDataConfig) | Field | Type | Default | Description | |---|---|---|---| -| `data.batch_size` | int | `128` | _≥1._ Global batch size. | -| `data.seq_len` | int | `128` | _≥1._ Sequence length. | -| `data.pack_function` | 'cat' \| 'stack' | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | -| `data.micro_batch_size` | int | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | -| `data.type` | 'sft' | `'sft'` | | -| `data.name` | str | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | -| `data.subsets` | list[str] \| None | `None` | Subsets to load from the HF dataset. | -| `data.splits` | list[str] \| None | `None` | Splits to load from the HF dataset. | -| `data.probabilities` | list[float] \| None | `None` | Sampling probabilities for each subset/split. | -| `data.stopping_strategy` | 'first_exhausted' \| 'all_exhausted' | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | -| `data.shuffle` | bool | `True` | Shuffle the dataset at the start of each epoch. | -| `data.seed` | int | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | +| `data.batch_size` | `int` | `128` | _≥1._ Global batch size. | +| `data.seq_len` | `int` | `128` | _≥1._ Sequence length. | +| `data.pack_function` | `'cat' | 'stack'` | `'cat'` | Sample packing strategy. ``cat`` concatenates; ``stack`` requires ``seq_len`` divisible by 256. | +| `data.micro_batch_size` | `int` | `1` | _≥1._ Per-step micro batch size. ``batch_size`` must be divisible by this. | +| `data.type` | `'sft'` | `'sft'` | | +| `data.name` | `str` | `'PrimeIntellect/Reverse-Text-SFT'` | HF dataset name or path. | +| `data.subsets` | `list[str] | None` | `None` | Subsets to load from the HF dataset. | +| `data.splits` | `list[str] | None` | `None` | Splits to load from the HF dataset. | +| `data.probabilities` | `list[float] | None` | `None` | Sampling probabilities for each subset/split. | +| `data.stopping_strategy` | `'first_exhausted' | 'all_exhausted'` | `'all_exhausted'` | Stopping strategy when interleaving multiple subsets/splits. | +| `data.shuffle` | `bool` | `True` | Shuffle the dataset at the start of each epoch. | +| `data.seed` | `int` | `0` | Random seed for shuffling. Re-shuffled per epoch by adding the epoch count to the seed. | ##### `data.loss_mask` @@ -1509,10 +1600,10 @@ Which message types contribute to the loss. | Field | Type | Default | Description | |---|---|---|---| -| `data.loss_mask.system` | bool | `False` | System messages contribute to the loss. | -| `data.loss_mask.user` | bool | `False` | User messages contribute to the loss. | -| `data.loss_mask.assistant` | bool | `True` | Assistant messages contribute to the loss. | -| `data.loss_mask.tool` | bool | `False` | Tool messages contribute to the loss. | +| `data.loss_mask.system` | `bool` | `False` | System messages contribute to the loss. | +| `data.loss_mask.user` | `bool` | `False` | User messages contribute to the loss. | +| `data.loss_mask.assistant` | `bool` | `True` | Assistant messages contribute to the loss. | +| `data.loss_mask.tool` | `bool` | `False` | Tool messages contribute to the loss. | ### `optim` @@ -1524,47 +1615,47 @@ Discriminated union — set `optim.type` to one of `sgd`, `adamw`, `muon`, `sign | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'sgd' | `'sgd'` | | -| `optim.nesterov` | bool | `True` | Use Nesterov momentum. | -| `optim.momentum` | float | `0.9` | SGD momentum factor. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'sgd'` | `'sgd'` | | +| `optim.nesterov` | `bool` | `True` | Use Nesterov momentum. | +| `optim.momentum` | `float` | `0.9` | SGD momentum factor. | #### `optim.type = "adamw"` (AdamWConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'adamw' | `'adamw'` | | -| `optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | -| `optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'adamw'` | `'adamw'` | | +| `optim.betas1` | `float` | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `optim.betas2` | `float` | `0.999` | _≥0._ Adam second-moment (β2) decay. | #### `optim.type = "muon"` (MuonConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'muon' | `'muon'` | | -| `optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | -| `optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | -| `optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'muon'` | `'muon'` | | +| `optim.mu` | `float` | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `optim.betas1` | `float` | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.betas2` | `float` | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | #### `optim.type = "sign_sgd"` (SignSGDConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'sign_sgd' | `'sign_sgd'` | | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'sign_sgd'` | `'sign_sgd'` | | ### `scheduler` @@ -1576,26 +1667,26 @@ Discriminated union — set `scheduler.type` to one of `constant`, `linear`, `co | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'constant' | `'constant'` | | +| `scheduler.type` | `'constant'` | `'constant'` | | #### `scheduler.type = "linear"` (LinearSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'linear' | `'linear'` | | -| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | -| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `scheduler.type` | `'linear'` | `'linear'` | | +| `scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.decay_steps` | `int` | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | #### `scheduler.type = "cosine"` (CosineSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'cosine' | `'cosine'` | | -| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `scheduler.type` | `'cosine'` | `'cosine'` | | +| `scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | ### `deployment` @@ -1607,19 +1698,19 @@ Discriminated union — set `deployment.type` to one of `single_node`, `multi_no | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'single_node' | `'single_node'` | | -| `deployment.num_gpus` | int | `1` | GPUs to use. | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'single_node'` | `'single_node'` | | +| `deployment.num_gpus` | `int` | `1` | GPUs to use. | #### `deployment.type = "multi_node"` (MultiNodeDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'multi_node' | `'multi_node'` | | -| `deployment.num_nodes` | int | `2` | Training nodes. | -| `deployment.nodes_per_fsdp_group` | int \| None | `None` | Nodes per FSDP island. Auto-sets ``model.dp_replicate = num_nodes / nodes_per_fsdp_group``. | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'multi_node'` | `'multi_node'` | | +| `deployment.num_nodes` | `int` | `2` | Training nodes. | +| `deployment.nodes_per_fsdp_group` | `int | None` | `None` | Nodes per FSDP island. Auto-sets ``model.dp_replicate = num_nodes / nodes_per_fsdp_group``. | ## `trainer` — Standalone trainer @@ -1630,42 +1721,42 @@ _Defined in_ `prime_rl.configs.trainer.TrainerConfig`. | Field | Type | Default | Description | |---|---|---|---| -| `output_dir` | Path | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | -| `matmul_precision` | 'highest' \| 'high' \| 'medium' | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | -| `max_steps` | int \| None | `None` | Maximum number of training steps. If None, runs indefinitely. | -| `max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | -| `enable_router_replay` | bool | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | -| `memory_profiler_path` | Path \| None | `None` | Path to write the memory profile to. | -| `trace_path` | Path \| None | `None` | Path to write the PyTorch profiler trace to. | -| `dist_timeout_seconds` | int | `600` | Timeout in seconds for torch distributed ops. | -| `max_concurrent_runs` | int | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | +| `output_dir` | `Path` | `'outputs'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `matmul_precision` | `'highest' | 'high' | 'medium'` | `'high'` | Precision for float32 matrix multiplications. ``highest`` is full FP32 (required on ROCm/AMD GPUs to avoid catastrophic precision loss in softmax over large vocabularies). ``high`` enables TF32 on NVIDIA GPUs for a speedup with minor precision tradeoff. See ``torch.set_float32_matmul_precision``. | +| `max_steps` | `int | None` | `None` | Maximum number of training steps. If None, runs indefinitely. | +| `max_async_level` | `int` | `1` | _≥0._ Maximum steps inference can be ahead of training (how off-policy inference can be). Higher values yield better throughput via async execution at the cost of policy lag; ``0`` is fully synchronous. | +| `enable_router_replay` | `bool` | `False` | Return routed experts in the batch so the trainer can replay routing. Requires ``enable_return_routed_experts=true`` on the vLLM server (or ``--enable-return-routed-experts``) and is only supported for custom models. | +| `memory_profiler_path` | `Path | None` | `None` | Path to write the memory profile to. | +| `trace_path` | `Path | None` | `None` | Path to write the PyTorch profiler trace to. | +| `dist_timeout_seconds` | `int` | `600` | Timeout in seconds for torch distributed ops. | +| `max_concurrent_runs` | `int` | `1` | _≥1._ Maximum number of concurrent runs to allow. If 1, only one run may run at a time. | ### `model` | Field | Type | Default | Description | |---|---|---|---| -| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | -| `model.seq_len` | int | `2048` | Sequence length the model is trained on. | -| `model.attn` | 'eager' \| 'sdpa' \| 'flash_attention_2' \| 'flash_attention_3' \| 'fa4' | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | -| `model.fsdp_cpu_offload` | bool | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | -| `model.optim_cpu_offload` | bool | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | -| `model.reshard_after_forward` | bool | `True` | Reshard the model after each forward pass. | -| `model.dp_replicate` | int | `1` | Data parallel dim where model weights are replicated. | -| `model.ep` | int | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | -| `model.ep_comm_backend` | 'torch' \| 'deepep' | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | -| `model.deepep_num_sms` | int | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | -| `model.deepep_token_chunk_size` | int \| None | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | -| `model.cp` | int | `1` | Context parallelism degree. 1 disables CP. | -| `model.cp_style` | 'ring' \| 'ulysses' | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | -| `model.impl` | 'hf' \| 'custom' \| 'auto' | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | -| `model.optimization_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for model optimization. | -| `model.reduce_dtype` | 'bfloat16' \| 'float32' | `'float32'` | dtype for gradient/parameter reductions. | -| `model.moe_use_grouped_mm` | bool | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | -| `model.fp8` | bool | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | -| `model.freeze_moe_router` | bool | `False` | Freeze MoE router parameters during training. | -| `model.fused_lm_head_token_chunk_size` | int \| 'auto' \| 'disabled' | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | +| `model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | +| `model.seq_len` | `int` | `2048` | Sequence length the model is trained on. | +| `model.attn` | `'eager' | 'sdpa' | 'flash_attention_2' | 'flash_attention_3' | 'fa4'` | `'flash_attention_2'` | Attention implementation. With CP enabled, ring attention uses the matching kernel family (FA2/FA3/FA4). | +| `model.fsdp_cpu_offload` | `bool` | `False` | Enable FSDP CPU offloading for parameters, gradients, and optimizer states. Uses pinned memory for efficient CPU↔GPU transfers. | +| `model.optim_cpu_offload` | `bool` | `False` | Offload only optimizer states (momentum, variance) to CPU, keeping weights on GPU. Avoids the H2D all-gather overhead of FSDP CPU offload while still saving GPU memory. | +| `model.reshard_after_forward` | `bool` | `True` | Reshard the model after each forward pass. | +| `model.dp_replicate` | `int` | `1` | Data parallel dim where model weights are replicated. | +| `model.ep` | `int` | `1` | Expert parallelism degree for MoE layers. 1 disables EP. | +| `model.ep_comm_backend` | `'torch' | 'deepep'` | `'torch'` | Communication backend for expert parallelism. ``torch`` uses TorchTitan all-to-all collectives; ``deepep`` uses DeepEP custom kernels. | +| `model.deepep_num_sms` | `int` | `20` | _≥1._ SMs allocated for DeepEP intranode dispatch/combine kernels. Also determines internode RDMA channel count (``num_channels = num_sms / 2``). Lower values leave more SMs for compute; higher values speed up dispatch/combine. The optimal value depends on EP degree and hardware. Only used when ``ep_comm_backend='deepep'``. | +| `model.deepep_token_chunk_size` | `int | None` | `None` | _≥1._ Token chunk size for DeepEP MoE pipelining. When set, DeepEP dispatch for chunk i+1 is launched while experts compute chunk i. Only used when ``ep_comm_backend='deepep'``. | +| `model.cp` | `int` | `1` | Context parallelism degree. 1 disables CP. | +| `model.cp_style` | `'ring' | 'ulysses'` | `'ring'` | CP communication style. ``ring`` uses ring-attention all-gather/reduce-scatter (requires custom kernels per attention type). ``ulysses`` uses all-to-all to redistribute Q/K/V from sequence-sharded to head-sharded, runs vanilla attention locally on the full sequence, then all-to-all back — works out-of-the-box with any attention kernel (softmax FA, linear attention, mamba, etc.). | +| `model.impl` | `'hf' | 'custom' | 'auto'` | `'auto'` | Model implementation. ``auto`` selects ``custom`` if supported by the model, otherwise ``hf``. | +| `model.optimization_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for model optimization. | +| `model.reduce_dtype` | `'bfloat16' | 'float32'` | `'float32'` | dtype for gradient/parameter reductions. | +| `model.moe_use_grouped_mm` | `bool` | `True` | Use grouped mm for MoE layers. Requires compute capability ≥ 9.0. | +| `model.fp8` | `bool` | `False` | FP8 training via DeepGEMM. Replaces ``nn.Linear`` with FP8 blockwise linear and uses FP8 grouped GEMM for MoE experts. Requires SM90 (Hopper) GPUs and ``model.impl='custom'``. | +| `model.freeze_moe_router` | `bool` | `False` | Freeze MoE router parameters during training. | +| `model.fused_lm_head_token_chunk_size` | `int | 'auto' | 'disabled'` | `'disabled'` | Flattened token chunk size for the fused LM head. ``int >= 1`` sets the tokens per LM-head chunk explicitly; ``auto`` auto-enables (RL training picks 8192); ``disabled`` uses the vanilla LM head. Integer values aren't supported for SFT training. | #### `model.vlm` @@ -1674,9 +1765,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | #### `model.compile` @@ -1685,7 +1776,7 @@ Compile the model with ``torch.compile``. | Field | Type | Default | Description | |---|---|---|---| -| `model.compile.fullgraph` | bool | `False` | Compile transformer blocks with ``fullgraph=True``. | +| `model.compile.fullgraph` | `bool` | `False` | Compile transformer blocks with ``fullgraph=True``. | #### `model.ac` @@ -1694,9 +1785,9 @@ Activation checkpointing configuration. If None, activation checkpointing is dis | Field | Type | Default | Description | |---|---|---|---| -| `model.ac.mode` | 'full' \| 'selective' | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | -| `model.ac.freq` | int | `1` | _≥1._ Apply activation checkpointing to every N layers. | -| `model.ac.targets` | list[str] | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | +| `model.ac.mode` | `'full' | 'selective'` | `'full'` | ``full`` checkpoints whole transformer blocks; ``selective`` checkpoints only the subcomponents listed in ``targets`` inside supported custom decoder layers. | +| `model.ac.freq` | `int` | `1` | _≥1._ Apply activation checkpointing to every N layers. | +| `model.ac.targets` | `list[str]` | `['norm']` | Selective checkpoint targets. ``norm`` checkpoints every norm module inside selected layers. ``attn_proj`` checkpoints projection-side attention work outside the kernel (input/output projections, attention-local norms, RoPE, gating, model-specific MLA projection helpers). ``mlp`` checkpoints the entire dense MLP forward (not for MoE). ``mla_up_proj`` checkpoints MLA Q/KV up-projection where supported. ``routed_experts`` checkpoints routed expert compute in MoE layers (including LatentMoE). ``linear_attn`` checkpoints non-softmax token mixers (NemotronH Mamba, Qwen3.5-MoE GatedDeltaNet, AFMoE sliding-window attention). | #### `model.ac_offloading` @@ -1705,8 +1796,8 @@ Activation offloading configuration. If None, activation offloading is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `model.ac_offloading.pin_memory` | bool | `True` | Pin offloaded activations to CPU memory. | -| `model.ac_offloading.max_inflight_activations` | int | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | +| `model.ac_offloading.pin_memory` | `bool` | `True` | Pin offloaded activations to CPU memory. | +| `model.ac_offloading.max_inflight_activations` | `int` | `5` | _≥1._ Max activations kept in flight while offloading. More activations smooth overlap at the cost of GPU memory. | #### `model.index_cache` @@ -1715,8 +1806,8 @@ DSA IndexCache sub-configuration. If set, sparse-attention top-k indices are reu | Field | Type | Default | Description | |---|---|---|---| -| `model.index_cache.topk_freq` | int | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | -| `model.index_cache.topk_pattern` | str \| None | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | +| `model.index_cache.topk_freq` | `int` | `1` | _≥1._ Recompute DSA top-k indices every N layers; intervening layers reuse the cached indices. ``1`` recomputes every layer (effectively no reuse). Mirrors vLLM's ``index_topk_freq`` HF override. | +| `model.index_cache.topk_pattern` | `str | None` | `None` | Optional per-layer schedule that overrides ``topk_freq``. ``'F'`` computes fresh indices for that layer; ``'S'`` reuses the previously cached indices. Length should match the number of decoder layers. | #### `model.lora` @@ -1725,11 +1816,11 @@ LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `model.lora.rank` | int | `16` | _≥1._ Rank of the low-rank decomposition matrices. | -| `model.lora.alpha` | float | `32.0` | _≥0._ LoRA scaling parameter. | -| `model.lora.dropout` | float | `0.0` | _≥0, ≤1._ LoRA dropout rate. | -| `model.lora.target_modules` | list[str] | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | -| `model.lora.modules_to_save` | list[str] | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | +| `model.lora.rank` | `int` | `16` | _≥1._ Rank of the low-rank decomposition matrices. | +| `model.lora.alpha` | `float` | `32.0` | _≥0._ LoRA scaling parameter. | +| `model.lora.dropout` | `float` | `0.0` | _≥0, ≤1._ LoRA dropout rate. | +| `model.lora.target_modules` | `list[str]` | `['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj', 'experts', 'fc1_latent_proj', 'fc2_latent_proj']` | Module names or regex patterns to apply LoRA to. Simple names (e.g. ``q_proj``) match any component in the module path; regex patterns match anywhere in the name. Names unknown to the current model are silently ignored, so defaults cover multiple architectures. NemotronH note: ``experts`` matches NonGatedGroupedExperts inside LatentMoE; ``fc1_latent_proj``/``fc2_latent_proj`` adapt the latent up/down projections. Add ``in_proj``/``out_proj`` to also LoRA Mamba. | +| `model.lora.modules_to_save` | `list[str]` | `[]` | Module names or regex patterns to keep fully trainable (not freeze). Same matching rules as ``target_modules``. | #### `model.debug` @@ -1738,18 +1829,18 @@ Debugging knobs for the model and distributed training. | Field | Type | Default | Description | |---|---|---|---| -| `model.debug.num_layers` | int \| None | `None` | Override the number of transformer layers (truncates the model). | -| `model.debug.random_init` | bool | `False` | Randomly initialize the model instead of loading weights. | -| `model.debug.force_balanced_routing` | bool | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | +| `model.debug.num_layers` | `int | None` | `None` | Override the number of transformer layers (truncates the model). | +| `model.debug.random_init` | `bool` | `False` | Randomly initialize the model instead of loading weights. | +| `model.debug.force_balanced_routing` | `bool` | `False` | Replace MoE token-choice routing with a round-robin assignment so every expert sees an equal share. Intended for fake-data smoke tests where untrained routing would otherwise OOM under severe imbalance. Gating scores are still gathered from the override indices so the forward pass stays consistent. | ### `tokenizer` | Field | Type | Default | Description | |---|---|---|---| -| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | ### `data` @@ -1761,8 +1852,8 @@ Use a fake data loader sampling random micro-batches (for debugging). | Field | Type | Default | Description | |---|---|---|---| -| `data.fake.batch_size` | int | `2` | _≥1._ Batch size of the fake data loader. | -| `data.fake.generate_samples` | bool | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | +| `data.fake.batch_size` | `int` | `2` | _≥1._ Batch size of the fake data loader. | +| `data.fake.generate_samples` | `bool` | `False` | Generate separate samples and pack them into a single micro-batch instead of using random tensors. | ### `ckpt` @@ -1771,17 +1862,17 @@ Full training-state checkpoint configuration (model + optimizer + scheduler). If | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.output_dir` | Path \| None | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | -| `ckpt.interval` | int \| None | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | -| `ckpt.skip_gather_master_weights` | bool | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | -| `ckpt.weights_only` | bool | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | -| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | -| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | -| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | -| `ckpt.skip_scheduler` | bool | `False` | Skip loading the scheduler from checkpoint. | -| `ckpt.skip_dataloader` | bool | `False` | Skip loading the dataloader from checkpoint. | -| `ckpt.skip_optimizer` | bool | `False` | Skip loading the optimizer state from checkpoint. | +| `ckpt.output_dir` | `Path | None` | `None` | Override directory for checkpoints and weights. If set, checkpoints and weight snapshots are written here instead of under the trainer ``output_dir`` — useful for writing large checkpoints to a separate storage volume. | +| `ckpt.interval` | `int | None` | `None` | _≥1._ Interval at which to save the training checkpoint. If None, only checkpoints at the end of training. | +| `ckpt.skip_gather_master_weights` | `bool` | `False` | Skip gathering and saving HF-compatible weight checkpoints. Useful for large models where the gather is expensive and only DCP checkpoints are needed. | +| `ckpt.weights_only` | `bool` | `False` | Save only weight checkpoints (no optimizer/scheduler state). Much faster and smaller than full checkpoints, but cannot resume training. | +| `ckpt.resume_step` | `int | None` | `None` | _≥-1._ Step to resume training from. None starts from scratch; ``-1`` restarts from the latest checkpoint available. | +| `ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | `bool` | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_scheduler` | `bool` | `False` | Skip loading the scheduler from checkpoint. | +| `ckpt.skip_dataloader` | `bool` | `False` | Skip loading the dataloader from checkpoint. | +| `ckpt.skip_optimizer` | `bool` | `False` | Skip loading the optimizer state from checkpoint. | #### `ckpt.weights` @@ -1790,32 +1881,32 @@ Weight-checkpoint sub-configuration. If None, no HF-compatible weight checkpoint | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.weights.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | -| `ckpt.weights.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | -| `ckpt.weights.save_adapter_separately` | bool | `False` | Save LoRA adapters separately before merging into full model weights. | +| `ckpt.weights.save_sharded` | `bool` | `True` | Save the weight checkpoint in sharded format. | +| `ckpt.weights.save_format` | `'safetensors' | 'torch'` | `'safetensors'` | Weight checkpoint serialization format. | +| `ckpt.weights.save_adapter_separately` | `bool` | `False` | Save LoRA adapters separately before merging into full model weights. | ### `log` | Field | Type | Default | Description | |---|---|---|---| -| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | -| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | -| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | -| `log.log_data` | bool | `False` | Log the first data sample at startup. | -| `log.ranks_filter` | list[int] | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | +| `log.level` | `str` | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | `str` | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | `bool` | `False` | Log the first data sample at startup. | +| `log.ranks_filter` | `list[int]` | `[0]` | Trainer ranks to show in console output. Passed to ``torchrun --local-ranks-filter``. | ### `wandb` | Field | Type | Default | Description | |---|---|---|---| -| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | -| `wandb.entity` | str \| None | `None` | W&B entity to log to. | -| `wandb.name` | str \| None | `None` | W&B run name. | -| `wandb.group` | str \| None | `None` | W&B group. | -| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `wandb.offline` | bool | `False` | Run W&B in offline mode. | +| `wandb.project` | `str` | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | `str | None` | `None` | W&B entity to log to. | +| `wandb.name` | `str | None` | `None` | W&B run name. | +| `wandb.group` | `str | None` | `None` | W&B group. | +| `wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `wandb.offline` | `bool` | `False` | Run W&B in offline mode. | ### `bench` @@ -1824,7 +1915,7 @@ Benchmark-mode configuration. When set, ``max_steps`` is forced to 4 and fake da | Field | Type | Default | Description | |---|---|---|---| -| `bench.output_json` | Path \| None | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | +| `bench.output_json` | `Path | None` | `None` | Path to write benchmark results as JSON. If unset, results are only printed to the console. | ### `gc` @@ -1833,7 +1924,7 @@ Garbage collection config. Disables automatic GC and runs deterministic collecti | Field | Type | Default | Description | |---|---|---|---| -| `gc.interval` | int | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | +| `gc.interval` | `int` | `50` | _≥1._ Run garbage collection every N training steps. Disables Python's automatic GC so every rank collects together and one slow rank can't stall the others. | ### `heartbeat` @@ -1842,7 +1933,7 @@ BetterStack heartbeat configuration for monitoring training progress. | Field | Type | Default | Description | |---|---|---|---| -| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | +| `heartbeat.url` | `str` | *required* | URL to send the heartbeat to. | ### `metrics_server` @@ -1851,8 +1942,8 @@ Prometheus metrics server configuration. If set, exposes a ``/metrics`` endpoint | Field | Type | Default | Description | |---|---|---|---| -| `metrics_server.port` | int | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | -| `metrics_server.host` | str | `'0.0.0.0'` | Host to bind the server to. | +| `metrics_server.port` | `int` | `8000` | _≥1, ≤65535._ Port to expose metrics and health endpoints on. | +| `metrics_server.host` | `str` | `'0.0.0.0'` | Host to bind the server to. | ### `experimental` @@ -1874,20 +1965,20 @@ Discriminated union — set `loss.type` to one of `default`, `custom` and provid | Field | Type | Default | Description | |---|---|---|---| -| `loss.type` | 'default' | `'default'` | | -| `loss.dppo_mask_low` | float | `0.2` | _≥0._ Lower DPPO masking threshold. | -| `loss.dppo_mask_high` | float | `0.2` | _≥0._ Upper DPPO masking threshold. | -| `loss.adv_tau` | float | `1.0` | _≥0._ Temperature for the advantage term. | -| `loss.kl_tau` | float | `0.001` | _≥0._ Temperature for the KL term. | +| `loss.type` | `'default'` | `'default'` | | +| `loss.dppo_mask_low` | `float` | `0.2` | _≥0._ Lower DPPO masking threshold. | +| `loss.dppo_mask_high` | `float` | `0.2` | _≥0._ Upper DPPO masking threshold. | +| `loss.adv_tau` | `float` | `1.0` | _≥0._ Temperature for the advantage term. | +| `loss.kl_tau` | `float` | `0.001` | _≥0._ Temperature for the KL term. | #### `loss.type = "custom"` (CustomLossConfig) | Field | Type | Default | Description | |---|---|---|---| -| `loss.type` | 'custom' | `'custom'` | | -| `loss.import_path` | str | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | -| `loss.kwargs` | dict[str, Any] | `{}` | Kwargs forwarded to the loss function. | +| `loss.type` | `'custom'` | `'custom'` | | +| `loss.import_path` | `str` | *required* | Import path to the loss function (e.g. ``my_module.my_loss``). | +| `loss.kwargs` | `dict[str, Any]` | `{}` | Kwargs forwarded to the loss function. | ### `optim` @@ -1899,47 +1990,47 @@ Discriminated union — set `optim.type` to one of `sgd`, `adamw`, `muon`, `sign | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'sgd' | `'sgd'` | | -| `optim.nesterov` | bool | `True` | Use Nesterov momentum. | -| `optim.momentum` | float | `0.9` | SGD momentum factor. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'sgd'` | `'sgd'` | | +| `optim.nesterov` | `bool` | `True` | Use Nesterov momentum. | +| `optim.momentum` | `float` | `0.9` | SGD momentum factor. | #### `optim.type = "adamw"` (AdamWConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'adamw' | `'adamw'` | | -| `optim.betas1` | float | `0.9` | _≥0._ Adam first-moment (β1) decay. | -| `optim.betas2` | float | `0.999` | _≥0._ Adam second-moment (β2) decay. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'adamw'` | `'adamw'` | | +| `optim.betas1` | `float` | `0.9` | _≥0._ Adam first-moment (β1) decay. | +| `optim.betas2` | `float` | `0.999` | _≥0._ Adam second-moment (β2) decay. | #### `optim.type = "muon"` (MuonConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'muon' | `'muon'` | | -| `optim.mu` | float | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | -| `optim.betas1` | float | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | -| `optim.betas2` | float | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'muon'` | `'muon'` | | +| `optim.mu` | `float` | `0.95` | _≥0._ Momentum factor for the Muon algorithm. | +| `optim.betas1` | `float` | `0.9` | _≥0._ β1 for the AdamW/Lion sub-optimizer used on non-Muon params. | +| `optim.betas2` | `float` | `0.95` | _≥0._ β2 for the AdamW/Lion sub-optimizer used on non-Muon params. | #### `optim.type = "sign_sgd"` (SignSGDConfig) | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `1e-06` | _≥0._ Peak learning rate. | -| `optim.weight_decay` | float | `0.01` | _≥0._ L2 weight-decay coefficient. | -| `optim.max_norm` | float \| None | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | -| `optim.type` | 'sign_sgd' | `'sign_sgd'` | | +| `optim.lr` | `float` | `1e-06` | _≥0._ Peak learning rate. | +| `optim.weight_decay` | `float` | `0.01` | _≥0._ L2 weight-decay coefficient. | +| `optim.max_norm` | `float | None` | `1.0` | _≥0._ Maximum gradient norm to clip to. If None, gradient clipping is disabled. | +| `optim.type` | `'sign_sgd'` | `'sign_sgd'` | | ### `scheduler` @@ -1951,26 +2042,26 @@ Discriminated union — set `scheduler.type` to one of `constant`, `linear`, `co | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'constant' | `'constant'` | | +| `scheduler.type` | `'constant'` | `'constant'` | | #### `scheduler.type = "linear"` (LinearSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'linear' | `'linear'` | | -| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `scheduler.decay_steps` | int | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | -| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `scheduler.type` | `'linear'` | `'linear'` | | +| `scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.decay_steps` | `int` | `10` | _≥0._ Steps to decay the learning rate during the final portion of training. | +| `scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | #### `scheduler.type = "cosine"` (CosineSchedulerConfig) | Field | Type | Default | Description | |---|---|---|---| -| `scheduler.type` | 'cosine' | `'cosine'` | | -| `scheduler.warmup_steps` | int | `10` | _≥0._ Warmup steps for the learning rate scheduler. | -| `scheduler.min_lr` | float | `0.0` | _≥0._ Minimum learning rate to converge to. | +| `scheduler.type` | `'cosine'` | `'cosine'` | | +| `scheduler.warmup_steps` | `int` | `10` | _≥0._ Warmup steps for the learning rate scheduler. | +| `scheduler.min_lr` | `float` | `0.0` | _≥0._ Minimum learning rate to converge to. | ### `weight_broadcast` @@ -1984,21 +2075,21 @@ Discriminated union — set `weight_broadcast.type` to one of `filesystem`, `ncc | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'filesystem' | `'filesystem'` | | -| `weight_broadcast.save_sharded` | bool | `True` | Save the weight checkpoint in sharded format. | -| `weight_broadcast.save_format` | 'safetensors' \| 'torch' | `'safetensors'` | Weight checkpoint serialization format. | +| `weight_broadcast.type` | `'filesystem'` | `'filesystem'` | | +| `weight_broadcast.save_sharded` | `bool` | `True` | Save the weight checkpoint in sharded format. | +| `weight_broadcast.save_format` | `'safetensors' | 'torch'` | `'safetensors'` | Weight checkpoint serialization format. | #### `weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'nccl' | `'nccl'` | | -| `weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | -| `weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | -| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | -| `weight_broadcast.inference_world_size` | int | `1` | Number of GPUs used for inference. | -| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | +| `weight_broadcast.type` | `'nccl'` | `'nccl'` | | +| `weight_broadcast.host` | `str` | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `weight_broadcast.port` | `int` | `29501` | Port for the NCCL broadcast rendezvous. | +| `weight_broadcast.timeout` | `int` | `1200` | Timeout in seconds for the NCCL broadcast. | +| `weight_broadcast.inference_world_size` | `int` | `1` | Number of GPUs used for inference. | +| `weight_broadcast.quantize_in_weight_transfer` | `bool` | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. When disabled, uses default HF checkpoint-format transfer. | ### `rollout_transport` @@ -2012,17 +2103,17 @@ Discriminated union — set `rollout_transport.type` to one of `filesystem`, `zm | Field | Type | Default | Description | |---|---|---|---| -| `rollout_transport.type` | 'filesystem' | `'filesystem'` | | +| `rollout_transport.type` | `'filesystem'` | `'filesystem'` | | #### `rollout_transport.type = "zmq"` (ZMQTransportConfig) | Field | Type | Default | Description | |---|---|---|---| -| `rollout_transport.type` | 'zmq' | `'zmq'` | | -| `rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | -| `rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | -| `rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | +| `rollout_transport.type` | `'zmq'` | `'zmq'` | | +| `rollout_transport.host` | `str` | `'localhost'` | Host address for ZMQ transport. | +| `rollout_transport.port` | `int` | `5555` | Base port for ZMQ transport. | +| `rollout_transport.hwm` | `int` | `10` | High-water mark (max in-flight messages per ZMQ socket). | ## `orchestrator` — Standalone orchestrator @@ -2033,27 +2124,26 @@ _Defined in_ `prime_rl.configs.orchestrator.OrchestratorConfig`. | Field | Type | Default | Description | |---|---|---|---| -| `training_mode` | 'rl' \| 'opd' \| 'sft' | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | -| `advantage` | DefaultAdvantageConfig \| CustomAdvantageConfig \| None | `DefaultAdvantageConfig()` | | -| `filters` | list[GibberishFilterConfig \| RepetitionFilterConfig \| ZeroAdvantageFilterConfig] | `[GibberishFilterConfig(type='gibberish', enforce=False, token_id_threshold=100000, logprob_offset=2.0), RepetitionFilterConfig(type='repetition', enforce=False, window=3000, prob_threshold=0.99), ZeroAdvantageFilterConfig(type='zero_advantage', enforce=True)]` | Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). | -| `collect_inference_metrics` | bool | `True` | Collect inference-server metrics (requires wandb). | -| `output_dir` | Path | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | -| `tasks_per_minute` | int \| None | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | -| `batch_size` | int \| None | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | -| `token_batch_size` | int \| None | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | -| `oversampling_factor` | float \| None | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | -| `max_inflight_rollouts` | int \| None | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | -| `group_size` | int | `1` | _≥1._ Output sequences returned per example during training. | -| `seq_len` | int | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | -| `num_train_workers` | int | `1` | _≥1._ Training workers to use. | -| `max_steps` | int \| None | `None` | Maximum training steps. If None, runs indefinitely. | -| `max_off_policy_steps` | int | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | -| `max_async_level` | int | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | -| `strict_async_level` | bool | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | -| `bench` | bool | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | -| `seed` | int \| None | `42` | Random seed for the orchestrator. | -| `use_renderer` | bool | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | -| `env_install_prerelease` | bool | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | +| `training_mode` | `'rl' | 'opd' | 'sft'` | `'rl'` | Training mode. ``rl``: student generates rollouts, no teacher. ``opd``: student generates rollouts, teacher computes logprobs (teacher_tau > 0). ``sft``: teacher generates rollouts, student inference pool used for evals and weight sync. | +| `advantage` | `DefaultAdvantageConfig | CustomAdvantageConfig | None` | `DefaultAdvantageConfig()` | | +| `collect_inference_metrics` | `bool` | `True` | Collect inference-server metrics (requires wandb). | +| `output_dir` | `Path` | `'outputs/run_default'` | Directory to write outputs to — checkpoints, weights, rollouts, and logs are written as subdirectories. Should be a persistent directory with enough disk space and unique per experiment running on a single node. | +| `tasks_per_minute` | `int | None` | `None` | _≥1._ Rate limit per environment worker, in tasks per minute. Recommended for sandbox-backed environments to prevent sandbox-not-ready errors during autoscaling. With multiple workers, the effective total rate is ``workers × this value``. None disables rate limiting. | +| `batch_size` | `int | None` | `None` | _≥1._ Samples to train on per step (rollout-based batching). Set this OR ``token_batch_size``. | +| `token_batch_size` | `int | None` | `None` | _≥1._ Tokens to train on per step (token-based batching). Set this OR ``batch_size``. | +| `oversampling_factor` | `float | None` | `None` | _>0._ Rollout-mode batching only. Multiplier used to derive ``max_inflight_rollouts`` from ``batch_size`` when ``max_inflight_rollouts`` is unset. Values below 1.0 intentionally cap in-flight rollout capacity below ``batch_size``. | +| `max_inflight_rollouts` | `int | None` | `None` | _≥1._ Maximum number of rollouts kept in-flight. Required for token-based batching. With ``batch_size`` set, defaults to ``batch_size * oversampling_factor`` (or ``batch_size`` when ``oversampling_factor`` is unset). | +| `group_size` | `int` | `1` | _≥1._ Output sequences returned per example during training. | +| `seq_len` | `int` | `2048` | Training sequence length. Shorter samples are padded; longer samples are truncated. | +| `num_train_workers` | `int` | `1` | _≥1._ Training workers to use. | +| `max_steps` | `int | None` | `None` | Maximum training steps. If None, runs indefinitely. | +| `max_off_policy_steps` | `int` | `8` | _≥0._ Maximum policies allowed to generate a single rollout. Rollouts generated more than ``max_off_policy_steps`` ahead of training are discarded. Higher values yield better throughput at the cost of off-policy noise. | +| `max_async_level` | `int` | `1` | _≥0._ Maximum steps inference can be ahead of training. ``0`` degenerates to synchronous on-policy RL; ``≥1`` overlaps training and inference. | +| `strict_async_level` | `bool` | `False` | Strictly enforce ``max_async_level``. When True, the rollout policy is always exactly ``max_async_level`` steps ahead of training. When False, any policy within ``max_async_level`` steps is allowed (always uses the latest available policy). | +| `bench` | `bool` | `False` | Benchmark mode. Sets ``max_steps`` to 5, ``max_async_level`` to ~∞, and disables W&B. | +| `seed` | `int | None` | `42` | Random seed for the orchestrator. | +| `use_renderer` | `bool` | `True` | Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``). | +| `env_install_prerelease` | `bool` | `False` | Allow pre-release versions when installing environments (e.g. ``verifiers>=0.1.12.dev5``). Passes ``--prerelease`` to ``prime env install``. | ### `student` @@ -2065,8 +2155,8 @@ Student rollout participant (model + client) — the model being trained. | Field | Type | Default | Description | |---|---|---|---| -| `student.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `student.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `student.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `student.model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | ##### `student.model.vlm` @@ -2075,9 +2165,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `student.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `student.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `student.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `student.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `student.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `student.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ##### `student.model.lora` @@ -2086,27 +2176,27 @@ Per-run LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `student.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | -| `student.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | -| `student.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | +| `student.model.lora.name` | `str | None` | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `student.model.lora.rank` | `int | None` | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `student.model.lora.alpha` | `float | None` | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | #### `student.client` | Field | Type | Default | Description | |---|---|---|---| -| `student.client.timeout` | int | `1200` | Request timeout in seconds. | -| `student.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | -| `student.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | -| `student.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | -| `student.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | -| `student.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | -| `student.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | -| `student.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | -| `student.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | -| `student.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | -| `student.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | -| `student.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | +| `student.client.timeout` | `int` | `1200` | Request timeout in seconds. | +| `student.client.connect_timeout` | `float` | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `student.client.wait_for_ready_timeout` | `int` | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `student.client.base_url` | `list[str]` | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `student.client.api_key_var` | `str` | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `student.client.headers` | `dict[str, str]` | `{}` | Static headers sent with every request. | +| `student.client.headers_from_env` | `dict[str, str]` | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `student.client.extra_headers_from_state` | `dict[str, str]` | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | +| `student.client.skip_model_check` | `bool` | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `student.client.dp_rank_count` | `int` | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `student.client.admin_base_url` | `list[str] | None` | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `student.client.router_url` | `str | None` | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | ##### `student.client.elastic` @@ -2115,9 +2205,9 @@ Elastic inference pool config for DNS-based service discovery. When set, ``base_ | Field | Type | Default | Description | |---|---|---|---| -| `student.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | -| `student.client.elastic.port` | int | `8000` | Port that inference servers listen on. | -| `student.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | +| `student.client.elastic.hostname` | `str` | *required* | DNS hostname that resolves to inference server IPs. | +| `student.client.elastic.port` | `int` | `8000` | Port that inference servers listen on. | +| `student.client.elastic.sync_interval` | `float` | `5.0` | Seconds between server discovery checks. | ### `teacher` @@ -2129,8 +2219,8 @@ Teacher rollout participant (model + client). Role depends on ``training_mode``: | Field | Type | Default | Description | |---|---|---|---| -| `teacher.model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `teacher.model.trust_remote_code` | bool | `False` | Trust remote code when initializing the tokenizer. | +| `teacher.model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `teacher.model.trust_remote_code` | `bool` | `False` | Trust remote code when initializing the tokenizer. | ##### `teacher.model.vlm` @@ -2139,9 +2229,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `teacher.model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `teacher.model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `teacher.model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `teacher.model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `teacher.model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `teacher.model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ##### `teacher.model.lora` @@ -2150,27 +2240,27 @@ Per-run LoRA configuration. If None, LoRA is disabled. | Field | Type | Default | Description | |---|---|---|---| -| `teacher.model.lora.name` | str \| None | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | -| `teacher.model.lora.rank` | int \| None | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | -| `teacher.model.lora.alpha` | float \| None | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | +| `teacher.model.lora.name` | `str | None` | `None` | LoRA adapter name. If None, auto-generated from rank and alpha. | +| `teacher.model.lora.rank` | `int | None` | `None` | _≥1._ LoRA rank for this run. Must be ≤ trainer's max rank. If None, uses the trainer's rank. | +| `teacher.model.lora.alpha` | `float | None` | `None` | _≥0._ LoRA alpha for this run. If None, uses the trainer's alpha. | #### `teacher.client` | Field | Type | Default | Description | |---|---|---|---| -| `teacher.client.timeout` | int | `1200` | Request timeout in seconds. | -| `teacher.client.connect_timeout` | float | `30.0` | TCP connect timeout in seconds for inference API requests. | -| `teacher.client.wait_for_ready_timeout` | int | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | -| `teacher.client.base_url` | list[str] | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | -| `teacher.client.api_key_var` | str | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | -| `teacher.client.headers` | dict[str, str] | `{}` | Static headers sent with every request. | -| `teacher.client.headers_from_env` | dict[str, str] | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | -| `teacher.client.extra_headers_from_state` | dict[str, str] | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | -| `teacher.client.skip_model_check` | bool | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | -| `teacher.client.dp_rank_count` | int | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | -| `teacher.client.admin_base_url` | list[str] \| None | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | -| `teacher.client.router_url` | str \| None | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | +| `teacher.client.timeout` | `int` | `1200` | Request timeout in seconds. | +| `teacher.client.connect_timeout` | `float` | `30.0` | TCP connect timeout in seconds for inference API requests. | +| `teacher.client.wait_for_ready_timeout` | `int` | `1800` | Seconds to wait at startup for the inference pool to become ready. Applies to both the static health check and elastic DNS-based discovery. | +| `teacher.client.base_url` | `list[str]` | `['http://localhost:8000/v1']` | Base URLs for the OpenAI API. With more than one URL, the client round-robins (chat) completion requests across all servers. Ignored when ``elastic`` is set. | +| `teacher.client.api_key_var` | `str` | `'VLLM_API_KEY'` | Environment variable name containing the API key, resolved via ``os.getenv``. Can be any string when the server is not protected by an API key; the same key is used for every URL. | +| `teacher.client.headers` | `dict[str, str]` | `{}` | Static headers sent with every request. | +| `teacher.client.headers_from_env` | `dict[str, str]` | `{}` | Maps HTTP header names to environment variable names; each entry is resolved via ``os.getenv`` and merged into request headers. e.g. ``{"X-Prime-Team-ID": "PRIME_TEAM_ID"}``. | +| `teacher.client.extra_headers_from_state` | `dict[str, str]` | `{}` | Maps HTTP header names to rollout-state field names. The header value is read from the rollout state dict on every request. e.g. ``{"X-Session-ID": "trajectory_id"}`` enables sticky routing at the inference router. | +| `teacher.client.skip_model_check` | `bool` | `False` | Skip checking that the model is available in the inference pool. Useful for external APIs or keys that do not expose ``/models``. | +| `teacher.client.dp_rank_count` | `int` | `1` | _≥1._ Number of data-parallel ranks behind each base URL. When > 1, each URL is expanded into ``dp_rank_count`` logical clients pinned via the ``X-data-parallel-rank`` header, so every request within a rollout hits the same DP engine and reuses KV cache. Auto-set from the inference config when using the RL entrypoint. | +| `teacher.client.admin_base_url` | `list[str] | None` | `None` | Separate base URLs for admin operations (weight updates, health checks). When set, admin clients bypass routers and hit each server directly — used in disaggregated P/D deployments where the router must not handle admin traffic. | +| `teacher.client.router_url` | `str | None` | `None` | vllm-router URL for load-aware inference routing. With elastic mode, inference requests go through the router while admin ops still hit discovered pods directly. | ##### `teacher.client.elastic` @@ -2179,18 +2269,17 @@ Elastic inference pool config for DNS-based service discovery. When set, ``base_ | Field | Type | Default | Description | |---|---|---|---| -| `teacher.client.elastic.hostname` | str | *required* | DNS hostname that resolves to inference server IPs. | -| `teacher.client.elastic.port` | int | `8000` | Port that inference servers listen on. | -| `teacher.client.elastic.sync_interval` | float | `5.0` | Seconds between server discovery checks. | +| `teacher.client.elastic.hostname` | `str` | *required* | DNS hostname that resolves to inference server IPs. | +| `teacher.client.elastic.port` | `int` | `8000` | Port that inference servers listen on. | +| `teacher.client.elastic.sync_interval` | `float` | `5.0` | Seconds between server discovery checks. | ### `train` | Field | Type | Default | Description | |---|---|---|---| -| `train.env` | list[TrainEnvConfig] | `[TrainEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=TrainSamplingConfig(temperature=1.0, repetition_penalty=1.0, max_completion_tokens=None, min_tokens=0, seed=None, extra_body={}))]` | Training environments. | -| `train.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | -| `train.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `train.num_workers` | `int | 'auto'` | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `train.max_retries` | `int` | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | #### `train.sampling` @@ -2199,21 +2288,54 @@ Shared training sampling configuration. | Field | Type | Default | Description | |---|---|---|---| -| `train.sampling.temperature` | float | `1.0` | _≥0._ Sampling temperature. | -| `train.sampling.repetition_penalty` | float | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | -| `train.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | -| `train.sampling.min_tokens` | int | `0` | _≥0._ Minimum output tokens per sequence. | -| `train.sampling.seed` | int \| None | `None` | Random seed for sampling. If None, no seeding is used. | -| `train.sampling.extra_body` | dict[str, Any] | `{}` | Extra body forwarded with each request to the inference server. | +| `train.sampling.temperature` | `float` | `1.0` | _≥0._ Sampling temperature. | +| `train.sampling.repetition_penalty` | `float` | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `train.sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `train.sampling.min_tokens` | `int` | `0` | _≥0._ Minimum output tokens per sequence. | +| `train.sampling.seed` | `int | None` | `None` | Random seed for sampling. If None, no seeding is used. | +| `train.sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body forwarded with each request to the inference server. | + + +#### `train.env.` (list item) + +Training environments. + +| Field | Type | Default | Description | +|---|---|---|---| +| `train.env..id` | `str` | `'reverse-text'` | Registered verifiers environment ID (e.g. ``math-env``, ``primeintellect/math-env``). May include an ``@version`` suffix for installation. | +| `train.env..name` | `str | None` | `None` | Display name for this environment in logs, metrics, and buffer keys. Defaults to the ``id`` without ``@version``. Must be unique across all envs in the same group. | +| `train.env..args` | `dict` | `{}` | Keyword arguments forwarded to ``vf.load_environment``. See the environment's docstring for accepted args. | +| `train.env..extra_env_kwargs` | `dict[str, Any]` | `{}` | Extra kwargs passed to the env (e.g. ``seq_len``, ``max_total_completion_tokens``). Auto-populated by the orchestrator; user overrides are generally discouraged. The main use case is matching ``extra_env_kwargs`` when running an env in an isolated environment server. | +| `train.env..address` | `str | None` | `None` | ZMQ address of an external env server (e.g. ``tcp://host:5000``). When set, the orchestrator connects to this server instead of spawning one; when None, a subprocess env server is spawned automatically. | +| `train.env..num_workers` | `int | 'auto'` | `'auto'` | Worker processes for the spawned env server. ``auto`` scales to 1 worker per 256 concurrent rollouts. Ignored when ``address`` is set. | +| `train.env..ratio` | `float | None` | `None` | _>0._ Sampling weight for this environment in the buffer. When None for all envs, samples uniformly across all available problems. When set, must be set on all envs — values are relative weights normalized to probabilities (e.g. [1, 1] and [0.5, 0.5] are equivalent). | +| `train.env..max_retries` | `int` | `3` | _≥0._ Times the env server retries a failed rollout before returning an error. | +| `train.env..max_total_completion_tokens` | `int` | `-1` | Maximum total completion tokens across all turns in a multi-turn rollout. ``-1`` disables. Auto-populated into ``extra_env_kwargs``. | +| `train.env..timeout` | `float | None` | `None` | Per-rollout wall-clock timeout in seconds. None disables. | +| `train.env..state_columns` | `list[str]` | `[]` | Extra ``State`` fields to persist into the saved rollout records (in addition to the always-saved ``trajectory`` and ``sampling_args``). Values must be JSON-serializable. | + + +##### `train.env..sampling` + +Per-env sampling overrides. Unset fields inherit from the group-level train sampling config. + +| Field | Type | Default | Description | +|---|---|---|---| +| `train.env..sampling.temperature` | `float` | `1.0` | _≥0._ Sampling temperature. | +| `train.env..sampling.repetition_penalty` | `float` | `1.0` | _≥0._ Repetition penalty. Values > 1.0 discourage repetition, < 1.0 encourage it, 1.0 disables. | +| `train.env..sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. If None, generates until max context length or EOS. | +| `train.env..sampling.min_tokens` | `int` | `0` | _≥0._ Minimum output tokens per sequence. | +| `train.env..sampling.seed` | `int | None` | `None` | Random seed for sampling. If None, no seeding is used. | +| `train.env..sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body forwarded with each request to the inference server. | ### `tokenizer` | Field | Type | Default | Description | |---|---|---|---| -| `tokenizer.name` | str \| None | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | -| `tokenizer.trust_remote_code` | bool \| None | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | -| `tokenizer.chat_template` | str \| None | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | +| `tokenizer.name` | `str | None` | `None` | Tokenizer name or path. If None, the model's default tokenizer is used. | +| `tokenizer.trust_remote_code` | `bool | None` | `None` | Trust remote code when initializing the tokenizer. If None, inherits the model's ``trust_remote_code`` setting. | +| `tokenizer.chat_template` | `str | None` | `None` | Chat template for the tokenizer. Either a Jinja2 template string or a path to a template file. If None, the tokenizer's default chat template is used. | ### `renderer` @@ -2222,12 +2344,12 @@ Client-side renderer configuration. Only consumed when ``use_renderer=true``. | Field | Type | Default | Description | |---|---|---|---| -| `renderer.name` | str | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | -| `renderer.tool_parser` | str \| None | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | -| `renderer.reasoning_parser` | str \| None | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | -| `renderer.pool_size` | int \| None | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | -| `renderer.preserve_all_thinking` | bool | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | -| `renderer.preserve_thinking_between_tool_calls` | bool | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | +| `renderer.name` | `str` | `'auto'` | Renderer used for chat-template tokenization. One of: ``auto`` (detect from tokenizer), ``qwen3``, ``qwen3_vl``, ``qwen3.5``, ``glm5``, ``glm4.5``, ``minimax-m2``, ``deepseek_v3``, ``kimi_k2``, ``kimi_k25``, ``nemotron3``, ``gpt_oss``, ``default``. | +| `renderer.tool_parser` | `str | None` | `None` | Tool parser from ``renderers.parsers``. Only consumed by DefaultRenderer; model-specific renderers bake their own parsing in. Options: ``qwen3``, ``qwen3.5``, ``glm``, ``deepseek_v3``. | +| `renderer.reasoning_parser` | `str | None` | `None` | Reasoning parser from ``renderers.parsers``. Only consumed by DefaultRenderer. Options: ``think``. | +| `renderer.pool_size` | `int | None` | `None` | _≥1._ Number of renderer slots shared across concurrent rollouts. Bump for long multi-turn prompts where client-side jinja tokenization serializes. | +| `renderer.preserve_all_thinking` | `bool` | `False` | Re-emit every past-assistant turn's ``reasoning_content`` between ````/```` (or the model's equivalent), even when the chat template would drop it. Strict superset of preserve_thinking_between_tool_calls. | +| `renderer.preserve_thinking_between_tool_calls` | `bool` | `False` | Preserve past-assistant ``reasoning_content`` only inside the current tool cycle — the contiguous assistant→tool→…→assistant block after the most recent user message, when that block contains at least one tool response. A new user turn closes the block. | ### `optim` @@ -2236,7 +2358,7 @@ Per-run optimizer configuration for multi-run training. | Field | Type | Default | Description | |---|---|---|---| -| `optim.lr` | float | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | +| `optim.lr` | `float` | `0.0001` | _≥0._ Learning rate for this run (per-run override for multi-run training). | ### `eval` @@ -2245,15 +2367,14 @@ Evaluation configuration. | Field | Type | Default | Description | |---|---|---|---| -| `eval.env` | list[EvalEnvConfig] | `[EvalEnvConfig(id='reverse-text', name=None, args={}, extra_env_kwargs={'max_total_completion_tokens': -1}, address=None, num_workers='auto', ratio=None, max_retries=3, max_total_completion_tokens=-1, timeout=None, state_columns=[], sampling=EvalSamplingConfig(temperature=None, repetition_penalty=None, top_p=None, top_k=None, min_p=None, max_completion_tokens=None, min_tokens=None, reasoning_effort=None, seed=None, extra_body={}), num_examples=-1, group_size=1, interval=100)]` | Evaluation environments. | -| `eval.num_examples` | int | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | -| `eval.group_size` | int | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | -| `eval.num_workers` | int \| 'auto' | `'auto'` | Default worker processes for env servers. Can be overridden per env. | -| `eval.max_retries` | int | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | -| `eval.interval` | int | `100` | _≥1._ Step interval at which to evaluate the model. | -| `eval.eval_base_model` | bool | `True` | Evaluate the base model we are training on. | -| `eval.skip_eval_on_resume` | bool | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | -| `eval.cancel_inflight_rollouts_on_eval` | bool | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | +| `eval.num_examples` | `int` | `-1` | Default eval examples per environment. ``-1`` uses all. Can be overridden per env. | +| `eval.group_size` | `int` | `1` | _≥1._ Default rollouts per example. Can be overridden per env. | +| `eval.num_workers` | `int | 'auto'` | `'auto'` | Default worker processes for env servers. Can be overridden per env. | +| `eval.max_retries` | `int` | `3` | _≥0._ Default retries for failed rollouts. Can be overridden per env. | +| `eval.interval` | `int` | `100` | _≥1._ Step interval at which to evaluate the model. | +| `eval.eval_base_model` | `bool` | `True` | Evaluate the base model we are training on. | +| `eval.skip_eval_on_resume` | `bool` | `True` | When resuming the orchestrator from a checkpoint, skip the (potentially redundant) online eval that would otherwise run immediately at the resumed step. | +| `eval.cancel_inflight_rollouts_on_eval` | `bool` | `False` | Cancel in-flight training rollouts before starting online evals. Avoids congestion (no training + eval rollouts at the same time) at the cost of slower training steps as the pipeline has to refill after each eval. | #### `eval.sampling` @@ -2262,51 +2383,91 @@ Shared eval sampling configuration; can differ from training sampling. | Field | Type | Default | Description | |---|---|---|---| -| `eval.sampling.temperature` | float \| None | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | -| `eval.sampling.repetition_penalty` | float \| None | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | -| `eval.sampling.top_p` | float \| None | `None` | Nucleus sampling threshold. None defers to the inference server default. | -| `eval.sampling.top_k` | int \| None | `None` | Top-k sampling. None defers to the inference server default. | -| `eval.sampling.min_p` | float \| None | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | -| `eval.sampling.max_completion_tokens` | int \| None | `None` | Maximum output tokens per turn. None defers to the inference server default. | -| `eval.sampling.min_tokens` | int \| None | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | -| `eval.sampling.reasoning_effort` | 'minimal' \| 'low' \| 'medium' \| 'high' \| None | `None` | Reasoning effort constraint for reasoning models. | -| `eval.sampling.seed` | int \| None | `None` | Random seed for sampling. None means no seeding. | -| `eval.sampling.extra_body` | dict[str, Any] | `{}` | Extra body parameters forwarded to the inference server. | +| `eval.sampling.temperature` | `float | None` | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `eval.sampling.repetition_penalty` | `float | None` | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `eval.sampling.top_p` | `float | None` | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `eval.sampling.top_k` | `int | None` | `None` | Top-k sampling. None defers to the inference server default. | +| `eval.sampling.min_p` | `float | None` | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `eval.sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `eval.sampling.min_tokens` | `int | None` | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `eval.sampling.reasoning_effort` | `'minimal' | 'low' | 'medium' | 'high' | None` | `None` | Reasoning effort constraint for reasoning models. | +| `eval.sampling.seed` | `int | None` | `None` | Random seed for sampling. None means no seeding. | +| `eval.sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body parameters forwarded to the inference server. | + + +#### `eval.env.` (list item) + +Evaluation environments. + +| Field | Type | Default | Description | +|---|---|---|---| +| `eval.env..id` | `str` | `'reverse-text'` | Registered verifiers environment ID (e.g. ``math-env``, ``primeintellect/math-env``). May include an ``@version`` suffix for installation. | +| `eval.env..name` | `str | None` | `None` | Display name for this environment in logs, metrics, and buffer keys. Defaults to the ``id`` without ``@version``. Must be unique across all envs in the same group. | +| `eval.env..args` | `dict` | `{}` | Keyword arguments forwarded to ``vf.load_environment``. See the environment's docstring for accepted args. | +| `eval.env..extra_env_kwargs` | `dict[str, Any]` | `{}` | Extra kwargs passed to the env (e.g. ``seq_len``, ``max_total_completion_tokens``). Auto-populated by the orchestrator; user overrides are generally discouraged. The main use case is matching ``extra_env_kwargs`` when running an env in an isolated environment server. | +| `eval.env..address` | `str | None` | `None` | ZMQ address of an external env server (e.g. ``tcp://host:5000``). When set, the orchestrator connects to this server instead of spawning one; when None, a subprocess env server is spawned automatically. | +| `eval.env..num_workers` | `int | 'auto'` | `'auto'` | Worker processes for the spawned env server. ``auto`` scales to 1 worker per 256 concurrent rollouts. Ignored when ``address`` is set. | +| `eval.env..ratio` | `float | None` | `None` | _>0._ Sampling weight for this environment in the buffer. When None for all envs, samples uniformly across all available problems. When set, must be set on all envs — values are relative weights normalized to probabilities (e.g. [1, 1] and [0.5, 0.5] are equivalent). | +| `eval.env..max_retries` | `int` | `3` | _≥0._ Times the env server retries a failed rollout before returning an error. | +| `eval.env..max_total_completion_tokens` | `int` | `-1` | Maximum total completion tokens across all turns in a multi-turn rollout. ``-1`` disables. Auto-populated into ``extra_env_kwargs``. | +| `eval.env..timeout` | `float | None` | `None` | Per-rollout wall-clock timeout in seconds. None disables. | +| `eval.env..state_columns` | `list[str]` | `[]` | Extra ``State`` fields to persist into the saved rollout records (in addition to the always-saved ``trajectory`` and ``sampling_args``). Values must be JSON-serializable. | +| `eval.env..num_examples` | `int` | `-1` | Eval examples to sample from the dataset. ``-1`` uses all available examples. | +| `eval.env..group_size` | `int` | `1` | _≥1._ Rollouts generated per example. Used for pass@k estimation (e.g. ``group_size=8`` enables pass@1 through pass@8). | +| `eval.env..interval` | `int` | `100` | _≥1._ Per-env eval interval. If unset, inherits from the group-level eval interval. | + + +##### `eval.env..sampling` + +Per-env sampling overrides. Unset fields inherit from the group-level eval sampling config. + +| Field | Type | Default | Description | +|---|---|---|---| +| `eval.env..sampling.temperature` | `float | None` | `None` | _≥0._ Sampling temperature. None defers to the inference server default. | +| `eval.env..sampling.repetition_penalty` | `float | None` | `None` | _≥0._ Repetition penalty. None defers to the inference server default. | +| `eval.env..sampling.top_p` | `float | None` | `None` | Nucleus sampling threshold. None defers to the inference server default. | +| `eval.env..sampling.top_k` | `int | None` | `None` | Top-k sampling. None defers to the inference server default. | +| `eval.env..sampling.min_p` | `float | None` | `None` | _≥0._ Min-p sampling threshold. None defers to the inference server default. | +| `eval.env..sampling.max_completion_tokens` | `int | None` | `None` | Maximum output tokens per turn. None defers to the inference server default. | +| `eval.env..sampling.min_tokens` | `int | None` | `None` | _≥0._ Minimum output tokens per sequence. None defers to the inference server default. | +| `eval.env..sampling.reasoning_effort` | `'minimal' | 'low' | 'medium' | 'high' | None` | `None` | Reasoning effort constraint for reasoning models. | +| `eval.env..sampling.seed` | `int | None` | `None` | Random seed for sampling. None means no seeding. | +| `eval.env..sampling.extra_body` | `dict[str, Any]` | `{}` | Extra body parameters forwarded to the inference server. | ### `buffer` | Field | Type | Default | Description | |---|---|---|---| -| `buffer.seed` | int \| None | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | -| `buffer.easy_threshold` | float \| None | `None` | Average-reward threshold above which a problem is classified ``easy``. | -| `buffer.hard_threshold` | float \| None | `None` | Average-reward threshold below which a problem is classified ``hard``. | -| `buffer.easy_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | -| `buffer.hard_fraction` | float | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | -| `buffer.online_difficulty_filtering` | bool | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | -| `buffer.hash_keys` | list[str] | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | +| `buffer.seed` | `int | None` | `None` | Random seed for the buffer. When set, sampling from the buffer is deterministic. | +| `buffer.easy_threshold` | `float | None` | `None` | Average-reward threshold above which a problem is classified ``easy``. | +| `buffer.hard_threshold` | `float | None` | `None` | Average-reward threshold below which a problem is classified ``hard``. | +| `buffer.easy_fraction` | `float` | `0.0` | _≥0, ≤1._ Fraction of easy problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `buffer.hard_fraction` | `float` | `0.0` | _≥0, ≤1._ Fraction of hard problems to convert to ``normal`` when resuming or starting training. Only problems with difficulty ``normal`` are sampled. | +| `buffer.online_difficulty_filtering` | `bool` | `False` | Filter rollouts based on difficulty. When True, rollouts with average reward 0.0 or 1.0 are not added to the buffer. | +| `buffer.hash_keys` | `list[str]` | `['env_name', 'prompt']` | _len ≥ 1._ Keys used to compute example hashes. Used to match examples from buffer checkpoints and determine buffer resume behavior. | ### `log` | Field | Type | Default | Description | |---|---|---|---| -| `log.level` | str | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | -| `log.vf_level` | str | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | -| `log.json_logging` | bool | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | -| `log.log_data` | bool | `False` | Log the first data sample at startup. | +| `log.level` | `str` | `'info'` | Log level for the process. Defaults to ``$PRIME_LOG_LEVEL`` if set, else ``info``. | +| `log.vf_level` | `str` | `'info'` | Log level for the verifiers package. Defaults to ``$PRIME_VF_LOG_LEVEL`` if set, else ``info``. | +| `log.json_logging` | `bool` | `False` | Emit newline-delimited JSON logs for aggregation (Loki, Grafana, etc.). | +| `log.log_data` | `bool` | `False` | Log the first data sample at startup. | ### `wandb` | Field | Type | Default | Description | |---|---|---|---| -| `wandb.project` | str | `'prime-rl'` | W&B project to log to. | -| `wandb.entity` | str \| None | `None` | W&B entity to log to. | -| `wandb.name` | str \| None | `None` | W&B run name. | -| `wandb.group` | str \| None | `None` | W&B group. | -| `wandb.tags` | list[str] \| None | `None` | W&B tags attached to the run. | -| `wandb.offline` | bool | `False` | Run W&B in offline mode. | +| `wandb.project` | `str` | `'prime-rl'` | W&B project to log to. | +| `wandb.entity` | `str | None` | `None` | W&B entity to log to. | +| `wandb.name` | `str | None` | `None` | W&B run name. | +| `wandb.group` | `str | None` | `None` | W&B group. | +| `wandb.tags` | `list[str] | None` | `None` | W&B tags attached to the run. | +| `wandb.offline` | `bool` | `False` | Run W&B in offline mode. | #### `wandb.log_extras` @@ -2315,21 +2476,21 @@ Extras logging configuration. If None, no extras are logged. | Field | Type | Default | Description | |---|---|---|---| -| `wandb.log_extras.samples` | bool | `True` | Log prompt/response samples. | -| `wandb.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | -| `wandb.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | -| `wandb.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | +| `wandb.log_extras.samples` | `bool` | `True` | Log prompt/response samples. | +| `wandb.log_extras.distributions` | `bool` | `True` | Log distributions (rewards, advantages, etc.). | +| `wandb.log_extras.interval` | `int` | `10` | _≥1._ Step interval between extras logs. | +| `wandb.log_extras.sample_ratio` | `float | None` | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | ### `prime_monitor` | Field | Type | Default | Description | |---|---|---|---| -| `prime_monitor.base_url` | str | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | -| `prime_monitor.api_key_var` | str | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | -| `prime_monitor.run_name` | str \| None | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | -| `prime_monitor.team_id` | str \| None | `None` | Team ID to associate the run with. | -| `prime_monitor.frontend_url` | str \| None | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | +| `prime_monitor.base_url` | `str` | `'https://api.primeintellect.ai/api/v1/rft'` | Base URL for the Prime Intellect monitoring API. | +| `prime_monitor.api_key_var` | `str` | `'PRIME_API_KEY'` | Environment variable name containing the Prime Intellect API key, resolved via ``os.getenv``. | +| `prime_monitor.run_name` | `str | None` | `None` | Run name shown on the platform. Defaults to the W&B run name when set, otherwise the platform auto-generates one. | +| `prime_monitor.team_id` | `str | None` | `None` | Team ID to associate the run with. | +| `prime_monitor.frontend_url` | `str | None` | `None` | Frontend base URL used for the dashboard link printed after registration. Defaults to the Prime CLI frontend URL when unset. | #### `prime_monitor.log_extras` @@ -2338,10 +2499,10 @@ Extras logging configuration. If None, no extras are logged. | Field | Type | Default | Description | |---|---|---|---| -| `prime_monitor.log_extras.samples` | bool | `True` | Log prompt/response samples. | -| `prime_monitor.log_extras.distributions` | bool | `True` | Log distributions (rewards, advantages, etc.). | -| `prime_monitor.log_extras.interval` | int | `10` | _≥1._ Step interval between extras logs. | -| `prime_monitor.log_extras.sample_ratio` | float \| None | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | +| `prime_monitor.log_extras.samples` | `bool` | `True` | Log prompt/response samples. | +| `prime_monitor.log_extras.distributions` | `bool` | `True` | Log distributions (rewards, advantages, etc.). | +| `prime_monitor.log_extras.interval` | `int` | `10` | _≥1._ Step interval between extras logs. | +| `prime_monitor.log_extras.sample_ratio` | `float | None` | `None` | _≥0.0, ≤1.0._ Fraction of rollouts to log per step. The effective cap is ``len(rollouts) * sample_ratio``; 1.0 = all, 0.5 = half, 0.0 = none. | ### `ckpt` @@ -2350,13 +2511,13 @@ Checkpoint configuration. | Field | Type | Default | Description | |---|---|---|---| -| `ckpt.interval` | int \| None | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | -| `ckpt.resume_step` | int \| None | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | -| `ckpt.wait_for_weights_timeout` | int \| None | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | -| `ckpt.keep_last` | int \| None | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | -| `ckpt.keep_interval` | int \| None | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | -| `ckpt.skip_progress` | bool | `False` | Skip loading the progress from checkpoint. | -| `ckpt.skip_buffer` | bool | `False` | Skip loading the buffer from checkpoint. | +| `ckpt.interval` | `int | None` | `None` | _≥1._ Step interval at which to save the orchestrator checkpoint. | +| `ckpt.resume_step` | `int | None` | `None` | _≥-1._ Step to resume the orchestrator from. None starts from scratch; ``-1`` resumes from the latest checkpoint available. | +| `ckpt.wait_for_weights_timeout` | `int | None` | `None` | _≥1._ When resuming, wait up to this many seconds for the weight directory to appear. Useful when the orchestrator restarts while the trainer is still saving weights. If None, fail immediately when weights are not found. | +| `ckpt.keep_last` | `int | None` | `None` | _≥1._ Keep at most this many recent step checkpoints on disk. If None, never clean old checkpoints based on recency. | +| `ckpt.keep_interval` | `int | None` | `None` | _≥1._ Keep checkpoints at every N steps permanently (e.g. ``keep_interval=100`` keeps step 100, 200, ...). If None, no interval-based keeping. | +| `ckpt.skip_progress` | `bool` | `False` | Skip loading the progress from checkpoint. | +| `ckpt.skip_buffer` | `bool` | `False` | Skip loading the buffer from checkpoint. | ### `heartbeat` @@ -2365,11 +2526,46 @@ BetterStack heartbeat configuration for monitoring training progress. | Field | Type | Default | Description | |---|---|---|---| -| `heartbeat.url` | str | *required* | URL to send the heartbeat to. | +| `heartbeat.url` | `str` | *required* | URL to send the heartbeat to. | ### `experimental` + +### `filters.` (list item) + +Rollout filters. Each filter can ``monitor`` (default) or ``enforce`` (skip rollouts). + +Discriminated list-item union — set `filters..type` to one of `gibberish`, `repetition`, `zero_advantage` and provide the matching sub-fields. + + +#### `filters..type = "gibberish"` (GibberishFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `filters..type` | `'gibberish'` | `'gibberish'` | | +| `filters..enforce` | `bool` | `False` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | +| `filters..token_id_threshold` | `int` | `100000` | Token IDs above this are candidates for gibberish. BPE tokens are sorted by merge order. | +| `filters..logprob_offset` | `float` | `2.0` | Offset from uniform-distribution logprob. Threshold = ``-log(vocab_size) - logprob_offset``. | + + +#### `filters..type = "repetition"` (RepetitionFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `filters..type` | `'repetition'` | `'repetition'` | | +| `filters..enforce` | `bool` | `False` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | +| `filters..window` | `int` | `3000` | _≥1._ Consecutive high-probability steps required to flag the rollout. | +| `filters..prob_threshold` | `float` | `0.99` | _>0, ≤1._ Tokens sampled with probability above this are considered repetitive. Consecutive such tokens count toward the window. | + + +#### `filters..type = "zero_advantage"` (ZeroAdvantageFilterConfig) + +| Field | Type | Default | Description | +|---|---|---|---| +| `filters..type` | `'zero_advantage'` | `'zero_advantage'` | | +| `filters..enforce` | `bool` | `True` | When True, skip detected rollouts entirely so they are not sent to the trainer. When False, only track detection metrics. | + ### `weight_broadcast` @@ -2382,19 +2578,19 @@ Discriminated union — set `weight_broadcast.type` to one of `filesystem`, `ncc | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'filesystem' | `'filesystem'` | | +| `weight_broadcast.type` | `'filesystem'` | `'filesystem'` | | #### `weight_broadcast.type = "nccl"` (NCCLWeightBroadcastConfig) | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'nccl' | `'nccl'` | | -| `weight_broadcast.host` | str | `'localhost'` | Host for the NCCL broadcast rendezvous. | -| `weight_broadcast.port` | int | `29501` | Port for the NCCL broadcast rendezvous. | -| `weight_broadcast.timeout` | int | `1200` | Timeout in seconds for the NCCL broadcast. | -| `weight_broadcast.quantize_in_weight_transfer` | bool | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | -| `weight_broadcast.inference_world_size` | int | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | +| `weight_broadcast.type` | `'nccl'` | `'nccl'` | | +| `weight_broadcast.host` | `str` | `'localhost'` | Host for the NCCL broadcast rendezvous. | +| `weight_broadcast.port` | `int` | `29501` | Port for the NCCL broadcast rendezvous. | +| `weight_broadcast.timeout` | `int` | `1200` | Timeout in seconds for the NCCL broadcast. | +| `weight_broadcast.quantize_in_weight_transfer` | `bool` | `False` | Use kernel-format FP8 quantized NCCL transfer for weight updates. | +| `weight_broadcast.inference_world_size` | `int` | `1` | _≥1._ Total inference GPUs across all servers. Used by ``init_nccl_broadcast`` to compute per-server rank offsets. | ### `rollout_transport` @@ -2408,17 +2604,17 @@ Discriminated union — set `rollout_transport.type` to one of `filesystem`, `zm | Field | Type | Default | Description | |---|---|---|---| -| `rollout_transport.type` | 'filesystem' | `'filesystem'` | | +| `rollout_transport.type` | `'filesystem'` | `'filesystem'` | | #### `rollout_transport.type = "zmq"` (ZMQTransportConfig) | Field | Type | Default | Description | |---|---|---|---| -| `rollout_transport.type` | 'zmq' | `'zmq'` | | -| `rollout_transport.host` | str | `'localhost'` | Host address for ZMQ transport. | -| `rollout_transport.port` | int | `5555` | Base port for ZMQ transport. | -| `rollout_transport.hwm` | int | `10` | High-water mark (max in-flight messages per ZMQ socket). | +| `rollout_transport.type` | `'zmq'` | `'zmq'` | | +| `rollout_transport.host` | `str` | `'localhost'` | Host address for ZMQ transport. | +| `rollout_transport.port` | `int` | `5555` | Base port for ZMQ transport. | +| `rollout_transport.hwm` | `int` | `10` | High-water mark (max in-flight messages per ZMQ socket). | ## `inference` — Standalone vLLM server @@ -2429,51 +2625,51 @@ _Defined in_ `prime_rl.configs.inference.InferenceConfig`. | Field | Type | Default | Description | |---|---|---|---| -| `enable_lora` | bool | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | -| `max_loras` | int | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | -| `max_cpu_loras` | int | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | -| `max_lora_rank` | int \| None | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | -| `lora_target_modules` | list[str] \| None | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | -| `enable_prefix_caching` | bool \| None | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | -| `gpu_memory_utilization` | float | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | -| `api_server_count` | int | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | -| `data_parallel_size_local` | int \| None | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | -| `data_parallel_rpc_port` | int | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | -| `seed` | int | `0` | Seed the inference components. Forwarded as ``--seed``. | -| `enable_expert_parallel` | bool | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | -| `all2all_backend` | 'allgather_reducescatter' \| 'deepep_high_throughput' \| 'deepep_low_latency' \| 'flashinfer_nvlink_one_sided' \| 'flashinfer_nvlink_two_sided' | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | -| `enable_eplb` | bool | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | -| `enable_dbo` | bool | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | -| `use_deep_gemm` | bool | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | -| `enable_return_routed_experts` | bool | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | -| `enable_fp32_lm_head` | bool | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | -| `vllm_extra` | dict[str, Any] | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | -| `output_dir` | Path | `'outputs'` | Directory for SLURM logs and generated scripts. | -| `dry_run` | bool | `False` | Only validate and dump resolved configs, then exit early. | +| `enable_lora` | `bool` | `False` | Enable LoRA. Forwarded as ``--enable-lora``. | +| `max_loras` | `int` | `8` | Maximum number of LoRAs. Forwarded as ``--max-loras``. | +| `max_cpu_loras` | `int` | `100` | Maximum number of LoRAs on CPU. Forwarded as ``--max-cpu-loras``. | +| `max_lora_rank` | `int | None` | `None` | Maximum LoRA rank. Forwarded as ``--max-lora-rank``. | +| `lora_target_modules` | `list[str] | None` | `None` | LoRA target modules. Forwarded as ``--lora-target-modules``. | +| `enable_prefix_caching` | `bool | None` | `None` | Enable prefix caching. Forwarded as ``--enable-prefix-caching``. | +| `gpu_memory_utilization` | `float` | `0.9` | GPU memory utilization. Forwarded as ``--gpu-memory-utilization``. | +| `api_server_count` | `int` | `1` | _≥0._ API servers to run. Forwarded as ``--api-server-count``. Set to 0 for headless mode. | +| `data_parallel_size_local` | `int | None` | `None` | _≥1._ Data parallel replicas to run on this node. Forwarded as ``--data-parallel-size-local``. | +| `data_parallel_rpc_port` | `int` | `13345` | _≥1, ≤65535._ RPC port for data parallel communication. Forwarded as ``--data-parallel-rpc-port``. | +| `seed` | `int` | `0` | Seed the inference components. Forwarded as ``--seed``. | +| `enable_expert_parallel` | `bool` | `False` | Enable expert parallelism for MoE models. Forwarded as ``--enable-expert-parallel``. | +| `all2all_backend` | `'allgather_reducescatter' | 'deepep_high_throughput' | 'deepep_low_latency' | 'flashinfer_nvlink_one_sided' | 'flashinfer_nvlink_two_sided'` | `'allgather_reducescatter'` | All-to-all backend for expert-parallel communication. Forwarded as ``--all2all-backend``. | +| `enable_eplb` | `bool` | `False` | Enable expert parallel load balancer (EPLB). Forwarded as ``--enable-eplb``. | +| `enable_dbo` | `bool` | `False` | Enable dual batch overlap (DBO). Forwarded as ``--enable-dbo``. | +| `use_deep_gemm` | `bool` | `False` | Force DeepGEMM FP8 kernels via ``VLLM_USE_DEEP_GEMM=1``. Only works with per-tensor FP8 quantization (e.g. GLM-5-FP8). | +| `enable_return_routed_experts` | `bool` | `False` | Return routed experts in responses. Forwarded as ``--enable-return-routed-experts``. | +| `enable_fp32_lm_head` | `bool` | `False` | Run the lm_head projection in fp32 via a native bf16×bf16 → fp32 GEMM (``torch.mm`` with ``out_dtype=torch.float32``). Stabilizes logprob precision under FP8/bf16 inference, matching SGLang's ``--enable-fp32-lm-head``. Implemented as a monkey-patch over vLLM's LogitsProcessor, activated by setting ``additional_config["fp32_lm_head"] = True`` on the vLLM config. | +| `vllm_extra` | `dict[str, Any]` | `{}` | Extra arguments forwarded to vLLM. Applied as attributes on the vLLM namespace after config translation. | +| `output_dir` | `Path` | `'outputs'` | Directory for SLURM logs and generated scripts. | +| `dry_run` | `bool` | `False` | Only validate and dump resolved configs, then exit early. | ### `server` | Field | Type | Default | Description | |---|---|---|---| -| `server.host` | str \| None | `None` | Host to bind to. | -| `server.port` | int | `8000` | Port to bind to. | -| `server.liveness_timeout_seconds` | float | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | +| `server.host` | `str | None` | `None` | Host to bind to. | +| `server.port` | `int` | `8000` | Port to bind to. | +| `server.liveness_timeout_seconds` | `float` | `30.0` | _>0._ Timeout in seconds for the ``/liveness`` endpoint's internal vLLM worker RPC. With Kubernetes liveness probes, keep the probe ``timeoutSeconds`` at least this high. | ### `model` | Field | Type | Default | Description | |---|---|---|---| -| `model.name` | str | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | -| `model.trust_remote_code` | bool | `False` | Trust remote code. Forwarded to vLLM engine init. | -| `model.dtype` | 'auto' \| 'float16' \| 'bfloat16' \| 'float32' | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | -| `model.max_model_len` | int \| None | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | -| `model.enforce_eager` | bool | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | -| `model.chat_template` | str \| None | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | -| `model.tool_call_parser` | str \| None | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | -| `model.reasoning_parser` | str \| None | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | -| `model.rope_scaling` | dict[str, Any] \| str \| None | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | +| `model.name` | `str` | `'Qwen/Qwen3-0.6B'` | HF model name or local path. | +| `model.trust_remote_code` | `bool` | `False` | Trust remote code. Forwarded to vLLM engine init. | +| `model.dtype` | `'auto' | 'float16' | 'bfloat16' | 'float32'` | `'auto'` | dtype for model weights and activations. ``auto`` uses FP16 for FP32/FP16 models and BF16 for BF16 models. Forwarded as ``--dtype``. | +| `model.max_model_len` | `int | None` | `None` | Maximum model context length. If None, uses the model config's value. Forwarded as ``--max-model-len``. | +| `model.enforce_eager` | `bool` | `False` | Enforce eager mode. When False, PyTorch eager and cuda graphs run hybrid for maximum performance. Forwarded as ``--enforce-eager``. | +| `model.chat_template` | `str | None` | `None` | Chat template — a Jinja2 template string or path to a template file. Forwarded as ``--chat-template``. If None, uses the model's default. | +| `model.tool_call_parser` | `str | None` | `'auto'` | Tool-call parser. Forwarded as ``--tool-call-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `model.reasoning_parser` | `str | None` | `'auto'` | Parser for extracting reasoning content from model outputs. Forwarded as ``--reasoning-parser``. Set to ``"auto"`` (default) to detect from the model name, or ``None`` to disable. | +| `model.rope_scaling` | `dict[str, Any] | str | None` | `None` | RoPE scaling configuration as a dict (e.g. ``{rope_type="yarn", factor=4.0, original_max_position_embeddings=32768}``). Forwarded as ``--rope-scaling``. | #### `model.vlm` @@ -2482,9 +2678,9 @@ VLM configuration. Setting this enables vision-language model support. | Field | Type | Default | Description | |---|---|---|---| -| `model.vlm.vision_encoder_attr` | str | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | -| `model.vlm.language_model_attr` | str | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | -| `model.vlm.freeze_vision_encoder` | bool | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | +| `model.vlm.vision_encoder_attr` | `str` | *required* | Dotted attribute path to the vision encoder module (e.g. ``model.visual``). | +| `model.vlm.language_model_attr` | `str` | *required* | Dotted attribute path to the language model module (e.g. ``model.language_model``). | +| `model.vlm.freeze_vision_encoder` | `bool` | `True` | Freeze the vision encoder. When False, it is trainable and FSDP-sharded per-block. No effect with LoRA (LoRA freezes all non-adapter parameters). | ### `parallel` @@ -2493,15 +2689,15 @@ Multi-node and multi-GPU parallelism (TP, DP, PP). | Field | Type | Default | Description | |---|---|---|---| -| `parallel.tp` | int | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | -| `parallel.dp` | int | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | +| `parallel.tp` | `int` | `1` | Tensor parallel size. Forwarded to vLLM as ``--tensor-parallel-size``. | +| `parallel.dp` | `int` | `1` | _≥1._ Data parallel size. Forwarded to vLLM as ``--data-parallel-size``. | ### `weight_broadcast` | Field | Type | Default | Description | |---|---|---|---| -| `weight_broadcast.type` | 'nccl' \| 'filesystem' | `'filesystem'` | Weight broadcast transport. | +| `weight_broadcast.type` | `'nccl' | 'filesystem'` | `'filesystem'` | Weight broadcast transport. | ### `kv_cache_offload` @@ -2510,7 +2706,7 @@ CPU KV cache offload for inference workers. Standard inference uses vLLM's ``Off | Field | Type | Default | Description | |---|---|---|---| -| `kv_cache_offload.cpu_bytes` | int | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | +| `kv_cache_offload.cpu_bytes` | `int` | `1000000000` | _>0._ CPU bytes available for KV cache offloading per worker. | ### `slurm` @@ -2519,15 +2715,15 @@ SLURM configuration. When set, the run is submitted as a SLURM job instead of ru | Field | Type | Default | Description | |---|---|---|---| -| `slurm.job_name` | str | `'prime-rl'` | SLURM job name. | -| `slurm.project_dir` | Path | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | -| `slurm.template_path` | Path \| None | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | -| `slurm.partition` | str | `'cluster'` | SLURM partition (#SBATCH --partition). | -| `slurm.nodelist` | str \| None | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | -| `slurm.exclude` | str \| None | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | -| `slurm.account` | str \| None | `None` | SLURM account to charge (#SBATCH --account). | -| `slurm.time` | str \| None | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | -| `slurm.pre_run_command` | str \| None | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | +| `slurm.job_name` | `str` | `'prime-rl'` | SLURM job name. | +| `slurm.project_dir` | `Path` | `'.'` | Path to the project root, used to source .env, activate .venv, and run uv sync. | +| `slurm.template_path` | `Path | None` | `None` | SLURM template file. If None, uses the bundled single-node or multi-node template. | +| `slurm.partition` | `str` | `'cluster'` | SLURM partition (#SBATCH --partition). | +| `slurm.nodelist` | `str | None` | `None` | Comma-separated list of specific nodes to run on (#SBATCH --nodelist). | +| `slurm.exclude` | `str | None` | `None` | Comma-separated list of nodes to exclude (#SBATCH --exclude). | +| `slurm.account` | `str | None` | `None` | SLURM account to charge (#SBATCH --account). | +| `slurm.time` | `str | None` | `None` | Maximum wall time, e.g. '24:00:00' or '7-00:00:00' (#SBATCH --time). | +| `slurm.pre_run_command` | `str | None` | `None` | Shell command to run on the head node after cd, .env sourcing, and venv activation. Useful for cleanup like ``sudo pkill -f vllm``; wrap with ``srun bash -c '...'`` to fan out to all nodes. | ### `experimental` @@ -2542,36 +2738,55 @@ Discriminated union — set `deployment.type` to one of `single_node`, `multi_no | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'single_node' | `'single_node'` | | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'single_node'` | `'single_node'` | | #### `deployment.type = "multi_node"` (MultiNodeInferenceDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'multi_node' | `'multi_node'` | | -| `deployment.num_nodes` | int | `2` | _≥1._ Inference nodes. | -| `deployment.router_port` | int | `8000` | Port for the vllm-router. | -| `deployment.backend_port` | int | `8100` | Port for vLLM backend instances. | -| `deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'multi_node'` | `'multi_node'` | | +| `deployment.num_nodes` | `int` | `2` | _≥1._ Inference nodes. | +| `deployment.router_port` | `int` | `8000` | Port for the vllm-router. | +| `deployment.backend_port` | `int` | `8100` | Port for vLLM backend instances. | +| `deployment.router_policy` | `str` | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | #### `deployment.type = "disaggregated"` (DisaggregatedInferenceDeploymentConfig) | Field | Type | Default | Description | |---|---|---|---| -| `deployment.gpus_per_node` | int | `8` | GPUs per node. | -| `deployment.type` | 'disaggregated' | `'disaggregated'` | | -| `deployment.num_prefill_nodes` | int | `1` | _≥1._ Total prefill nodes. | -| `deployment.num_decode_nodes` | int | `1` | _≥1._ Total decode nodes. | -| `deployment.num_prefill_replicas` | int | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | -| `deployment.num_decode_replicas` | int | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | -| `deployment.router_port` | int | `8000` | Port for the vllm-router on each replica. | -| `deployment.prefill_port` | int | `8100` | Port for prefill vLLM instances. | -| `deployment.decode_port` | int | `8200` | Port for decode vLLM instances. | -| `deployment.router_policy` | str | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | -| `deployment.prefill_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on prefill nodes. | -| `deployment.decode_env_overrides` | dict[str, str] | `{}` | Extra environment variables exported only on decode nodes. | +| `deployment.gpus_per_node` | `int` | `8` | GPUs per node. | +| `deployment.type` | `'disaggregated'` | `'disaggregated'` | | +| `deployment.num_prefill_nodes` | `int` | `1` | _≥1._ Total prefill nodes. | +| `deployment.num_decode_nodes` | `int` | `1` | _≥1._ Total decode nodes. | +| `deployment.num_prefill_replicas` | `int` | `1` | _≥1._ Independent prefill vLLM instances. Must evenly divide ``num_prefill_nodes``. | +| `deployment.num_decode_replicas` | `int` | `1` | _≥1._ Independent decode vLLM instances. Must evenly divide ``num_decode_nodes``. | +| `deployment.router_port` | `int` | `8000` | Port for the vllm-router on each replica. | +| `deployment.prefill_port` | `int` | `8100` | Port for prefill vLLM instances. | +| `deployment.decode_port` | `int` | `8200` | Port for decode vLLM instances. | +| `deployment.router_policy` | `str` | `'consistent_hash'` | vllm-router routing policy (e.g. ``consistent_hash``, ``round_robin``). | +| `deployment.prefill_env_overrides` | `dict[str, str]` | `{}` | Extra environment variables exported only on prefill nodes. | +| `deployment.decode_env_overrides` | `dict[str, str]` | `{}` | Extra environment variables exported only on decode nodes. | + +## About this page +Each entrypoint section walks its config tree top-down. Nested sub-configs +appear under headings named after their dotted path (e.g. `trainer.model.ac`). +List-typed sub-configs (e.g. `[[orchestrator.train.env]]`) appear under +headings with a `` index placeholder — that's the CLI form too +(`--orchestrator.train.env.0.id ...`). Discriminated unions (loss, advantage, +scheduler, optimizer, …) document each variant in turn — set the `type` field +to pick one. + +To regenerate, run from the project root: + +```bash +uv run python scripts/generate_docs_reference.py +``` + +For conceptual context behind these knobs, see +[Configuration](configuration.md), [Training](training.md), +[Scaling](scaling.md), [Algorithms](algorithms.md), and [Advanced](advanced.md). diff --git a/docs/scaling.md b/docs/scaling.md index a557c87c6b..b21764b89d 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -496,11 +496,7 @@ For large MoE serving, splitting prefill and decode onto separate vLLM groups ca | Agentic (SWE, Lean) | 3:1 | Long growing contexts → prefill-heavy | | Non-agentic (math, chat) | 1:2 | Short prompts, long generations → decode-heavy | -Example config: [`configs/glm5_disagg_inference/inference.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/configs/glm5_disagg_inference/inference.toml). Launch with the standard `inference` entrypoint: - -```bash -uv run inference @ configs/glm5_disagg_inference/inference.toml --output-dir /data/$USER/outputs -``` +Example config: [`examples/glm5_pd_disag/rl.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/examples/glm5_pd_disag/rl.toml) — full RL run on `GLM-5` with P/D disaggregation behind a `vllm-router`, FP8 inference, and NCCL weight broadcast (see the [README](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/glm5_pd_disag) for the launch story). Monitor live queue depths to detect imbalance: diff --git a/docs/training.md b/docs/training.md index 5429cd3ada..2c3531d863 100644 --- a/docs/training.md +++ b/docs/training.md @@ -98,8 +98,6 @@ CUDA_VISIBLE_DEVICES=1 uv run inference \ --model.name --server.port 8001 ``` -Debug configs for all variants ship under [`configs/debug/training_modes/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/configs/debug/training_modes). - The standalone `uv run sft` entrypoint is the more traditional SFT path — pure dataset-based, no teacher, no orchestrator. Use `orchestrator.training_mode = "sft"` only when you want a teacher to generate the supervision on the fly. ### Important metrics diff --git a/scripts/generate_docs_reference.py b/scripts/generate_docs_reference.py index 0a77193aeb..4626c19a0e 100644 --- a/scripts/generate_docs_reference.py +++ b/scripts/generate_docs_reference.py @@ -114,14 +114,15 @@ def _union_models(t: object) -> list[type[BaseModel]] | None: def fmt_type(annotation: object) -> str: - """Render a type annotation as compact Markdown-safe text.""" + """Render a type annotation as compact text. Caller is responsible for wrapping + in a code span; in GFM, code spans inside table cells can contain literal `|`.""" annotation = unwrap_annotated(annotation) origin = typing.get_origin(annotation) if origin in (typing.Union, types.UnionType): args = typing.get_args(annotation) - return " \\| ".join(fmt_type(a) for a in args) + return " | ".join(fmt_type(a) for a in args) if origin is typing.Literal: - return " \\| ".join(repr(a) for a in typing.get_args(annotation)) + return " | ".join(repr(a) for a in typing.get_args(annotation)) if origin is list: return f"list[{fmt_type(typing.get_args(annotation)[0])}]" if origin is dict: @@ -203,7 +204,7 @@ def render_field_row( docstring: str, ) -> None: name = f"`{path}`" - type_str = fmt_type(field.annotation) + type_str = f"`{fmt_type(field.annotation)}`" default = fmt_default(field) constraints = fmt_constraints(field) desc = (field.description or docstring or "").strip().replace("\n", " ") @@ -212,6 +213,26 @@ def render_field_row( writer.raw(f"| {name} | {type_str} | {default} | {desc} |\n") +def _list_inner_models(annotation: object) -> list[type[BaseModel]] | None: + """If `annotation` is `list[X]` where X is a `BaseModel` (possibly through + `Annotated[...]`) or a discriminated `Union[A | B | ...]` of `BaseModel`s, + return the list of model classes; otherwise return None. + """ + annotation = unwrap_annotated(annotation) + if typing.get_origin(annotation) is not list: + return None + args = typing.get_args(annotation) + if not args: + return None + inner = unwrap_annotated(args[0]) + if is_pydantic_model(inner): + return [inner] + union = _union_models(inner) + if union: + return union + return None + + def render_model( writer: Writer, model_cls: type[BaseModel], @@ -223,6 +244,7 @@ def render_model( """Render the fields of `model_cls` and recurse into nested BaseConfig sub-fields.""" docstrings = _extract_field_docstrings(model_cls) nested: list[tuple[str, type[BaseModel], FieldInfo, str]] = [] + list_nested: list[tuple[str, list[type[BaseModel]], FieldInfo, str]] = [] union_fields: list[tuple[str, list[type[BaseModel]], FieldInfo, str]] = [] flat_fields: list[tuple[str, FieldInfo, str]] = [] @@ -245,6 +267,11 @@ def render_model( if len(args) == 1 and is_pydantic_model(args[0]): nested.append((full, args[0], field, ds)) continue + # list[BaseConfig] or list[Annotated[Union[...], discriminator]] case + list_models = _list_inner_models(unwrapped) + if list_models is not None: + list_nested.append((full, list_models, field, ds)) + continue flat_fields.append((full, field, ds)) if flat_fields: @@ -266,6 +293,39 @@ def render_model( continue render_model(writer, child_cls, full, sub_anchor, depth + 1, seen | {child_cls}) + for full, item_models, field, ds in list_nested: + sub_anchor = anchor_prefix + [full.split(".")[-1]] + # Index placeholder matches the CLI / TOML form: --orchestrator.train.env.0.id + item_path = f"{full}." + heading = f"`{item_path}` (list item)" + writer.h(min(depth + 1, 6), heading, anchor=slug(sub_anchor)) + blurb = (field.description or ds or "").strip() + if blurb: + writer.p(blurb) + if len(item_models) == 1: + child_cls = item_models[0] + if child_cls in seen: + writer.p(f"_Recursive reference to_ `{child_cls.__name__}` _omitted._") + continue + render_model(writer, child_cls, item_path, sub_anchor, depth + 1, seen | {child_cls}) + else: + # Discriminated union of list items (e.g. `filters: list[FilterConfig]`) + type_field = field.discriminator or "type" + writer.p( + f"Discriminated list-item union — set `{item_path}.{type_field}` to one of " + + ", ".join(f"`{_type_literal(v, type_field)}`" for v in item_models) + + " and provide the matching sub-fields." + ) + for variant in item_models: + type_literal = _type_literal(variant, type_field) + var_anchor = sub_anchor + [type_literal or variant.__name__.lower()] + writer.h( + min(depth + 2, 6), + f'`{item_path}.{type_field} = "{type_literal}"` ({variant.__name__})', + anchor=slug(var_anchor), + ) + render_model(writer, variant, item_path, var_anchor, depth + 2, seen | {variant}) + for full, variants, field, ds in union_fields: sub_anchor = anchor_prefix + [full.split(".")[-1]] heading = f"`{full}`" @@ -322,7 +382,21 @@ def render_toc(writer: Writer) -> str: HEADER = """# Reference This page documents every field accepted by every prime-rl entrypoint. It is -auto-generated from the Pydantic config models; do not edit by hand. +auto-generated; do not edit by hand. + +""" + + +FOOTER = """\ +## About this page + +Each entrypoint section walks its config tree top-down. Nested sub-configs +appear under headings named after their dotted path (e.g. `trainer.model.ac`). +List-typed sub-configs (e.g. `[[orchestrator.train.env]]`) appear under +headings with a `` index placeholder — that's the CLI form too +(`--orchestrator.train.env.0.id ...`). Discriminated unions (loss, advantage, +scheduler, optimizer, …) document each variant in turn — set the `type` field +to pick one. To regenerate, run from the project root: @@ -330,15 +404,9 @@ def render_toc(writer: Writer) -> str: uv run python scripts/generate_docs_reference.py ``` -Each entrypoint section walks its config tree top-down. Nested sub-configs -appear under headings named after their dotted path (e.g. `trainer.model.ac`). -Discriminated unions (loss, advantage, scheduler, optimizer, …) document each -variant in turn — set the `type` field to pick one. - For conceptual context behind these knobs, see [Configuration](configuration.md), [Training](training.md), [Scaling](scaling.md), [Algorithms](algorithms.md), and [Advanced](advanced.md). - """ @@ -350,10 +418,11 @@ def main() -> int: body = Writer() for ep in ENTRYPOINTS: render_entrypoint(body, ep) - # Stitch: header + TOC built from body's headings + body content. + # Stitch: header + TOC built from body's headings + body content + footer. writer.raw(render_toc(body)) writer.raw("---\n\n") writer.raw(body.buf.getvalue()) + writer.raw(FOOTER) OUT_PATH.write_text(writer.buf.getvalue()) print(f"Wrote {OUT_PATH} ({writer.buf.tell()} chars)") From 21d4296673fa8fab411f600fefbb82efedc5b0cf Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:42:26 +0000 Subject: [PATCH 48/66] docs(training): correct batch-size rule of thumb 128-512 is the range for quick ablations, not production. Production RL often runs at 1024+. Co-authored-by: Cursor --- docs/training.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/training.md b/docs/training.md index 2c3531d863..a5f12745a0 100644 --- a/docs/training.md +++ b/docs/training.md @@ -321,7 +321,7 @@ Requires `PRIME_API_KEY` (set via `prime login` or env var) and an allowlisted t ## Rules of thumb - **Start small.** Run `examples/reverse_text/rl.toml` end-to-end on 2 GPUs before scaling. If the smoke run finishes cleanly, your install is good. -- **Batch size ≥ 64.** Smaller batches give noisy gradient estimates and the trainer's overhead-per-step dominates throughput. 64 is the practical floor; 128–512 is typical for production RL. +- **Batch size ≥ 64.** Smaller batches give noisy gradient estimates and the trainer's overhead-per-step dominates throughput. 64 is the practical floor; 128–512 is the range for quick ablations; production RL often runs at 1024+. - **Group size ≥ 8.** Bigger groups (`orchestrator.group_size`) make it more likely that a task produces a mix of high- and low-reward rollouts, which is what gives the trainer a usable signal — if all rollouts in a group succeed or all fail, the within-group advantage collapses to zero and the trainer learns nothing from that task. Bigger groups also tighten advantage normalization. 8 is the floor; 16–32 is common. - **Pin `output_dir` per run.** Sharing a directory across runs will mix rollouts and break resumes. `--output-dir outputs/` is the simplest discipline. - **Use `--dry-run` before SLURM.** Validators (CP needs flash-attention, NCCL broadcast needs `max_async_level=1`, etc.) fail fast in dry-run and slow in queue. From 7e7ad004d4333b8fb4a43efb7a94682174efd587 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:44:32 +0000 Subject: [PATCH 49/66] docs: relocate "Disaggregated prefill/decode inference" to Advanced MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P/D disaggregation is a feature you opt into for large-MoE serving, not a step on the single-GPU -> 1000-GPU scaling ladder. It pairs naturally with Custom modeling / multi-tenant / multimodal as a specialized inference topology, so Advanced is the right home. - scaling.md: drop the section + TOC entry + "disaggregated prefill/decode inference" from the page intro. Page intro now forward-links to Advanced for users who came in for P/D. - advanced.md: append the section after Multi-tenant training, unchanged content (P:D ratio table, glm5_pd_disag example link, queue-depth monitoring snippet, UCX 1.19 build-from-source note). TOC + page-intro list updated. - overview.md "Where to go next": drop disagg from the Scaling bullet, add to the Advanced bullet. Anchor preserved (#disaggregated-prefilldecode-inference) — no external doc links to it survived the move check. Co-authored-by: Cursor --- docs/advanced.md | 31 ++++++++++++++++++++++++++++++- docs/overview.md | 4 ++-- docs/scaling.md | 31 +------------------------------ 3 files changed, 33 insertions(+), 33 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 76fc39c5a0..ba09980a99 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,6 +1,6 @@ # Advanced -This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, and multi-tenant training. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). +This page covers the specialized features layered on top of the core training stack: our custom model implementations (with EP for MoE families and CP for long-context training), multimodal training, LoRA training, multi-tenant training, and disaggregated prefill/decode inference. For developer-side workflows (adding new model architectures, debugging modeling code at small scale), see [Development](development.md). ## Table of Contents @@ -13,6 +13,7 @@ This page covers the specialized features layered on top of the core training st - [Limitations](#limitations) - [LoRA training](#lora-training) - [Multi-tenant training](#multi-tenant-training) +- [Disaggregated prefill/decode inference](#disaggregated-prefilldecode-inference) ## Custom modeling @@ -119,3 +120,31 @@ LoRA pairs naturally with [multi-tenant training](#multi-tenant-training) — ea ## Multi-tenant training Multi-tenant training lets a single trainer + inference deployment serve many concurrent LoRA "tenants" — each a fully isolated run with its own orchestrator, LoRA adapter, optimizer, scheduler, checkpoints, and progress tracking — sharing the same backbone weights and the same vLLM server. This is the topology behind hosted training on the [Prime Intellect platform (Lab)](https://app.primeintellect.ai). The trainer-side implementation is the `MultiRunManager` singleton, enabled by setting `trainer.max_concurrent_runs > 1`. For the full API surface, see [`src/prime_rl/trainer/runs/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/trainer/runs). + +## Disaggregated prefill/decode inference + +For large MoE serving, splitting prefill and decode onto separate vLLM groups can substantially improve throughput. Pick the prefill:decode ratio based on workload shape: + +| Workload | P:D ratio | Why | +|---|---|---| +| Agentic (SWE, Lean) | 3:1 | Long growing contexts → prefill-heavy | +| Non-agentic (math, chat) | 1:2 | Short prompts, long generations → decode-heavy | + +Example config: [`examples/glm5_pd_disag/rl.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/examples/glm5_pd_disag/rl.toml) — full RL run on `GLM-5` with P/D disaggregation behind a `vllm-router`, FP8 inference, and NCCL weight broadcast (see the [README](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/glm5_pd_disag) for the launch story). + +Monitor live queue depths to detect imbalance: + +```bash +curl -s http://:8100/metrics | grep num_requests_waiting +curl -s http://:8200/metrics | grep num_requests_waiting +``` + +If prefill queues and decode is idle, add prefill nodes (and vice versa). + +**UCX 1.19 requirement.** NVSHMEM needs UCX ≥ 1.19 for multi-GPU CUDA. Most clusters ship UCX 1.17 via HPC-X, which manifests as `cuStreamCreate: invalid device context` errors during DeepEP internode dispatch. Check with `/opt/hpcx/ucx/bin/ucx_info -v` and, if needed, build from source: + +```bash +salloc -N 1 --gres=gpu:1 bash -c 'bash scripts/install_nixl_from_source.sh' +``` + +The script writes UCX 1.19 to `third_party/ucx/`; the bundled sbatch templates prepend it to `LD_LIBRARY_PATH` so it overrides the system version. diff --git a/docs/overview.md b/docs/overview.md index bbbbcf6719..cd51166006 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -40,9 +40,9 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Configuration](configuration.md)** — How TOML files, `@` composition, CLI overrides, and env vars combine; the precedence rules; worked examples. - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. -- **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; disaggregated prefill/decode inference; benchmarking. +- **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. -- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training. +- **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training; disaggregated prefill/decode inference. - **[Development](development.md)** — Test suite (unit / integration / nightly), pre-commit hooks, adding a new model architecture. - **[Reference](reference.md)** — Auto-generated field-by-field reference for every entrypoint config. - **[FAQs](faqs.md)** — Quick answers to recurring questions. diff --git a/docs/scaling.md b/docs/scaling.md index b21764b89d..2349983dca 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -1,6 +1,6 @@ # Scaling -This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster: single-node multi-GPU layouts, multi-node SLURM and Kubernetes deployments, FSDP / expert parallelism / context parallelism, disaggregated prefill/decode inference, and throughput benchmarking. For knobs that fit on one box, see [Training](training.md) first. +This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster: single-node multi-GPU layouts, multi-node SLURM and Kubernetes deployments, FSDP / expert parallelism / context parallelism, and throughput benchmarking. For knobs that fit on one box, see [Training](training.md) first. For prefill/decode disaggregated inference, see [Advanced](advanced.md#disaggregated-prefilldecode-inference). ## Table of Contents @@ -27,7 +27,6 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster - [SFT and inference examples](#sft-and-inference-examples) - [Custom templates](#custom-templates) - [Kubernetes](#kubernetes) -- [Disaggregated prefill/decode inference](#disaggregated-prefilldecode-inference) - [Benchmarking](#benchmarking) ## Choosing a layout @@ -487,34 +486,6 @@ torchrun \ Common operations (logs, exec, scale, uninstall) are standard `kubectl`/`helm`. Auth (W&B, HF) is via K8s secrets — set `config.secrets.enabled=true` and `config.secrets.name=`. -## Disaggregated prefill/decode inference - -For large MoE serving, splitting prefill and decode onto separate vLLM groups can substantially improve throughput. Pick the prefill:decode ratio based on workload shape: - -| Workload | P:D ratio | Why | -|---|---|---| -| Agentic (SWE, Lean) | 3:1 | Long growing contexts → prefill-heavy | -| Non-agentic (math, chat) | 1:2 | Short prompts, long generations → decode-heavy | - -Example config: [`examples/glm5_pd_disag/rl.toml`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/examples/glm5_pd_disag/rl.toml) — full RL run on `GLM-5` with P/D disaggregation behind a `vllm-router`, FP8 inference, and NCCL weight broadcast (see the [README](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples/glm5_pd_disag) for the launch story). - -Monitor live queue depths to detect imbalance: - -```bash -curl -s http://:8100/metrics | grep num_requests_waiting -curl -s http://:8200/metrics | grep num_requests_waiting -``` - -If prefill queues and decode is idle, add prefill nodes (and vice versa). - -**UCX 1.19 requirement.** NVSHMEM needs UCX ≥ 1.19 for multi-GPU CUDA. Most clusters ship UCX 1.17 via HPC-X, which manifests as `cuStreamCreate: invalid device context` errors during DeepEP internode dispatch. Check with `/opt/hpcx/ucx/bin/ucx_info -v` and, if needed, build from source: - -```bash -salloc -N 1 --gres=gpu:1 bash -c 'bash scripts/install_nixl_from_source.sh' -``` - -The script writes UCX 1.19 to `third_party/ucx/`; the bundled sbatch templates prepend it to `LD_LIBRARY_PATH` so it overrides the system version. - ## Benchmarking Every entrypoint supports a `--bench` flag that runs a few warm-up + measurement steps with fake data and prints a rich-formatted throughput / MFU table: From ce4637edbb122214914f803a6f5e46eddd1bb1f4 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:46:03 +0000 Subject: [PATCH 50/66] docs(scaling): correct the Single GPU section The previous claim "trainer and inference server can share a GPU" via the rl launcher was wrong. Verified against src/prime_rl/entrypoints/rl.py:86-99: the launcher partitions visible GPUs strictly (inference 0..N-1, trainer N..N+M-1) and raises ValueError when total_requested_gpus > len(physical_gpu_ids). Setting CUDA_VISIBLE_DEVICES=0 + --num-infer-gpus 1 + --num-train-gpus 1 makes total=2, visible=1, validation fails before anything launches. What actually works for single-GPU RL is the manual three-pane launch: each of uv run {inference,orchestrator,trainer} is an independent process with no cross-process GPU validation, so pinning CUDA_VISIBLE_DEVICES=0 on inference *and* trainer lets them share the same physical GPU. - Drop the misleading `uv run rl` recipe. - Promote the manual three-pane recipe to the canonical single-GPU path, with CUDA_VISIBLE_DEVICES=0 spelled out on both the inference and trainer panes. - Lead with SFT (where single-GPU is the default and just works). - Add an explicit "single-GPU RL is for debugging only" caveat. Co-authored-by: Cursor --- docs/scaling.md | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/docs/scaling.md b/docs/scaling.md index 2349983dca..b95004a13c 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -41,29 +41,24 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster ## Single GPU -The trainer and inference server can share a GPU for small models or smoke tests. Pin everything to one physical GPU via `CUDA_VISIBLE_DEVICES`, set both deployment counts to 1, and tighten the inference memory budget so the trainer has room: +For SFT, single-GPU is the default — `uv run sft` runs without torchrun unless you ask for multiple processes. + +For RL, the `uv run rl` launcher partitions visible GPUs strictly between inference and trainer (inference takes the first `num_infer_gpus`, trainer takes the next `num_train_gpus`), so it needs at least 2 visible GPUs. To smoke-test the full RL stack on a **single physical GPU**, launch the three processes manually in separate panes so they can each pin to the same GPU. Tighten the inference memory budget so the trainer has room: ```bash bash scripts/tmux.sh -CUDA_VISIBLE_DEVICES=0 uv run rl @ configs//rl.toml \ - --deployment.num-infer-gpus 1 \ - --deployment.num-train-gpus 1 \ - --inference.gpu-memory-utilization 0.5 -``` - -Or launch the three processes manually if you want full control over each pane: - -```bash # inference pane -uv run inference @ infer.toml --gpu-memory-utilization 0.5 +CUDA_VISIBLE_DEVICES=0 uv run inference @ infer.toml --gpu-memory-utilization 0.5 + # orchestrator pane uv run orchestrator @ orch.toml + # trainer pane -uv run trainer @ train.toml +CUDA_VISIBLE_DEVICES=0 uv run trainer @ train.toml ``` -For SFT, single-GPU is the default — `uv run sft` runs without torchrun unless you ask for multiple processes. +Single-GPU RL is for debugging only — production RL needs 2+ GPUs. ## Single-node multi-GPU From b43311b9f150d07c8e7461108ba72f6d9d20c8e6 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:48:51 +0000 Subject: [PATCH 51/66] docs(scaling): collapse Single GPU / multi-GPU / Multi-node manual into one umbrella MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - New umbrella section "## Single-node vs. multi-node deployment" framing the [deployment] discriminated union as the user-facing knob: single_node runs locally; multi_node currently goes through SLURM. Subsections nest beneath: - ### Single GPU (unchanged content from "## Single GPU") - ### Single-node multi-GPU (unchanged from "## Single-node multi-GPU") - #### RL placement - #### SFT and torchrun - ### Multi-node (new short pointer to ## SLURM with two cross- links to the existing RL / SFT-and-inference examples) - The umbrella section opens with a callout that manual multi-node launches are technically possible but reimplement what the SLURM launcher does — the user's preferred framing. - Drop "## Choosing a layout" entirely (the new umbrella section conveys the same routing more naturally + the layout table was going stale). - Drop "## Multi-node (manual)" entirely (RL training, SFT training, Multi-node inference subsections all gone). Anyone who needs the manual recipe can replicate what the SLURM templates do. Cross-link fixes: - training.md SFT § Launch line previously pointed at scaling.md#sft-training (under "## Multi-node (manual)"). Now points at scaling.md#sft-and-torchrun for non-default single-node layouts and scaling.md#slurm for multi-node. - faqs.md "Multi-node without SLURM or K8s?" answer updated: from "yes, see [Scaling § Multi-node (manual)]" to "not currently documented; technically possible but reimplements the SLURM launcher". Page intro adjusted to match the new structure ("multi-node SLURM and Kubernetes deployments" -> "single-node and multi-node deployments"). Co-authored-by: Cursor --- docs/faqs.md | 2 +- docs/scaling.md | 122 ++++++++--------------------------------------- docs/training.md | 2 +- 3 files changed, 21 insertions(+), 105 deletions(-) diff --git a/docs/faqs.md b/docs/faqs.md index 6d0c49279e..cd588f0d6a 100644 --- a/docs/faqs.md +++ b/docs/faqs.md @@ -146,7 +146,7 @@ Yes. The orchestrator pushes the resumed checkpoint into inference automatically ### Multi-node without SLURM or K8s? -Yes, see [Scaling § Multi-node (manual)](scaling.md#multi-node-manual). You need a shared filesystem and a reachable inference IP. Set the three `OUTPUT_DIR` / `INFERENCE_SERVER_IP` / `INFERENCE_SERVER_API_KEY` env vars on every node and launch each process by hand. +Not currently documented. Multi-node deployments go through [SLURM](scaling.md#slurm) — the launcher writes the sbatch script that wires inference, the orchestrator, and the trainer across nodes. Manual launches are technically possible (run `uv run inference` / `uv run orchestrator` / `uv run torchrun src/prime_rl/trainer/rl/train.py` on different nodes with shared filesystem + reachable inference IP), but you'd be re-implementing what the SLURM launcher already does. ### How big a difference does NCCL weight broadcast make? diff --git a/docs/scaling.md b/docs/scaling.md index b95004a13c..323c046c96 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -1,14 +1,15 @@ # Scaling -This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster: single-node multi-GPU layouts, multi-node SLURM and Kubernetes deployments, FSDP / expert parallelism / context parallelism, and throughput benchmarking. For knobs that fit on one box, see [Training](training.md) first. For prefill/decode disaggregated inference, see [Advanced](advanced.md#disaggregated-prefilldecode-inference). +This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster: single-node and multi-node deployments, FSDP / expert parallelism / context parallelism, and throughput benchmarking. For knobs that fit on one box, see [Training](training.md) first. For prefill/decode disaggregated inference, see [Advanced](advanced.md#disaggregated-prefilldecode-inference). ## Table of Contents -- [Choosing a layout](#choosing-a-layout) -- [Single GPU](#single-gpu) -- [Single-node multi-GPU](#single-node-multi-gpu) - - [RL placement](#rl-placement) - - [SFT and torchrun](#sft-and-torchrun) +- [Single-node vs. multi-node deployment](#single-node-vs-multi-node-deployment) + - [Single GPU](#single-gpu) + - [Single-node multi-GPU](#single-node-multi-gpu) + - [RL placement](#rl-placement) + - [SFT and torchrun](#sft-and-torchrun) + - [Multi-node](#multi-node) - [Parallelism knobs](#parallelism-knobs) - [FSDP](#fsdp) - [Expert parallelism](#expert-parallelism) @@ -16,10 +17,6 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster - [Activation checkpointing and offloading](#activation-checkpointing-and-offloading) - [CPU optimizer offload](#cpu-optimizer-offload) - [Memory-tight recipe](#memory-tight-recipe) -- [Multi-node (manual)](#multi-node-manual) - - [RL training](#rl-training) - - [SFT training](#sft-training) - - [Multi-node inference](#multi-node-inference) - [SLURM](#slurm) - [Activation](#activation) - [`[slurm]` and `[deployment]` reference](#slurm-and-deployment-reference) @@ -29,17 +26,13 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster - [Kubernetes](#kubernetes) - [Benchmarking](#benchmarking) -## Choosing a layout +## Single-node vs. multi-node deployment -| You have… | Use this layout | -|---|---| -| 1 node, 2–8 GPUs | `uv run rl` with `--deployment.num-infer-gpus N --deployment.num-train-gpus M` | -| 1 node, 8 GPUs, large MoE | Custom impl + EP + activation checkpointing | -| 2+ nodes, SLURM | `[slurm]` + `[deployment]` overlay (recommended) | -| 2+ nodes, no SLURM | Manual `uv run inference` + `uv run orchestrator` + `uv run torchrun src/.../train.py` | -| Kubernetes | The bundled Helm chart at `k8s/prime-rl` | +The `rl`, `sft`, and `inference` entrypoints all accept a `[deployment]` block (`type = "single_node"` or `"multi_node"`) that picks how the trainer / orchestrator / inference processes are placed across hardware. **Single-node** runs locally; **multi-node** currently goes through [SLURM](#slurm) — the launcher writes an sbatch script that places inference replicas, the orchestrator, and the trainer with the right rendezvous endpoints, IPs, ports, and shared-filesystem paths wired in. + +> Manual multi-node launches (`uv run inference` on one set of nodes, `uv run orchestrator` on another, `uv run torchrun src/prime_rl/trainer/rl/train.py` on the trainer nodes) are technically possible — that's what the SLURM launcher does for you under the hood — but you'd be wiring rendezvous endpoints, inference IPs and API keys, the rollout/weight-broadcast paths, and the shared filesystem mounts by hand. We don't currently document that path. -## Single GPU +### Single GPU For SFT, single-GPU is the default — `uv run sft` runs without torchrun unless you ask for multiple processes. @@ -60,9 +53,9 @@ CUDA_VISIBLE_DEVICES=0 uv run trainer @ train.toml Single-GPU RL is for debugging only — production RL needs 2+ GPUs. -## Single-node multi-GPU +### Single-node multi-GPU -### RL placement +#### RL placement `rl` defaults to 1 trainer GPU and 1 inference GPU. To give inference 6 GPUs with data parallelism and the trainer the remaining 2 on an 8-GPU node: @@ -90,7 +83,7 @@ CUDA_VISIBLE_DEVICES=2,3 uv run rl @ rl.toml \ --output-dir outputs/exp2 ``` -### SFT and torchrun +#### SFT and torchrun `uv run sft` manages torchrun internally — you don't need to call torchrun yourself. To scale from 1 to N GPUs, set the deployment GPU count (or just let it pick up `WORLD_SIZE`). For non-default layouts, the manual equivalent is: @@ -103,6 +96,10 @@ uv run torchrun \ `--local-ranks-filter 0` keeps console output to rank 0 only; per-rank stdout/stderr is still captured in `/logs/trainer/torchrun/`. +### Multi-node + +Multi-node deployments (RL or SFT) are launched via [SLURM](#slurm) — set `[deployment] type = "multi_node"` plus the matching `[slurm]` block, and the launcher writes the sbatch script that places inference, orchestrator, and trainer across the requested nodes with the inter-process wiring set up correctly. See [SLURM § RL example](#rl-example) and [SLURM § SFT and inference examples](#sft-and-inference-examples) for full configs. + ## Parallelism knobs ### FSDP @@ -198,87 +195,6 @@ max_inflight_activations = 1 Walks through every memory lever in order: FSDP+EP shard the weights, CP shards the activations along the token dim, AC + AC offloading shrink the activation footprint, fused LM head chunks the loss, `torch.compile` reduces fragmentation, optim offload moves Adam state off GPU. Apply selectively — each knob has a throughput cost. -## Multi-node (manual) - -When you don't have SLURM (or want fine-grained control), launch each process by hand. Multi-node RL currently requires a **shared filesystem** for the rollout transport and the weight broadcast. - -### RL training - -```bash -# On all nodes -export OUTPUT_DIR=/shared/outputs/my-run -export INFERENCE_SERVER_IP=10.0.0.1 -export INFERENCE_SERVER_API_KEY=... -``` - -```bash -# Inference node -uv run inference @ infer.toml \ - --api-key $INFERENCE_SERVER_API_KEY \ - --parallel.tp 4 --parallel.dp 2 - -# Orchestrator (either node) -uv run orchestrator @ orch.toml \ - --client.base-url http://$INFERENCE_SERVER_IP:8000/v1 \ - --client.api-key-var INFERENCE_SERVER_API_KEY \ - --output-dir $OUTPUT_DIR - -# Trainer node -uv run torchrun \ - --nproc-per-node 8 \ - --local-ranks-filter 0 \ - src/prime_rl/trainer/rl/train.py @ train.toml \ - --output-dir $OUTPUT_DIR -``` - -You can scale inference and trainer independently — multiple inference nodes (each running its own vLLM replica), one orchestrator, one or more trainer nodes. The orchestrator must be a single instance. - -### SFT training - -For multi-node SFT, point torchrun at a rendezvous endpoint: - -```bash -# On all nodes -export MASTER_ADDR=10.0.0.1 -export MASTER_PORT=29500 -export GLOO_SOCKET_IFNAME=... # only if default isn't routable -export NCCL_SOCKET_IFNAME=... - -# Node 0 -uv run torchrun \ - --nnodes 2 --node-rank 0 \ - --rdzv-endpoint=$MASTER_ADDR:$MASTER_PORT \ - --local-ranks-filter 0 \ - --nproc-per-node 8 \ - src/prime_rl/trainer/sft/train.py @ sft.toml - -# Node 1 — same but --node-rank 1 -``` - -If your nodes aren't colocated, set up a VPN (e.g. Tailscale) and use the VPN-resolvable IP for `MASTER_ADDR`. - -### Multi-node inference - -Multi-node vLLM uses native data parallelism — see the [vLLM docs](https://docs.vllm.ai/en/v0.10.0/serving/data_parallel_deployment.html). For TP=4, DP=4, two nodes: - -```bash -# Node 0 — DP ranks 0,1 -uv run inference \ - --parallel.tp 4 --parallel.dp 4 \ - --data-parallel-size-local 2 \ - --data-parallel-address $DATA_PARALLEL_ADDRESS \ - --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT - -# Node 1 — DP ranks 2,3 (headless) -uv run inference \ - --parallel.tp 4 --parallel.dp 4 \ - --data-parallel-size-local 2 \ - --data-parallel-address $DATA_PARALLEL_ADDRESS \ - --data-parallel-rpc-port $DATA_PARALLEL_RPC_PORT \ - --data-parallel-start-rank 2 \ - --headless -``` - ## SLURM The `rl`, `sft`, and `inference` entrypoints all submit to SLURM when a `[slurm]` table is present — there's no separate entrypoint. diff --git a/docs/training.md b/docs/training.md index a5f12745a0..f92bc871f2 100644 --- a/docs/training.md +++ b/docs/training.md @@ -156,7 +156,7 @@ The minimal SFT run trains `Qwen3-0.6B` on the `reverse-text` SFT dataset: uv run sft @ examples/reverse_text/sft.toml --wandb ``` -Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manages this for you — see [Scaling § SFT training](scaling.md#sft-training) for non-default layouts). +Multi-GPU and multi-node use torchrun under the hood (the `sft` entrypoint manages this for you — see [Scaling § SFT and torchrun](scaling.md#sft-and-torchrun) for non-default layouts; multi-node SFT goes through [SLURM](scaling.md#slurm)). ### SFT-specific knobs From 0cbaff9fbbbf068c3a5060785ef57b5b9dea2c7f Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:50:23 +0000 Subject: [PATCH 52/66] docs: drop Kubernetes coverage for now MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The K8s Helm chart at k8s/prime-rl/ still ships, but the user-facing docs are dropping coverage until the chart and the matching guide are re-validated together. - scaling.md: drop the "## Kubernetes" section + TOC entry. Page intro already covered (was reworded earlier in the restructure). - overview.md "Where to go next": drop "Kubernetes guides" from the Scaling bullet. - README.md docs index: drop "Kubernetes" from the Scaling bullet. The two passing-mention "k8s" / "Kubernetes" lines in README (Overview features list, Advanced Training Examples adaptability note) are left as-is — they describe codebase capability, not docs coverage. Reference.md still mentions Kubernetes liveness probes in an auto-generated field docstring; that's source-side, out of scope for this pass. Co-authored-by: Cursor --- README.md | 2 +- docs/overview.md | 2 +- docs/scaling.md | 35 ----------------------------------- 3 files changed, 2 insertions(+), 37 deletions(-) diff --git a/README.md b/README.md index ff7a16086c..350897f643 100644 --- a/README.md +++ b/README.md @@ -220,7 +220,7 @@ Check out the [docs](docs) directory for in-depth guides on how to use PRIME-RL. - [**Overview**](docs/overview.md) - Architecture, install, and a copy-pasteable end-to-end RL run - [**Configuration**](docs/configuration.md) - TOML composition, CLI overrides, env vars, validation - [**Training**](docs/training.md) - RL, SFT, evals, checkpointing, observability, rules of thumb -- [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, Kubernetes, disaggregated inference, benchmarking +- [**Scaling**](docs/scaling.md) - Single-GPU through multi-node, FSDP/EP/CP, SLURM, benchmarking - [**Algorithms**](docs/algorithms.md) - Async/off-policy training, the AIPO loss, advantage and filter plugins, trajectory merging - [**Advanced**](docs/advanced.md) - Custom modeling, multimodal training, LoRA, multi-tenant training - [**Development**](docs/development.md) - Test suite, pre-commit hooks, adding a new model architecture, debugging MoE diff --git a/docs/overview.md b/docs/overview.md index cd51166006..f21e0536ef 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -40,7 +40,7 @@ The `rl` entrypoint reads `examples/reverse_text/rl.toml`, splits it into per-pr - **[Configuration](configuration.md)** — How TOML files, `@` composition, CLI overrides, and env vars combine; the precedence rules; worked examples. - **[Training](training.md)** — End-to-end recipes for RL, SFT, and evals; checkpointing and resume; observability (logs, W&B, Prometheus, platform monitoring); rules of thumb and common issues. -- **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM and Kubernetes guides; benchmarking. +- **[Scaling](scaling.md)** — Single-GPU through 1000+ GPU; FSDP / EP / CP knobs; SLURM; benchmarking. - **[Algorithms](algorithms.md)** — Async / off-policy semantics; the default loss; built-in and custom losses, advantages, and filters; multi-turn trajectory merging. - **[Advanced](advanced.md)** — Custom modeling (EP backends, custom impls); multimodal training; LoRA + multi-tenant training; disaggregated prefill/decode inference. - **[Development](development.md)** — Test suite (unit / integration / nightly), pre-commit hooks, adding a new model architecture. diff --git a/docs/scaling.md b/docs/scaling.md index 323c046c96..9357a242a0 100644 --- a/docs/scaling.md +++ b/docs/scaling.md @@ -23,7 +23,6 @@ This page covers how to scale `prime-rl` from a single GPU to a 1000-GPU cluster - [RL example](#rl-example) - [SFT and inference examples](#sft-and-inference-examples) - [Custom templates](#custom-templates) -- [Kubernetes](#kubernetes) - [Benchmarking](#benchmarking) ## Single-node vs. multi-node deployment @@ -363,40 +362,6 @@ uv run rl @ my_config.toml --slurm.template-path path/to/my_template.sbatch.j2 The default templates live under [`src/prime_rl/templates/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/src/prime_rl/templates) — copy one as a starting point. -## Kubernetes - -For Kubernetes-managed clusters, `prime-rl` ships a Helm chart at [`k8s/prime-rl`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/k8s/prime-rl). It deploys three StatefulSets (orchestrator, trainer, inference) sharing a single `ReadWriteMany` PVC mounted at `/data`. - -```bash -# Deploy with an example values file -helm install my-exp ./k8s/prime-rl -f ./k8s/prime-rl/examples/reverse-text.yaml - -# Or with custom overrides -helm install my-exp ./k8s/prime-rl --set trainer.replicas=3 --set inference.replicas=2 -``` - -After deployment, `kubectl exec` into `-trainer-0` and launch with `uv run trainer @ ` (or `uv run rl @ `). All three pod groups discover each other via stable DNS hostnames (`-{trainer,orchestrator,inference}-.-{...}-headless..svc.cluster.local`). - -Environment variables provided to every pod: - -- `$POD_NAME`, `$POD_IP` — standard K8s -- `$STATEFUL_REPLICAS` — total replicas for this component -- `$HEADLESS_SERVICE` — DNS suffix for peer discovery -- `$INFERENCE_URL` — first inference pod's URL (set in orchestrator and trainer pods) - -For distributed trainer launches inside K8s, extract the rank from the pod name and feed it to torchrun: - -```bash -RANK=$(echo $POD_NAME | grep -o '[0-9]*$') -torchrun \ - --nnodes=$STATEFUL_REPLICAS --node-rank=$RANK \ - --nproc-per-node=8 \ - --rdzv-endpoint=my-exp-trainer-0.$HEADLESS_SERVICE:29501 \ - src/prime_rl/trainer/rl/train.py @ /data/configs/train.toml -``` - -Common operations (logs, exec, scale, uninstall) are standard `kubectl`/`helm`. Auth (W&B, HF) is via K8s secrets — set `config.secrets.enabled=true` and `config.secrets.name=`. - ## Benchmarking Every entrypoint supports a `--bench` flag that runs a few warm-up + measurement steps with fake data and prints a rich-formatted throughput / MFU table: From 3c9509b8af7d27aade7c8aa9c6c7012ac1a39ca5 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:50:56 +0000 Subject: [PATCH 53/66] docs(development): tighten test-suite layout bullets MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Drop the example test filenames from the tests/integration/ bullet — they're a moving list and not the point. - Reframe the tests/nightly/ bullet around what it does (runs the examples/ configs to catch regressions) instead of listing the individual nightly tests by name. Co-authored-by: Cursor --- docs/development.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/development.md b/docs/development.md index 2a27c3d809..d4886807f7 100644 --- a/docs/development.md +++ b/docs/development.md @@ -22,8 +22,8 @@ The test suite is split into three tiers, each with its own CI workflow. ### Layout - **`tests/unit/`** — fast-running, hermetic tests for isolated logic: config parsing and validation, advantage / loss / scheduler / packer math, individual dataset paths, model-conversion roundtrips, etc. Tests that need a GPU are tagged with the `gpu` marker. -- **`tests/integration/`** — full-stack RL/SFT runs on a tiny model end-to-end through inference + orchestrator + trainer (e.g. `test_reverse_text.py`, `test_reverse_text_lora.py`, `test_reverse_text_moe.py`, `test_reverse_text_multi_run.py`, `test_alphabet_sort.py`). -- **`tests/nightly/`** — long-running training runs against shipped configs and real environments (`hendrycks_sanity`, `acereason_math`, `multimodal_color_codeword`, `wiki_search`, `wordle`, …). Each runs to completion on the research cluster with a 24h timeout. +- **`tests/integration/`** — full-stack RL/SFT runs on a tiny model end-to-end through inference + orchestrator + trainer. +- **`tests/nightly/`** — runs the configs in [`examples/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples) on the research cluster every night to catch regressions in the shipped examples. 24h timeout per run. ### Running tests locally From 094ad233fcd401af3ea86dfd38bec2ce67185951 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:51:18 +0000 Subject: [PATCH 54/66] docs(development): drop the nightly 24h-timeout / research-cluster aside Co-authored-by: Cursor --- docs/development.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/development.md b/docs/development.md index d4886807f7..3d4e7aac83 100644 --- a/docs/development.md +++ b/docs/development.md @@ -23,7 +23,7 @@ The test suite is split into three tiers, each with its own CI workflow. - **`tests/unit/`** — fast-running, hermetic tests for isolated logic: config parsing and validation, advantage / loss / scheduler / packer math, individual dataset paths, model-conversion roundtrips, etc. Tests that need a GPU are tagged with the `gpu` marker. - **`tests/integration/`** — full-stack RL/SFT runs on a tiny model end-to-end through inference + orchestrator + trainer. -- **`tests/nightly/`** — runs the configs in [`examples/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples) on the research cluster every night to catch regressions in the shipped examples. 24h timeout per run. +- **`tests/nightly/`** — runs the configs in [`examples/`](https://github.com/PrimeIntellect-ai/prime-rl/tree/main/examples) every night to catch regressions in the shipped examples. ### Running tests locally From 09bf4df7e8521c75c9fd8ace2700211bf689e415 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:52:16 +0000 Subject: [PATCH 55/66] docs(development): polish 'Adding a new architecture' prose - Replace the bare 'To add (e.g.) Kimi 2.5:' opener with a one-line framing of the two-step contract: implement modeling code, register a mini preset for smoke-testing. - Bold the leading verb on each numbered step so the structure reads as a checklist. - Step 1 now nudges readers at glm4_moe/ and qwen3_moe/ as templates for the modeling code. - Step 2 explains *what* the preset is for ('build a ~0.5B test model in your architecture') rather than just listing fields. Path now links to scripts/mini_moe.py. - Step 3 says what the smoke-test actually exercises (roundtrip + SFT + RL stack) so users know what 'smoke-test' means here. Co-authored-by: Cursor --- docs/development.md | 39 ++++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/docs/development.md b/docs/development.md index 3d4e7aac83..603f679be6 100644 --- a/docs/development.md +++ b/docs/development.md @@ -68,25 +68,26 @@ The configured hooks: ## Adding a new architecture -To add (e.g.) Kimi 2.5: - -1. Add the modeling code under `src/prime_rl/trainer/models//`. -2. Add a preset to `scripts/mini_moe.py` with the config class, small dimensions, HF + PrimeRL model classes, and tokenizer source: - -```python -ARCH_PRESETS = { - "glm4_moe": { - "config_class": Glm4MoeConfig, - "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), - "hf_model_class": HFGlm4MoeForCausalLM, - "prime_model_class": PrimeRLGlm4MoeForCausalLM, - "tokenizer_source": "THUDM/GLM-4-9B-0414", - }, - # add your arch here -} -``` - -3. Run the [Debugging MoE](#debugging-moe) workflow with `--arch ` to smoke-test the new modeling code end-to-end. +Bringing up a new model family is two steps: implement the modeling code, then register a small-scale preset so you can smoke-test the new architecture end-to-end without paying the cost of the full-size model. + +1. **Implement the modeling code** under `src/prime_rl/trainer/models//` (HF-compatible config, modeling, and weight conversion). Mirror the layout of an existing family — `glm4_moe/` or `qwen3_moe/` are good starting points. + +2. **Register a mini preset** in [`scripts/mini_moe.py`](https://github.com/PrimeIntellect-ai/prime-rl/blob/main/scripts/mini_moe.py) so the [Debugging MoE](#debugging-moe) workflow can build a ~0.5B test model in your architecture. The preset names the config class, picks small dimensions, and wires up the HF + PrimeRL model classes plus a tokenizer source: + + ```python + ARCH_PRESETS = { + "glm4_moe": { + "config_class": Glm4MoeConfig, + "config_kwargs": dict(hidden_size=1024, num_hidden_layers=24, n_routed_experts=8, ...), + "hf_model_class": HFGlm4MoeForCausalLM, + "prime_model_class": PrimeRLGlm4MoeForCausalLM, + "tokenizer_source": "THUDM/GLM-4-9B-0414", + }, + # add your arch here + } + ``` + +3. **Smoke-test** with the [Debugging MoE](#debugging-moe) workflow using `--arch `. That runs the HF↔PrimeRL roundtrip, the SFT warmup, and the full RL stack end-to-end on the mini model. ## Debugging MoE From 7e62dc36db3970dd86ff6f9622b4daa655ff9cd4 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:54:10 +0000 Subject: [PATCH 56/66] ci(docs-reference): add minimal GITHUB_TOKEN permissions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CodeQL alert actions/missing-workflow-permissions (security/code-scanning/19) flagged the new workflow for relying on the repo's default GITHUB_TOKEN permissions. The workflow only checks out code (contents: read), syncs deps via uv, runs the doc generator, and runs git diff. None of that needs write scope on any resource type. Pin to contents: read at the workflow level — explicit minimum that satisfies the rule. Co-authored-by: Cursor --- .github/workflows/docs-reference.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/workflows/docs-reference.yaml b/.github/workflows/docs-reference.yaml index baf200fc6b..44e8006f65 100644 --- a/.github/workflows/docs-reference.yaml +++ b/.github/workflows/docs-reference.yaml @@ -11,6 +11,9 @@ on: - "docs/reference.md" - ".github/workflows/docs-reference.yaml" +permissions: + contents: read + jobs: reference-in-sync: name: docs/reference.md in sync with configs From 3e2216e5df5c8931b989f9238875204c4c2c5550 Mon Sep 17 00:00:00 2001 From: Mika Senghaas Date: Mon, 25 May 2026 23:58:25 +0000 Subject: [PATCH 57/66] docs: refresh architecture + async diagrams MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - assets/architecture.png: replace with the new diagram (trainer + orchestrator + inference deployment, GPU layout per process, data + scheduling + weight-broadcast arrows). Was 511k 96dpi, now 111k @ 200dpi from architecture.pdf. - assets/two-step-off-policy.png removed; assets/async-pipeline.png replaces it with the cleaner one-step-overlap diagram (trainer steps g_0..g_n above, inference samples with theta_{n-1} below). algorithms.md image reference + alt text updated to match the post-deprecation "one-step overlap" framing. - assets/rollout-timeline.png added but not yet referenced. It's a continuous-time view showing rollouts spanning policy boundaries (policies pi_{i-2}, pi_{i-1}, pi_i on the x-axis with rollout bars crossing the boundaries) — that's the picture behind max_off_policy_steps, not max_async_level. Want me to drop it into algorithms.md (e.g. above the off-policy / max_off_policy discussion) or save it for later? Co-authored-by: Cursor --- docs/algorithms.md | 2 +- docs/assets/architecture.png | Bin 511492 -> 110809 bytes docs/assets/async-pipeline.png | Bin 0 -> 54011 bytes docs/assets/rollout-timeline.png | Bin 0 -> 22439 bytes docs/assets/two-step-off-policy.png | Bin 450852 -> 0 bytes 5 files changed, 1 insertion(+), 1 deletion(-) create mode 100644 docs/assets/async-pipeline.png create mode 100644 docs/assets/rollout-timeline.png delete mode 100644 docs/assets/two-step-off-policy.png diff --git a/docs/algorithms.md b/docs/algorithms.md index 71d1f11d73..47cdf43e9b 100644 --- a/docs/algorithms.md +++ b/docs/algorithms.md @@ -26,7 +26,7 @@ This page covers the math and the configurable algorithmic components: how off-p `prime-rl` is asynchronous by default. The trainer and inference always run one step overlapped: while the trainer is producing $\pi_n$ from rollouts at step $n$, inference is already generating the rollouts for step $n+1$ using $\pi_{n-1}$. With matched trainer and inference step times this produces fully-overlapped pipeline parallelism — neither side ever idles. -![Two-Step Off-Policy Training](assets/two-step-off-policy.png) +![Async pipeline: trainer step n produces $\theta_n$, inference at step n samples with $\theta_{n-1}$](assets/async-pipeline.png) ### Step semantics diff --git a/docs/assets/architecture.png b/docs/assets/architecture.png index da160eb8b05e1550c0d5e79e4d92e1d11d41b257..080a7e2710e9b4afa794de6e3304e96f600ba891 100644 GIT binary patch literal 110809 zcmeEuWmJ^k+wK^E0b2*?#WfDvdib%qgK1hEdp+PFrDgL*!+>KyYAGf&Yvi7K z8g+Vrv@DCpW955Nk167Q{(Q(4(MiZLBVt-{l!8bRs-Sb@;GDaFp6_&b_g?O72vvv( zrd|jtN<4q~5X|vk<5f3D-u2A?HA=q-GX48+2t-RX<=KB96;kwFuliqOB-{++-+x0O zmYo&otn@05pHYJ+$F54bKFmO^HWn`I%DCBg1H*a^tB!2s?dPTn_lkDglaLd z-4$dmC1E7pD07H2>&6l~(w6^+ewfLX#;mAcl_=W{iBK z`b~>v#_7{tBwIh5mzj%{=KF1Ld4~(Zo(}uS@=ul z^}z>!3h46#KWXE1G)6u*XxwP-I~Kd$*Or8DET{dhgvp?BD3-u0dOZTe@e${#hT<25 z_P!b8GKCCz2+LEOND= zN2M*|kVQp{xC*!I_CW0CLt)H7v=GV8yG}a<&U!{>6iAYf!)$$SKwc496h-vL@?MQ- zvIz)1vUi)(A76cBAmo_z@!*+?&f}fyIW`HHpnu)oYMDt1i8p>05h^jh8evVoau4$9 za=Y5hq{F#_x*&kWT}+>t?viyu(e1n(X4pzyK(xwLB#%5@J(N+++DLKKzD*}@nN`)g z@$S!|Obf?&y^`y>yCVeh>M^_Z3t(VHe~peofft2)=>=--%n|1(f5l`Cyda|20!vz_)0!CkwTFJ0>aGA;Cw|!Xh!{{oSaAC(}vy zV@#L44)ah>ftnxh0-p!VFvS$tkZq!PW`;CMe^mHj1Tm8m!@#uB{N?hnki?yOQLDJo zz;4QJ7iPCx-DQqb)y{fq6n#ljHsmvRa0>$Y%>V7p7P6Ks;V|A9zw=oOxK1xwcm{Dy zSY4yt>1 zSjp_pgR@N4dCyqZ|oLn=yjRD*Z@#eY!X6B3#C6x8r-5S3s#p)%X(3F}i(=LE!0bwW-w$)(9yR8& zf>zN}P2tK1za+Adj|a3V7inBE~pv!vmX9K4BIoZK%iEsD!mA(F$(NDYhOLdgW+^h3E2(1lc0R@;3+6 z7#6nJ$?g*Yh_XY#LjaarBwA3!f-V@9$V-MJ(w=;B3#loHwv)_oc&XfRcdlr7*d@l+ zZ$O)wr#ZdxgC}CygBPyHM0|M1wxr)Ry>dj1BozRR{i)HM%I7SjMoCFXhG>ksrhGbm z?32n4Hvd$WE<@>&=aNpHJ&hU4P3OGH`SeJJwqXv-sKeflb*$yKkEW>_tn@rvRf+4d z7;cqu9jPkF7)FyBPXSn32OHV|Wu18_@F*dJ_>D^%x5-UN_`P24_ehSOI_2tG|nLMEsK6Tj%~$5w}D%SfUi( zHL`g=CFdq53nx3#F7z56PnXZ+pSw{(LnoB^6l$3{A?Y!J?$(j{o+jAk3^R(Q_bsg& z59gpbSQ1#6 z#^t#X81j$7p8`EiYnV%B+C~7A&>n ztU~op{cmDT+rK@{Jj>gZzVj5M(Zc7~*mNKYl?Bh^$Yw2?(IvW3lQ#KFL#l}8vuA6w z2TaC8+)FVW?~J}xTR{OIVDvZ{_3vbYUH-L^74FnL{p2HnpKIcA;t;`1aih#ZZbm4cO0-=o}p|ZOO2!% zWsXs{H&%<|iAJSz}5Ud#Cvy3TEjM^|;ze^K0kb~Hmz zFk*v%+{F9yZh$&4wHT;?yTj}>kMFD-31_ew-*l!E8}+SeeM7#r-|2$w7kGAr6B zDN9|f_?xlelUUcElun@^w3V2|b>J>e2X&1gQ^9G@^YUPZqBBe%)gz47X0LyoC006YQUFk02DnqPb}8i@7Ai zF(GL)yg^wT;SOi5I!sn2$b{->X-YLvtl9PPnut06w3V(`DyoY0Yg8y!-2X5U?#Vr< zrm93{a@2jRLF;ClrX&cyU}~`5$bi48%na=Nk!IUV*@7$v&1UH+$py4UX!)!3!Kj6} ziQlC$OVJEW#$ERqier26_$JblRUd&eF6zsiE8lsUD)iP+*94AJUF~Aixow{P8pdq^ z!jHKX^CwASfWz5hLB{TM3kB8ht!Bk7;k*#SiMX65SI$(Y?^#fn&OY5Ybk1UHwT251LD+vFmt^`4D6k#G9v}1P&xL$*!06aCkWGi+nYqf9p8?`)O*o zpG~#c2iYWU?U6pEZ#JBl*GuRKKd(WPywcRH)1K8 zIVYO*u|eZYp}gDXQ~D=q@XFU#}X=DBLIY*uG98DnK zKWB(fy)MFhAmzUe(xJ`>jD90&B(+<}bCHv=tn|}qZAaEDXjgGYjDReizg3&rUjO&> z>icqgp4}BD-(2aV5=CT|U!b#BnPU^H-m^{U^W51JYrPu&QBt;N1$9~6 za2^RSH>z%ZOpnRtmuS#`F?_d40(`&rHMjlbfNMs=ad9{6{4zfv#1iDFr9RyFqV0XM zq7O#oGPe~+LU+2F74?J5KVAwJxgUN?uK%v^Y^0wcOImZr&u!RP;W-3yc(!*-=EKMB zetOyEr!vKo4C~~v1B%ff&rj%?@=C(ju=9Gu8}ARo+o#>ej>ir>bb6>r95?Wy%pXO* z!Fej8Pi}&UsLP?~H zDC@Lx_|pr;h#PlG2^_tiSe^Wf29)-7NZh6^d`rkJ@ZB}C?c~W7k3fH9B%bEsl~|*dntH1NAKmoxbUEAaru0q zmK$?9l%Z5p{h6CTHdg*tqA#A`F!vH}^6OOcT1~=4g};?4Fr}ibt1u6iQ!E(Twf2wU zy@lGWwWj8sIj3g7D1}_P=>$UU#L7-GimozS6cuat2`2h6^j7sXvE^ zX>l1op?y2EK~1v&uwGVQ!qosU!ev2&goWu zgfSmnKX{toT9K%zeB)23^DBsg^eMY22CYC_2QXNFo3rX_d^|`*YBdc ziPZMXHKsW=QBHm(Yj9iW7u8CxrC)kxWq8G~mtB?%*sOlm2F_`na^%H}(oCJEZai;1#Sx7^l^JDeXyeBRye_tVP)i6#B8qKR^G3E!W}^QEQe8z`jS!){x7 zA*{FISrcyCuxY_PjkjtrE`Ms7p!}s_*{A}?moipg1Eh*22~~JEbd75lV@Ds`QQ{4m zBZfwZ%!?wJ;}$(-ExmCV>3RNHf5LtmiKk301I!JS#h1)c;0=A6^HHVa68IPWvybS_ z%AQ)q_?dJe-%`{>mU0=_8ZXzik|{e%$OWn24xYi4pp5`rHi@Mj*L$fB-dQ!lbhf2N zIYH68{Y;{32#T{O7fR_&mcfW(^sPI#@5S46KWHLw#R?Q37Q_WeJ~H4H*ahtP8d{x6 zb!C+zZG~FLxQ2_(a9yaDoBZqAZVv|UO)lCZQ$z9|ExO20f(6Y?^TG~pEg~F!7439L z1{fDoK(;dU4kaL=yBzeRRMj68)f9t3KIH=C;F>hF5u<|Fiyt2mHR%8#rjae$MYJ@V zL18O}E)^3i*R|VYdW2O+Ul_h|bLmmB6qEFOhko2&<&lj&9t?u{*lvDuV!By*i75`o z$aImeRPsCHRxLpkqg3K-Ir7cMAb6ChwDWV2mhlo3X<~67UR&_O|ic6Pc8cn^G zmk2Bek3XZL6%?1Bc728>9QkRdX;rFk@pg7!gYi|4o8AyOnCZYL{y2-+!NU7*suj=m zWL*A?(G#kA#W|#ehtmEPMRqbZFZ57D2JBVit8~6xW;Iu~AK2x>44QG4~@1BOa>yrFMpt;RY(agv3e8+cMr z^V`OBJ#BBDuYo)^LImCwPZJ4i430MkRW0T5 zk&Wlf=tB*<6=9{Vd{T0mOLxKw-rfn5;CdoAwTJD+5JYj%cQ)_zxl7=}oM)L<^^d1- zE4KbFd(@K1t00D^DZ=DyY-yObv~<&J;$)NN1Qg}}PQrDKlxzl1T%(YAn zgHDV7XzDx?JgQBneM04>?)A`B|Ez%B?GTQ999YX${$ zXN1t}+j@jf6>;@^p8rFq_YoG%8-;mXMqlTu&h>V8T+Q=e7IQNsZK)9y9;OOiyB62PiUZ(6BmXF2qD5Q-;{zo+B>WmoXZ6hIa}UoRf6!cfkQ+6& z$thbNC?%P(LBM7upnCeyN$#a=IcCYQresh_iVX{dS{`U^@$U~+CpV*#Qvy~=9_6u= z2@;7teqe6}B)I_R^+mYtz-Ijuv*i}^;94f-lfS_5*iv=g5z2A zSTpJ$xlU(%6I3WbfA)`0A^?_^ZmszV@|_SVbSg)f9Al|Jj!-=_=HP;?&Li@PReZS( zYExu!ZZI0qez^`Z5vp7Y^yhk+b~c_f9ry#7Yvx2j zL$@7*OLl2(!eslSxhn_tNgY~28)d#%^YN6k;NF7(Gdq_r>PBG(hp~PnzxL^Gx%kD_aHy;8f`32$q%hEd!PX?Xi=-JohXL znT1@^;d*}%gdxFO<0&i0c?2JhW#UTF}Yr6%S z^qu3`o88u*)*0g%c1r{x2nbbQG`<%Z?5zPQbA zTQWxnG#={ve>II1sHJf`e+8nw`x#x0W*~p#NPL@g0VN!9+{NC0U(SuW_vZ))=_AX_ z{IBfE(Li;p$LEfm`wU(E5FuIs$E&=BJLcY$UHIZX{V47TK63d%$$bQ}cRU@ghH3d0 z@aaoNzB*+-e4J-hAWuMja3rwqcC(0ZEMg32%Q#X9$x-e7(bIu4Ml z5qXyLWgq5Cowao#hs@*;YHO4rmIc71gL@(=y33AehG#frxlhq~^4W4W3ub@7<-0%c zqmMKZA;g766!~P$01vt4kz4cs)iN_%X0CEtF)?j5-F&v(7K?#+4S&v}MW?EX5P(^7 zxooUJwVnZ6`P6BBR`k-8>f0z)uJ00rkQ7Uwr=i@I_s% zF0Q&Yu2{lnf zsa@jQ%&KN}59qf}b&~zoF!1~aplrTq3pPb7#kQ7n=SB_vrF+ul`6T0ZNXy7275z0; zWEF3PA`^Zzv0_Nle{{yt2mje~>;2X0%MYn4l6+_L8H*$}O$NSM3Arj(HBb-xZI%*o z5_riUlcMwC{vrO`V2$BWCF_wT@tMp-3&|UhxLcA{a{76V=erBxiU+to-MVHCm}X_M zneJZdU8HZr4Av?6T(WFc7zZiD>-7Ed5?HoUm1JKhM5uJ-#mWqzt#xKoN5*KM)? zT#i#I4(#8bjvyWUi>q>OvKNJHboND--n#1l z9y8%24C|gCBZwygHI<|{Tw>HgGs3goaY$w7G zVP`R?de1cItS2}JUfy`&>T!Vja#Pjwv2B^0f9$d1&!)E10LzWX6*LDF#f^%Q9xh_{ zpY}-FH2oDa-)|C%f!urcLgZTG6|G2BrIi(^Q#LJL&k>w|n4$mO>nwr04wXyE!j;Qi zNdpZtNVxGbm(70U5Z6PJ5o>8*G9IIw<2e4j1z+V4i2 zx)qn^`R6XCvUtqHpR0SHa54aoi?>ARfNo30w)w_mD?oW5kzX#a zVJHq?5L}q>^!58E7DGwfFxPWvq`DX?Sc1hlmlM4F_bk9%1rI#?L0I-Z;;ZcsqWdD! ztR5vZ5%%fKx=vDzH&q^m&#w*f-}B%&8(XH=qc2a2yw+Rd%!9Zg)Ov$aG8Cgvc2x_T z%|iiz@RRd@8f{RwMGM~gQUu#@??Pk|IXrO*FzdrSp_%Nby&IgqKO&G%0S;B~ zdH&eW;GFzPbSffoOV5SV!XGG+zM)!E!#Itn9>!cmlc$pZ&ekNa7+a1>bzvI$ElYm& zXCYnfg+rlnZDps^3by&g_>Ac8nxcFQM_!3k?aw+OLpnI4KxZy9C&Hc@ne{T(<64$2 zyl`n;eO{RnXxtU}he=t0HP>xmoRZI%A__(;rCGocnTYzzU4v^uY`{E4CG0AJL=t_2 z|GZ?4pE<51-KsAM_3sv&I2&wDfWNl1Kp%Gflw6#p0yruokQ(hu$R4Hq`Nng^xgO|< zk+|yCqk|2s-O0m%=BIMu475`1S-U273^u=8SGf({2Vx2C`#>SelEt|t&gDaVpUdH# z*6{U0LUsC=`Y12s7t9~CMVA}Wg4tLoNkc$K1_25jfP7kZQyd>Gex*i%rpaeJmM+j2 z93es+bpFvX86qK6k6yC}Z0r*f`=6@BK?*>NxfU%Y&_-1?wJn8$QG{8rJ#F-O#Srz< zJ{_1oC8ZfLb&k^d=zCd4Wwt_uB5BQ-I-_XP0MzKk!Z&GjY-l_EUS}_$ouyl)%KkjT z;;LCM)oEAmb=!6Qf|*=-P;BP^%zm}>d8hLWR^}|4kd**(DmoS=YE|MtxvrO`qDH!d z#TUzdSn@4CP||9*G@|y4aA34U51VgN$rCm|Y<^%sdxOA8=9eUyt%6EwZ3C*@UY@@M zBw%hCAKrDpTUgcXh(-W;!nnX7rjA=PPwwRFVxR2WWzQ|yxjEMt7V7@>DGe(3V!Jx| zu<@_8)6kEAO=lKa#&(?*%%cY^_^j(>7;=ShKNpGAk`Y!MtE3!!db(EkU@TH*h?q-q ze=jfZ7JisErRZ+-#mU-xDFM5oX%>d8Cdvl+7k^u|(zsdgsz5`_BxHB48OX?(sZK8$ zlXZit{B)mz=JW3rZ`XSBpm7i5&QH$IY3qOd`jy9xENqlZ`PpTMD8!*&Nsljov%7*g zyE0C)IUM;~t?-v|wHxHI zctdqBw(D(F6w1eV5(b|`3Qy7USpNw#!liW>zJa?}9LMCRhic?$(i$~yJP9mD8`pI2m9hxhhCCy!2>`~* zj&8gcslRzfS?*iJ?8GWwy%wIr6MkV=t!W`^Jdz#1j0#XR^nI!!Rjmm1VlgAeCd&k> zc|pcfv8lvkpd(9*{cWOF)YMnD_cy;NWLN01t+Rge$b>ATH>gdUd5ddcA&yNX zcA?MT!Op5E;ura~Vhq+w54=ElYW9NDhQaoYWJc1to>YqZW;+qG`&*gdkD;sh`;$r^ zg?*=o#PlRHoGp9a1neLe6sMA{0#aNnpJKF`9Xk;J5|pe8W{6@pv0K30fgx&f7yn#6 zK)}e{spnLwjrmNv?j@AJ2`tE(pqoZ*D-{3jz79wd0`R~>opTT1AfiE-xlomMc8P-V z^f-X?~F)?i8v}IxNNzMU?!9) zxSg~~g0v7Z2K}MB@~ztu8A5El%zyG09&07E9cG6%|GUybvY>c%RK1e4rLPeD2{%zQ z-@%P+UB$qWS7VD9jcq~+mZ)0uAlq1nEmmFFxtHH6N-L=8#`-Sv=Ui1}LSYA0F4ruuWZv1Ug-v$}n}r zO|Sd+TR&kO{Ei`8bwp3f&2p~7ewSKJbT|Er%kpRTQoTA48RtPO*L3@PCP02$GP9W= znZdkTwanu&E1NtI+$vGn7Pwa8@QhRd(S6?2Lg+8&^@gFjnK2Wm-^s#gB*8;QXY-Fv z4D53wR2;BtcSl_YVuhP#L7XlZb47hRoxA4JonY*lVB5bqK^N9!+j3v|#N(!e;_-{kNzOtO?86brF0)M;F7xKRkRQb*wZt`SunQ&O2RVeGZfFLp5u)j5-SGIssC7~vax?7{hd z=<H+=lze%Px7?4c6! z7z*3*uKgI$R$W&`HRj>#6w^`96|p7KJ$Mqmt&vLc=TO2clKFM56WpuH#w(hJlaw2_ z0yZPyABuHJGCp6d(q$pi`$YwdUPCpJ(ynLV3tBXr%AI`3J4k#~FlXzP1-D3#rvd{w zjACWh+U9!R|D;U%yA7t$rF*pRjGGoy4s_-nrt8|LU`d8jomjOKTMusUc1!W=kawimRqY>KTVUF zHKhg@bNTFEI8#W;2>X4!9Q6J2}sXw4u0?*370Dv*iXpv+ywh|T*&E=P)>~_9;SPef@*7D;J z@Ag!%kQEc-CtA8tw_EVr#^Bd6o=KMB8-3zAF+4N+m$v#~`M90~in(c$Pm`$H3VIuc zTGg7OpABC$^IHm!?Ys}8K@e3{^QS%zu+$u8}wR|CETus0|=fw zkJvAMcO24A=~bIkT>Pcr&ZPPK2MMo87of+3;OSI!Pc*%~ol+Ao!P=m)9QZ*%PegP-(WE77o zd|mw^j&vbrw@Dx?Z8A;0E{i`?8g)T!pbC-V`>8#RD@;i_uo;s~rd=xOC$GLWo6NGi zf>p^5bu*(^{UWU6?vN4CS$*kmr@6U$Mi3J-pg~AK_Zn!y=URuX1Nr@WCQ2a{-0!_7!ED|aJk@xg z>U3solZS^@SMWi>S)d#|iT8bzfbFkzWh~-$Znv@0r6~b~ca`lUr!5~~g7I?{Z>?dY zZpBOL+^s{zPf1ZS6JlJ4Dgy%+D1uh2_0B$Dy#K`L3KS)+wrXO3am9xl6;hox3FS60Fu%Y`V}9vwg6 zz{XBv%HkX*zCYjjKHD9(-E_x*Q6Xt+X9m7i_@FaQtGIHMOsxip2z4x@F<)!uVcT}$ z`-}EQ1k@C$RGO1PUaY}+DatbrZFP9b_E4S)zE8PoUYYs#Q{EqXl^%97cl zco)jDzGPGMk~~C{>OMzQE$jq#Q&78ti2?^toav!oP7I?@E|bPTWeUri9EeTse%px8 zO}b-~3?v$DRzxbfen_#3Z3(-UVJs*0k2`)A*VS9d9Y>orp{~pCemOYoS0GAR`Gn0P z4ybC6i!iXTg!eR@-#YCv9m1?rD#Rz0sx|YzyyXfc7!p->d_!gh&|v_M_Z{GZFKp$T z9n6OZ6?=&?o!&jHH!I+H%KfqI%Y$QD%8X@dQCS#5i6k=oOTF~0a zS9L9&OnSp$8J$eVf`jLzi8CFBZSVHl-r(_F`F96Rc^K>7NwF@xs57yBh4Se-B=e7l zckhO~VrkZr)tb1ayFQ!Q{pMZ@T_q^uz&;PGqFrR?kgQ@pHImY_Yq7u{5$#6?zrz6~ z?ELEPbuGpwS?)VH(sT0G*h*e_8o@66*faQ)HNR0H`gV?%EWWCSly=7JH~awLYIYUX zx7w=T>*#*~&NgHvo8CS6Y*$;LJ!$zq67!SViau2ZH(_^H^18CMbY3bFAW6LrlU*5v zov+s-V~j8Q8;$moN>goN|7@Snb@K&k2Q5npNH4;q(B2l?PpRhG5R=oUfN@C%|1q_S zygr3S@JzA_oXgm<;7){2>(R&xgw&f%Q06G%4Rtl}Q0j+hvofFv zv+KR{U*YJKzH9*#Rz>g2oHM?6HYL0RV_5obB>JuFTyJ2rqN=}?yO6Be-9v^!pd9BL z(Lv%1-8A%m;)}w29ykQ7X(PtLt|ejyxXbTKzFeP}cnnfK8jd_V(HheY#Wkaqy`k#BH@R&MJ~tmmB1I`(v=M0fDCiIUeu2)do9gXk9} zNsFlPk+%c9D>vPFOY$et{plfzXX3RJ5=9BO#l>jxeWWkqb+K)42P!LyO(>W=G<{Ti z#ECq3D45G89C`I8T(^2h)YGHkc}7aIbt))c8Pc@`^8$(N64taaADiUspi6KdkIAK8*c zXFV}=gPZZuAje0oGdyZzhh(I-)s6frJK#Ji zFK!`OE1u zNb&>*IJ~s5hv#oum=1s|eA*8^zCQhYJ$`sT@sa`!c?kB~{eRzy@c8*C%ghxnIFM3l z9cAM_m&#wBM*%)oDe>?77^H&#ArEr$*Pr~M6~=gdt$z*K|Br5Bz*7JBtX?th8DJ-! z{6eKJ;^KFt1()(PH$CJX&1)oypc2J0i!T&wZkYSEne4rXKhKxW?ow5he-t56R6MQ2 zW4I6shhs5s`gxuZ9Mu*HZI8#<)rpENuIUmAF=vYeFWkIeXIHwoWW0^jW)gH=wEqsV zGN(<{gmryUO-+3ME0v^^H(i%zKL%1a&R2rjUFK|>E+PaknXH6h-Qi|kJ(0$zOR1Nh z664+57%3+nZ;!q^hyMAk$hiSkM^lA-ue%wg%RynVEZ?KG$SoyB_rBxTp~Z@br1a;t zIJBOT>t&F)ko;~?^{tKuOe_JjbhYV3!}20t)=~mCinni#Cg1*RQUUXw5mYoZG|!$r z17Kqa6Ek?q|CgJVzHq3E$q|Lixu&WfPnAg`LQR*`HCGLSUWRbDqRV6pGntE`NNn+tXHpr|;rOe$-`LPv~ghbJZ2aY3Otw0yx`mS$Bd_c%JM+H@<5a?=$CQ3xh04l{rZv!K%s4T>Um(BfK?tjE8FWQl(diso+y}-z3%hOI#M*@i{GhPj!I5=FL3acM6XKB^ zzeb94(@UGf*t{_RCDI-%fj!eP0djZ8m!a-OAsVDpH4gh$q22q((&iiInPq!XvkYUW zMRbGj`#`vZBp0~t+K7y0aa&F>)Au|vsh*Jcecn&vgH5aRtbe!ALpr+Y+ObT9qLID0H?I!sVtuZ6E@ZTeT2ta`L+kgJ=pXHRP$fVCWT@JbB&urO;4X zLtS5?y_s$F!QP(2NzaX9*B@jKb2q8olr&r}-?W5H^TU3-eHYSV?Md8h`1SVTmvr3< zIy@iIA-_hKnsqUw6}nT{wJ-L6RAg?1;<~%k93>JfUYcHH#*{g9>CQJ~)gG%~2#`B+ z&7AyXJRqo+^AK#Za6dn`H^~S*_2k#}*;@{af41-oLz{IE1G}%#v(@O)j{bWCPiL#K z*}i{#kUbwnP+s0Rs8&9>!hb&go)4_t9}w7mySn5&>%vEeK3W3_yht!GU!n1SlK%)1;U*P<`BjS+{rwJIK?>WaO$xeVN9ku7D|q)9@5 zx;B`odZpGHD0xDSuq}XhH7zL3__L*myo(KCP;60GuOA(Ok+>drHs`}N?X#WAD#qLc zwf|YvlQ6;EPE|#rgJ1XAqGMvfj@`GX?XoSoEtwo5Fw6syBQJ;<{^@x3la#ZP@7+SD zeKp+Q!Sv#_bhJHe{bfj#scOoL)cUxNJMR~ zFeT}(qN2N&$5qc^GxbFpnecAOs>)ScE$4&jeOa)YgJk@x8DibozjI1sYtTn0!>0L+ z+Jj?b@hooT*@#u0SGPg!wz6QXgRTK?*5F|IoVykCOpoJ@0#Tl;OOGN!#4``m$x>=p zH?vPAkl(OHZEs^nV)aV9$)lmTv)+qg_mk}2g*D#x8B!sy>45d&Fz!@WE^WyelvR18Mo%^l1||4aoWf22}ctFegVxW#WsHBB9c&0NbwoM$X?O zWbZ9;kJ;Lz4R;N&H$)!uF5?h*KGjvPqbtCx&0@=I@}|(Hv;8wGyi}X$fwC5Hm6Xey zHxQ2B**}W$()JnLH6A}Js@z$*8OPF&gBkgucYD=tm*=OQsN~>1Jrulp#-mD+WrovL z|1gb7cF%Ma-b_b1eScFvxnqj&$=TJz5k>vljTSneCI7JV_3zDYe0q#Ce9m7km-1}f zZE+I;MFk_R+IFEmgqX*As?KqBAWIH~5;-Kl`(dmoB$MCl-7L8bhjo(zQl#e!pT=N! z)!06{{Jb;i$3Qbu>l6m<=n#_cQj2>gzB{I8E;v$bq|{lN9Jj(; zRAdINBbG^tqHV zd<}u3I?NRe)%>HY9D!lQ3@3m2Ns8zdHtg8J0@;FqTsP321mkVJ2{IPUzPPC=C10wt zgzdJ*Ifrpw*R2kdHnv+w7jAu*R5z~Le)Qr-7b2)N_if)DfFlSe4XWIu7|U0(vn!6? zRfj)D@ozRe>zp`5&~4JaZ0xSDJY~a7ag`TXME7luK3OA(eq<0gQp%dBjDe`Y1g{&>L&!Z9LFmaSq+ZoMt% zwpwVjL03O!*ms*4UfMeuuuK&Y5J>R7W1d-B8ZHCmy9b`lS3`NKL$?Mr$GqV08d4T6 zalehOvoJ6)u&_*(o7|^(Ki_Z>q0waIH!_Tl*97LFraG8POgEN4$Gp^q)P1xM*no9&$NfU5*=a_(o^;%gFbt6jU~g*gxK3DGxkP+q(}y zPVM%@b2}HYIh#`LX;ypU&(z_8rW=he(9Vk8%jp=MVWKSl>5^mPVgwmoyOthU*qrbZ znNQX^QJZgiD)_O&wio5zZ=8#X5ebOw=<$vU436O5D`}_$&|`ap$H_CY@KJElTAj0g zC z$(5`YsSh&a)rnztKqVY2Umu5rm#oQHTv?8xY^b`ECG1Lu!RAc0Fi+UW)ca`Zn{xl2 zz2=I9qS=-(bv>@Qfx5ek25{5_w#DB$AWX16uRz7N;wcnaLYlZIVBPHW3*H#JgQQ*j zM=zzv)tlxu^n68(k5XrN=8Q zlz?pp0aDUX%zoX|rhR zowZz!=QiRuhs?-8YOrIXqhIpy3}%6dl^rS8lZ|{9pkrdPwA_<8T4K_9_Y9ng>btw4W|Z47Y7Sp>Ov-wj|zZC4m&8APjBD~ zh}=DF-vd=2*z}YJWL!zolzp$%bF0!>qNJmuq`QurXbgjqq&)S(H?NO^Q+SjLD%a;v z29C&;4K(!5M5A2y^mubhye_0Fz^gg5meps6bWD9~y$QL8g++X-_15#EJC3+QlaA@A zGygw7{r6&pLqF?mUbM7FH7l32sAD|4P*vvEKR@(jx-q-^61Fwj-Y$_dC=D3W!=TgC z)2=R=wHtr#myFf}FUL1gp(6n zfJ(v6M9vQ3lSK%KlXsklwr5bTPf2^knye+2{bEqbQasPY?z)5K?>?Idd)&9+tF6v! zA)OVJ8W!aB@~W5SZ8hi3*^vVX!nX&8~9V>b)xLs`2pDKh9?>SuGUhwxUz)vXGNx_jr%YFahQb2+% zKk9Fb4Z)j79e3QzIV}40j;>UBytlcK7Q3Ri0 z`@O=O!cYe|=KUban1BGPLFKtRkJUdD58PdxClYvkE^Z?C@Gsv_!T<$|FX@84UR^Xu zX?`)rRHlQaB;PIZ_k<<6P3Ay!HMw9Y3`yPBVJe|&h|3eKW z^*(GobR#=26x?xN`nsa$vloir9C?p39_73^L4})L1Ub7`{&D_#jc$#u;*tQ_%78VO}}L+ zm?{3o6!KgNi0*jtqqT4SW{)sU#`zeG19MTX<&_9k-#zLP07;bQ>VyVz^zi#u8sN{h ztT)bEVRO3Z2LYF4&HS{b0fML1sf5LAA4l)P8#cQWrR50y9L=X*EYV4CsGCPF6*-mY zxy}1_J?YDNlXU}fnlQLJSz#8|v2fSTYiqI=FkHDf49KVarlzLya*hr8u#O7LQC3E* z7n)FLoe2_;Y(W&I4Ip#AP0Q8~ui6+j-77+V-i&m5rK#tXyDmBBaGJN}+ufgmuoI2p zdA9P(MR}j|S$7(ix@X)e=7(_UXAtRVfH4Oe#@CRjLy;r>xXw{PJ4;R-QVNHz)+-^`iwZ${-Otm3jGODAjJa=@K2aalqWoBh%WoA-u z$j2eoH8nf7b3WaTLmq6^qen(YB8|oQyO4;Q1r@U|M*1gh6F1f}MK03BWaESnzG*Mb z6ua3S$7_bVWF+MsDp+|(@Nqc2AawYF2?!Roy?%sc(~l}noAxpkU{}(Dx#BK7wE)&5 z0jNONISX?!UFE;VM%{wM!OP3b!M>LAtgo#Bzg1X*KzKb^)95 zfQ{)cQ|`7C+tf^1uVKAV)7PIKAJ+!K|c#U(4csq(E!XKmnoq8s1o0ogU7 z?MAw?Ri3dX**_qQs3eb*>o`R)#Q}L_JInJ_3&_t*&z*Qd_7o&j)QU@vjEr1fHq8eA zb3)fI#p7@sur0sqP^>fUA!5Y3vtU=N?>vV=_WG3w02Z^GVcmFRC$F8IHb=~N`hN1! zvMq!XdQbUBIyE#q3zVW4v~v6|E+>Q6?{i`fgcdaj86AC(uL1%TN3SEs1Kn?8TR&_Q z>Sh#gq288*D%+Em>aJ(z(T$F^JeN@fryo9r^Y zn2Qv7c{mM15)0WZtXlnU@+h)qPgZiK%3@mrF2V|dFd$bQw#F-RU?n59_AA`n+#DFB zz|1NJ23MU_VP6_o?P@_H6<>lFa$BMHXMJ!vWL}(h)G7ce#`J6Tbm|t+e80Qw*svd} z8Bmh!LF0*>rqyaa-)?GcxNJ2f1+Q$y(N*~}yLU5nG_ZnHO9GXT@>!8^6Fi(yt-C`+!{_c@#2Fhr_q$2 z5vRzyDBjCYdRf|ZFKksm*!^}13bzbXSaU925&=nEB zHGVuI^#wgeI_at48)OtROBjQM^kSCL?Pu!YWtF>lHEu`^?Ui2U)`PrR-CHvUdjp|? z)>Ao$6|*k6ayzvS>2t1n0|IJU7}u$9{zMa;IVc^K&h^?{=33O*54i^j@Bw1IdJsd6l*7=~$=E2&l&SP843db^)7A8YDl!0`4wMWMmhl06jnM&>S6032V z2tg(~1_momQV10-Vo|NdhCMWp4xb_rmRc0JT}pPiLe_S63d3l@(NgJd$A8?QEKXtw z#vxOQ?x4$aFkO-EA-rAC$Zxkpp4U^eb8kIy^A zBEm#FF!h+GJ%1OBe&bE2A~8Hx2=7iCmy+S?(BY|Evi9lq8G_aPbyc?XAb zqhd=+5oP75{K^mi5sKAz=aFR4 zX1{%V!L2v7g25<}dM&W}xQ*J^-&TON+&9(FAMUV8KrO@NXjIwNO%;REH175yakY=* zJGA`|6?K9iHcod}tD{aC%1mPRqduIv==uxnwRgHnkGPuJ+YOLEJr?2zkTRYK~ zUe?O0=;ve-w90Vh*3@X(q=n$tkDKGtp%pFmrJ6);ci@qy?!KEl{bLiTe)^6U9})b(nmfLAnw2;p z+t#m%_0b~{)Ps~47Z+DokIn8jDuy%riwjCfLC&6hjoStJrlo^+%LDh^giS%u^wIhu zi^;?<(9m-f`ror``2OA9Rw<|Z&6Z;|<8yQbl7%ejr_R?mrwj3XPD&ZNb#zl;Dypig zbZ!naZPE4hLKQ$Tiln#8N1rhwnNYGW4Oc4n&TZ~5*;Z0ZY~dg%4EBysf+=(UEZ&3O zJp=+oL#Q7@_s?uF`yT`3Mn?pbTzKbR;_88$Tv-`=08BM-lv5yvMhL!UCUuFlD6efzja;6FZ~;^N{m;KL6FP8Q^idUx@`fg{+C zku@i&ZDl;XfA-{^o@qVv8cb2$Z(bCuGe#yQQ69ib(zv;~2L=XS1SsVcfBhPA z{(h0O_lxK}ymfGRSiH=joWFH=jJm-^d)|bB0X4;x?l((|i#_w!@FLWfcP&xmG=(pO zf@-JTY+_jJuxS*v&k3yOM(wECSCtn5VkzB=+cyhQoMf2L`Sq2RlxEC1Uj#5k83Q+J zjg6pCe;S&qX{UkFJv?mk@YWpU9*CFnNYscl_fL=5*;VtD6esgFD-)bIBv9M9Pc2c{ z|CT>dws)krSO0PB{MiQz&p)67fzmJojh%zTiI)uZ?s2H;;Qq6}=I3{qmiC-k3h6)Z zoww$WYSn*4thh|*r~l^7N7Tw0GkW@mhH{O*spY{xSRULd`PrSmyoq&1*v+lB#AUiA z!nHOL^=^uwp9-p~Sw@eD1Niv(xZ?xiMFe6(`0Q4tWF#b-N2;9^1Naj=O-m(1!E~?U zV_cfM^wRgp)lF2?t7Ld_dtuuV9toH#2^WfBd}yezuaE65+&gDk4k1Tm7U8oo6(&na zK)|9v`Uz7aREj37b5^0aa@f?PP7uxl0-^B40>;uHD9FH!ys__CmTR>9fRc9D^>7(p z*}{1!D$BpGXZ}DUta)nN@v5gg=hINVqW*7`#!gO^XD+3`Bc8|4*LM!qf!lB9x=Era z`8Uo6wIiq}M>V&&Sh<*AS*2A___e%z-&p&Qc(&kORb>SCBo$ zkW*GxR!|V0MPyHMaKlzKQoikQY<*DL#RJz7O!Y8~@0hkUTx3yWjAS$DfJ)_;s32)L zr7^A?11@?-PgX!+u9C5_F+)7?_jjq7iLo9qMIOz@%`0q zD+&5TGNP}amfij4(azS^oOLcG>gZ8lIp0YLjrW{(R}|R~ocSwQa7;fFX=hs1o@4PjJU1m;41Qu zR|bj~cLvSOjEs`q+>!NOxpglT`-OgEicm;|w!iNAkh{FR{JJRm1WnJeI$?eouCS1< zS#Ig21?xei%Y>#@u2e`yT)j#;WXfSs`VIrGU|M>ZqkWv+{#rq^j>7XHaezpkpnpnM ze!l79&hnYz2AiMzmwJ?k>Lu=s<5Eyjd3uUayCKv!b{Vs7-vK`ILMEfAuu#7DO|jbH z!GU!JsU}B@ZsOmQ2|OJoiMin)M-zJulY@39L~K~E>?+rc*9Tp6rrgV>2#TbV$4u9~ ziYI0YJdZJAvJ%bY%7xbbN>-=ZZE} z*oBE9w~w0zJ_1K#8XK6Djf`P?LqttWiaO(?UM`T8A%4A!BY+=IQ{TNq3u{g_qQ<=- zi`mji1iB5@m;*uM$-H}%cDN0v8&?YAS>mk&EWMtf22tUEQjO8zA| zs8Emhzv8eo(V!VtGni8$lEk-k(T8#BnB3C0AtJ2o;x&$JT^W5MQT^JRS>BT;_cnAL zarcismvy=YeCN5v_z(!>rJi||T~NFOa0rs`7yogd3n89Sddg%hE-n_S$XT_;z9XUcs8v#8Lldb%>A>WuvPf=V&X#gL$pf3QmpN zbb16rqy+?gn@t&{3NU!Z|9g`*e4XI?{^gyX4B4Cu)KGzGYjN8t+H%G22ol-QcB51>4tCS0}gAJuv8BYdltCO?+J6E`uAsOB_o zE{rA7K1Yjj@)HE0RPwXK7;F#qFE}Up%R(k5CVqqufEG=13!bGRDGYRTzGrk<=xhI5 ztpbRw=A2ai>2U;czl=V$T|9kp)2vt4IV%^3x9EA>1H5asTr9oYd5&MlV*k9zW&b2R z8W&eY_@iVkI~#w^z^~qwAzR3aSCd0Cqb8~I7R_TC8}gYEl1`Ufn|_nUc-q-5cRLyJ zuN0N(hs|ko@egjlh`oQ{{e@uz9&ak94NRItxHn$AA|#d~*Q=$jzPN5bM?O`wC!{ou zOm=_xne_QUnjeSh(BI#`!!!6;*F2h+-Npla=!+?dUa@}?ogZ0WW=SktyHZe*y0)D7 zm&w9Y#@2w)mtCBuhWD7+8l@BGIS-n2kktHH>=nx+^z(Xz;b2TpnZsN7QSII5krAzB z_d>Ni9=nyc${Llj^#!MQk{V<&jW{O@E|q(dA6kydVU%A0&w@D~*wx!r&C~;l1zPuXzhTDnw^-eAPG)v!%J{{uA2-SR__5gTMTC2E z42$lMy;%ipW-#gtK@Sj_pW0{g^Ya0i)=%l21r_J&;-chimN`Z7O#(27@Owt0u25?O zdevP*&Sgp6&U~WK8D#J>r$Roa2$OQ<&F$KG96KK#HJu6@{k#D$HzLDMFE@^_j+tg| zamJsi2>GLeFN7rbIgH}&#wR7hw|@AH8s3Kp*08M{9yiY<)?`w4ap8;i;0^g3%>MC0 zJyJcC8uE<9&WtDZO${((tNwDAiy2zngOEINk{L_Sc z%Vq8T=Di=@l{#u^imEw}>A4z@*;XG&;N~8YKI&q0T*aMrYRb5I2SO4nX%tLig83%` zOMUYI<|D^9{Q=K zuisPF)lO5ySdYFna-V;<$)`!kQm2sW8JmvPG+`11#SptgLT)5eOr`31=WdKq8Hpy$Eid;5 z(&hA4Sk20C8w@GCQthaFel&^b8$D)Ci2b;lA^(HxK?`|zF*1g!nuR%{tv`0l9e7&IWEvyeKE2buu&3c%ZE^OggN*M|D zQD9=+CC9?T%2h9w>3#EvI9?`0Mtq6-G>=^te~)ipBRu|RzSjO$bhAbP%S=jam`;TZ z;~-N@)vH#{Lk+mmZZ?7R({R6)#O& zmM<(WmYR*qoKm2t^1~^0mi+BY3fV@lkzzzVPGDAjef`!@MyoJfhicipOrA6!Yg>8k z*{{`D%ZNTZby11E2xEaTztyX|@jitO{?q~s<-0VWnZC*}&qVr^>nEp5IMe%FGyAB0 z+Qa^Caoo(OmP-ch72Lk<@9VJT2>HCPU**)tzCqfXlH$?vq(o8~Ni2f)YKl6xeP%nG zs0xY;x?2kU3k(9cM{m6$l*fwLpC%B@D3hi6q1xt|BmubT2a51_BkWA5ds6>Sd9qamaRYpQo*#zF6`%bI8SkYK4o`Ra& zv24Bu&9{ZvAsdKi)pJv$4s7 z5=$G<094soD4On-`$QxUY2;e#2KC6m&dG_Z$2)tvD=l<+QB|*Y_a{!v5BkP#>b@Ew z{BcxoS8Tl*rQT#Mv?9Wmn_y2!6@^W9QhsYQz^wCxo)1j%m)ma%pa)=*pI+(U9c=IH z?rF>3KYx0-keJc8&+bQ3A{2gwsz_Z{N2>4L!_NzdwtQyH>9)V@v#h1?G1I1v*-;u# zyW{Z6ag6t~#miu$&pz;c~JZs`=` zN^~ZO5v&L3AjYy?vdTSk$cJ(lZhGycj>cIH?1`1d9=^EAw@Le6p0d%!;P?JeMkwdY z*i02eKj&B)YeezgorHa?1)x4+q@ zTVC$NcorivKZ46OS|5;XU2xxUVmtb%)Uf&?k7zVUV-fD7R_dA+!q*v}p4#gty7<;BI;n6M?JP5-UJmirX`{?ogM(llWu?p0?RmLxZMqc-Lf z>u1ikEDl=SF7!S@ojw9szMTZu;`tpqFVhNv$00q>gY>8>hgZxqC5Zqc1&o9nJ(?m% zSg9^oNsnG46d;_(dc&1=%#_B&Z4WU#qJ!Dn(-#}tgjh&!Gg`6uNS#F|pJL8;^I5q11*a zNBaXl()|&yDSu9hA+t&K@o~jyaG1_4$(0-Z>0=pBn4iaqoE1VLfr3m>$LFrI^E6&ptk$8+_fD`7VC`dGc0mU+>NJO_EnzBmDJp;J( z?ickB_4cM2ZS6ZxPfv$ZcMCiXAjMX_Y+Ov02Qd)NP4P0+;+|)?mD1A*XQliNL?M2Z z1=>Du#pDlXB?AWG++6C&n2% z#8~gFgfP&{p87oX#)tFzkT%}eb`%dt? z9;@MCWo0Fu6bdBN`v+$;{=01J-H>EtU9JHbmo;{J;8||jOmT(~+omn&=QAk{paOXz z{&~qIjmk#XW&gFKSrmGTkgU~%SFb#GvR?o~Mgd&GX(+5gwD-~D$C1I>$IY~fUA2Ob zD^@BNZ;c6#1g?~|e`4_FP?Jt~jb}#!pCl49^$zOOKYgB4be>!4dA3p*bXMT>mBdP=3sMN4 zQrLRthCm{$0A#-Khg?`sFv?LP{|Pxttg-Eynwr^SiJi(?S|}{YLj||+$NNkIi))uI zKwdjIGCbU__Xa*ffgXGo*!x>dOL<7K6;oQLvkM9`tZ)e^jh8;E-9SG>W{WS6R8~~n z(t!^Mno^vbunL&)<3mGZg^}D5jGgm7iNZCHG+#aRK2h>t8DM&?ap%se&ByMBBsY_(%JnAN#@x_G>-R@I z5q6AmEVY~`YkMXAw<~CMZ)~jp7zvYGUHGJ}u~%HSm(E!!eug5|2TXhz^jbH5b*Q?J z8;IGi zKlbA4MN;G}8X3pJm5b+AIt^Cp=!NwQcu&xS$jq({Q0PfBCMz}Wg`QI|x9#k_<0igL z{E9n+o|1O0Wk^OqV1D~9v&prPgdzgB5Y70EVHgueQVR}xUO=b zDcKEc{r*G~>P*18k(=jBh6l%$iOS(S{*}Ww8+hh63!=Ts{PykJ8(}5yh7cdhoW8+; ztePegQ}WTh&88*$_h|z92YjUVtnx^DPtIN%$C+z%9z^)|$BPr@`l-Q+7#3AZTWv&% z&#J(7-Tn7oK0S+vZDi69u#tiwjFib~<=gspVQ_;Ty>;2I&0jG9y&;9Gd2> zjnHFyjZEIYZ3{$S8)vc*mps}OAu`3cS`Da)Eu2_1@cY3bhU1+_&Ooy1eR5CXoB8Bk z+<OBwc=sWET18InK6+o~~mXr2WCnILAUZtxk9FaGqeuP2m10@L8M*6nOGK zXcd255ac5iHL_mTsDe9K!Fvyd(6C2KWoP#4eD(MD4?dV85$ePD=S)_DP`teHS1jdg zMh0zcxSE3Nm%S=kZPtFOvWGCieD)h24Al=?LLB%cVjUkIbc! zYBTXqQvehas$6qwO!8g-1{hS?akpo$7B>G~O7gO}q8}R@w>GueZYwL2IlNAK_^IhZ ztjKALkGR|x&Q@CR6_qWKW#GJudK zuC_m`b#QbRCMc94z5S^v<}RLgf-}=3*?Y6oMq8h|)>ynnYcb*$2io>CkcRwI-EbjM zrALGX=0;5BWV`(V}ep2f@l53xV^hZ?ZTnd{cj(eAO}W967(-FqpH!Tt>H@3p>gtp zoEZV9PREOQ%PHPJIRNnx8QE$i{^}dN>b%E? zZ3$b6*tbrbcFQWY%^J!T9<~#p4=9GuS*bW6>>M2({l!wUs|yQ%O|EwdaQOtYFM9-L zvdePD%Pgw^&AMS)%;n1@8V7nNXjU;O`im|HZfT-#)sZ@vSB{G*cxyxVbt1w2S=^2~@kaqw^#i z$Gr{yH+Twk4{xYnY2M@)ew}Za5s+LIZrMBd?sd15H|MCN>PvXsm&p<5@OtTOtShC1 zCSO&;1L}~oOZ`Rt#8(A9g4sRz4HvB;;ysR)R($c8a|L*)x|g4$3Zq6>3oo&p$x0*i5 zVo-L}(qJwNVXXfQYj#;8U%B}@1NRRh6ZS#FXU7Et&xG~AcZXY+6emjb`oa?wL*2Dm z@m=-NxRzx;M0@F2#B=sOJ^@&tvrGnqlF$?%f=uiRJ_l3$knv4;nCyus=YYp-bo31p)T!GNX(PR1JyFPg&x%Y^ew0iGhD+7&UwhK7VpCvrX zhk@L)>{(nnD@6;DvT3+@T6dMP_%)(Sl*gjfy>Rj5a)#0@3yAY3O2nIp`kuqYB7`0` z#5{0(4oT_h>CjV;q91`dor`-bj*N-EPZ8&I#<~s!7gtyFiT@agDp0~Adn40CpO<(z z!LEI>GPd_)RkFCfxVJV0yA5far)jjE5s+gIXNlu1RPUcjPPN>iS z1lgY3H}nfmp(DK)E;tExc6P1}3?gTRDn;_n@IEh+0#$7VW3k34$1jgp_fKz~ZhW|> zvr;`-bnDdm83ct-e`Cz+sM%G@I6rOP&a0oj8C4~ODb-lLL0>|dx8S>_d0g*r9|@S9 zdD^GlfG&M|kE?c1-FAX@W#!xHZ|p7+$qOE>d_K=T_!J%6Xtkfv?WzsP2UsZb~)IJ zNJvS;Nv>gHf;zeylu;WS8yF#N-6`VQ&g)GWwMR3tW?Ch0`>Px_BOj;}j_-obw0V%n zcIl(Dx82fTi>1F`$98&T1^AqH-X=6kLG2WI*O@aa!`0i__ov|EdO}+#nt0s1KrB#}8a&QMcSFP)8XA5E-htXzd#zOzST46F{-uTd?WV6Ejo$1Cl@}I?OEwd+2-ep%quWA+TB5vX9c677k zzNOi9dwPc6MJ3{Z9|;Kw8nJh8?uPeqww+{i+SsV)Um|JaG*e+cNz*`bNW2eI@(B0x z^TOpA=ISPMC^01#O}G!t+1~<*S$yo*gQ~Coh0!`ZCCwprt6_3zGCLF&Z~D0N zYUku|RXJVf2~-SDJK%r~I)mJfbTfKxuUM^sJPuyM+o$2?;qk(~eEBl)c;iBMrkQPM3LY8?*v)47E?3^5r!NorX zK`})wJlAwS`BO^#g2&+DX7~b~ux|OohYzbRt4=Qrk>AipMn}mhsTBf?kr#RENt!_J zJvG|X-+!AuOxNAu&nsibPCA4hu}9sQZRG<(ze1KzyK#I@kqr@VpCDuO6RUmHilRqX zaUAr{)C4a1d}`C-eY@JY+Ha+Ly-OP=N$UKqDrDMNeI&KUt)z1UxEnt04IcaJ%+XyV zx<+ZL!EG@*)wca%kZq}5l==0opH}PL9KU#z9-D4Bel1cKjS1x`dLDlCl8UJI2yInH zVgDgX$Jp2ysI)T@f&={hdDp&Q1hH=IUK2BCRT~Rz4?qvGeL1Mt0k8|=+quOTNerjZ z9vsu%RjakJCw@&gxmG_mjg7rjRqa36UQE%VefKs4Q}xEdT68TYq2E1?D|oy#fWjKT zc@s=+==+nfx3?EmGWPbLqBK{oUb;Z?`0aDATzb@_{cxQG{9L`BfB>Ub73T&*yTW?@ zHx9;oGkg%4!(yy?5+%_n7=ZNK*|AkTkAkH6dY6Uz-hzxHPbtTCE?SBPM2$rHG!mh? zN;%m>j^<_4Z@Fv6VkdoOKFMO8@)DHVpRfaQwZ0r(8fez0El{(5a(ZWU??{_=o!{B= zB&NP;!pM`(6F^xEXU>{I13%XFeN66NnpDB?c#kkuw)XLsbvFZBvHndIgx}iFs{@=2 zcCU9o_!Y1OKmqM+ZS%Ejwf6JS$?Pd`KW;eE@?JgQT*-O3sdsjw;k5% z&{JIJ{Pf*HxJ+yLKyd2&qHi#JI*#N1(U@=o15Uo!=a}Sy{(grZX%6Y*h1j~fI?y1K z3b^v37udJ!QWlJ(l=&Oetxir(5tkoflkx7QF}pM`+bRfiUIxV$x&0X0GD5ph&nG*( zc```AbuxfM#${$MfzJuN>lRdJ<}i(#2oX$RWMnkrc6wY=@>n0R4I@KCLzF6o^){t3 zq1y!mC1XO6SD~o*zmQ5` zdA=nlPjriuD@NkGPKAh5X@7z^uoFBT|H0dP*|yTT*U7+OHeEVaC7aFQe(M(keEiG~ zD?%b7=(?kmO#tUKyLx)En!)LGGsMQuj-gY@a5e<>yUy@HfB&Pc41oaBcitpp;z4A` z(ACF`vj#PM?vlg$#gy+IS zY^H*otmm)({;VTDaEQ?Ij8RY(M|lRfvFZ(j=B>$(2!fiNqT+T7LjjbCOC#0l`xoxR z0FqI8bNeJLRfv<0PEMWh2sWmgm9-qECz||BOV!JnhQSY|Xk}#G8HWGm`Ccv))^H7h zq=K}5Aj<}1eKHwg9nC79x&48vd8!-}FOtg6h%r=}bAp`JY$yKdoK?o<5IXJLw0_>{f#HXm5v-ho z8iEQMrG@K;d$V{2&HdWbeDt<0Zy7(;y|D^`69Ht>ek%m&M&(Gey7!l1kz7x9%4ZWM z>5$g}{wP}Zi*9|{@E+`S-u_ff}Oow&D^4IUA>BXQmJFd zIMio+Esl z1r@D#C;UJ~AxBeLd(c~8zdn6MKVf&HmDxMzeIZNelt`)UU`2lLmeQ?9&`G9P)h+4!|p*lNOaaUu0a ze>>p}Qj%+wp7^-JGEf-)HESLsiml^a*9h7d>2zk9dQq!(LSkWD*5=CYM-z7=VxSyJ9Jc)mPS>Kb_)!xuPA71lP9_F zfwzg6KSm;-4)y*hecMk}D>&+Q>Z0Y$_0rt@dB-MXAyie9@F0LWIRd}^Br^SJXq1+t z#E9!ozrkHnEqo#($ZT80FG7AuNksP1!o4-hpa!*K?PY>!UGO@fG+6A*1O2aNg*6@D zMGAvk=e4kg)dBmh9aOSm^ONlor7^ZEu|ZxR7T~ao5x(yONx4GBJ0;DZEecvm@!&lA z`E53vN$ySU3{7~rU(uCzEq1vMWsD^4PkY{3F?e)DU#`dG$Et8Yx^KC0u<(FoIP=)I zqKWIlQb)GgupNy^4W}k;OI7lpNBpaOvlTP^49$MAuVV^UaBB~z_gqLe1DJ70ofZ89 zGFLaTQdA^fvt4Zo`qrWtDWq0rp0x2$m%g;}O*fUpVZGP27FJ!P806i-rL~*!tDc9S zC?tMCp6~)s1zOeVZbo9g6BN8{HSo3sc{mDb_r?x5|8&a>R@)1`8|7U9bVt5vm+;JG zY5BRIn{7wl+r+L`I^vj9#TRiCy~X2di~iJ==`bFv@7HDMbgIV|<+RnB`-!)TUy8H7!?fu=I9kA_gT7+d#|65ZtGi8)AG`5`v*nlvg z56XR2^P(kCV!DOR{cm^kzkGREd!Qq)pq3$1&S~5=TkwF|YIrDh6kQ)}kiujmCi?X) zk$u%y-#(1i-CyCk%YXVO`rtz*(QZ68i^qPb@|{Aaq5CaW(T5^>Wcn60;8ipKM4`_n zU3s8ds7KG?vOXbtAP{wbmc#qnl5oQ1-WvRn`{b^BNmam8+}Sqzz-2Xqyof5OWPUrf z+WgV{X%(XTUh(BsRo(C)uAIV+JyhCObI7P&qcJi9!jTuIG5da8Vh*a>o@%$Dc@)1=yc&5++yf-V^!z-c}ZBQK39UYerjA- zNsJB-OF+dbl-4qFoHvGQP7W6FHh$1XF^iiX-D6;4!j77SvOzkZNb~vghfryZ$=+zX z=d+>nF|*0s6L)vWN>=Df*#DE;AcmpA znz_%@_%|%%-oh)HKR`b;ny}X|;YWEl(#|#oNET%Mcxwx?x(;vL;)0tSpI=A#k2cHR zTL*{)q?+YANj+hCKcveq=`ylNC1c^>Fl;`KOh%%$;3ZMo{Vg$x)o5h-)I-W?F38W^ zLTuFhIZn?R>etjI+rz`dX%iXGnNK99Wp$7mVmMm<)OOhW6hS}{VWoK2c}~;37aG=D zawMUCZ}(#!ypDs@Iyjz>J_J|YX!#6rU%!cQYD_OV?6fnqmGMt64hTRDsc-h zAn~~4!AJgl^C)|Ofq{XK4Cgq^(KFY9{oqq?45hfsz!QUFwFt={2vuDaa@{LA@;YRm zo8?Tp$#(6kYLN^D?(=-hJ4qO%g>uOEzgmhu*9F@^pvsau+310iZOUYNAV-3omp3=p zv(cxEvFC_TS#hu+<2H`$XOe#tQK)LuSUy^hB z@^v|)I}&5F=+on(5)xNsMVK58<}`8v5Sy1^5$bam{vBg`@#01By9(@} zg3Rb?e#D6Am*g6L&|h>Fm`rbPZ>T&U10h%b^zD-;b%1EtQ(4AlV=T8a*OImw%10hi zUY{3N2|6O^2z{rjhz6T26Ort7@DXp!oQ7G)?uNh2k`8xCt^87y2 zHndP`ufsmbr64V{*q40vK1~vtdbj^#Rc&pe>_T(mmy{@>v~_PEt3yO}n-2{LW-ky(#w$+upq5a|>7Cs) zMAv-vit_j(MUH5^o+b3l1}j_uqyc1k5n=d~T~Lc}&=cxSQa(e5J8g>q!g{6!uHqz6BM`;qE?+!Iw1SIEzrkPn; z4yVVvg98KM(7ciZ1$S1K&xr(?kP>cjdIE^}Bc<#(NwK&QTrc|z_2c8?>y{z1OE2V# zLLmdgk_blGZJ~%(FM0cl#@JNB3*?KY_w?GS1P=Q2@!;7Wx^r1 zGBlhAgvn#@ogl?d?JiVJm3FJHOJ9FZR1QNdfiAyDfyZ2T_3|PhTIy#HbamnGUb~`@ z8Y52Dig25bP@dqg`sjEE1OciUAXX0xCVTOuOmC;q zN9d5Ul)mcD%L4ncz4(Vg9TXGORvF|jTXJPqcK}5|h2NdhcCfdOIIc?%xt6+;gzNq^ z!%(YCXTCR`*`oK+KYH>wY^2(LGeX}3_d$}oo0WUOm4()^FDrPw{s4!($g)Na*}rw8 zYU`I|m~T5gBovAWNF1E}5vxu<(pOX2qcDS|uHl+r6V*>AO5r|^QRlFg78S^lVqbKq zX=qgGHHS{OMn#2(n_q6rUS87HU;a3}YX}QTtq@vf08_BAu~F4= zNauS@N{B3!X3W|(ar{zc(3RIhuy?ZeQ{V^dcTi`5W!@eWq!Os?zN{uIaJ&WHA^ZV6 zBJ6xIT!#-sP#exyA6Xo4|7jhB)fWvew*&(g@Mvnh$!WEk)=XTJ@}dIQ%r)1CU1Yd# z(Dxc}1Q<^-E`osQUbWWhwj-2|FP4GDYvlD2Mf&2B|M5`^d;9X`OBl$M4P8}p*4*V! zi3L0E)6<8RYP)Rrs?Y>$)Hp}Gdh~%tXBzJfAV8)t92a}H0-(x|RqZ}qdiR;2$!-)T z*Z^4hZS>wW-Z3-AWpDsC+ZKmD~|C6ZR+NN==OI9``~+ zgE5i*_U+sKUtS4&D&PqmjpIyveGN4jsdCuZVe90@8`n4s)()|FNrtxzAPZcRCsXy;`XvCMdike!6L6hE<0>FunRB=to zqV?-Ly7DgpW$tje{*zo;>lVSr8r5Yx)NqoGoy|@Ga#Ie8&~Ub^A7A}4h6JdEVIFkn zcS#@d0{p$P1F3zMRVXiDP9Li@m6eGcIu5`=#J#Vi)?VRYWxY9G_`J8+wdMOQP(4FK zCOoUNmK9ai+mbHynYe{ifhMzrj$GH(9IrV{L?497}>}>kWJ+?oAZLe~`)BzaDLnP!5QOuo$DjL3dF2DnFbwA`3^8 z{z6gL30OIhH)}9$Ff;wzv(T+BVH`TzBGdjtuYAb_prByy(ZySk$d64r7L~)4ch?~E zPn%gw21S2O68sWexHG_51D!Z6Io!T}ei5S3@`}6!-SmswEs4?k;Agc8{=9tHCCWmj zCmKgSW>W$3+1U2Ch8z7re+6k=c}j8>K(R@Jg=w9F5=gw++-3;z1nKTUr4->!P?&O5$tFLZ^#pfh2* z$bu0vB2f^Kru2bJpv_7;ynZ}d znftD5gu9`tH+3(aW_E3>goc@@WVC z8=-+&IBS1re}X-TSI<60GtndDS$5cFBtBC~&X{%3d)^`?tvWq9)GZGb=Gtd>;4-Gb z{UqS#>dylt8Fc?%aU(pw9d`PicjvDzsaEQDNbRRv)04?wT^dCXjVY2@z)}2hqTj;N zf)gAT5gk&YW!(DgmVNbVEa{tJD|x<|he6w1?z9rQY!>c0jD zc?>#8_p!v}t(0yI&a0-!=*^J#ksr^A-;nVp#K-q!V2(1hq%k1q&ez1?r$b>q;ui2; zN)b0VH-Ioa(X^gF@9OI-ffDY=3mOW8c*cE=xh-ec;Y%scOorK($3?+VF;{}L3Ir@B zL>Srr_Q9ZDvr&FA*BN_Ab8S&^v~*--qzy;a>W41@gAVCfj&gxa-LdMImrodX#WCbG zfu_I4yoPlF!Kw;g9!8*`x)vbI>f*WqK2T02#|DNa4y0aJy@QQT$>8o>q|@XJNWYD( zExmF-R(5_qKYX*V0XfhLwSXhTDIYk58Uco>xGekzp+Orjz5^@;(!g#Q@uI2uAaNB) zAxOcD&CFI|0$>qfB`n}G>u5!JJ`)oY%O^S)s20uatxtlvwrqczj(TQevkh;&sv2@} z!;#;(3z+JnPJR+gY$qrvp^U6T!LY?@&9R-t)DpjB-41%pi4x3x#A$$McU~!<1?ncX z;%Ua`&o2QP1IN)E*^6O0&zk>I)xiYKd>Ro2dp=v#a?u$G=5gQG`1trohqb3X)9=~P zH&to9_HzYmu5xhY!UzFqeiu-K$KC%>Psx}urSo4x=wLHiAw_Tu0w;UAlp-JEibA7S zUx{|+;$d7Wv_>1&;J%T~%FD~>Sq4#Gm6po{#aKvvqUBcA}-iDG9fUR+0Rbk8@54;VaE%V@z2L)-srYG{) z0lDS?Crs5YI3&!Q=$u1$0>a({1>ma$vQ5k79LB;)OBW@C!v??Vix;VyOTlB${elT3gnkn%BJZV1)AYdZXIWtjPPueoQg~Y_A)p0? zvkDRH@{*uzJUtWn+SYzV3`lsX+0_EYSw^iT9> z1h8U29RbxniDb7+lnkLk#cB=fhsQ81;djaI?k?WhsDq~%XYGK9b+*`c=<7y_jtz?X zKdONh5R2rgogvaDuQq`}iVP25i#Zb<;-0m}AktI+3y#|;SI1Bd0ks`{>|>K(ee$-P zq;P1e)KKymYRogGmV1_wiAnDj-ZR#SO@PNUm>^61cQ&berR}mjyINjfdOD33E$Ra| z-~4~bjz^%k4l1Yt`gn&=j5iCZAQ1oxa3Cx0Y<$B^Lq+P@n2b!~^X;_ZzEgmUp#NXW z?(@bJ!3%TSj-Wd<0c94l)W;eOE`>L-cVYLim=fl0fO!3^bpUZQ@T|dGQ0E-%Fv3db z38kJE#(`stH)Zmh7~>MmIVh=y6oH%u)t!rY%6Gn}S9Kr94h{~U%~W_@H0}KR8Pc~c z#9JsB;57R#f!oC zS3Cd&%pvyZ4y2agT0?;SJ+Ndqy8b_c_jtB@At>w9V?E#8W}W}sTxoiv2x_{XWC|fA zy{I8SwR;jU74Q!s?drmU4)|PnCfCnDCh-D>WC1dYlF9>}?6b~sNjd7M)a|ge&EYvL zQH<++r-|avcX}3DW)Xr$5Rp(9$&cr;GfIgEi5EEgOo7GS!`aTnRm*a0a{f3Bt3xpB`LPL=%;@2Ymr3c)G|~nP==*m;Px38$UTDcv zA=-+nJ<|Wc8^^O+r7-vu>B~qQM48gZ>@qJ2L(g=TcrU>W50tDPW_W-$4`z5|ob7bW zKg;9xs575J-Y_`?uCiV_$%;E3N^33W?%7RgBZ)s#EV1ptj`j;7eV390K+IrV;u-E* zJF4?F6V+J#uc0u(I_K+@Px|uZIe{!Dr89pb^mSg&45jAVp9%xuB0Ih_o+pYjNc46w z7z7DjN*^?IU3}*b3jPv1J3A7rD+#D^D+oHVvyNN}(YDt`4dc(*9jmo~!EmEPjgh!k zj#z+-MU0OS2EoIE2>jLVV2hV{=7v&7L`Ah)4e7(-z!FAX#&EsXb|7FpZ&+uU( zg3p$R%6nFKp(RCM*UD-WpdiUWrw-{pNY=lH%elV%S?<-MeR#K`1o; ze>JS&^1v7x)@Uk`?D01OwFsocKW;2QY!b9hS(#`reNi5s3)Mxz|9z4s@tQt_68M+( z`r$3$)^7g9xpD!`5eLce?_nVTCBqy3QOUYLy3fdHT{#?fOb7CnKWO{Hp{c=Eqd@9| zKo!;T^FMFgQh^*=&uwaLykUzZnD*ad%He-T`yhV(;}MBLT^`o_s6P>egO4=Pr+i?V zIcl^I#PA%LnSkm(Lg-yMKX_R)*4&^+vW-!D0kaj7cinj%qTZOADxusy)WtCW1~&M^ zd3Q=M_y5@1`f-`9C5ijcBy=^c-aON>i)Zw73!pB|dMQyB^)vOUd2tin>ol-ssEeWn z^EdueRDL#d2fDP+ojCnj9}m8G-b;eSPVo-RDd?>es*!{Fcvtx0Qtndc^KdBUK|U-h z+VmSnW;DVP1$Rb}dIdgG0{+IuK*hwl^7E{R$sH@aaX(jM8g1&~6;7C@6EeCw3VGTExACztifYU$ zvj2Ook9*sr|5qa@;s%UMA%gLBUPM|?k-19C;15*WC)fWEU2hpy=R&$mV9|WzVeRid*SXGn){nin?zNt0 z=9puSao_jQ#c+AW{Ps(Y3$g;Rmfj>nJWfYW+cR;25ewBTPhT=VDR|uAB!C;i{amXHZSb9{+~hq zcnLGeh;*HV$mhl12fJLh-c-~|L@ysl@PsD}6e;Fp^1S<9G_F9Ko-0Fgvx)1*S4-`U zb48W(<=EGMzpI??D|h{6DEpP!iHB#wsP?a-N^K@HrRH)(I1MBF^FY4YbCo|#SD$o! zpNsi++h?5Zc6@t{ujGvYm^(|C+2oS0+75`z_@NJLnxsi_Gfw7Xpa$0gO(KH$qQ`sVii zXu&qGb)}Q=^#q{lPAC8>-@bioM}`bIPq06zkw{VecDxnXc8~N^Rh1V+k(Cp<{NO$^ zNH0Y_SmRZ4iV3p2fE6b%3qy>DGlk5Nb5%L=fIwZMFA=}HBu|o~@SD=x8e=IW+ai2lPawfeK@z zoaKwc3D6M)oRj7iOIkT0j7vI{L*@6##MCtCMF*Jt80LnF)~w9rUisPgSe? zjB}&>w0&)hPSjUC)|fvWa$(verA`T(Y*SRsKj8 zF+vdfRZ9-HY2i%t&)r$_rsOo)foR0(2`qZk#ELYIP?BqB_k!L7Hup%u#IPGmgU4(0 zf!4LkeEw(~f(CywTH>f7V++n919YU|Drp4gEV|=f^=`~iowT-I7i+guZRH@zr?f|| zQZ>f6W{+HCH-58jPp7i)3f-MN8DBoJrnV-!Jb>%!iMyK8n&B;f>K{w~Hy+LRszYP> zIUf8p;?%roL{4i>pAkcO;nMM*MdXGq#}21+bLet2`@hGxTNCr!#d`0?BGJDF2n4Xox<{{ zA+u=x6;Y9zHGHJkandJZ$iFAQ&&PV!HazYe?Nhs(Y^>3g)!J+z>HJY?Cwn=1;HbXJ z=cHDL)Ojxt*0S@YhuR7$u6~oGKU1-Qw||C3nCSz0em? z``%}}VKAe{uvBWdafGVgsrHd2Itpz<4T!v^rXh5cq~EknChNF~{;I@9^$YFv(PFFXk95spZ(amy|mK(H4&lu7B|u087&hH!!_kXx~Rqd z>1E=?eVMH8jhqz?+xBZGDMyxhPDWNfaT*dNR*SM-tU1$H!YY0={^sC8Ix9oHa_V?A zg?OYodB53y(Ka-Lrx!vP&u6axx-vtS`ICs{mC~>eJMaCKA2%ha#J%L+v~7x!aPjU+ zRW%-&J95kExqI*?v8ULOB=e`sMn;UG3B|7{QNP{FZjPj%x(pGh?OT-#%u#JO$yuzL zO6^*I_TWAn75)3E+jwliqlRnM?uo=AEkP+~=BU-?FK(es$x@ZUxkbro%HrV|xuFCV z=L55>^Q{?YCa3A1ar5>R%k;}Rqgb+N#D6m$5B(Ir1f7uj`JXvF;Mx}Ezy3@2Z<*|4 z%?S2Y_XydO9fj-@Kc!6F`%8RQ4O&z~AKbI9=&uyLt`P9m;%&A+pdsLPkL<9ra2Yr( zY3C8=G#K5NPpiMjKY3;DsWk2}u@M~`iwVVX&*M%jw!?a*RcfKKd%ixOdQwwjjh_^- zt6etpqmGK_`=YTV8WTGFw_I-NYsSpfS!8g{&tk5LgXp*?-X4)%0?!mVCNu4$L(TBN zYHp@H%QjOFa^5-ovG4&Et=Y5o@zGnb81P}iKaKq<2A*% zrk7#SXf2)`N8ie8+@j8+p)se@Lk1#D2xGH(RyEUlZpG`>Z=MiG_2=v@v<6&gwDI*T zw-8aB+s{)zsHdXEn-?#r9^s+Ey?lH;LiyW#fQ$6n4eEfCHR)bILa+K43%kj${oH<< z`V89${^rFd$`CVNPJV!0D`RZ;p5@Zg`sG^N9AxwZ-=v$7ATOs#Qudpil8jtHk=%CJ z_|5)L6V(@P`{z(_s8q{tayhN(B$@o0WZgUIDlyx&0iBHP^*pD9r6t+_iv>_#vm7P< zEt!J;$aLd^vNdys;L*Dg?^MazH>s~VGasJcN`0){44AMPljP3!p##w-w`T0x>-DUk zB?npq+*Q<5B%g;AaMy~!iT6^IW^aitAKU(|lk)=m!M+cUcVF^@bgOtRMS}HPJNgel zekq)3DUNe3l26Vd&pDwiFBzJT6{?a~=^{e$c`8K9hkmhKROFcc0sWGq#yH@Fg&*8E zDu4t1ETvliXTTq#3PX*ifKey6eR;aQ7&EYsJ_aPFD=E%TZr?->5dvHZs%*Hh=O^e_Vg8J%6%}%<1`DvG)%v2K%>llv$ipyvMmt4!j3t3;3=- z4d-e&C`+1Kql%8g!N&G9I&$umtUQ$MdX=PZ#Q$JMkx}^^mJm=MW$w>1X{G(U zy%A#~dgWC&cDHw;g3{FmbhzWy@N1};qS^erE0Za$Gz5jK&hPfL zr!g)=?J^zHER|rinc1Gi^S>EtE1j(}Md7msGz3}~nU01eN0+uxGrpy;V9{;TVP5L{ z0e6M(c?c&}rO}?b*NjjD;X&+$%9wI$X*XZ;lM z^0xU)Im1Pgwd3#n@r4(eD`|2c52flzIAz?OOkqpb9<-QiA+r8#o$pSG%Xf4144dPR z`t|)@=BUaHJw6e^cW`8~xr@HLZJ9^dUJ>p{`7GuuQ^V0j&E7elH6F!`x z$4#Ss(NRwO_oj>Gsv5`NA9dZkRiJ4%*;uQl=hkw;d5MmxRvDSaxsJUsvUi=lmn1s|rqtS4&TZiQwS9V7gijbWhgDGD>y3l51fB z0+DDBAkJF$jy6KwR#dU7Se7x^G#ob(?;~kK8 z=8k{5<0yr7=VGe)aaXJ@-3RF~(GfBEM~Pk02jf+PG&YuE{+W-$>HL`AIc`&%8BolT$JG$dOP5x(QO2VRJQEcLZf-sHTI{ItzC+9T88d69kFmV zrgQqE5_{a;#;f&hXJ0-DO&Q(c+wRP@D*5RW=byGTbkic_&4IO)J?P(dSmoDv3n;gf zH?Ku}aChF1wcu8VPWkn#oAkg4V434$4z`BF`HWnWfsewLYJb9(K=rLzC#gcMj_bXd z-b$So$E+kru+!zf%!d(qQH-`3m4i%(D?ddhX;dO7MbBXgPU~<8HEXJBJ1O`Ez$^*t z5~rl891iPWPn-NYIX^b=MEhK6f#hR1l`JxEp%*iP*J$u^{w=oQ`C9+&#y$zRv-!}r zyK&YxBA}|@Ss-3rVsS{NC|q5?wLfS3S9Oy|z?8Q)sr1L`$OK>K9swSg`@U{ArE4Zg z7c4Nc#Pql7&fNF;*UzJ32raw{L*PzPu<_fuAS3W?I6auKe9xLT@imPnpy*Pe(!8FIc(&o7)5WZ%ORZswnN3YSG>C^2Sy33ZebNN_1LsJ2C`K)c@V2rb`H( z`F;>+94OK23ZtT3-(Q}ii%c~U{!{#v%vpA$Z(w7NF6M*BjF6?Hwh)Wf!`G1pC+u?3 z5%uDey=~j`jp@@WAAO=|d+H>LMr=DpTzYv@60XCwZhgRi-@Jl?Cc<@T?!AGViAP0& z)k>6KEx4j|he;N_f8NaBzJ|196tknkoVDgeGbL9pZ#~H9-1sxSakl zW6LM3$TTalXr-(slqj@r9y8b6+w#t0^8*Vp-Crv20r9JN&9hs#_R8<+Oo5xFg6ki8 z>L$wk7w+a!>-1Tsi9Ijv;2^~5Gb);;TGz}Y{(F*A&#Om=$)?U_d#%i|dC*_DN*<0t zos)4d#@{-hrXxSzZ*7b@p^k5T7>N5Jl-wQH+b;O~r%wyug-!elx{))eJC zy^BUp#;?)1P}fR_IZ_F4Ak&9+bV@xB^Sg!?78l#YHMwY_(S)LP7E@8$rV9!Qp;k3o zCZiMF*YnRF;~0wgsq{Rq<4xJxG2?rkZ8{fkp(CFn4?b{I_|&dSs~An3u;iR9zJUv= zPJ**ZyCY>X?au0Y$)S%yit=cT@Y3lU|MT|~Uvg(Y5~Z#CS4m-mu9?0nTT|7XwYi8 zuPz4e52QFp_S{X}>7Od~nEm1|CRSvyWw-Oysmn6Uzx=j#<@0&u^|7(1QV&>7AtT1l z-n-sUL(gaEkZ$apiC)zVAymv^JgQSTI{Z^<`IS~j6OTt^n&DEvwS`lb5N@vH7&rJR zpCr02##{-npmVapr_Asg4<&2es))c57vR+vqD_?`4c|80rOFBwPm<}fBv5@kBN4UY zP~=pHocmbcb`k9^tjVU*n{nGOwCH=i2l^Y!d4qhqU-R+u;KXHkm%6cTvb?LQUZpu5${a3z8Z_sc4k*N_? ztdKj7t-X1&9|*21gfW(0M--@%4_`79kQZor@MmTF)9+ByH$XvAf8Jgh|whq(_Iy;^R&O)h}Atu45ur zLUCmphaIjPom0`maQ|#_$l_9epqP_h>4Wo5+0vfWByELMJc7GN-qFE_CEcuW(Y!+I zAq8Gy@&mj$JubBsH`Pj9eZ#p*gTtLB-$la9HLGvbIH~V4FT8B?_ZA~&+RU7;mdme; zX%QWfQIfDil2=%Tm=qhK{H!=@W{Qg+JzG(maDts~wa63)`nH;z&9y;b z8N0?6m99pjqWe;Cu$({$3KC+Yy}iuALjH%lkZ=JMCnz9TcBQ5yCofNh04)zN#}Z_} zm$pSxI?+jZ3^um5eG5=(2zAlJ)fFg2{Svhg+lljEJ6uf<>$54-&xeSj4`k9q2mw6M z(cTWJOb{QBx~%~WOwhAlk#+O*6cQG$%FbqwE#^q%k8yeOq@Ze+66I(vSv!~oA}9d8 zVxQnq(ntWHGvqDHb8_1K=@E!t{Wff)RB#5+ndy9-PD_!ses5BI=gan{Ew)}5kwL((N1C*+2jU^xbAiW@j zC6kfht1-%XeEg(W5hpBkCshz=Bv4ei(K+yke025 zM^ACHeSr+S33cKyUQ&-nqW2s{#UIVq4)s;IRWX$gD{`b)jtq3aDUu$oxiS0CA1jbB zB0W#P4DcapsgKL_`-CA9j+8^twgNWjW8Swod-AODj~ky~9%m%kQ$3u`Gk#XH8rC)Wj}6%^4c%U5<=;Mfq~xMdsf6|0!b@ul6#7vDbwLwcJ}QrOF9-c zF59|2Aq`-UW}mAmE$wyeRKku?gJeEnU1+JPw|4WsnGTlOlAkl&eg&`j{-}R`-dcBj zA0UJ=AA&4l=D1K7r12mT0NDcoYtebV0J%!P9hl6Wr0>3?(cLIiJ>X=wmo18PeGAsN z7JB*?{>^ngp_cFZw79b}0{BHS+Vhqr!rBg7Z-4?jO6nx5u>*8{fa0KWz6>k@CM9iu zcVj}c- zaK4(fFuWvySURjrg7Ad*XwmxpZ3lclGa4G=h?R~W!-$hBrrg_#S#zzk44p8h{gJh^>yTC&aZ<}JaR3+TItc?`w|Bdl48Mc) zjNeLKIqy}1>$#34RmpN+c7A;wG7qj@riOq?h-{uLYJX?g{1>$QM!ve`w%Ww@eZUIpKY-5!lv$U=K_043i?9U;%9rvw*3;!_F9AFb zy)puVkAz!l8W%mGnkGynBy4|L#&n!%AK@g0dc??w)Cnt0j?!LX*>R#i~6BIG8R(em-FSc+h( zxX9>Kh>cY@%@fqU;0l5lBL0V!hM&d(WTvUP3EZ{kx}m~CLe{)*Gu2X*>=ly{B_({l zl@-F^UWY%Ob^<8pRvKz7BGWScwOX`s%(+9TV!wF^$xGnA??y?U;h2EVGG}a9^?iQ8 z#gJPoJP9bzjv|m1fiK*ww~rwH_EF||edkiknmFZ-mK;#G=1;mG@^TuKzi8y*;^qdl z=g~OVeZbjiYNoSaFu2KEPge(&#NArO9-A^?fC|Qnx*n#oLwPX=KFXoZjV!R3vDMsA zaqNX0(>X-Y6N}IuWDb?vYy^_j0SPuTVq2sO?2_fT+535bMXAH^j$wPb5s+Y-nmM$UjCoh#Qtoub$~(u(E?ELpF0Nn`ap8@tWLuWnn*e!F;VL(nL3#EnVO#uc|l0G4p6fI zLEEtafQvHM1Z#W_FnU&AR|@C;vV78R3N=e$zQGc~2*5wx+{!xmpQ5y$4@I>{g+!P{7gs`}cuguqWaMKwrDkjamPA%g2v%{OXg1oYRYnx<4f* zNr}`tb5Fn`JRfw&81#a;ZlWL3djq{>M4l`z=5={_IhdG{`kT`h??ExOxcFrhlDNPM zyABYxxEqQfObN0n(^uZ|QK|yj*pNnFPfs}0_kKqA$E>WmnVEcpiZ9Tc{L6rX8-;Sk z)%093pWv_q1=l^PKQ?86fpV;YFG zii7UK#mZ*$g~dh9Ib0X@^X6fOMNq5_Nc#YZU_;x-rlualI{m>d?72=q!s+}3S$S=V zP$7@d4q$)dGE-yNO$>wMOavvo4Wvt+JR6X=LArn93>Ww z8#u|70eb=!o{~$Q(=5fH76K%66T8M?MIm#0P`+$J24Z`g1R5nnKd;fAE=qAYre`8@ zpul5IjaZEGaIV}+4}2d0XWx!mx3;z*ovZ4naEU~Hx^{25q|4h7z)E&wUmy4l4h^-t zQ%Ta$(6|_YaG!vqg}y!f{g;d49;=4G&lyU5;^XHpBA=Y?kp7|(08rlAt4>v@XGm64 zHSxW$E!4mnJ*!Pb zgN5IwCivjsO`zL6IQ3l##i#p&0n}vBW`ln zI$ja@Ic~#ig?@M;$-baqwmZl{8D+ULT(;PFZ?RySL3pU%`EKV;L}kF@$}8$``!wa; zM9kKm`v<5K;}n6nDiU1W;M+_eXM!^6vx6AQsM&$~V>jJS^RVliQIg;ejMudrI!t+k z#>SxQl)QtXu94A7MqtMt8t^y2SMC9y&{tiKz5XuYgX!eH)|rqFdqeQHEP776k=_LS zlHLWNSsfUA0{DM%3-Ysf_v!&db1y5%K{zy#Y+TaoAGO6^ z32A8sYN5ITENdqFJCOu?dwWJE`+HphtJQ%WIoCY?Fq?NSy6kvCp;KS(dO+T~i6r>m zm)Ha_i*!{K!6pL;3;eZEgqe7M#%K3Y$%P;L$`=TgMp z(|%W0t6Q03*f%!*Qs)o;<=wr3&8lag3wPi5+4R&;7b#2j!BM;a%bfSEFLyE?@y5S2 zA!licgSL$<`Ar-U;H&)$d%zdtf366unt^&bUy#d*KgdAep-n&ov&yn3ieyj! z@83|mo$AAvEXut&w61N}lMhx&x~#&Gaa3=@NI42h+q^9 zz1-o$sLdY17_;bhuu@T?+M-DK;ju1n_us!}WenG@eh6=|ECIuiLvZ9CAB%^y$cL8T zqkv&<-+79Ud*TP#JzKA10~5(NO(}pu@F?j;JnUYBlvMDvn=Vw2CU5i`I6jhR9-ax= zU((3x2_YUtlDk-hxKKj%lbROMJL~H=ZuF@I0o0S@U^mVnNa%|F7-i$l%BxP`-o%4D0r!ZN;% zMvqh*i2(Wy_T|hIA2>h*r>bzIs1H^g0}h}lun-GNh$=SyaA`#ag4!b?4Gd;PQ!N*v zxVPoM0K>cN3DTvGcWpm-{@7Ca4vOdF#6v7+L}jwLY33s@8$L*y~jVQY83duaBr%L z8yYu#!Ma#EWH&3yx*Cx2PK7n`tw_o$3{8gXnFi385rl%3OC~2@q1qht8L9h^9UZyZ z*%ee(XU?9(@(-9E25~MYkWU*oX4fh5wzx%6?lAU4PK}U@!B$rGoM49hA7F=^;hylP zVmhze(8x_HmtVuN+)BPQ*dAY+LeZL0#vI;lh+$41o9c{hp+#wa&;+ z0@58QGTL;2w{`I%;u&u-MQb-eNhCly5u0PrZ8DlcF}+)>yjX+(!NJ_!1r*N*&YKbF z3E#n1kODxm4Zi~5?BD38<{Q_3t*8*JJh3gn`Sg<6xiT~lSlbpQGWo!&0bL&`jq?7Y zkY5XH80={|8ViFj_W^BnfL_abkZZ&^n(0drQpBsJ13sxq?#qXYm#>RX`Oz{j!JZMJ1%?&VkC8SvR z%9r(46nL>@2>2hufYN_Mmc50(zmroTV3yZWRU#=4U~;Oes(^9kG3t!_;C(nqSyCE) zb$zVT5tFleAMeA!68TNtd-w12xh>wm;5n(FrUvDDb{+%uHKt)~cvu7)2R@*T<$9g< z>#z_rc%03;KQNH#P;{8u+%G$w0UFx^&YK%3x*VKc#<|au2#_^T(`Ae+;+Og?j?r zrUW$kIckfSsi~l$9!)3dP^Q1~#2;Oa(wJ8C6Z72t8w!=?riFp8VZ;A*tt93E9AT@M zUs!`KW(O64R{>6Wi#Y*yZutu_d@)WeRmXa;WARS~JB?%rH`*t`5L7Rt{r(M@#(Ve7&Ku|(& zKJzTrExrA6zxb7tRpSPDs4q(T4tA&`@S#9;e^Fx#{1RLj3wo?3Tsj6@e}9P)Tq(Za z)_gMgSj7F%)xHH7ItXg)t3C*#jpujzc6o_4xN`QL?O;|nERfF^c9pqnNiUC>e7Tbk zi1q5^ZMoIDH^T$Ead7Td6lqDnO$baAW@5UV0_-!qX%8^_7>1Esj=N9C>)d{X#ad&my^;0~% zs61B9OLqmB3ys+M-VX1fWXCRIK z@EoB5vLy4M%3gb-lYb%#WKTQ`G#wnih>Q)1NFyua2f`cd#bI6E&7$|^ue2WWYlRRbvimrVM#*m&)> zCG&2ka#J$Du&~~-HX^8YSjQZz7IgQpmJQ`ODZ2k4$>Pab`K!|X>zfg+hKtu<;$jyso+4ThF>w1@R@Ntvy|%U%^K&MI`bYwXZM;9Ls!Fe}rtQZMb4RtC z!aT1E^o+pI-4lH7D}fuhd$;WZeLs{ORFDEji3{2vJT3rBS5*y3h`mi`hkz^G^C3JD zw~(ut(T#||8K(gO3hIafBEKNU@Gqd_ucHW}@Xd;Jr!No&eKTD%>E%91Z59FwauB6E z9s>j?VC)SqRCZefdEe_{KO2Nwzkb#CtAyZEM)x2JWT}RCqc&xQW%2MWB!d)xFW%ab3@$G{UiYV{ll9PbI5*D)rL2 z(MR=|_}HfXiVJa}Npbb4DL-5s93KU!*a?jM5=7;ts)&7o5)N_P+&?X!04G`2&b%tk z4G}0z+%SbW2O^<`!tUwoyINL%gu^j*M_EZ}@(uPREpYTIKd}D*XBny5I-8Z5>F(i? z_3>k2c;MMM`l}yy{*A+g>l+vxaXr38(c5h3|M!O*$A^<$PIh*@nwUvAE`RgS1jpNc zZ)JX$TQ&Kf@jf1+@i5^>3@(7tPx(JY=GZ!<%`8#>YL(!MPV0A9?%z}90PZ%3Nc`K} zOy47s&E(>9exm*Nb<^N0K;B=y3YvQ~lK!Jpf=5C#dEfKb^bYBL9H$?i-8Ek0eQPUO%{hAF9LD=+$;!0v-v_BD-@o5O zc0#A#m4I{>5+eJxvgPfks(Lv}F9TcHHjmuNhkVk8gSH-W8_;n(KFXuvhK#AO$k?*FrV13EIf6`r3Qv=1u zh4;%rE8TSF^Cs;`Hc%{ZXU~C4k#OJ!8Ni#UFT;Aas$#jEw^^*1CNg2nj(! zXdmnf+zVb_Ua%$N^wOrMr{P&K;?FLlb~pCL!{R9oU~BC4%dm@Lt?D;mN{_ zfukE{5bl8?AK1xT#BW7kz60k7IH!mHVb>$lp;+_)dQ8B(_IPhDEoL*C_bk9T&jLYF z%GYGM`}0+WhvC{p*VB#=hUxhDG29hrr>9j?M7E_P*n{97vSnGw zXAbz#Ag2VUF5H`o$aSkot4VphPM%i;0&fvWn5!^F;FN*e4UQ8Ceq>&@-J+8P;Y&>J zCdaQ;HATb?iWzkqCI{V^xRZf0?II_GVCT;4qn%?}*upSZVCRush@V*^MT>cntGm0m z@AZUjFE}~D>*=)w5&i$XVsO>EO=8mSJ#(Nuf#8(fiicvdk#ROTnGH5vDZr>MN+sEF zbpHC~HuPb$@fO)SXtV-zH|LyT!2q{z;rr}tg@S?V7f0dQBC(jf7z|3Dzt&CtQfs(p zE+(EEAfslJGDMab8VYO5ye@>CZ*}EQ7=Gi1ARti%c3cenBhEEIhBb)~cIWJ!-V4k( zuS-8Y3t;-#iF0DsHb(Qz0({7|*dsU(i#XITuOAA6C+_VO>4!dlJ}86ypHU1E!$}rS zif+O(gwIBC1B2F3gVT|j6;4O^SYWCJH9UohXM}pm@K=QYK`nKNfHJCa1O6ru3-d5S z*!fs>GQFovkN;?)@T>;>({Ewbx1H8Ll92~0$SQb4@XK8cCybd79E$bl*+|`+K!Q`8 zw$vmH02T`KHBSr;hnXF|hIbb8w3z>%CIe4lMi=vnNeqB< z5(vLoGH23}DW$)I>BI)oiF91dYw<9zeS=I*H)bBfd@%E1$Mo=A;5?=&O@f%FJcp^5 z;=>I3Bn4&!a$yAM_c6`eSiAwM>LDF@-QC@VHQR76>D)PI-VJPOjtQynuUFrLS5g%6 z6c2ecMf(4KY~lk8H7*6F+byd}s z)m5m_ZbpJ_M*jatm1*U%npr;MK1S&SGKKQ66R;rQ^WY8vYwP{L`UpXo5+^YQuDY9w>D1 z0F_(N{XkJnSAY1;fWVv6TYk{kd0lY~m&~C$shjV_ym-%1D-8ajDgF47x%?I(wY0Vh zsjNmct@u9T$X8UL8=juo)EXKZ0CK;)I%CXJ4gY`(Fp0^*qB3#)8buSty2X z13D}d47r5qz_IGdRE2=#5YVL3*}K>8fJ2F>b}TP17eH_WeFix&5 z0uKU1;0~^)M4vY^HGK~sC@(oIF$MY8g8Jiqkdtv2oV<$JLiU&~6b4!|(*#w7Nkw0) zf%FLa3Dov!uBN6F^3%4f@^C@^+FBLrinU~*-U5@W4*xm-aGAfi*I^V~D3nVFCuLM9-E%m;K~*09 zHwx5SK$3W3gXaAC^A|2&MERk|2z9nWd-cc9pIg;$ep!MW1@)}KvjAaXLYgIdmYL>u_{j7`}v=;8T8>T8uD{xhf_ zX^>ty-^6n|>89qGNuL6%6PLyQ19X{|SBBu$P9I$hOWQGDpRI7l8XEeyF|;;{N%Yn& zr>%L7Xxt`2%+GQD`#Bt?B$EUIyFnF14@`h;`aVV+KnE*1NPL`R+l-&`SK9Gb7P}!2 z{mV}<$w*Hy&Ok#Y{ln(qd!#dk*{5J6cld;>F2B@58k%!k<|Uq zFXJ%QDEJz~4OT?^My%7M$XwRa(lX5POSiTEZX=zy@8b4=Nm_$?ez$eaWjfoEA))KB z^cY))hrvmG4~O1@qAWx#J$bp!_OY_WP#0Fi5yGo;c3<95i|PY==*VdsH0y(xq;Nc(z6K~Zr-fOxTKAG}e>c)b$0 zF#;)lWs}LB?6T@=Q3&|wBk0Ir@4vrJ^AyrDZ4uU{B}1_6N9W+8`aHMC6e2J0*bjF> zjf?tBAZAliIAWlj!C(?KCL!KPJ9n}kDyb426L5U!74z>}#ntc^SAxNd3v`w}HT!2{ zVy|{L8U%HYI8<{INz^r)ZwRG18^ldu26O}C#GQo!)tG?`-MUZ!fUTTQhQKFJY3cxKLJ4nTgXS&2Y~D*)B~o4UoTW(Ix=r15->c%f!Hgm#45%+1i}Giocak#Nl9sGYd(BT zkg~L3V8M>hsz_(EFU3o!vjavTj7Ng4~{#Tt(=r%Q7;iz)dPumjsb6xGzI<#3Fi2 zTvgs!4XNW;W4PN_JB}W zZ3-`_!ug?Q5KtY7gabvD?_TjCtN#;iFEg8e87G_4b$6X0Ibs*D02l-aFQPRXKtaJT zL8u<>l!N4V>=ArHOekD+Kd))szfU@`ec_zgXJ3#9^(Q!=cqS(5JmP248zO5Nba+J@ zxMv`AefH1#DhLD63*y`l&ciyiymdXk9pn{B_X!Mq#yUXw&DtfF9^faf?tj8muuaPC z^yo(dAg)}Od-n{$LAL)ucpnfuGWi>GuzO9JUMQv~V#i#2aS=2o0rmvQtx5i;N5jv? zi7RWQns)0HdmI2-0oUw;noXrJw9dOAAesYcGro zPP$|K`5-(EibqqeF}bx+3#MP{3qC|BWgsy# zuA8mq;V$FpVdF7?QrGmQ%=C0~h}WrkfFLH+uAvHB0isAT=q({BDBFIY^bu4# zw@J~pwY9ak=lrEMIkLXK4%61={)Qd0+i0T(kP-ngNDDs7zUEW_xT5U)0W1SH2=dlt zp^_y-%RaX(o6;ewIwApkTiS6S@DAuiD=0-woeN>qsyhZC8mMeV36(z}c2Z~X*-K0i zG`W##2AhN%uu!-!FHV8Da5enr&=nF-FNV-Oh&m&piOSfCLAoVFhv2g8)~WXxU875O}flo9Vhn0~}N{EvASw9CVxoN0dfZ44q`BaTpsI_eEC?4I(OLLi~3n z5CF|iB!{>kWM#K^v<&N50jIKgM}QSN3XZTg9aI5%5yTzc-dGH884Wl&0z(;0UwxX)c{_1RaVe<%+?Gv}DST z=yM}Q#c-rf-NL;KmBTuP2W{bDk0I>0$R{q1%YNsHHsHrV-iZz^YRgBtw?73*3Dhg2KYt(-HZ3s-p?`+@tKG0B z{$8JO@%XlRbS=!yMmF2HU~WJ=keCPir4KC#sjd6$yp;^7^Yt^OYol3Lpnd(SBND&6 zrzb-V)CODDHeCou`ufVC7T)iHr$@9_6bCd7xPFRaS%QC5VktQZ|#6 z8RoiCMx-Z;dVB#D(o+}WbKMw=Id&|S1n9E*83h`m!1kYBzhMgZ&9YN${O!8 zmxlr6m%2Y-ncU3-ebxI}Gv%MB@;`jIlaeavNQ-Nd46W*0+6LkEM@gp*1GtE#G?%Ty z)MU8toFn9sAzQ6`Q!c|9;@pr52f)lN$|y)CK!VZk1{N1Z?$tPe)$qmq1DSkKZ#m6a z&3BKE+VQ^a4Fy#gyLiv%cgPdT|#?8J``YA={qKQSG!ENl~P> zit`~S=P^jBZt_?HgF`#lFU+K5i26(2E3Br=`elsYYm9iIXQtFb3b&L#hjE`(1U8NT5Q=!hfd7%UR^UhD88iSnwM@r$P2r8 z#R=)WhOUkjKZ+E$)Z*rD)0+wx*#Ta)TF=;9mCU09Gy?Vy1#@6li)3mky9G zv#3x5LrXiCn3&i)D)+A+PsmU$P2yKv_=X(B*GXX>T&4*3rkFR8i*~R zb>3U>3YB188L+k#qF zI%WWeS(YG|67g*C>H%Ee=|vlOF5;Pc3)T zglC6sYc&lgF%oG9`0Be)Y6kiJJbgX@Q$IhzA_pCTaEpp^t59QMo^vM1PKPBw$m#*O z%6vOp+DFOyT_&Ai%1h@RQ+8O`*g&1i77Q|jzgAPcjM!BGqJ6$P2|L;1YtCl~eahV!dUc7 zHnI;rm;P9wSI1y4UhF9Q0rePWo&A$pL_i34y8HGy^qH@R!8PvDEQB5q*A4RFT1Z~66qYsY9E_%Xh^dSMlQ!wFwrq!t( z8Y@-DK`fJ=7y&uHn8#ivLpNhe7t?_>bff|q+`Q4zQMf3^7PipA(_AcSU(G_QKqA$R zdjeK3pusuMiA6*0p^v$_H)56B0fY3JJO{3{5FQg_w2?gv=Jyi>g66oS!|sM&3HfAV z;#ey^1`)ra%4Wdc*NDTdkJMXprpT*^@L8^WFvP#40;ffNASR@(tvy>T1^6IkfgTiU zE#p{=AF8H^JW-+sGoV!+4S8eOXW-**S55<*takx4N%uI=jRXuYsq%cLjT=IG?+4;U zzt36G3rTz-`2kSO*JJQjoON*^B&I35YA@x9Cf95rmIs5`SS-g-_3dk`!Q;`3fSb59 zz4;6$HLz6QKG5ww7(|Jkd9E{of-?DGfq)VkieV=QN3|Bzx}@vRjE#h7!L>!@n7JPU zRW=6|^+)+oAud61ZDC3HPMhSeckkY%$Nt z++$hd_0?nZRV$=m`yey>dx-{7v)v}XfpcRJ9P-Brbv^= zF(_{e#{2`3^>0rj-h)p?K>`N__&f+b0B#uo#sbj!hVhN6u0jA5qb`y(!WqHx-KK1x zQ&l<)?0wYI_4W$4&*FFX>ACM_HIYExJk46}-}O8qJ-esrR^Y{|j%NVNlrnZX+}%J>U#qlUHR>_bz7C)ixqJ^ROj#zM zLI=doATdFkrVmQBkd0?|Jcg@`XR?W|E=6@8BT`*ZtP{Gp3e|W4gCH;hp>%NxH&97` z@}3b%PChGiU(t7iB5>cT5<+ zW`2lZ#5aWkARPdzpLNbpQ~+gC%Wt{e&&(PoUGz4J`l^655~30k(jXxrEr^m5(jB6plnByDBPk*vARvvDlz_wm z0qO3RZb|95b94NDDUOZ>7v(H*FC#<=!Izn>T z*Med9u5c;<-LviOd#hiW85xh~5l7LBU*CW;W&0e|bT=wpsBg}BSII2d0VZ~>m2eNZ;5jr4==?LDC|#wB z$^(W+(BhRw#B)C~hn*{Q_wHRve(UOK6-Z7%jeq~FXV?IOmmAB>$hfY?`-M>k?DYTC z9#s`q0AyvPt?gZss+!tY{7lS?qkz-MWrCs{9HDLel%QSCp?LUEnbzKs<}`lePG?k$e6n2` zXZ}p79h{uNjUYk_*rmXu4A5mdrJ58u8#}7%8**#06};a(IN zOD~^yIf&~KGX53he~CcEadT!c-}u4>%20r(@7Hh%hhJ93$qt3^Tql>zJ6k`{sZ|_@ z^ZFMxf7fC4XaWFDkPDw4ZEf<6CM6|7YF|{AHk@*RU`i~!5G4(J0=NT@gayB?slvyK zZkxoZOcPHFn~rxd7{OI~%nW4T^a-zz5PY%qk;wpT1_T3pKS!A!|ptO=m{mtSuYlAQmVg z1NL2{6)mFTS#8#;%RZ1K6&OMF@~-yt=S-LHB)YM2aB$#L6oU%@prMRV-u9vu;AsWd zJt0*|!jd&4aw_9om|+27Miy|NTo{1760lcpuKF|=a$yhvY7+|(&IohN_h15~ivcV3 z>(>Leo7zB}25Q>eW16hTxJ<8Ab-p-ZT%kc+(w3v8AYlA$?G*rFfmYxb6AAQlIO)d5 zwrm9WgK_4s3wslU_co1TX?Jwohlc@3X`}_P^Mhm2so+z}%WlZ*n;WFGxPKm`>+9eG z;A-Ii@Bo0kQr4CB{d;7S=G#QzE9zh5U}lEM8zeXKP_CJ8i;ap?dyEfV4Ag-6%->(W zOY2d2(X+6~iiCkth42OJR|X;VaUx8q!ZilDXle)IVHmOF38le-7P)$rOS{UT?5{I^ z00f%Mqb5iu1FkEVU5J4^ZI5$=>5fF5!8HIo%ym9X9HGNKTzY-TR&hh)P6(I2y87+i z+QOk1U}6Ad5gfBueh9;XAV?we3M8g}+#Lg!eu~?hyLJnYE7h&jdlnt;QEJr8F3mx5 znjI#?>2_O^HBY1@3f0$KAd>tG9A zwVyM~>JtR72|e{Ao%SXB@?wW^d>rqvf<{f1offZjPcsvfH?{@dbJ=eLZcb3yiBG>x ziQ?uKUZKR5M}2b_uBdj(8^USJ>e`bn^_M9>vxS?zs_*jmsjTIhU79Zy`dI{j5*9^Ps_)u z9Jik8#(iqi{LuU78)Pi`Nc+pY7^tX!z{}M#bo&?i0~gWq^3g>$MqQnxuu|v8mK;Uu zjDO9Tgi6SoA5ni1$cp5^kAKw;JHi`ZmQ+Bm!5a28G~T3GYambGDu@F=e*&;G6iO^a zcG5fmc3jm=w1-0scrE>(#bsRWfK~uJa+AY-zeTcQ3!t^I7zj{*xk!e~rGe7&FtpRsUta6mjGo+s)-6Xg*8 z++7)Z1*a*n5VHE5AlL$_9!$*4AQuR?&bO+@21Q7hWALZhdBR*e@2#SycK2y3`J9RT#)apQhy_XC(b4*2#AYVDsi(I$ zWmKav52S_JwZF3h*Py+LD5j8kmlSopYIUKOT(Lu|gF&g|H9Q}JvavsaDQ4m$-E;?d z^zo893c@WiYzlVmU*oKSQbZAb;&dtH?+c=0hKt1|C6N4vmGkA_xKHXpi4GhB#MIkf z>E3se3(L#Ek`1uxjfsjH7#+Rwy!*p}K-NRKUM-l0t=t{fv3~wv*^rK$0~8ZuP*mZ0 z{|ry*;5c6XsFuF2E*fugVW({%k6*6r8lAQMEb4v&D`h;;{T8~C_qFgWIgBP9wO+MG z-+`q^)(3mmP7pvH9dQEpFAj{o-3I;QrR`R^cI<`o@N_{m|FpsVNV;OG7DzOZ$6eZn zuve=6t1@nKt#yq*K#;=ZX0kn+Ad=6J_ge`<+|{cW#a0y>D=&Hsg5ymbl*ZctpNiXBID3bx2q^X^din%&|r@?yb%%? z7YBSOpnVAaD1b!;A{WO%YgK4`hio!%#{&s>HH?!tL033mt%gt>YrZW_R%annF6#5=p=e)2n!6 z12$0nfXYkO3Dw{ByPM@Job9HzNI60b^tz0Zd}tG{{xM;Jgd|FhC&w15rjGpqB>#-h|H$3BLV%rEt42t2Pizz}eF%-HO zEn5B54nYQPBbYTPMvgA#jug{J7n}Z9EC4BnU_qf69l98TkE8yJAVx8u+GlWq1#-X? zr67SFkX){AJgc@VU}^ zt6EnVp$`TToQY%rc!yZP+dNQB03W;W<~a;Vn&(kdmqi0`$>?|8q%tsxpvQF<`T*+* z+c~_UmNh^=G@#rzyJ+X9mxDn-D4|vz_6!p)DkX}Zb+h?43{eFC2<87nw_ffYG^+^R zUSgvZSO09FzxMx4_@lca! zw_j-X+``pRn{!Wd$1oDtC3D_ZY9Von zqK1?||7#wL-THSsA|)1&UIZo#-7Ai6m!u*+lGh}~1p`MAeFH2s6q%YqJh>nCq3&*F z?90`OBl*UD69{4%UjJ$=mVK0jlz^QI!lN>m#J+w`{8z6l=z8%#4o4urOwOMFs|^D> zl}>$Yso-TOt@sO`bS8bVP+z>8G8Zq)X_qI1uk-=d1p)~>J3BZ4!vq#Hk5A;{=7#JS zjWC|&LVO=r&yWb2%V;yUQJvL<0V1yw7}Z!vlRU0*T5=+vqF<0y3VDG4UboNyrho6= zqZbPFJC8rc{f_!yrWf=(f@^2fAN>a4%xA6q_wN78&v{&X?ckt-`P-U{ zu4H7@PLSy`brjG zj)hq8Nu;(V6C)!WM~a#c!N0`V0(Rrlbscqehu_ubIyOPL2~Zd{3UW*U#e`$mV4<1f z{rg_?B<5O1nAF#iT->6f$U)dx_sZAAMCyC@oTusn0EGFa z-$w8=_)k>(qgd{d2z_k2$F7V3GYt*De3D&@0VDXJvUg&FDypOk;4yy`#6VqPvv&#d zhk~;hGR9@aaYan8fh5rr&B+&)1+4fXb9hp2HWUDjy91Yx*4X~yVf-?lK z26P?9z{LM+`$qISB!4wA@7IbeOvkb28fLZ(a+CUP8&i*j2v*{W8aM)EJRBf5PmGN6k~Khw!Y(+A;`!13{lHDj z>;1i?1gz_!@Dv0;4*p=Ps{(1bfa&O^_6u!K&0=64QePwyl4^v0gYXh~FA_2qgfgE*H)D1#rZ_+8;qo8eCg=e#QWL$lRyqGU^>0 z15EYT@S!V6xNsa01W3jgCI)d8m7WU6t)FV^;AO~~zo`YMthxDl@PC2UP+a)sR=4+8 z4sq1M#{eb*fPa7-*`@IZ3w!|Z>P^7(I)``ruURSF6JB>;3=-H^|5t~kum*2hhH?#9 zBDdwD2@Z+*AhZQrB-99S7KY_n$e9GvduSS5f_N;P4TVjcBS4~R1-=%hT?7YF#RQR@ z@jAnjkz)#h3x`QOJUq{?@n8Kx96dC0X3vL!;$OWX4U+ouH*I?ZQ`JcUJ zC6BKK0cypMgG?e@BYz4~?gbq$;wX~8TIi$+C!fZO8FG%t!4`^xv!=X=$$sMKy3g0& z@gjlw_?7@1v_9IivAl7%4v5Jt{`nJm9~65i2J63}Ep(-r@A0lXnyA5F9pFGYeqg(a z(={+Yj_7fvu9|}H2qr|(z6l)&^V!D6qD|h9{oepJ0Ej6hbgG$NSh$rMrI$2qJOyGB zEycVGzpFPBC!qaE`ZlX8xLP%1mb;)@5yh&R6vqg^0&-AND6R{Qjg6T1TEF_W%mU-- zy-P(KST-}Me08k@7X(N^BMiC`vc!XMgam+>;9-N877|YN=VfNXim|ljNGE1&Yz#1@ zQCw6>i33>Pz1q_^6^I$+9^>6E9~&PWd;=B*#46#Ho0|*F-9|rgN5`_3*`!gT^vujh zRg>@|fPso)c>_Kxof>t3X4#h~XZ*f?_eIzQ(kw)Y;wab<)&uFH=PzFtkF0KP+_jx~ zh)sYDl(1W*tQyqr&v6AqEDQ||A>OqK)p-f2uCBUgef!qAX0eIHEZI46V8DR#8M7$3 z6OmL=c`bk`WnyKGa4xQbDDqX^sT~L#2qhgY1XNydc0|+%0bSS}3fyU>A_Z;MNQhlQ zNTUgNY(+-}+71~()qY41^!I~J_Ce2Qu1#}Ra~C`SF!z<>yS{0#0-g|DZ=1q3z%T5Z z;ARmsG@JyV3jjoCmt`O@1c9B`-5Dih@qwDK%-Xs&H8y5sWDJ&ACERH28yHZr3mBmV z`GJirL?6rcB4BNW>=wdo6+1^p4gviqrBnl)UnFSum}SVZ~%I+j-n!?Wk}KfK!ocSVh9-Qo{tS~1ymqcUUzo+5*$e9~!2r7&+ntGK^#Rtvt zWm>|pTs>H%{|VT!kFyZ#(yDwSL-}kA92-ifN{$Z=9nH-@1l%w;13u|tHp z10Ay~VHWwALYSHmRrad|uOowe zOdZ=(&liv+TN82J%OH<3f@=DmrO=JgiK96_n|;djH*LyvHdhgl%Bx_cO&H3-LSN^P zcPH|yY=L7@a0GGbx=ja&KU$jZgGXou^8$c<)nmjVD*iKM?jzta=bc--VlmmHWJ3gI z34eM)jN@@6ID3d&5+d&t|JvaeDjCrQH>&C>zy^p5np+3w8M+Gq1{j_&;mb zvCqG+v$D=qL|B#|>7s+p_6}DzWZ`b>t2h=echb4W2>ix7h2HeKt7Aua&Ml#m!;Gi; z*RHCMPK|I+WhvM!a!-zOR4o&W-k`Vq;r@Fec&f8%Ujdz@9*$tzHVN-aeX%{XoS8>=ob5tzReEDlwY#sFI5{aU7Q3uUyURo z`Ukp3N;>jC`zHn!$viAa9!2Ek-i>8@Q41QTKW-${O+k=10#yU3a#>Bcqm2fpFmZ>a zh+T$em^CkBDB)PBT+IE>SEJ$Gxf4TVA3mI&IhWAR(>`C1jaml!YZQJXbJt!L{6M%UWyNXh!7Q!s+$nhv5DW_zDOl{8=nF z7Tx{tqqZG4rj%^VN2+a~21IM+@eUu&n4Xp_Jtx)Gz7&~WbuG=N45hfDpRtH<+Cf$# zO2%+|;lN=n-Tz_ItNw2Zci2>d^jUv#^lPo0tf3z22rV7kNX9dsI!Ehw>22m;)w+5; z@Wfeo^4DTO@TW&}Fjf>*cE@%RWlKFc7*u_VZm)r)Q+g zZzh|$dGircd-djn;kv_DlJ1JtPZkfkS9SB~G9~su8&ALaxVT2_EPJ%9WT(Df8liDj z{lmAvd^D!3PXFk~Z55 zU2;RwH#Vay4KLRcN&xA;A=R^G+PH4HMlmHDa!OlrV7RM~8cpDI^<;lnP%M1bdS#-> zsh&;b4ZmFM;fc>wEb%P|`5WJldhgnp$3}mXoDnsv)?+;uPBPBfv&$+q^Gh4o5&FCkQO~Wrv z?ZE+taohou3imJkhasg|?H;dP2pmk84iC8vDK$G6OkAU@^H9=7qv?y*@|uRr@8_}g z+*DP1V+!b>ZuVT1le5J9%w;OWh2m!keyW!B@#&ol)@3>})8B7*TbP?`0kuY`OAn4E zuz=luo0Xld1HD-;^5v4nW&uPJO+WcQuGNO4cEJ&BC(4-@KH0b-bE7QJoRKt;e%U!+ zD27H<*B0F1*6Y`*0eBQKi*Wj15$^+E>?#RK@XdSQAtYB&U<(0Um@6Pkl9c3$rTIb3 z@tXqX9SEzWJh9&2f6R89bYU3I8Ck4wAh2~l=YJxyajJ+%RFGzRDSL=k>kqN(OPY!1 z4+G|lgT47`H(Zo;Y#Qci?OqbDwS|n7r;g$+5w4L2guH(5TCL)ay*Wk3#QD_g+TE44 z4mwk>k^o)f`qU4P*iw(Lc&e{g_xPk6QMzfZj)^!5yfG)*vrPXUke2ByjD_Ev+bfk{ zr^j$}*;Cs0ZG*-D=|)ZM4^~FD#pHSA?&x`Lo*z{*K|L44bYP!|&wL5^jfjAAjUQGZ z5jU_3Ab6*Ur+M7l-+uxei}x1<=$z8&=0m}x0!0wIeRnc1U_@#!makazM@RB9#`u1y zTeS!GKC`I#Vtr^YVwjo=@bNh`gxuWJ+v$OWZ)ayUT}S|fXlGcF_;dJC7iXG|KI~}_ z5sR|oNQy7I&e^FOBFi(kAA}n%tUtrLsgN=}aa-ZI$&BFl?!5m{x|f6D#`4uzVSTwb zIyzAttcyt~#kf$}QBj z&ab`gy4{-``-iO7_7Za!8Qb`~xd5#vmrO#818`diujC$}G~9z71lHaA6h)ZaeqazG zkUf=1;^T|`tM-UI1JInZ)N;b7OAt6IEzTU!cwOQAzMI=%iGtW$UiqIN7cS!=IG5&e z#5kIHFAd4VA=)Rh@H-1N;tI!II6#09H6sV`ED+e+&pY;}>X8XfOH9~wjNEhCuq&xY zg3<8F8#OpKb__(g=8X=p^#%5Doy+dy)gr2cHNbW991H7 zo!*j}j4$%QBwuSPgDPZG-%z7nEbC7je&7WXk($-gQ?>^)Un@sKUk2Rl`?G)EcCNcuX ziLdWT;lWJ3?3^4|r(S*{k|0W9moZ`}Fx7v%PE?pVCe0~y1fPnSTX8Sy%S3sq=|Si- z!Ma-yX5J-GZC>*`TJJGx&7Qe=+eSz?r8l?ty#}Rb>EI;QwS$|+k#={?W1<};D)>gU zk58A$rsnU{<8rM?Kz70mR4iTn`s&@*>suD=+~ z+u7Ms2gT^MdH{%u0L*5o?1h_KRV40UmFvA%d=#|}RPNcVEHqiQlY>RiKM*AYVE~h6 zriu}aM2Myb55y%tCbW|%N+fZC_#i^i(P7?8I6X4+b>!xR%(1HWv272mK0I#KxI9zA zy^zeKkiG>TA6y|tpUT@{WB46SgZIV})`0Y$%+?2qI)0^jq4_D_*7f$W|96Qp17Z(tJBtY0oQH<$;G6-u{u(v(6z&f6P-qr`f&DlvGoZkRMhesma?y3d_>4rYgo?%ynO%i(}mSt z7>?~n^^aY9FPW+@$@UIl3V5`#VhNCY(Cc394M2?xtxwk{PoZKjp`0rpoQ}k`H!@yM zRDJ&4HaI+7Q7FQu`y=8!mMNwb!L!-skNf8UpaIf>Gj@Vc-155z06zfUnj;A{zgltq z2epqWu~fsqgXSL?z9gii%y-Nok_}#bZ~}~3s=M9!WkEirR|Bj4As`$w6ceu(c^>*8 z#@VPI`AAk?)mK)qi9FArtDAS2038F^{!w;Jk1^5LhbfR54ICVZD1uddIY|F@DW;c0KRsY5#51U-k(DvY}V0=iMzh zMMEa}UP$V#ACNfE+Da{69;b4SsZ}f|PcB@06ID3x7XXhapb7Qi?%L_zQQpWbchj=) zyu@*L~ZgVw%1fFRIqI&YQk_%`GifIb2bvtj|33 zcTfximuzpl0+Y5;96JV0n%t)())r!e=Stez%iyaxEXVUp&dLfT<)j7IQ_xXz+tb6o zJPEhLuzRVPG=hSqqK^dJf!~~TG6r158qd#4)=8B<)5RkJ-43;RQ8m| zX%an6sUj@9#nzKfY|g4{ID>Un@nvAGK@QWipVQ;vYtKoe6uzeFsE>~S0piu2P4=}@ z`!JEBVeFYef%;o&Fw8>w;%3QOY1=Ik8nLMu*umDm7V96yHb#nm~N0Lz- zIQWN8jI7%4@KdvY-d=nw_*89AziiRXYIuIYT`J_ot5dck?J+np*je13@Mkjn_=>fx zA?JNi%p7rToU3$>(&3R`H+w9-H^F_@Th3L0q;qw#exQE0$+wC2h4g!TshReP!=Q^4 z{D$V9M;!x`17++J=~UlWX0DIVZUoj^6&|UIyN8xe^|x(*%0#iK^arPoH`pY-F_|pl zsfd@aC14PZ`K{Y88_97wOe?(Wx~%Kc`-v?<82tHh?Cg}hbBpKaXm`a`8LGo!O*jzd z9UmLZF*X0?xHZ>RWcu{W(-q~$oIh)*QxW<|OfY<8mbYO&D^j&~+|gm?SG9ZnnyCex zvT0(f)+q8i2xFjtL%RPF4i(N*cm7h^0_it#TWxHH5uj27R0b9G5b+M+ZSUapW`epOoCPtw%?PhmW5hvsflH z-hJUfOe492dh@EEJoXuC!_AR^@2JplH|j=h+;N~5V8ma=9&=o{A`n=SIl1`Ri`Xp2 zq_$v=Ax}0-A0weNQR`3p_=eTwJV9}a-%12i=y^f6{4`WEiWZ#7*Im|RftiFIf^{}N_7eTl>X_#K*r^z3r7cq z=8L0q*8DhVVTOx$Y>m~vtA63#BNW#-1!Xg<=6tQ z=i+(H^j<%XX?A&((IeyldD29j;>sfY=H4wul?lTAfkS7bn4~B12TGKjod&rd1nC=% zr4GubcJ3yJHf}|3;f{Zra}}hz9qf3w)ylJa`Z}b#HBOBk^lo1ka7pSQT;RK3Xn*WV zqj-@yg1h6}joKXr+2P#YD~`(RA;p9`3`gvbD5xO(`ujLNG_Z90TZE8Y3HH90`(z>> zVwC(AxG965=wJ`(UhUsdM2nDPHLR8PuH1Fz@%G&(vijM~ zERGC>TJ;=g(}Dqi+-2EXkJpvoe>g2FY(98w`e^qWiGXVJ-{S4M{2L=0-IW)|C%7@0 zLNf2NF1avb56I#(B*@QH_O#(29_gO<;m;VNbiVXvEk|>ofMcsrRo1CJmucz zV}kou-P#f!3WI-%Z`MKvLiv~!)&x;^Jy%ml0xqNI8eU)! ztln#fWot}ht!$oBhJz6Nz_Z`(|Dhp}Ard?tT(#fld@TCV^Mq9StxiKS-nZ7xwp#B5 zO}FOc)KsY!V;$QOr(2C`Gp;ZgiY0AzSzOnvTB`O2ek59tj3_kbLL2(RBz0omFq17-Jtp^JOzd<%$-%L7t|8jUs&&i{slwym=|Zs60X1Hzjll?BU;8{%-rQeU&Cn zDyLfv5z;v=Hs(2&$>6_hD$DvM{fe0SjN@ND*S~$=^8F%N?>?|C90UmGll~VVRXd(T zT#NG^j>bBgCkSn}YBN{2-1Co3-9QZQcoQMGh|{YIdKTsE?q{sy-kYfvJEk_Qt5Kz; zVqW>oIq_S5?-Ro0)iVOi3g?yCAA5^W<#!$oOY8E3>TQ7>4)+3VM>AJG9yHtPKX44C zts=8UeZu-Vu1B{uJ_YVw^-~dAD!L$zc;%Vyd-F zvQMM>Q_5C9Q7cKihw>N8ujn&UY}M^h`B#b(-(K!>%D7$BV%y?6mB*KLY*EJH>A-!o zC~+B=@{VW$!LF=wGpTB`UY&6b#&Gp6a@yhf2!st0!1*)F6NB+6^HViuQ&k6IY0NLb z0Fy~2(yH>W+RohX@R2@<5QYiZ%`t%VghFxg;RXWST|7 z(vhF=FJb=V8k}~>wBuTSju##QUBXUzQ^OFvk)lfpVb}CJiOL5jS0^KQ2Tt949yoqL z1xhDVWr5`7C@0Egfhv!Fdb)AipJ8SsT}m#COl3kiydx^x*htu&Mnza27N(RNa8Bhec8RW9$zE zsR2Mt_d*%6ql_-(KkG`W?!OM+`X^85Z#%+?<*5~(pd*LS-fu4er@vk1|J*JWNXqoK zvUV~tn8gNWI>1WvjxFW=`O8X0cI0p>;Z zxSaeZPc=prsu$PZtX$(#YsDMY6*j_zinv}E%g&2zcN|}gtdxnXk)sC7N4G3DzxT&o zzxy=OVLy3-j`m$ASApG%EvdibDXm*SGXtxV_G<6ljR70OAIYv}X@aqy<;qkqDM_t( zSJuFam)cpLh{HVj@yZ=~URS%WTnf&pX(GRrzt{2WR&Rr{)GW;q1gN1KAaN2z&j6G^ zGBV<@GI$4|k0dN=bT28T2=YRQP2pE3ACLoa<<~<9+;+jtDS=1@BHMQj;?^q3~oqT+vu4Y1~%cumZ`el-rkNJNFI?B>6DEPfoEWzCIH|Q zSi}r3K>TI6=(!9D-e^1`doU6haVvBP&p_;Hz{2y8(-AiZLjsW&Fl{?I>P43%_a|zz z1-4$@VQu(!s7g6vKcfrJ^w48L*pgR}5o}6LcSLpuBgP@~Cm`Ob&L|(evmhtTl=lXYJgW~voP=+j zZ1``$EJl8NuJs!BxplGznmsqppXDo~9G0#UJyx!N->3CGn8DIsxV$Cm+!jj;F38c` z8mbEdHobn1D{EUJsbCCNG&z?#e7(}u6PGf@wdGqMj{ki7$vibxB$&kCZoDrPlN(|^ zIF#%thbc8)U5t`;(V$7|lihtHD)al-=#h1_$#!=Wrfomn^=;ixKiVh>MGC0(!;`jF zuYHzUq2Sxw)cpKWLjrX6C|JL$3LQ;-B~&LF}1VE`ev2Z1$U%hLR~>6gc1N)Ahm9N0*nA)EJ2PRWhEtX zmS@;8+-8uf!vg#{oyM8pzuN<#%RBqgcX%R@EGqF{Dri3-H*q!rFBB|;;H!H3-mlL* z2Qb`AOSepX&P5*+%?8{Bn1Tc@)i6v14UJOpB+a}Ch0HIow|sq)rAo;>Nf_E|QaA*W zG(h{@{JLCtTPXRq1sB+sVD!zvIUGl;29{R9ZRu%f>@~CG(QUQV0{i=ZaGFuE7HEJ= z5aEl7@+mt6Y#(=w9!*PWfh-rd!8v|Q*G$3v`y zW7)d3{R!tf`B|(9+YeS622uTWuJ5j|U@F9<5)>xc8!g>!nwt;);AbB{COd`hJIMn?CDSffF!`V5P)t! zx-TdW#RIzTLr$6qxQPCn0rLc&0kEQO3@!ZzzW=rm+oXMY`Hvso1a2fUsic7EfmeOm z5H)Qws3E1L{X+Zv!7&0c#Wc_5DH*~g2@mRtyNsTYDH5iP6Z<7Jb}*D6=?oB9URpY+ zG6}8|iHivc0uXQ_LJk4-Akb9^Xe34H5-?EXfh_O-EwH#OKZxN>zy#sy3y;;dgzkygIch~I-2gmH%Q}?mlXHQ4ufAV<0&@d3WDjlV8;PTF} zgf1nU-Ial2>ry1QtmH;{i~`4|Kuy_%nVgcN?Sgwn+Vhl+1B;55R~*+kEtDocz84{- z4P^<7e;Iab)7j)pDbq~MeO4pMc!x%B4lN#GN}Aojy*|t7Z=z0>$f_uHZ31)Iy2W{) zP$AE|v-8Ez9csP9kg+IbR#I65zKxnON<1R+??06Ob0hOgRD6{YcbAbO%|Nq~Nc| zGnhqKvK8VF8&E!LY5nCLi~q6Y1?CP6+)SRW*O2oz&dSQBgKX=;p1Aiht#{ zY<3#RzumGnvHEO}?swgt73{&-!-MM`{DK39<0ATIpg9Co=bpY*z~49@q7F~O&TP42Q9)-YK}~MxAL6S z^`$vW!pm(e>EjVQ`iSu0u+F)#LNv|v-FpZ6Z*^R<6)k@5^tq+A*=y=DZppu$ zHfe&b-qf?5ER&5>1rHqSw&X^1lo(1UL@J^-pATWME^0M=qI$plDri-oU@q}^ThxW! z+KEOVPh@OKbfYXguWep1Tv(ltp0s6gD($e$3mPD%afUqi@6UKH^0I%8)F(*m@UYG= z)X}{)er99cUK?703XmT(hS)Tw>`0|hI8S1F5dEj}Zj=uIvq#2ov`=-u&rh}_a$ zVL4{AUh0^9*Pik0a@*B+@inwHi4&>L{uVk+yt|_uf@liEpit|f4`hs(O{{% z$MNfU?JdnoZfX5MLw&$$M~Q6PQC$e`^@XfvQ-I8zA|v>V`9H?6++O8;W65u zGdz_?RoM&Xol8(?q=j*EW6UxYcKpk2QC&G#+Q#cno@TSS!b43tiUIE(obHwG?grdf zJiyD){@kl%J^ki~zpMJ3=lpAJQ-%is@^Lw>VETi@sEJPLu{Dt20@)a1f zrnbKm-@hrqEQC*#No8yq#lI=n7~uWEWo}?^!}LDuRIYkOa*T6Qf4LqZ-{56?754=G zQ+<)pM_4r2OLl5TKlpW#VN-Gjq$lt)6i7PC7}Q>BlZtqH&`Q)baoxnAI@9Wsl^K;0 zVSSsCRlu(LkMB4|zn(_!HDJY)t(_Pbh~{-u=G6D{dD&6QhoQ!AJoN!|&sr`u{Td(P z-1?)KlTloI(0}sfEIg3%!@ZJiL9s<6)u5G76(Bxzsxqn?A9Q~^S1{o>3MCCT_YTJm zetviU=F@zP&?eNCYd6M^r8jDo#$CI5gsdYUwp`^qdKKm=({i;$IC-E)YNfn8oc+%r z|Ks1AbB^y=o=qyc(zP~_C>38;_{s|~Djla!zF69)S%J1+?$xNe+-AL(`NbE@x=sTlAc!vL5;-6mEs}q5{kd3hT5>+QC!bV zV*4!?Qz0-TbUxGl2lShVF3wD3z77Ldv76h&mDtJC^0!tW zh4n{|8|p&r%;`S3lW|4v9h5pa{v^tg5=5|Zcsw;{OYM}XEc*K|I%Ly4(o0_~HE9-$7I4Z0w<)gxxYKfK{Qa;feHcOvw z*$()(CL9FetT5T1Ua?DKw8Va;T9;opn#*~}xbAM{uHdf(!7%{jnzSN>9gk?o&8}9W zzSmsPW}9VV!(nzmjk@hI!*S*kExD|8T-6V);ct!bXSWsBHiJzGD++};7BfZb-CKsf ziEYHB^0KMSs0Aia$>=qHec`@80Yj%!MZz~+Sikj1M=lHhG`>WH#;u>?>1Qn2x(@tn zuP{6P?Pno$_!sNW)~mG_HD44c@OK#UI8G8fMP#!GZaD-u=HR`0BqKSnZ$iP#CMUe9 zQ^WgU4Z<1>qR8l`zLQ(#)aO1L{r^N|xsOQ$3wEX`*1Q`6ahy`Rb4&4P@2*YH@hxHP zpbC}(lipP#eM&;IlRW-IvaTN1g89IpOc8F>*V8|rP&lmVW`3V)|8_rG3;TFY#j7ZH zrx^O+3oLV11HqrR)=lDTOGf+LJ4^HbHE(VB#EpBbaH2POKI+r5en-@iNbl~Uz8 zc1@bsT?>7Cof@T`NENjHDct(oo3xiz){k9|^BwAjSN4teBsY?6j3h1VgqiY6>-X{N zGfV0mTrX7iL^!k@X`aCwzun_mDWeZb53F02QM4&j2q;{`iOb@at+S48ER|K6ls>*T z{W=8qN#3mPwY#??SqdVFear?}=;y9k+3!i(Y}H;!cqnrBmLRvJS)uH)5K(zs2nz@6}8XNRJ zh8V~H#s`|dbe2jIxrXuoCxz$#<hs$m&J&5UbXSY`|NMcbAWsmaA$AunqRFy zPfhTDqOQt6zmj^!8}xXCjpVq4j870WbX@gC@Nd@n98 z(_wE$8YlKIXNP}mBBF>71|J{afC1I_oE%8@`+CK0K^@fFJuhCt5y*A`F!2Nn$W}wx z8vJJn>jMbURhDBvppCvf$lVC8_^u+2C7@!aSuT3sgR^5eGCT~*6GAOObT(YO?*P35 z$=aNpoZufpeA&C79mX>^QPLok=d_(M@A5=hnF~rDuAxM4-}XyB>p02i1QHL?G}c%&QxE+k_tph} zLb6DLPvDvux!@ET+$JVA)~q$!@SC_DqAt9chxqP3wZ4Y7!#$AxxrwI6MMpE6;QC~c zI1!Q;INbpA7APV--M1mBr8~vnwVmzlYQ=|AOChc9BhCFdje5cO?ASG%;v zVr5cQ%4{g3k^tZOQg4ss#;+sxX3ok%G_Nvk|DXGrMc_OSiWV-0pB>yU{ zLGEQ_Ivk%3&~spES-@}*#w8n&vXee8N3#`s z(Nxn;@31G3FGd-cCaVFYf|3uV&@D7XU&ag#<7G`!OTm*+lTuJn00Mmq3eC_KyMiYD z84?tjOZL7~6>D5F15_JfPQs^8`-nyWH8OEA1OIVCCHnmdijAxZ3oe0e8~E%P$O!=R z00%f8k|z*j3Ed)Z$wrEcNoqKx68(vPDz@4mx~G> zQt|_!VM!fm1E~{U|B}xT{6}LE1mYNS-q}kx@ca(2f&bA2`461mpWVnq5MA*TOwDP} z43Z}CMv8t#_XCxWr;yeS0IFNyD+oASbtJvfmB*Lp=Z8K_A}lZm(J87W6if=pol5IX z4vWND5(y7~^zyWR%On`2?dkAnvL3*wfs`1?=oVBI<6#Tt-$XEOz!^Ct8US!xr08!BsGG9rGGh?&W2ibBc|qSzl^?COwGTrwjoXYN0|Qx7v5#` z#lZ!HoDUbc(cg%M3&9zEFW>?~$A=3!i0Y3F6Z{k7()~trod2EweFrGT7uM?jXH$EJ z`H%@zS4UCqqOBqw1!BY(8Fo z1nLzpM!pA2z(-24u&lVc3eRHTuuG_UZQxMfATM^>;Zn*PCoG@{FvFOvzPWTwEE-)+ z%DJ-yP(fZ{ePPjAeHrjuer0E~N#&mkfoE7>wZB%dUszuTdUM}0XrzD|R%1eiLeF^z zdx!N6!r18i45D>5)4!DPVDclv=%8rCj3U}>cLNMj1R1in?2o;Lb;q1YGUpfkxg?q6 zQ6&VBO%UXWfbsNr7xwXIZJMC_!o&faC8+=l=K`jtyidlHHB8Mm^o-*aLRZ)>D6G#r zI_n9jL$vv#D{PD=@o&Ei>nrlt`s*Lo7l*F!1@r~GfUv$9xZo4k*NVQ7jMfm?_6sBN z%y=S-sfkD(z~Cb~TZ5pTxywOie;kC5t6$6Cc5(Gf@Qc zv!Uss3Iz9He=U`BG%Q7k-~a{&j>p-(9zKWM8<!h&0*=$k2^Z^i_jVk6!g za{~(8JX!ro^z+h1 zONCs5Y%1Zt8N8tT^6+2IlfURndjZ-0(Su%!EAUm~vr30ep}SN3e&aiF@m`9XaD@te zr5SytMjWn4_EIoOz&>{N2&hfrZq=v8YopV#U%M+$SMI6}KX`WGWe<~2cc6~5cuqa3 zREmy*_%*3(TD=3;He74SI^Vy#)U`FaBk+#@^35*pJv{ztaWoaU3EKsIOuG8duZs<2cCQTuA=+Kre=_|)~SEEsr_Vi-Vu7nLl5H? z%ff5wV&=>6SC{%M=^9LnXtA>y3lnfc%;e&^^C@cd@b@i7%Gn-(fpMN-;Y{8DGTV{( zXHzG7^fKd1suA#uojrG0d^&F@%nNQogxrFjT;3)5 zJ)gq{=f(9!3JV^i;By27(dF6cbX7;kp_dpi&*K$8v3x#WvNP2XzFLn(oL*1T8kCTT zr6SnXmo|JMr=g_y1}itE;Yd?xQ`}vH7cwSEtyQVMV}*+BWaDl;=8gew1#-vXOR2Fj zyUf`x89AFMb~oFvl~v8E4VE3bU9rbH;&Rl-|kI{ z_G*dU_h0T$>MYA#CfZB=HTElozE^CLEN^HweH^1Ld}`9F9mko^^D4FE!Plg-emMvI zV$G_&3M}hJ|nO6_J(_ zknToBLQ+DyyQI6jK|mU$yFt38ySuxQ?(V$n;QM^{z2Eu=@weIM?73IWImVcCLLmoW ziQWBGX0!OYBu8o1FS2>~?eEE`q0Q?}o&=;j!AtartBa)UYG+?cQ}Wvmv#O8^BMgPe zw(BLcmrbjiP4q^n_Z4TC?h0!Oe@>q_>69nNhQ){*9!IkJi3CaMJT_M;W`sQ+88;p< z$&HsKTt6uHie*@~shxFYbtE2KvC|24)#%dsbZXo?+8fqXsD?`f;=h(L@J*lU@Z5W} z7tdb*k}L=>Q)Xyu8LNU(9?PN6Fhw75fX^d^g6qRjP>4&16|<@KGr5uze=%O?ikm_4dW_1o;`c`>Y0g?tzv|B9YUy$XgXoE%e3M{ESDCzw3ry1@5 zS$rh8=F2HgQ!N|^0m^wf>r+uu>THi)8rJa^pChp{{Pp?eiEfsS)oTXxFnWh(r}qQ= zjo!HYT!e2c$jJ&&PPa<-V>-K2k*H0(fBtTR&aKT&Sz$v_utqCgE$UBx_~U7eCzGiE zvBJO#?x5krzN%r3tz=!k!SUCCcaJ;lrHN_e@;c|_KW1pnH_=BEkJlzy-;3%Xq4q^# zQV4C@!5hNu;8cvXbKSh$#;ovp2Qz_Xpg_t+n&7TWhFJ#R#zkOO62NGm@VMR%Ke&eL zNhPxYmPF(`yS~qI(hcLfGJQLr3kuvNG%w#Ib<)OMo?8U@&Qh&5QAw`^1CtipJ;F@< zR^}RN4KvnHbhSx6uYOeEUsmZS1&dPR!Kt=J#qdvl)19@iPG4oT7mtPf3mTr1POwKk z0z*Pq=vbQnaE~U`!Eqhft#;0r8ufePNY3qXo*unrDP)I!P;DXTJswX5*jJ?~n=pMk#!W?%#=J6rrBOAX|m;6BZ`}HA>fH2xkbStkH(=NTW8P~x(fr!k}No$)J zwdBdVTUmn^wga-n4XLT4R=rINxe^xNmHX#AnyQuZ)!{}cT+qTOq0!w4@ zpK!KZU_=lywetH=&IAN;89-?ynvdn@yrQE|!fYjQKgUDgtYeux7f(h=_@Wt!gBL-l z&#-y$PJ744tTiTwc*gq)PGEN-ol*Q8qDtkLaTqg_P+gIaj4yd-$hK5!g$fui3|OWBGeBvNs`2t0>l0 zGwLE{+a^(4rgTbNE(d^pV(&|7RKQusseB%eZ0NY(5xPmjvBuOw zzWcEtbQrD<-+<#l^bte!FHI!_HCnAMHhppwClTKy=&AVpEwYlQFq~K}IuW%b@h@fQ zSC&R^n2C(tX-yyT$iLC>v(8h0Fl}v6XB^pi{HuUj*O+1Y(f_rEd!a*&iv)Tx!$iJo z-XM&AJ2QU0*E7P}y<{q!U8SE&+5{Q?_S9H=<_GqP@#6+7Rk)RKRdpu$GP&#ww3V7o zV$73{KJyWmGu;mStcf*&OEO}QHENOCk3%oqhEQv;DhFvcun|A7&*J!1H5CJ20!oOF zNA|H|=()Fv2iYk9$#o)u3G~s0O;iKWHyt?>>R zqc=83?D#L-@RL?PqU^1RE%1Jh=15;T+GYZq8`aG9*BZV-rTCja4Rwkm9(UHTDFKgV zL4t@CLMc{;hdbZ36ESwIl)@%>Z5-AxSSozQf4i=dn?nsVK@as=eAwl&?TkgNj ze<-SP)dW69Ln1cb`*Z%9k?EUzOa*ukw|3e?j?F4dPPHR$U$r(Zx^f(PI#Nq$Ph;Kl+!>0c4N;if3p zDGwH{K6eyf>G}<~xq)GIignnp7A31~fyaF^PFh+IQqV6)t*of8EIf$Y1qO%5J|J0K z%sKms_YKglT|CxJpP-`^KGqq%F*qngIXN9QG$YhMq|Sxg46v$x{M@A%i?48IwF$d= zITe8Oc`xRjyH&KJJSuj=#D>U;vtv-ag_+Iks`G1x3*6$IEaaf~{Q);G2cKUbO&wHX zFW*|TIz!8$4HnWNYP*WYn67(m&#|X^4CW1rWCa!i3b^(O`4idKGhDK&dBlOSPslAJCmEN`gF0@@T?bRw}(}AjS~+oQ(8&ROeW8G z*@7zarD0w_#C$l|IKUxuYbRUYV`Zc5>Vz+>D|m;NWL)aVUM$J^mf>>ctCkPiCJgV$ zh#~6*k0-G$B2@`*LA%<-dwI~V2Av|=(YjOSg*CdQdz=Sqk$oAwBA%<+$JyuJ?wge2 zWSz|`FHMu(3>pJiFX4ujCIl;Tqiqvz_fxg5Z$y56p^QYDbsQ|LBdMLF6?fz%i^(Gs z{b0UKtW_SAr!y2V&dTd(+xGJPn}~^N`OHDfxcTEMFFm;ReOli;29zH}2F2y^bpsbd zUtPtPRYc+PM^n&?4OBvV6!bX!X}7&GnbJJ)Zm{!L3f?IG&<=)%@^A;fJ%CPsfivUB zGL$qm6k&H`gcd9P0{`TM(&cn@T@U+yDK-U=BcKh6;h2^?A7LLA$qJA9J0+kt2~@>K z3K@MQpkCA8>e1~snEY&`&^&)!Oa7HcgE4G`Bxxp2yWO(%An)SiJ5%Wi#C!hKAStxL zfoc_8{_7b@8x@^%Df*TS3{B*mU z7kG4$YCG2If8#Q@qkp80QTY4!Qn?gtrc1!N1X619oguGBDf!$g!1?ff{Gj=xO+wB_ z{sq%P5u+V5j6TFZ_z?R9Ek{G_^ZgmHPudHdVY$VTUw#7F(=7XV`7!zZZa3|UA$|vd zAvIyeB|!~g`1PP{94#hq9`QM>&Y;zFuQ-EwIsAf4oZY+4GhY%>nM4LDhtd^Lkp0*`Lo#M} zlwAyU@_f&>W)ojArcuA?wxO;~XT2v_j9u+9CCHCM;V5_gJaitxE%wy{t*ei?a5@G2 zYy4gHZR2N4TEgF*IXN%HuX~N@-ApKT498Z8wEQVHE9dsPyi{~q-;rC=Z?^Ww|t6SZrArzR@fR@m1}`YV7)LKG0Nd@@?7QWX1E*8#yIh-`#W z>Zn!coSE6GOyS#9{*dtM&p_jiAq^^~?#a!Xq^9!58-W*tHi7Gk?8RM#{ro9+Cwef# zk0*3=sbqaNUU*XUXN41dBFv}JQycDQ8w%d-x~kzta;OL@1ZgR~jn#fz4=6?Un)pmt zmC)BpuFq`b&s&_Df&=jibz(;Qqgwlj1eJ0c=+vd}Z!l$|K9aInqK;hSyoif8K@~<) zRbD01@;<2*Nulbgh(nE|sK5Jw8A>i*`gZcOnRXRC8#OT0Y15cdN$i&mvky4kPKsyW z<9Z8_j)Sg_qrsVyX(~Rf{Fo9IVPk)}R`5pW$M^rwRI$quPqPaavie%rjkO$wnGIXj z`lt&0D6Fd%<~w4dRr;`<7;%rUP*w;-Ci53P5X()WLeALMBNPDC0wkOq){r;wp71?SI z)mI~(a7nv3i{ptiN8TB=n8}{|u2;Ii`{|LPzW*IZDL7m+L(vOD0>XD6ATvl-lMyo} zVM@Hzvj7u;E-bhI=Z-6jTcV_%^^Y>!J3IMa5>|Qke-S}8aAJ8R7=#MJ7H$EPi|!$r znkz=L`9_`usRUY0SK3=%<87Zwry%qB2KiFXB4q&Hk7Wd!kvT<~GvrBrrmw|_{Jv&R zAK<@Kg>fx5=hN-&M7L?pD0%PS&9TPx-QK#RviqN4husH_w3mk>QdnBQ;Z(ddo@5HA zEpRq2ey(^i`PseLffmNs;QJ5Bp98TDDh_|b)#im$3@r^S)@=S%|G{Y2p`N!7KRBQ< zRgTIXJh78GT?FTU`5!~D-45)tf8oA+Om8Fh5kL6kCX=2bD*8r~G%Z$1yedV;qMdWF zv*>joYi3sYdGUZQ@c-eG$kVHQ8wt!gT2AHg)B45(hC)jn^{3QsY(ON-GFA!+laPQ> z2!HS!fuel{I(54D^lG)T=ow&Z;)M2zvC#^y zDHG1gkWSW^#||7|J1wrfmWQ}_NEAsuv_WL*M}`p&FcQ5h!hMb{Z!=`-83e2`0YYFf z@#pyjmT>mE@CpCp#yc1ekYB7JGW3P}5+Q>GZz=+|pq;&v+_%J{-c$br_7LuB^}7Vm z>Q3${c1LFko{Mn`+^igJ;ZCgleZyo0Ys|+Ly%GH?60Q=|u8Y^LZ)Ur`Ey)Sx2WH`9 z>%452tvw$=Dqe4fR+ymrIS{plU2yo_#sOc$<^;?vjHD>P07%@dGSdXeHbQV)vs}-Jn8ZxzQh|oBOqi+w|FV1@W>}!2t!7ckd zCWh`OET_E0C8D*H(J#}tX4nDh_O8iA@g3)DvTRp)O(yxl-L(c(^SHSad{S&gWQiw{ zCX5~>FGKyWG^y@S2k6^?gM5O%6JJ_}>m_0oCFNGptKi-|Kb5X%uc&|4p7lv4!L4X{ z%sHfA^ZYw>AUz(+HnGj=3r;5Bi!`i=e>QDa3>;L}`!c#XVcD?zb9w)>lv1)defsOb z9&qqj-usc#-Fs9$61*rVm-s>Lx5RfgwNZB66PkNgKr+`~?8*JlQDXzE{8hR-B#?Ra z3F4>`=^&lx6mR55W=5D5F#7!%=x3D1(`uza{Zl#=a>CDQ6lKkMr94ihC@4@iHunUf zoHNXP%iQanv@`VE2zQYl{l1Kh-U0$4j;1hsEaTerXey(^07D^8F3P~ufR`n8)4N~8 zBNH9B&t_@F)s=C$e&pv=3~FC1JuGT6N0ADt-*fr9KZM87l-KGd$Y$tFT=OdE3=FeR zQ{=3cAN&m&S0XbzTt7dovGt4W zZV%M1D*Tp|5jxysqpfuR^Qt&N7@_Xo+#7hsY3Af^U+Si=g0p~svy1LSc|=P#^f0MC zr`%|zBxdoBEWN&wm^4Kkhhb7v%mWf6JWWNV@z{Kbxx{|uqeCrY+=>URf2sQ}S5PS{ zB{Ib>7ngPJlud!Hw}oeb)Knj6ZGenzCpQ{CP`?KJTdFd>ExB!DEcBLpL-FZ6rNpmz zBB_58lbvD!zE|~#R(+5c*|@j)N5VT$V47lvH%G^_tzWiG$mL!jn1@ZkzdJrg`~>ao z8Q+W;V}1zIy-%#pEN5&<+k6r9;JR^06=uWWT+@M4Fa6@UX-Xl61MHWo944j;Lr$Zb zhWxfdo%Bo=J|h6wB6n2)G$Sp8l9XvA_|v=JATB!_9na1et9Yp#8dD_)@99U{&spQ* z+()QSDB;>w?-@#yzr@X_5`URwW?o{Fh{CnFU{Q4R9>M*3Le&=!gp7=`Ivnl z4(teuf`qCVyXSciUoo7%u3=qWfG@%U{o^e_fIQNUUzGI^61lN|9T`1#Kn22NOyZ>Q zeFfdVaQx5oNsnW=*&+HuY(-7-k?SNy{V90p62U@B!p@CG(Yx`<*BNs%QX-)bo`=Qj zQ(qR@)jOcC3LPHwm}UCV(Br)PdL4KKiEsMrRe2lasJ%~xk2Dwn^6u^Rm7Ihm~R<4 z<=Q17Yq9Mfp#j+3)&RN1FxJnF-5`3EdMaV%GYVdfU1v7T*?|&eOB{+~9RlmwD+r8n zh3{7Y^~)(!9{d3G1Hly*Y8Q%N zU4r@2-EjiN`*gPovE_>^n4YIC(l#q0J6gnEH!bW*>I~nQc`!YR+JpY7_?4 z?zoH$9&RV2RHZ~V=VDmkutze zdqdaJW8_~_vxt7+x@H8Sj(&QGx%TIog|1D?;0niS0W`MGBmv~}Kf*7;BU9e}7{RxIp zl&u~f7tu1AGTPt*d(d-o6C$I=cgUTGYDM+|FG{xx;z1M11*+O9tUJ7oDQc1cCq(Qy zyeZ2o>(sSdTn9JA*5_0SB@DsKSi*b&*oW6DQ#+6|nx>7+?ph#!CGL1HFgoxm1zkZR zYDP6e@sogM^Cnzoa1aA5~;p0L1Gm2 z4S8mlalvtR-!Kq-v_J!+;%e-_%zQGdGsIE&ShA=fxbt@ZWWrvHbzso3&j9rk-1=Vb z%0v{Ll|*4JRqATrKs4w-#M`(t`@6w``Ymi~1*A-0gHn?%;R>y)cTVPm7#8p@ga zWj{(C0=-iV_MlIAuQEx0z059oDQao8T3(^{HuKh)^7DXYMfWTGi^8VKoifMeAS{B~ zCiyd(CSsp*zk`>9M>Ae5k7vDtfl7{io*)$eJXW>5qvTpbHSVr<*PfNGeZUN_uFH3J z3iHy(>bLw|rHb?M>L+aYbr@%1RXFQgMEm?r)EIgF39wmXeC)d~?24pd8~GasAAyfn z#w+}+r_9O1QRUplK2b~wt%rfm&IYgqe@zEXt8T4@92T@dVEqV8*!ElP{hyjF$CN`G z`NQNWPobfy>1&*y&w9y#c!CvK3Hle|@Y7z)0tw1nc*4blemQpVb>XP)CKc2fZ4el) z%fx)y?mVDUzWWuw2U>2)59W%I8C1u~77o6HeraG>r;m1lR4{6KA3WkF*};@j+F5 z5%#Di7!-Am&FLh#35I3)^g@auAzTceywA4oK z`$0~FeQd=M9`y;p@ zxbR>Bo1NT*=%DwS(rWZfElvNCsrL@35b7?18|%lp@;ka1fKSk!1)Sd0XEU@#&WMi- zUW1R8(@k)?aiY;mL72IH(5O;h1<|NO%37&rey7oXcyG965N^05sgkwdSemDZzF47n zX!R@ZnLn3hM}^D^eQ&cMIHh#e<7Urz)r#6&T`lFy%$=>|0fP_4{rwr(?|c+9v z@dK5bQ3)h65HNAWVdTmcPl;5jWXhM%yg`FTz7p+4H7|~dj;Ik4|8VsyOzL#BC)`$g z*@>Kf-fw7lZ0jwZ5cy@iWmWZUV?AOCDUv)S zFuhk&dQ2|=jWJPxPLzQ|)@XK}|8Qz#Z%r%)fx{Qi7f>H1{%6ETfSfe_p0B&>oOJspR?{oR2SvetgiR##` zoj{Um_=wp*)EGV5&kN|6kyS-rFtd6UT6gIMM72N1K<#?n_lcHAOuK)0?#t_=c6`T- zp6wb(5RX=(v*izY4(17Oka6_C_SFs@w58bjV^g_-?_Hu}G6@P)u#L<2Tm<8VG*1>N z_?!Tmf5`RmXN}5D=*YY&U9NhsQO{hNdvP@BU)#WIeCw*3@e*f%0<7pZ&?&4)^xHWD zc({+*JxY=y5`ZVNwEDW;MWrHK6>osIdD~jW0Y!ie027awXVFiGh`s0dYmdAUZ|g@k zv2i3SBRrA9E}X0EMfA_g^KzuNy+gH|r7@XkuB}C}-!|ynmz(uw{#JoAFWVE9Sz>Vw@EE6XOwd}c}A#w1;6CDX4+ zm6kA$mMd>3RAn5=aJQhTV19Q^{@7hE!3CvpkK6Vo8Fx_KZ>#Lj~8fCfZlQ@R>vUu{55-oskwtvqTRr({5gT? z-S!DMEG7`}OI$&S#2jMt>mrwctd2!99l65>kdsX&_jZ=OXn=8!Fbg&ZS0CdKHCi3G zRw{sF>a;v&^Mk&8#{z0b$;qQwvJ`T}K6~JGNVw5UU*qeVU|a19Nj4+c+e)Cb7z_zd z{B4O+$}9Z+Yj)|ZY(lP>LdQS7(9=x+()m~eaZi`AyC=XKqU`<4$xik2t}rJ{DF3QQ z=`NE-e9kvya_nL3%Wm+LhOxER;*T1UGrRp_QlB1q{SXxE=LbdUD=vnxw+7LL`&u;* zOeiz5eiKxqoE!pvD8hUaL2@<(LV1JyGL9*|CZ2kXL{6p{pC7o|z3688A2;osHlTMf z`DeWR{EFyF32T52y|IwUe|y{A7=W}`HXJkyxayPM>azn`AhWMW-5=MF=%jcZTyb!dS4E&j7= zt@Pu^7X7X3R;hV!ski7nQJ7n;NX!y&P$@r*bte`p{dyXKrl_DT6n^yklYoU>%JO*t z8d+I5oQREs&O|Jdh&UaZH(lKC*S0kpYxN^&y4@?IY`1Gxm@)n^CrY~Gqtr3Z_r&&I z<&6#ToDz=4H?J1*w@>fBWMA-Ryct`0{hDGf8{-tk1WEVGVBE_8!|oCtJ~Ak7xIgHJ zPjBD`;_+K?{quu=FUS)LY(-8^wrhe<=OG6qg2>wlZQ7XJL-rY^CZnRw9<_Q|HFaxj zxO<9t@Sp&&+KWbw!B~a>5N7qIxU|jnedjOkTUbeXY&;v2E3Zz+#ecUGVct$$hGMDf z2cZP`(`s*sg2j(DkzfY2_%g?=^7Tf-2UgOGW49`X)cii@>sbz-5qd&2ERpu!m)V2D zm7yrFN^0wP?c{2u9RKRLq?||cvA>KMci4{hb{nPjdU2&q)%3;Gno1}q$My(8@;ZJ@ zm#ga6E~c(E(g8Vn!Cohu#MiOycKV!3mfBiPkHc4=WcC5+kA-Ur(mKlI?`~1VD|C4~N1lF&N~c z1+>#UO5fKOJL)iYY0Lf?%4u^_jEK00OgA^JY6()C1dFZKHd-bG328-4xbNt*d;Q>JMc{O4M$iNw;U6j5Pk3^0D^$QrJ3i*(HeS(L+b%nvWJm|Tq27c^$OQb5Ty-K}GI;y-fvi)~MT6!8l+$Co$tS6G*xk83o`>!w z+_Guo+-gq5_y6Dw((AX*)5_ck40J@ybVQRXFmL)9UsQK+gXko!ypx78J<40-g|LBv zfx~uI%yf!7l*bzorW~O(f}J0{&=l1*V$ip2M=*QcM)@7WS15b{?CNe&va}XGLl!01 ziwp_3Dd9Wxc*^#9<`X?Q2{DNl-=p0y4OXTvm?Kft3W(TD&%ye64S9NrV>2jC;6z;^ zK1T+a9~j((koXenj1ve*RcAPkHPDraUZ9PTEG#gHgo-LF$0sK4NisfzVu9Sltyu-4 zD1dT7$owBMZ6s6@P$_eNF9p*UC`yc4kwf~wf^?Gda!|}M+0^|K8)_L6LR>#fVOdC*31XN11VH# zA#@t_5!Zj8{tKE7a*I?myMy_-#Kd}#OaL|`J^x!5ZC0z#%|3DR0 z2G}RO9}3#1{>jG8ki+(AJcmp=C`njLOG{8)2~?s1^+-xyKrKE4a}?U|!_y>zcJO@t zKlk=#c^3=$J4txZPZ!iBo-5Ps10B1a!gFlFZ^QyYNAkr#-(V9(f-+bwU_aIR?Lp-i z67usq4EJ}fFlBHJwF%_WzN2DdDyXQ;jgCH(ftn~aovYkTaaVV62E8;uFWsAcs<#)A zm-*wx7DQm~k>^(PDIVmnY>#DwE{}RMWxC0qn6%pcA|u~w*hu70+~nwj%LrkqpXS6b zfAEP0&&U1N^mUh;J+Fu%8AMP>+)w2SxLrZpW59WyftoF#rW1HpD38?kr^y7l3Jm0u zf3o*+uxffiZywsqR80i%|1E+%JfLY88hA(lGd%w-JnjY1<5#ub;W#6d^T$&f&VLs) zoBz9&Y$&N=evgiFwFP0pzcu?mSGYkvEgobN@TDNV>Ol3GzTVy+pZ>c6-%@z72=bbxRIr0Wo?dEI;xi3aDM;qTR>?%Z7xu+#6P|p0NriG z$Yel~y8rj9EW-Z#D*fQAfTBg9>k;Aq`za16|32j?A`a5s=o&PQ{{LSAiy(@;0;In1 z>!|l@?@-+I~xhakObp<+)_54pZAEPu1tu73MHY(>E+tiqeylUp^*H1X@ z`p&9v0oT2LDb2)vU7F2D=9I+~6QO&+tf0zn`|wlKxi3v-{~R3MylQ5ERR#X$i_4MC zbG%%*)9dY}#fj3wb!vTW1%u7FW3=farztHHB=mHhq*oV@#epfvBOnOmlLd65;@NWD z4WfOG%}ja!`rm27HwPK?emw_0Tl30#o+X_gbJohzOrS=Pf`%*l|1~0Sv;eV&V~g_O zdp^Lhke?UZpJ*i8S+=)h1ssaN-G4(nSW#q`IATiC+cNx?xGSKO!+7-RMM6lhxaC5; zOR=hSBr%6KYKBm?`*VJN3nrNi9p;53rwsoktJuMcBcO%VPJP3XMLJylS9OT|nMW%0 z=a#QYfO>-0C1uNZxqqy^^5>Vf^5^8O`m`%2j}|WJRuRAvT`r*h{$zYU)0g9yHC0hk z_@5?XeX$V_m+OW|q+vqfY|@Rq<1mwSE3nuC#$P2>4DA#=Wfp8k?oFV1&FvPLVGpzpE?G*k)#MVUzu_=-O+M`% z*En2DS?X38pEW69lnnN?lRJslZa61#WQK=&j^>B+#_8qPY)c@SMr%@=9-Wzu5WsMG z>}*POXz>!L{CF5)GUK{#fs9p|RsQMrXjMjnCj8a+#*!uxMZx~V*&*NVS#^NQG;~vG zwp^ND&0Z0fb^hNaWqE=R?f-73T9E$rFw7dLpLL_$RTerI9>4n2fl178L<8PCHyOwZ{%S6m&p{=VR zVqH=kK#@6!b@IvyvVF2ofNxTh)Xj@9`Ilg_u;kRgcw4fzRtDT%gg>5;K>ch!`(*&) zAw-r{7H0dU;_)L}eQ3}<=0rh-hBAoCuBO>Z&8nk2Gyt6g5i$NZkfjw>srDPbESRp9 zXWZc@N1ld6SxkpnRM!OroZg#ycE5_(!hB(-kfm7p_>qZMs;}6kv#;(?ZDwclV+z6@ zdW4O$Ivcha2})j!$%gsv*Goh%1?pF7spZc!N{RIc{I&?ggM+!$I3HGL6b2jU{TRV& z<^N>7^zP<|s0_9ACItVg=lBOdyqu$=rAo%Lj#F)DapJsFlw%5Rrs%w zMxKK&A0;gwofKyux8&bP{Z7CqxWaX zZ~egn)Janyz{&2!cQ8&Y&#Epp+$_66KpdiJ{Qkp9QOQh7siH4(j1^1f5Fyt?ufbn%EfNn^K_kj>8RtnsI2p0 z(9-AeIvRwrbBPMw91gzHU$_6v?`Rn&)KS&>oa|^JX@@DRDo2qd*?!9ib3a33mUlWP zJZRL^GE)Pxn(SFJ#pANVyJS+}+wx*=6*MaZ`MoES((t1oWhx~gyPax^z3AVT`$1Fz z5}%szWKA@b2sn%gZVdrl-eG$#sj?0@Tc3USMtAL5k5Z=nS5L~wJAfF174tKze#k*q zT|YR`FIa}bj?`7sfN>p!60du)QP9anZE{dB5W1UK?B>AA;*q@7OHt*NvbR^|E1-|| zT<&trYgptmE~sJ|m#0Kg0>R$bmG?=gaaa6E*`yof1c3m=$EAtNolnW|@?v z@stMhpM?py{etZhOFjq*f;h9l83Ll6*GxRN%^3$*I?uSor)8((4uV91dj3cg5XE;4 z+@)+(V_XAj%F`lSM@FrD;!}Aiw*VS_@iP$UeglA(mEa>|?maaki*OvJa! zcniSjmc5i3*{x@QbVP0>B{8c<0`-ND2$JH+iYbL$HEUktK=IZs^KXNJoz zapN*{x@=<##NEIOgJ8C9b?;`3?|!=O_8&(3ebkH}DpM%Gh}qkP-yY9B=clFYL3IkUgo$I0r^Y3%wAhiBoT=&L4i=CY{r%p&q({S;zU(#0cD19= zieP~=L{(}dKg@2>`tdN>LCK-6Vf~l>emj4=Ss%tZEs)hoCIl{R8p#wQkJv0W3xWd| ztcTVEps$OHkF3xHY>*>T2e1NwRNof!7bU+=^i;jc<)NQ-Yu{m@3?$n#uuZ;D|hp5hk0$vq*-8K8av zTx#na(8BagaSc1EQgGpNh4{JM2j;e7FlUz3PA@r$QqW_cWcJ;}1>qmX6fWK5C;)~Q zu?i&AMa$BotJ5YG=BfAZmGzANu6fgDtmJ@!SSbhL8~z&}$b_^!G#ZT?j_mN{fmGjv zL2i&0TS|?{n@>PYD}|!^GkLeT!Ynr@Yd0w`2F~5ODM#}#&qk%Kaj_>RI?!CnUe1?~ z-@CCn2%=C<@XTLKez%ww6ZN881ct% z0IV>I5P~>dBtdURT{q8o+ay@~jHH5riA@e`Hhh)~Y!LRAC%%TlXA$=^Z5Ks2KmO49 z6RB99M-Inzrtm=@Rvv9NSJl_d^>b1H3i3~%8kmM$?<#O1toMJz``ybyzZ(rL(a`w# z56N?KhNgvkBwSN9@MJhB3cnP{Axe&VNghtoA}Xw;Rv_SDmA980ZS8Rh7vD}*9B|&i z5=T35>Uw#Y$^%m{F`d%^<^A?ZAGQIanc%&Cm}YmHo;|B}_JM%UZ21>c8%)PjM4CS& z-N+1U+t{eIlvO(30)?)6KDJdV=*xUXs z6`{^|55>(fh#|pKh}3p1wT9;Y2Aza{hPYvNI+m@0vbb@MGt>4$aLKMtlFmHePTo|d zk0&j$ZLjr)pv+db{fs4{AL7mkKRDg6V{%1@rtcX8l)q|edU1l#|E=lPN47n3996dO zns^pnU5nW5*{0u%{@06nfi{dv_Ys(iiOAl26W!+hy;5sF!egR2740D;`Hbour(z%)Os zbS+KTUja{oGR(i>(;?V|X#}|}$Nnmou(JAJ6u^yL4Qeg6W+PVPB-s5H?BU>&cfJo_ z(qlN90xF>Y#w*2;&qt?An7U1rGwKE(Xj^RcB$;h@y``A(IT)jXM$axcJz609xmS|L z1L@^>-6W6e)u*N_LVm4_BH;o_s4oE1I6?F^=dn5#X(w9k2y#oq=^L5ff$$GodSI3 z{gbA_9iLNH(Q=TG1Qc3gHy5mvYOx(4jZnTx69^AM{I#8f4rmw7{qVL6q)H|FT(V32 zF};V8^l~#}Rs$416}CbqM*wc8f{qh;G0HzFmF|F^84r_v_Hl5m<|bjw06iB3xUS<4$+(Z|%wg@4E60ZC}&ZdAcHA38dKv8gsMARF zT3HNZ+glgA4#G#xco*;JN|W7h+LgWP2kJBOL*gn`Wt`yplku9Yby?*>bJt{oiXX$z z*{HYhz+7<0y)eJHW@-4*mY9S-)o$=-h(yfZg2~$u#Cfm9ebJ}HuXCISS%4dkxk|48 zr1BGyNijRz$vHImGVD|jQm7WQtN?`y5dCfZ(6*Dx9U#z6s#zo$l;tydi>6o&z-n4} zhbgTF9~aJL0Lu&TA%)G{+uXAHG_I>QWCLgz-p-39~M(Fjc?aki0vc0G3|sD z`WB4nk!;C+GR4y8OkK_kjlZnaeP$s%U{5xz42yl+G!;o>u~#rqP}P3dBV*5DU=38i zRV`3W#DAleMwWOxWw;hC0x}`tLYm%}9P0e~HV#$oTHErxS+L++QpEdD!*0$1a;^5m zDt;*pNY64CAKHOi@00535yagu$t?Vm{vhr=Eg#$=24@7V)AZe17juZJvgQ8x$<3wW z=16!%pUaL!Pu^6!PEUn5YzSg#|LAWeY3FE-LScW=w^Ys?Q@NqthcToN#rEMiEL>zV zPA``?nKdz#+2+!T@lsHLH060?^!)CB=O*&b$Ae55nItzSJ6g4AUZg$3x+XpO{{6EF zdOYJl(F?^%zi*pTOTIS++hzFJ>5U+Gtx6#1o*C9}YL%cWBaXBPCfPMJVd_uaR3brn zB!@kv#&Tzi2Blp!X3q0Vv)T?=rMwfHjBVzNj1iBwc!~E<4`K569SpWu{S?f*%hw0B ze>Q%H{o&3%a+;q6`Gr4U|yOTeb{$w@jr2$NM5Ec3IBSw_2xw*}MXkH@as^Oz3iywY-v)w7v$1iV3ugN^jk{)TXtx6HjK| zMr_NL9{h!E%7lq_yvy}*Gb)v8ah-Yz#T_anlA4bD=x@3~CnZ3d4hYQGb@}Z;`=Z-4 zyqXtM!~le7N4j$aA|I+IqXr$UdBMX`1O>F;*O!D%dOrGJH_^EazGu(w`b!bP zejWzW7e%)6EJyr&(=G!unGQ!n$7F7@R}6ijYWq}dmlI{3+`ow1i6Z>1$i5p{s6LJ| zQ>D;WX^Ucz4yFDLT+=~c*PPkI|8q%@9)JI9!IZC4E0{fZ#jxu=%nfg?bA!ZNrkL_& zWFZ~aVo_3&s+rD`8(ydH#W;PVUP|A6hXoI>DvQ@BE~tA88C*&0=e6uG>Lje*#;uc} z6&?>(QEC?z%W{xX8(A<0tEQis?FD1tmu>0kovl;VT5;` zbW)=V=LOqE;{22e0m0)}0LVd9@MDU z6V=;T>alK(YF80oaeO%r&q;c8MdgqU1jOxCFg(|DgRViwP*ot!ug*W<8mODwco^eB zWsAkF`P0i_#!{6(YE-lXU(ef6UxByz@|A7t_h$sd!y_abX&}Qwa&YuFtn82yS1T@i zl5NUy&dY%ZkQ-UizW~%nw8fBozVI7xzVE$}PuOvp`rOh`4GNwVG4dq8Jqdls7Hk50 zU7rcfpGK!Z@wlEc7qzz{li)l14kqo#!-h1(m~H@eDATMDC#rQ&V0 z?^U=p9sbi% zEf=8CFG}stO83V8omTyfDlfAieZ@z9>^;5xvanCiYNyVO4R`E7-6HC(MH{!Rd&|ko zW@=oPRb~G#Q$&+OL2#*q=F{|c1s0LAnDvW23gfj4dJv7-nwPfKr@!+};%Eb9$J8~w zCUz5}R!sKCuLoec{b1ZDC9^>~CCz9qhY8|%*hsGDhXA0YFC{L!BvZBDmPxxC$ysL- z!**}11n#kSL-cP>ysHQ-OhWYqkaf_%pcnkiJq-JJHUEytb=wRDZ5W<2EPAZ~L(zRW zSS+vJmWAU${qGm$rD`!$qG>`PoXs`xfNt#M&av1Xz)=@RC3KF!c|nyTR@E*k9p)wR za=H^q3#J!=uJW3~$-?7Z`JAXb{o6V*KG(BIhGf4?#+kerDesVN zp8u!4?}}=w?beNz?-LOb0qH6r9i*24QKU$hUP2R)P($w}Hl%~nTR=cTx)3@6M4GhF zgP}_AAe}%+_Co(X&b~Vr=Ukm}a=~C^C0T2|>z#8x^O;RJLW!+ksTlcC)nwMF(I_)c zWBy%_lwGj5-t8B$qr;7@uWd>>UjU^r*xjU7f(WIg*!PE8hd$P z;&tZkx@MrB^{f8&<-y%U;U|jsa(-=Pw-Mp|SO29=Hb}mAz`h?I&+_z+*LUBVQ`cbX zDSXm|{j$Tf7emR868UH`HN?4S*Cq=7y!lQkQ}*`EZJHUBBX>|thMfbPu0KZ3zv!Fa zj*EMPq>M9VBG3YKBGc3f; zRMFK!EHsF@zRkJqkxg=>-1Xae*%NdoWweFKgFljA^S6sJqMt9^${)27=6=v~5-@_E z=?S`7@wnYaYP5&n2g!Qn0+v;k21?2YTv3c_cc|X9BIxM{b6XS5Vb+FO7yWgChyPC=7@wUczT=cZ} zgVXU0U^i=>JOGmJ?XE?7kDgF0&<5sqcYIS2W`pO!bTw`IT`D=GhIn7$*SBvMim^2k z+h!Hhqdcn1^mJ4`?`~x7Z+h%@MOgrJ`D2tG`71Oa_XA@{WtE9S7a~b_qkoJu#!9&0 zqh6GvlEO?Q{o5qbnhCpz`0mdaZBpL7!{mosL_(YVlr%1RN+uQ;MaYg}9yY768j~+e z;K8=LHIe=hRgXUDqcxxO)e&9^{<8c~3{JsugIYON%^ur7M@<>bVx1+a%9BHOID~YG zWK4fS{p6pl4jb(b8Opx73x|UYS6aa8PgE53^i*mD(Rdoekd+o71wz5|O?c~vI(g00 z4QkGJH<}KS^Q<8)B8xHY!8|GwH!o+H^AibO5dAFgy^7!OT6$Dy->!&B2sC^t$a?f9 zvP4j%;^M_$_oPdrgCDSJdbe}t2#;B;-z9SpB$RE=&G4h#Y=hv}U>3^1wSYW7zYmFF z?}+kz+l6`zVumzd?Tz{COprUrMuVzA;~;oy22kj|OR8fg0(C_&3ctM0seGWLic1#2 zd}Id#6^lE)srqwb&TrvB5wH1nV8QdF^9Er4qt#AisceusKDifC=M0xEYYZ!A0zmxf zw#x_rH03jKPlodjNy2e{0fwFTmu@u3$-`LfagmbhsP2wy29kmoFekHWKe>NM6KX{Y zyp|~!i*xf${#g*aDjzZd_t%zyA8o!$8P!Z~zAmqpmhw#{l{cdU70r87k@-m~k8-yr z2X%bw1v-)~|76qqed<)mM`B|n!#~C*MagwO%@y{scZLpM$;KexwTgOnGPp&J2_Af} z9SY`8LU1J);m`W=dE;Quc3Z7#g3_X65Ax!|u_b zb(z1=Wlq<1{Op!CNoI+%AOBGpP`lSt-!HAL)@SdP*z(PqsraoYv z*F9=J=!Ws|h+j)IJZZJ;%n~7PRzNEzaJ&f+kL`C^Sk#AtK+iU0kh>Ig z6Nq;{9Rc5W04n~36{OXt94HxYR(0!fZl|m$0`=F_?^bs?KGX_Vt&*o^d<2-?@iR)G ztyNieZa57F$hQVcGpkNnE!$TS3n$P$>v)vOS0C?Qotw7RoV38O*AL;mI_(Wvv;_yq zNLIsi2Yy{kAkdxepjMcyDO>(DdWL4r#QN^rhc&uS3Jy4qze=Mma+#bX#H!doXj~os z_)2WV73ac^b!i>>*z^HFq%BY(w~FZI9zR4u=Iiy6UmA0;t8|gMw#>X{I(~aXxpMwe zN6D{-+bht$t8Ai!b5`%eJ?!9gSv*gJQ%-KsL8M8BtlrzV_AIjkE0tPAB)PIK0__(v zO-Jo?W$hmy>&K~xKY!IoHn-kMBwRoJXQZUwd|l~&{QT+<)QJ0+qxnG4%?+6(<+G}} z31!l^W@v{5gQ9lH;EysQ`n15u*=kR7*M+A}iA@mgo%w5qf1Y&oZfl2hPrC4yHiDl& zefE*Uq%Ozdq?{MfmCBE9+4bn|OE3FG*v=cKhKJb1ok$5URzeJ&PC0`Hp=JBnzHFlR zA}IU%F6tuBpqBa6?^kz~09%qDde>mm-CjZsd^lHGR<5P-)ESM)yu zt@&QZL#>#coaFB2wc5nrCHn4JAs%INcB-bqM@>wp-E_fA>+YJMaw*Z;W7|GfeW^zr zx;O)V|5o-Ww)?699ckY;zth&Q^CR{>j_((Nt)VnkOzyK<^K2D{gRcq{0FK1&_`c%w zy}Ya1-&pV!8;BS?p|Mp;nkJ3Gwd?o(nZ3LG{2o-xi2St^pUr#`)FnOz*T5BvO%c*E z^Jabi_V;90ZLTNUc)q?g#cVDR9 zQ6ASjF1h%NTy%Catus7Pdn|Ehy*unjK~t>*?eg-+7R9oLo8Ww#VO|mD{q9Mik3C1u zz3$8wmextLX*0!0*4^0whG;ZlrNUw($q9!CV_xiA!xR+5j7=5??dwYVhjS4(gY*Fg zzf`p+W@zfvQ6-;4JV+48d?FZMpwh-MyG%6Pkv?bkPN>u_7Vd@APmaGXxuIxKK|wez zdMh8|j8RXy`==8U^O(}}v_c_Po(yGj7s{X(bN_wmLx%KLu9o7y(H@{Sc!uTVb&mrO zkN)cRPT%8<`HSQAx7W^;jL0EEgJ9f`)D34r@y(=$ulNUKeSmhYh^Pw0s%TW+iRXWy z8^oqBnJ8~19+Z7RYd$2pp2T>yGk9p4^e&p#%p_K30=%S}!Cp9c&CJBOUr+pD*^{j) zRXnRUQCXm31|L@Kpq-Mi`tnx`AK#s9cW7dn@V7de5+RR*J0(x#$`p= z{@4&$%Z=V&;NuAiqdU1a_b_y5TWSgOSfyzwqe7ZL*=;WXfF&92ItJX`W+q-H@b$eM zZX1Jkpest{XeG_n&|{@UyH=aHh}d_e7YR5nC3O|jG?lo=(gsIDSyX%tF}&1=Lp|et*29Kj>1dERR z|3f7?saUJ$%8OI@6;M@wB^_SvGo@rLT<}Hp`a%DYfg&JsFk0uP5C+xqg$uv$17#c4 zB+LVjd|9An<>|KM>E$nthxrC-9QaDo%wm1z_9r7u^Hg+Ze{H(Q&!;y(%Bb;D0b%~7 zEAHg2_bb*anKbW%Zc5weHs|CBM(t1BuKWs0b0Md)bVKOJod&vhoZQMY2|*uEZ!B z50EUfC75D2+25IxF{h!`4)w+|N3{RZ+W_RpZDlnLbIW$P>jnp`?w+R-q`eI=8!6 zyKbM==VX~^Ez!uH6To-y5~^{!N?Npc)4w_RUJ~fyp_Felk?KV$0k8GRV*w?;=DlSZ z(8$a}&rA>W4q7q((*HVT9G0+`WILD ziSZg|4^TMJbqo#xL%uj&Dfq%Ops1{7DoM{Gz5ETXkWiCJL8^V zx>eixH0Yr-;wnP~x_ClwInPsb&2os}eE#t3)wxnCHV?U8o6OW5^zja~1GAkUSPNU) znNL0S>kvr=#*&_O2&(F=wJ_t2%B856;n6WAZc81C_(A>gCsKr5!p9PDTBMwQZ{R)1 zJ!&=~ArtpuLY*g!I3lZGRJo*T;^D}mg-FEOeTq#dD@14xW`y^j>6RGti&AW@@Jivw z5MYFK^hRXY#w0}0c4og~L3_j4LEiu*mf_|W{7`7vYZeEN*s+RR?r$iiyr6Ke5z)&$Nlm?_h8;9^3_ z0TgT#og=q^MCkYn->w=DKK(^lnqh@Zu=>8dZD4LhB)WD?%jzJ~Ki8x62A>hfzhaIdj8;~jRFK~BP+@RpLSyD$;x68} zh94rp_p2TxFqO(G`kkjFw7TD6bGqinf+H_E$-So#SqQ^U^UfRA7?}BZJ3TB2F+mk9 z{y-1P9XM)TACSj-A@7=;gA z0p?%upH49Nu&-@ zA}i~Rjbh4>^U0sTVGg;IA~nneN#0CEQoX+!w*I+rQvQu?P1!7 z7k=uR-ZJI^T+MPX6rwhe@R^ZzxSwtlClrRrxD#Q0v~0*34J?b0=)l0m%i{O??FVcXUR|5Z>}%F7!Td z2`J&SAN_HdMik+@MVH!P*Ss%bR@wy|xjhlkyu%IxX-WmXx$v@msX2x-F|4l1Ijfcr zQSK}IgJxAtJKZD)1xXXZM|QHxG3VD*iPXbm`4y-3Xy;`Tn7gydx+PM2AH16RIV_}+^sRR!M<4w@)PZPaq^r*!8i87-bR63%)t9WrQOYfC?t63S&#m;2Dg!>Vt2?~O#THVpTg`bk&qF(WY5D?bcq zt#8DBt#0+$l?lNGAs*wXI46*b>@#Iq(cj5r)%uCVJ4UpZdaMwQCnMFR?4Q&}X{xi& zw>+zQdDe#@{jq}#%Jrs1*6Z8c_=j#SqE!lG%=n$`P5nr|NW=VqvmNQvjG)g;d|qc> z!UDYCn2FjEXCuUYKU@J&2ndlTPNNXT~1vi5N1oh0_)Av*%H>xqVu?PKO|OhJ)2Xpx*!7 zQ#%MgA~z>AH{hP@hqe>}NN?Lc&M0 zNA=EK)F|(qtaYp;8b*aUyvbaHc}q+maDNQ7Wm?rFS4d0*=1>SqF= zuad^V*cKzFYkjZArr(>^9o7>%1wtH9G)3zAb)$^)D_w4zRS-^;O6$*Y-YEg4c-+Ds zHk$OzadXrsc41xo9c$F#bVx&}A>rYnIlYiF$G8aILSD^t>I2EBZaj#~ysK)<9iERq z^>kx)aKrbs+3BPW?>!prPU&(Y%y-6h>ID4ztx|yK?JnQsX;Ffts}~cPtq6EoDR80$ z>>d&aI{d?NvX+vq2cS$1iX$ke!;`l(h#2Lz?HTBEN?FRbTt)45@}ssosP#dI@46*= z*RqQa_=~w&Tzj&|=h}`VQm|~PKOgr$0(&R(UFy{cctKxlKtIR)9LlcHqbZG~u)R`x z+>6xDcf>5JnnXg>O?|V*%`2WdPdVS%HP#9n| zeVaK5ue>wV=#hb1qFwCUB&0&EM)P??Pxi)2s4TUi)3*&}X-UMYg(_Pci;8Oj<(X57 z%{Evp%@}KBCnO!QCn9u9*@weIqsJc8meg9+q?zk{!$f0F@a=F7Qf7Oy_#_XxAW^#16Rw zHuXAdrsw7BX3WolvVF6vL38_)IW#&|d37Fi=AfFf0iXCo8QZx64ATVT}AHU z=!uA;L*eb>y!pzvlmytwC^ROTx3_3^i`#1`GV8%Z&yK(e&3#t-1wQ9He zXEv9mHrwYSptBhz{j;4^YeYSX4j zH@9N6>-3D-CD12MU|0xk{#?7U$2g?sW;SN9xoEtlcmQazZQ@d2CyZ)x1+Jz~cgF>h zkZ`228D;n$$CNx-J%Y(7Z&@hUTBh#us99nwrP~TGop1z}Qo;~6a4dvftA_A{p zvJp3&`jwVHhb}|Tkqr&fNi-P~{dU$NviLFUy~=Bz*DSX0;0)`Q5BW-(rAqw@u~%gq zl9y0G=nF*4Nm3*O(S|Q-aA=H1(82ofZ7;m)OG~y=4WQNX^HmYdqj@t)&S}g;d46Tr zhP;!Cj=9oI&e8M{kq6K=Gx3{m2E)Q=Hp>s|kBy)iUYoCNrTR9z=F>29FqBk6s$g3; zc@d^p4+d8cHXQDzXNfGJbV#GjUVgZ>7TehA~)(`0=6Dz+nF^&=Ar;qI?agN zIqoNgeaLEGX>N}=G@7Dz6kmjj*>=Lb3}GI^#arP^huuPJYtiNYJ`u3>D+mT&gFf7( z!N@ctX}SDT9+qU_wqJW$Of)T2~;f_q7MeD1 z3Haz!<^K$)0lkm;Gav|f?=JJ-*?0gK_n%pCK(xd2@9at7@amsAKEOYqp}$jrfdBr- z59W8>K0gM^F9F6UlD?A*ic*3eIV;C7aW^RaJy%AP_r}lXw?HrHzoelApw+z_f4%?$ zslN#NPg&vrJe|?`lr<1YGJ4Xrl>#W96rlut%^X_4%^+U(G0p6IyN;fQU|f(Is9G(;TCrB7wmkx75Nmg8SYBNVque;H~beDFt%vVv&Uh##6} z+hH^Jna0#iqJg)}10|T3@W^zlo>#0P+!oMBtZ3>YIp(a|V$5N?&&Mi{ccI0uQF#Sh zy#U-NNwdti?zDi*t@Mkx3J$&vV`gQLK=$|QDkJb0l%WY26)j9r)nktbm?;K2M1!US z4OKi&vQjo?tI&10-gze#Tx6?#sb}|1=F}i=BIzEm6qMWNGXfy-%0j2Ix1+XV^GPxt zH|}A)+>lTv-M1Jw^zRGw^`Lo^BJvpIcZxu};ilj>iEWSs^tbw}3?deq+DvnyGme z^m2cn;(WVu37?x`;2t7dwmWR9*KBb+zH@4cPq1N<^}LzKIwhu%S=~)u z9C3nKIL%|&X8MoDqr)yqC_)V@y)=9%hecEQRT?BFndySqRwZS{??9Tx5fv2M;Z)Hklvh6bk!=o6t^D%K-d&NH`9M6=)c%vGwG#V++4TtsM4!Fsa56q zItnoy=HX?)iw86Fo{gf6mjm>J5`?nU-NJq_QjTb;k-w}7!&!zVfutpjkY^UAn zRV)n2Wx*i|c;ZCX>}@Tn5bjdkvcCG!!TtVk=9O^E8ptB&?zn(om{Sel?8rcK0)IdUr9 z2g@acS*NlpCpJ&k3)7KxLcn3?9VnR=MjsGe6UUxCl4lceqF)Jw=9WMDF`*~$9zC7J zFWq(Tpy${7)JkE_p0W*eOD^0EWfBv&z(G(p&X9>QW6dV#XNjXlK5uA#flQ&`Z zs=sGDZ%L_>xD6q?7PN+8Wk-mcD*GYmK0+X_`BcA?xPzo z`~ql6FP8KM{4q@#rPD#xgtPBrfoDfc^SIRVyq+b8|SM+wCTUy<_qhqkJ1X9z8Q~r^N}X6q4{PM~58zQnOGp5>`)zft3sUT_D`1#zqG0sT zw85tWL8fWl%yNF(e60(P(Fvj`sZxKZV3F_jL2npMJpuypqsK>O`dq&WnbgfxoxiF_=e#{RZn`3&t36JnpOx8_Z5M|UVgwS zOKffdGSdFD19<4cO#PF$isa0O_ZF zw?E)S?HslMv;2s2zuzR%aUe1BB`$=VL71h}!}0pO2Did?Q#vFumd7-eId+UC;*tbX zhCcZAx=SW%CZ#Y>WYnrO+R{%K&{R-E&qIKdmGnkb?5_XLG&(BzjK&IrT*r;zfAGRB z;MN-@Rxg}(zu~CK1bFUe<@^q*rlGTTUq>Z$5AuW79|@NZig*_|ID}(}fXF^%^Z?u- zRwjgGyH?+Z=ZWRMUZ%TQO=q5RoTF0kk{|`>>uut?<)0t#E@>2iaM^s8;uTcy;C$z> z`P}IteNE#+bN>t>!AUX2AqmQ-#9E|oIHLvg##|FwlzCce*nc57< z(T+#70;1defrgfv*ZF!#j08B6zrS9bzWIm|VGVl8@#xQ{e(0R)7`Ib?>gbFI=+T`t zjpzd|4BjF5VTIs>!el8;+j)Q*_noSaOE+-IzzXtA^B7dMMu*0l?*~_C13LsOTTT<@HAVTElWcMTvWv~H7A@wFW;SeKOj&__0l0bum5aj zVY-HGJRlbgrSn^gGwL0!kivT0dMxKWISJOK(Y`aiLM#m6vInDwR&`5j+xFU<%j$7I zEp-foR*%%+Jm#V`O3?Y5$hEj-@!tZXLM&j@{6HyirXB>b%R|NrGP?K|liHi^A_OQ-xthd=Q5m1EMk_MR!Xq^B(eUNeOs=+$lXddHTntSgDGE(ZIA82z+qv@o5)!0=h{D|v?r-glPyZC7Cn3(%-dt1 zTmp$6H%I|o0Dvgvp@tPrtG{Bd0rNq9tX(_1q8k`1E#Ph^r*=MmBho?)vEH28J2VhJ z($3r`?+sVN8g}=sAecofkQ?`;AI_=yf(`+u-)^jsaX|KVP+O=b8JN}ZCrKo3KX0Al zj2(YUI34SbR0TGD5!8CawIXz`x)tSvt?C+n4xpMfY`1<%?y1^ZbL6SZB@my9M&9{V z-9=`S%?;*!IS2`un=C2!w?=zCH1~V>FP#Sz`+H-iZ6`GUfKg&nUy3Dthh@AzCs%N$ z)60sTxETKd$K>b|MpHDmcp7IZcwabuzE^cLlyO^Q{2R*@jf))Ba-dlAAq*!|>D5^W zO(q(bHzQ2a4{juwI&DT}PPzTcBeXXfDnWfre;*i?^Ta?hdR@031ukK(fzd0BZVLf; z9LqE{pbZh?N^Pft9pp#sWI0mQ$(tE(sNJV(0C|I$Pv1uTenwb8FQ@|;QToJc(+lq? z=gS!b(Q{Wi*vOnOrUU)D{nzpMH1gkskJi^e5gtjCdJT^U)l8-FoYBmh*C-qtJYfo_ zOxxK$cQLh++>OjQ!&~nbizjHa>QM3&F97`BCZ^Rs&AUDszofn|^URM@P2KH*$rED? z0~PXYhZx!Q3P7d(lRD2C%KVFWMFm_UidbW_ETHR7z4%GUs4|X%PTm{H);F%0LZbtEk772JsA>UU~c}`_|B1>G?nE{8q>i*E- zbnQ;x52w=l$I2F_B&#>xn()d?gD9~5H}JtDCn?kQbNF;1_yC!a-|ZB`IK%g^@-lXu zzx+U<1bmaqd8&NT)igLcmjc8`-y?Uj!iGBE1sMVr^E=~D@H;pOJ^pR5&}|pyAF+A30O>%nn#Gk532Ay;F zt*6XsQxqr+jsB}VfrWe@(*)SNKd=9d>-&G*m;V1hc)slaXMDu}<73ojdjbOO4 QdA?LCikb?g@|Lgv3&AMmAOHXW literal 511492 zcmb?jcRbbm|3?%dN%o9FDLQ7zD3P)%Wv`Cy*vH-{WzUekA}V_wdsOxwIrhjF2U#cM z@cSG^-R^gE`~Lp%IB|N9*L=R#+xMRQZQ_$OC$X@wi0??>dVqyRat#aXcs~Ad;5W0+ z+BC4RPGjApD}_TmVE&co@_uluv{(O8!#tlpjHEGm9jd6wsfsV3PASxaLim=Io+ zb)jT8o8rQ2O>?tY1Om?a^d?d}T=ru|@fDlAESU(V)43*^Og`ns=27D(*_)q7uk9s% zcqwim$P*pi8t2l0(l=;#9G%{Z6J!gE5+FNmj?d(Qh4b@|aeM4l2`p^f)8aq>*e47* z_m{6>{?}J+4S3THd+#Q?y*3YXRZ1NjR-<9*gYky@| z8^r~j;T{oAEB3GF!`!Pl?LVRG0gt&!fy1(@)lvjw@!a_1RsMDtaU5)lYi#0=$uKNy#%ZPh<U)1k+_368A{~~?j+61{Zr8!g7l-Ri2DOiI4yCNL0 z9Cn>Bw??p|#wPf?pljj!{6Y@k62D;uemV7&6d(zuR;iRfsmdygc+Nkz0zdr+j78%) z1Bil@2JSD{KkJXR_UFXjPRAmQ+v&4nQ~&9--!8OLjgi1#jKFW%oUdvC1*$|V&U<6w zjN`-oh9Q5B9=-=U^69=HX zqbIu=uyI$pYVd!o`Tf&By!?^r3kHt6i7!d@jRtk>HF4iL`-5}8sQY(``_Fe-#CS+e z5eKg$`5Y&~&i`{g_~s`174THXhZRV!8~?0i;_l>37Duvi$hzNF4_G$^Ah(f|w~78D z2N=!zJ5T|(`8{&5698^3vi33fIsVxkES^6`M!#-&HVn`P|9PXU-^`a7K?vdh$FX6e z7)QpK#B=^|wa#k1MnKqpk%oT?m$t}&W%o?yssFN4qpXL~uPXyU<#+Vp;5^+Y!*_9J z3#rfiYW}crE^vt--r&a{e^WVfB>;;ce7@v%jXRFs(<8|}T<1Sa{Lgs!FIITY9#F9* zd+)fX2TFJA>i?<*2T)^N8IZ4?jmGDcht~}W#4`JZTfZx;2PZn9|Gf#|u$*6xHLf37 z2!K&ZS8)QK{15aHx5A%-!k|`qjhFv+LZ(EV^}s(}8xI4RZ@f3^t+3;u-- zM^yg}5pMJblx!{eZsLJ)9U$yH^B3{OEcF|!9Pao7qEup(jBCZn@9@57A9&>bjUvBi zV1Dq-L);0Fg>(UCoS#qWr|vQE=TH%!Y1|z~0>lk#C(d3!yvU)-)>z<7c^+QjNbY}A zJ@x>LP_Q3#k$OYPWAp#bjp5!^0(AF+^IX(5Kp*ZoU`ze~pc?T!0ECwI2cKU1Zoy#> zPU{5y@5fG(##I^DIDg`a@e%z1k}F{6-~8cszkm86K0j@HjxXSs*s{nidA}=H1pKK|0G9=fl}pq)!Z=rldv^$aEQ|S+WRFw$=SbR58H013P4Ai+v;q;@&+PqE zJ9fYuS5ANkqGGtR|I;de6TYN-7!MTAJNaEc#TBsOzliDqvHn?V1sm7;GH_fv&u;pk zWVZBl63MRw`6oC8;{*mUaw{yLb`D4eoWUQvzL)eC(fFvhYrFh?SEDZaC0uHW;P!&M zW-s@WkDJYFf1%#bMpJ8_jLyH^cxMSo<)O=Rmt0=%LMk(gpr$@aia4?YQ5irLadW&m z;y-5B+$;ukP;y~C2gKDB@2whJ9OsuB|N3TqrNuXc9k1Q)XUU$Ovm))y?3JcTQqAinVD-Y7+^8TfEv~|Hd#3N6=~>Q$M4G z1M~7*&THFSJjp%v&I`;n&%n;c-tG`|RnU!hq&d?)=OIJ=s_9{Klcsy!-feuF`Ue=lwxMSS?ZXlwc zQcKF&abApbPN3u<+;L)X6{~W5soU&IfSG~dfGmwmCQU9r>#&nG-&ym!NSNKC%*X7$#!V?uPc_T9 z72`c*(GU)IXA5td#1tz7n-~dv`^^^2)1S|qJo|436W_0bphK0G1{6Xg&RxmBcS$Z~kD@_Y?k;XizihA>}paju7d;SO$!PV}FD_ z^AEyRgL7A38uxm5rWN-+l7}R!fwIuee#SB#85Mf-ce?ddPdtBS_%3$NN@(sq9DT}s zclHJzV=VR&jn0sc2b%fPv8VZZ*kdyv{`$x^{FbxD0P4L<_qs|-43OK!+E$r`sw@Sy zji|bQK}4QROCOi(Owzo}2Xt>*)T(I7m&6kh3fkLk&GSg*NCn6~L~^n_{Sxg^-y>eQ zy>g~Djs0;V2OiG6Uh+WXvXU&Mn*Yo2z>zSlYXcq?HKwaD1Mow9pNvWOZv;G&h$F!P zzWOU~7zs-fw>i#~C@9jU72O6gf1rm@G9U_#?ghu2HkQ}i=ownLL=!ox)YZQiQ|>UJ z;#^Q<{RXruNa~zQPW>_r-*}{yh=QeuBkhWa#cbepz|HdFEFM7Ul}bQhT-|-{998C| zxx{gQR;VAs^|!e9r>fg4u5lu&m1AvS6kdLrab*73jQvf@#r$V%j%3%ypGbOgq;sLo z^>v-obzjNVy2+i8DZ$b#X1h|~pYeQ2@5HzLsPRwxd1wQH;9?}p9fqGJID`ay`=2Q_ z?97>~4Gz(nMVpFR6F&3t;#-BAT#>{?*-FdWYN39k>uJann`t6uM>14RF!dN#Fy_;E z;z*nh037hc>0~b6-+uXT6CRiOv03a&A_b{KSt@Et!{TeI6d8A=C3Sbex_VXQ)Xz1! zMAV)+nLa*tNaW67OcpnoMDz^`h9vNzm`3r~{jAWG_g6N^f=sC0-#WVxUcD@FB!YDuHLbq{ zM*li@CccNexsE6r_@GXAKl&uy@YfE{<6iLVoXoo1X0bw}*F&7doDWG1+q(cbyCk9?>+L>`MF)ok2W&zH{e^7?Nw#V ze72hR>9eXcgw#;+0E39ugr=+fXh9YeSn+Fyw(ZKF^Nf1v5XMDIka}E+oRXZBBr=Cl z*Cq45GWp4bE`E2V$ec&TK_ls%Q2YH5kM>EDM@yx&fi?@e_U-#`P2R*lpVxS+R!^U| z5z~2Q%YUe2ro@p)NBw;d^@aZb! zVKy@4dWGDq=-XF&QE!c=3_lr7U9=}n3vv!t*^#N-_uGV~B5tTDMr4(mR4r-iKu&}D zBJn0Hj1%`W{Sb+YKF{8!%*JVaSuBSn@`5csI>CMDIHcr3N}ORH*Of*dpr9j&@|Q}2 zg6&ykK@S&R+%O0(9?@k7bDiNx>}tADaB9BlF*B=aUBcU*uMxh2ImV30m_AkSq98eY zmsA~rc#VzWM#J}6p<*x0RCW@4%9+O8w1Xw%eTceZ-V$|{e2{FV?49O^NI>`+8s2AZ zjM-A6Wj?Ftpp-@)Y8#p@;vtQmSG^$Vb?$TZ8Fg=_;wcuQ*~$==R~2=Ij<|h#(Reu? zC8hgBK|P}nmqj~dsxEvVe2@zjtFK@{s#W;GKMt zn-)*iSrkVU=IsP#%E}V?uLPkb-nwNq&c0PA6D}vr%}Ci$HV=yB1M9uroOJCfEypT* z38JS-O=ZFu-HZ-DyTv?lu~^Pu*rgvRGsa zUmX&CUPMbBZ|Ot6rIC*oh6^M1TpA`#m&{%?Y~>>IJl_jQ?mdqQmrm56K<<{kp|KAL}xe(5?n2i9FMZdsfMZRq*!(Lo$dojf)kQG3XnSH+sd@D| z_#$Z@_c_UG!2$hD-e>A8sDKjl#ql_sx#Y07h%&_|^5e$D^x1s-b$6fTwU4dyt%qrx zGB|&lf<%DDF0wDAee${p{K{aVvO9at@#cWd8%LV#C*|S$Uo`j-XY6u?ED&Xd%8HrJ z5lrF+Eymp6+32m#){I`*>SR2mM<*iykcJt9xCJqCU3vN{v$LbgQa^g6ghfq$SUA^s z%@aJWyR@c-Q0^`wxy=$76on@>)oGK5*DoeryDM^Cfcvhs4dFFdu=D)gwK0XZjdmHE z4l1*#E-MDdp0B|cL9uL-ndx}_BDY(?{t*QF&bSs$FZJu?)@^M|^JdJ=sJhk0!EO)f z1BWlm$(1@K!h;)dH-|`-d(Uj8`9!k_meh@nBKucGREgjX?u|V4x$JV*Xh?s zl~?Bov%bsLi`~mpHvKTyuAR6pn~w!&>GZ}S`W9`7H{nm2e3ETC-+Jo+U#zm-vu$}= zc|2NViIRoUB?f9CRZzd%C9%UHQhS#FJiC2R0(3i}ltl;anATJv2n~o?xFbT%mas$(b~zq5={z#vKf5s)HlOq-dZ~Fo1BS@Z?RId z2(Se=>IEYmR^QjravR^c{PSHg?5<<03HUGf?M%db0-D{?S`%Pi@IEzwTI=W#H7n@j zuRRFWzV*q#z+~)=H(!HgVzv&1XzE6diY~aF8M(!tRmq8H4&#nE~*4$+pKs`?7oW6;$UqTGWWZ+H3G@Q`Kph%fk#|N1%GoFFbVm`)U zvsu!yq$skaHnMG(N`z^1iON7H^#E@xT(0Dm+Y_oS2{_W;JpH(ZuP*CyMPLJ3*2Cry z(;dO${}y_=aOnj4hzvnv5VG$1##0m>UvQ7?@++p!+fbk-kMSK3hK}O`*pRmQn4a{) z&9vIuX0az$)Opr-haBgH{7#rXDvLV1BQURmS|B-}O%58%txurIol!hfc>9{vTROI@ z^=g9Z0mh^(T+%beO@Lw32y0cRup)#YF9=mWR;vt;B6k!&w$B7M@C7NGmPZcNHWxgf zmUH(j)V^H!B~q1NVKCD)_@4i!t5LVdl<-ogi%FS4{pOO95l7FqZv?T~1_upA9HJ{% zCslN8cIBOf1U5*wuCcPI{7%X2kUAwj+i-+;VEvxZiOQ;MN3Fu;mb}#<=-`XlB(X9~ z_~gPSe)AvM&94$kV6~mlm^;!Lk@;oN#4z*KGtxkQ_Pa?DcY=v;pJqNi2gPOur0&(Z z&T+>WsWM*v&zw&R>k-FOO{2=O3O{Jl3B5voY-l%gzFRSj(|(ETF=KuIDVEK0>;^&E zhQSowyF+(?dz7u0vmAfImw>LGBpxTPDw^`0E==ewJ@5MTkMn# zGUhdz@A_07Pg4mkKHn1?`UZe-RfcJ8&+C+n5U!Fax$e(JwV4rP^%pE(&)ApE6f8Co zQd&qO9QIPTzDyGnzLlFtM@vE8!BkT8!*W>bbQ2hxz#2<=eP~J99v&kx|Mwyk} zR6I@S4_N$XsWK^~7GN>6i*<7Tx(rU|j`}_ORXrLGDfnN#SGbZF_L~i%$1JO6#@s=c zKN9VYz5t+tzL#bLmI8YI!atV84%dZX=Or#i4J!?Fb@EFt8|K7SSkA9{m5*6cdKkpW zH55~boExH$VKe*8nP5xF*DwjMNO}Bt!_VuP@)fb!s>tX3()qeu;m=5HcnjxAuC@k; zJw4xV`uMbL&qxFvlf%`!U* zK5pFuCK;y%aEJ#2rm16qyN+G&7<1STT%7MBeueeZ>{Q|^B*QsA-+cciBZ zL0k^Hu5d|SpGJ>X_F?2cn zl=};dvL3U=;PbI=TO7BENZ`&ddfJ8%id#^Btr8fN5wR`Jw7imhz&UFZ_C{`|CD zhxZT;ctmp0SF5@lwKwn?p?GS@RTn#vYs@ZSsNP=II2JXxfSNH*FZ5_C z%U7Y6lb6PmW9~m(pl9A3^dMHU=Pi6xns}K*PA`<7=W zgD!#4?d-VWO&tqGz>wPoZWh-<#AtRLbMl%d-inqjdY4&*BMK!O)bmGe7}ur*Q5Et0 z{kk=)UeC&{4iIz=aTH@A|J<6$g;h@c?>Dgh_gR1l|b6Y4I zTi!ysE+?nkAKuGEDwL+PB2WoJG^ zMZbosgd5rtZ)fDGqHpYSCNEimlwxX!C>S-&4s!u!uBs@8^vY|FUsx?ZjixBY_1L9L zadpS!Q?bkjVqjnR4Z%@kx~;`#uV>YbR>d}$3gp=WEa(s2KTL)Fr^~}+aj&Dz)A-t< z(KdH~0EENt6y{f0!WgFG;z(&ZG}OvFVxYvC#f>^v!||Hf3RZ5WIl5UscyIg3FC-wQ zIxEy5p&@@Wph$bH>Z=eNx{Khu;|MRqTKfl`?)IeDmP7M36Ve^d3ZikDa6P{1^)V2R8(l^S4uXn(J8|x4IO3BtQk}VqOi{IiURFJC+sz$w96#|Vm)wZ z#U>>4%?eF9blAs{?1WEIB(mblye&W; zQn$MBXT{TLJWfAJVa+T~>bDBo2 z8pqVwuL&N*KoU{P+|6_j*8NGzM8nc6A`pgwO{R@%rGR~kYbTH-iiYt{7H=z)m%hG# zxMM$RSjSMZ2F(eUpdH70#Ld zIX*boq{f(M8mCan?((Kw5r~1eYeiSWQ?{jjY}$DZ9G#@R(7n^t^8v5=G=f8m$NU@ ztktiHP|1jKe)ZHN_w~I?fh~rm2Gat|u3U-7I_>Xb+a00DLnrjttjLd;>TsSVJj8=< zI#7n|brfO`la4J{D=(^gINHA_Y)`7e>btlfukWlZYav$r(yA$kgLY{7wUNg5L*R?` zZ3~{X^pHefnU&Jcn(-izR*x$O?`C$hF>*|BrF+L)pZi6hiaj=Xdi_EQkcOu(a1I=* z$1|4XUIL`O<%76 z(5cKg_LmX-5R)?#0|uJD2wR6Fd=c6*1Yd*VVV zFO>4lf|Zbrwpew~B>X`zG=0<}Q>B{M{N`#I)mrL~6BH=1XsrKJlOoVU-(NQV#f@cC2*j;6%dY~FUPzXj;*e+mdb6%@ zGiAfE`GI58OO}ymuKFYGL*VJ-q|i|EtpFd92e*zJk5oRj@!SzkdQ1v^%eR^#etl03 zs1w=Q-~aSj&i_nNsI2*P&!Cv%L3swNOKmO6XU#I$q%|vHhj-?L$ROKi#bRnJrGd&( zcLgMa;L1x$7Lnjk{(HfF@QS?>JM_?UDWT&=iK}+-qj)EhJH0MDdzcJlR@4xG?6}&7 zW2W_X_i`Cw@P!Tmdu-={nsJl(WgGg>VQM3~iwFf8jICv_x(D(R5B9xNBa1+PoXP&AxJ(meWXL~sYO6iJ*{}XWWTXIGY?c!YXD%p8v}lC) zmF<)v(h2-y7v4F!x!heZnyO;$-5>FrM;4pNv$2|n_5^MZ8f$!Rfr=>0k_WZ~T}&Jm zl?iUC&LVS>2=2++-rn-tbPs1Rv={MkCfYE5&n>O%U7mBBWrh`1l#H%W1(`KwKt1Hjx%=a z`6`w-%$;TXhad#SOYr;Es@E-x<%|=7Tu`}~hs*}?%n4VblTFA6ov~<{vQO*Y4z}~) z{85YP0gRV2Lm-V0Czf1UYF;1Olq1{x?^0XqH@wJuhB|nd8`7zS&espWaH!^s2t+34 z7@4|v3cX7oe&$L;**)kFhS*f73*?YwUrDT*Dl`>-8&{ zxmsenzkGS9yf{7rs5Dvw5=2=%yKw=T6Nwn~Y-EQZxg?HxT08{mrkBWO@rupcK8V2_ z7M(brn~-kicGlE%kewRQI~EZr!={``lAR}IU@W!-?s=L_z`ulYvs?iYzzqwH_l=R{ zK&?`swk%#*vX$Ivlqhm^?6`kK0dKngAnHl8elTX5gOG8fR=>pijLk!CobLB$SSP~yCAE8 zNlwdAN5%YxmVMjJ_r=eVtqq$=#G*6F+Nd0AC;XO6Qwf=BUHC`22P_yfxsKZJ@NDjx z2cNdw*dT+dXpw8=*fWt#p-!()W>!$9RVi%(^t{ zJ+n9Ow;`jQ4^Anw5X*|DWJ%$+SbyyAH!tIU&mdK?I4*-v`vUk_dryqT`gpy&!R}6% zG21jibF84X8soN!Aacd6?&@^n*H#xnT>b1*B!OQ0w3;B>Y6G7R}(qFF+0OhKKLNi6dNZi zO2L3f7La+xCwX6cq|Q?&QpDMvR7!pChgR3u-BIX1Zg|?V^P}Vmh{NF6mTd!NUQD(W zW>PQQwVi6S^0`Q5;g03Jg^Z` z7$MfNkvg9?dVjQJ{fS3Z1PytYS}s^pl$MA{C2O|77z&!(gxqSBTm|?VeDhTh+#HS*OVg!DVawK@HXK4YFK87EtN#FF>H1Yd9o(}oMT<;Q>i}mN|P!RBzQ$>>Na`L3n078&jtwsOKp+{ z7a!wPsfeU?8kh8!@*}mkf^x39*CBbrK35@*eGyqNGM8^l)eHiA1MQQ|n&a^)d&B1N zvNDnfN{ZIAPR%w>i*eVuUe6-vST@TTc1{+bOP%ds1hlSr!$7+2gFv6p=(SD?*hiej zD|oQDiUIa#x1w1vNLkzsFSEZe@p|kSb?SWOn~SHbN|#nXBes1XWrZ?&r-T}nTx=6LApet8S1x@@`A|1SMTYEeU{Z>Wyt{Zmd zUdY^})~OhfbwL$p8KI^~>y`6Lyu>7-)LtrNTuW=$KDglD#)9FTy!y9j|36PbrJe@J zX(xM?Li^RZ4}}yZQCv#AoFa8QNi*7`x@(Nab=Svj!;f)Q8s;?DZwlXPCGUYWiBD{#nGNiE7<^ojp z7{d+dFZREbo|abg`bF7fMoMS~P>wQu@6G8iGkje)E8Z}Z*SAp*^dKGeq39Ii?ic29 z>HBC&VJ=1a(`%u)L(k2u_dlyAL-I>qa?!8%(E+p7dZ-<qG08MOyEVW>S}NZv8e! z?;PW?37}@07=ClgvvBi%Ua?XFsxd+9_!u2W`Ji;@ZhJMT+VTZYZ(rU3dvr!A*h}&5a)y5}K5X~&UG`^lQ`QWANMxS4L@|Z=Lxx9&A zNLT}8_}bM%{XD=9Lt}Ap)j~nhL?8H=$*zEb4#$U>@r6sb{T!J)-EkWWZ{Y>tPCURZ z{v7*Rf#~+a9-XMofb%vnLTp2B{}p&@g*l2wk(~i1y8B7Z{In^m-WA%@lU<1`*E(QJ z*l}c2fs|M*7X9z^@z${6)DOz=lIJ82FLLI()~IcH`o!x8XnY#=4;?BDHY_qw--I5o z-k0GD97}K=76uu%kS;|@rMAe#(lwME@9>XGMD|uDOD$(Pltd5{6&b12C(>jJjy}do zV}%P=*(*cM24ZFg73N%BmDi$r%3>h5?fCnadWxLnq}f19JJre=voBrU%egnw@tv1i zWOOv*mQ6G)#41d}__&30^<^9lp>i3Uvu_^Cs;?$0wQ4B5D?eU&p3e}S>0AtZF4Czx285kdA`zF%dB%Jz>oZ5ODcAsk8GijBo^!eDSmkK zt!TDyQTQ8Npo5{|ktEB0VUQ*(xQdw#-x-cHpK1~ zef7j)^6H!KgAEgQ9$2X(j0ofoezgtkf-v2l)@wjse8%I=yjrC}CqP9qL~ADcE_c{5KV7qdc*61+Z1nRQ$=|jJBBtu@ z)gbGnL#gnjY`#>^@po_w3^Y>X6A!Qwdv)FO(=>sIfuSAo8)^h!ihoRTp2%%xt`b$!3ufE6uY1^6f|2+S|GpTeGIxH*79r@G&Jz&-wD9g8F7jR&&w8 zUJd6)#KK_fV0}9^U-jo0Rbk~~z6A)f+{ry3Q8}4BZV5r=*t|y0z7)NCf1U62-8|<= z&6pE?KAN{2fH^!jAO}^T9dsUO90j`YABk7%o&tLFv&ODnPqJ7EUd|z!LduQ%3XJxD zP}jSEyN*HKb+_8z$;%ZaSpUzO$}r9~2^ViMv*yK>X-$Bs5Xg(B-8)|%-l=fxy|ZlB zp6P2x>>EkTvXuTd?c{Q)MGk7g+59r{N<>d29awU&0YU-j9MukEe48%kVjO6OJO^o4 zTWU582P6CJO=X~BDsf4(3OD6@lr8M5buu;1+7I_>m~F&R7g5^s_Z6DRw;ZT1bD1)C zb(XUslE`cWl%>@|f>Fq9V+J0D2bWD=+!6@X<|)_=K4$9jbZMzG20phidHaM`p^@HT zO~vu>(NZm|CN$%taxsH0^Z;7#!~%Wy!xRz|1ozXPI}VRQg-}n_6&%!+DUUN zD`2*dmlGkFvdIlW>PHv$e&ylz<=XCNrOid`KAN512-8T!mp0Oy4QQq0Z!s^y{%wfBER=0(TjF`N3(Ceb}qGwow3}o#+zhHCJ?BXrDEojsE0wg6*=hPKdkG zVlnWyoOPPS%S(09lL#dFkY(mlwK8N7jPtIg=lC>Rl1+y#8zS?(R>u`$a@Xi+M+xsm z^?`xwGwv6LhIa2*6}y^3D{hvT+=q04r#7=^!WiaDb&IcPu-F~@bP+HXnCDnTg4xv% zEdJeFIzCThHiUf?t*?zlxvimJYyzp%X|+hd2zR4O&4#To(ZuwfePzZmc#&V~mjB(s zCl4Ot0Dai$XzW&VveDEI|Kx1SykNg5i%cVxer2(^!&hm(0tMTjoXWR!$iFSveyAdK z5M)ng(T=t1_N=6LYikAd88ONfI$f%d>To`JIW%e2w#DtWBK@mg8C7me>D?&6xmF-A z^vlT3w#M*7rU@rLQCx)uP^37|bu0RimP)LNmO!MCSR&mk8-nNqe#yk2&p!s6gHNZktr42KP{8lN$Tt11ZTf zlt7l+mr*<|So|>_6)|jr^mUol`-a-){T6GfCFVX!FyK65!aU_y$dZ?ejtS8WHY@QL zB6CoQolb>_!Jvy~=_Y6}(D%-@u*PG%ue$wp!JMz&jiu#_F`^hJwt|R}WH6UuO~fyC zsi!M>yO;0gC#=iH1{3Z1`w|j@D@GiJjy1cbq7tmRmC|ye9lJ+p5{J6==tw8$QQdNl z8t0#tM{JgwKUvCa6K`q%Y{CGwI!=7uyUM!vPP}GYQ5z%%qR_cBS!RG3W>OC>Ppz;D zly3SU59l|hD(Y&@bmtggE$1w7Sf^Usc?}`c#n=$h-Y=_njtSEY0`)Ylm5iW^giEfN zi4^zs-P`3nNQ+2^gzx|la>tGqy_N~?Rv9|bFEKq@+T-E(l>LtY^lyTE5ig*>!@m8k zqoDphFJbL?5%Z!*Jh*Tf@$n~f02Ra=}pfk zvkpEjay~`HHf?lJM{}Wtffjh(;LMW*XD9PK<_`NMMW|+nNa;X#uS+)zcnZE=Ibhsf z$Q~}5jUs*JlFjHakNyzQI2C@fe90vl;sb5Uj@{m)?(?Y zF9b#X_lCej()`?^>_N(tQ{dQI=@@#ooh}Eh7J4!#ua4#*84a1{=347fr5~hz~#|(mx=842?N<-%@qcj zp1Sd7dx?fuzVjPP55)S8A{pwGmWk4&?gUlc(_`ip0imflMZHpMgY7C#4cJ8MXr z)gP3Q8UsBDTt=L4!&7vvhvtA9f8zRRy?Uw}%i4X5&|Bn@9`h^wj6C%~cQGa&MBwoL zGlvaxW6=rNy=u`!W0>4mWmdxgR7zAPN3~F!vo_|Qg=tq{CC1XU;hr8I#6@Yy`et_H zd`E5nv$^fXQ(n4dJ?-g~*ix;MS~0Ytu|W9YR4Zd@0T#-kN2QUgKF=muxTf1vw{GX& zl(KZ|v*4jc}b3&s9z{jt`y|>V)-e|V6YTei+s;8&aP#te8 z+G`s>JG7j;oqFjoiPNWh_$@N^R_EO|FB~?zYvhn{tFfjT09R5;r~~-0_nq zcFb}u8&IA%dc?+QLPve@58OLp9NLPm2?v{$==9ue9E4x7#kGK8gw^Kz3N({sCT7x4 z+dk+r>KgLhqbrjYTxqEOi;sW0;QC#U`k(&!&3yi=ODEyuWfJU)F86S+hDU-jLmz8R z_zBe0aBM=gW(;d*#$3~VRYHaBoMKjz$$VyQVOdsF%}_bilN#X5Hs-A$S65uB zgEw)Qbg|KCnceCvJ5jWektTI>?hR9~EQH!JYC4s{B%iTBS;qdLnhg{8~+3j~8#vBU}|J%Ug{}2W5t@vt3i>9Tihsqf6y81F4r;DA0!YV8w2d zJ>9t0J8jcAA~eL!lndm&9UV-n=CTxb*et9{eVuGKOL_;x)HgVdsckyfUwN~Q47JeL zS!TP3+*Fz^6P)(koM+TX6irof{ z^&HoyzGn+XJD3UG;&@eyOCfN@j#6O%!%!hIoU|{He(?(Yd`cAWueHu!W21ioL#EWn zHv|k3cGJ$MA$pY!pv?^`<+#VJOPrRQ+{M#5T9l4JXqgSZIgyI?xOjW2Q6n%iDbaAO z;&ZA!nQ6y7E6al`4CaWln`DI}7gafogJDIRVk7<6KRy@h3*_6-(u{L(iSFxvQg0Q= z&Tj8ZTyI0VB7B}d;4l{ukMUgg(swYK<(1+Zal@q9>Jo$6#b_?#$!nA|85Y}Z)gEq> zVq=r+$v{u?tW&@RBA}(~f;jhY$?IV&EMUljaJ3rmi(M=Sff28WzFMZIlgMmetterj z=6IxKpeJb_i8FRnX;fAqm6SS$GmUbG@w245f*dCGp;NT6dP6ynFQ;O3=}iyaacY*F zSIG7lba%ZT<+vwwgy99d?GRLVvBC~(%VITgo5|E1#pC_C$he7AnL$D7V z&_(6+vxrF@(`xjeCjDP5plE<;L+jc(#X2-UaXcl#OE zo!?IC0EJgQCY>yAyh!Tx$jcvv;KBovRB}3@6-k$dlIrzB<<4(kgWOryv zt$(D%QP3)J@C*`G&xky8(hoXY>qzFQ^~Co_U%^~8j^5t~{eCIUA1Lm2iU+73jUI6e zbE+caV`fBgd*yA;_yJ9WH6Wb2KAbdL4oZhdrpG3d zg`m6Z^a`p!c5Jw&i+&~TaHMtYhMK~TZA?wj+}DA6NnpXMFmDEp4e$WX&p`H0+U~fN z)|c*kWBrW66J>dxm9}DOi+afzlgEK@E&h$cXV^gDt zGH6h5)&^OIdE82#mc8_ezlf~(7{S$p<$r`^z^7lIz^LiKL1d$#POc%bD9I@1bMe~A z8KP$cMAy0mHri{^IM|rv(h!S-e?x!a;}o^e9Dz~$ zcW9->@8Dk-bo@d?$|F-T8X!x2o28mj#Ly)6*c~QlUCiOnu;Uqdzjz(1yBOxST{uA;r(86UuvqhqVsK6dzQT` z)pWgGzGgmT6m(qWdK7H7cB*z_l+sOjCd(|#Z&f>+RMJtUTh0#+j9ml*R;A#63h{x? zC7x8_a943PzL(YmOntf*cBj*Ii_D;(-~RL)#a^4*$R0bN6kUI zeXsWlnOCWTwI-0Iy^~A~5e{tZ9YmQYikeILla;6zEB@^U7zFrV<_ckZ>{qR{I!TO4 ze9SZV%XdgGn&BGUlo2wJ&X_&Jz3JHT?nit0 zfu%CW1LFAau=t2Jzv3mpvv%S02DZ$h5^YD9z{8?BQhJptjt>J|QZhXDoU1+p{{%RcH(NH=kf8MPgS#G|~Ag z1TK8~Si&-QnF{rhlFaZAJ*i>1^9wWgT+QS4%rGKu<@r~+Y=9Y%Q z^?O#m@H7Yb(|&%V<(u8sV-jcoZvbnK7Tf-v$zZ5w(9`l52#Vcna6qG6p|5pNx6I(< z=p}Gw_(~gWDSU2$3xvtROD!DdorPNprUp2d@w@r2s37dH5oj7OvI``GYau3*t=YbYAA&T>@?XuN`b zF#~9V-!yERpvX_ILa35%P<~W1RGV>@{k=n&_QjBDkO1<-HHYwh;!Brsrqze;g4DMJt*mnqz zpJWuZg(T@2CBE9;1H`qIY8DcS1wP5D zleOm8_YTbJ_N1@>CMNnx<<|w?XVS(rR^&?dDB| zM7p>#77OQNLlpLDgrh(g55Jh^y((0Wkn^YJ4<|@O9yDwvBj`A#i-PYmiCRxj!KDVu zUaLBf-4CcVY51UrLV)IjA94VsFhFV+7s=0O!qaDDWnj=ARgFPiqZF?J6Nye9^w4a3KkwJH|z zgY%y1Gwak2dEfJmi@Q@?yIJW$y5`fTiks;&05QEWxjn%5G6_xu7(L1Fk4MV_(c*T2 zII6xM#kc=3v3AGBNXGJQ{n^aRiTNuqB%yFvAT?Q@wky!xy`}Xv+H_YF-jT>FHrmy> zS%BgXC#K*q^b2*iCup4Vjd(V=Cj(5(1kZm(yktAWhjf@vUC5MHf9W~|A4uG^3;q1! z-43rRhk*G)PD6EUX1^~-`emoIh6E!1k_Oe_Nd-mTf%8GKYQu#p!S;TyHIODI*6Xn? zJcZ>OYGG%ck(0B-LkW6@@&e_AZOZcIJ9l0?Bdh8|J4|Npb39KG2EauHDEY!*gvC} z3!I&$_KP><-t{a`sp-=4y?746<|y{}jwRn(1lo4QS|!-eNivVuPNTHafnuhR>EKJ( zpVDctmy~mqWWc0Y^|jXXK;!f)=5fw5z-*SV`NDgRP}*CI+BQohrT&Z?U$(qS7@Ve3 zX2*J74V_dQein})pnC-^LHLxa&!R};Yv{L$hdL&$oGpRj0^xPX`w_ETWzG>OhUvpx z02K4?4nPR`Q9S{-TWozvMXCY2ca|9u&V5#4qIQ&>*?n;dy$$|_@Xu9TSps z7ig=#+Wm1V)6^(KpkB5+Nhym2qWN}10MYoxWKGv~IAaxqE0yGpg)MeK*R*%y-xM{) z&(dJ8SDS-|j4N)WU0odm}*<#8}0AT4nBgQ+G z&m1|Qv{qQQrg~n<0J4{jd{vf}G~Knvh}5!@ABv|ZHdDkDFt~0wKW(3tjj!z}ESr7T zlW%q|517OH;+Y;bGtTMdrge*<`|D5^NbIUqy|J68FbxG={>xy&HhjxM;(ef~>@T{AZ2WtuqA5Fc6G05VJW^kKy?@ym9h>EciPZtAox z4hq2c5(*mzOiw)>wmr^}LU^{G@1=Mr5fIhYF#(Bn@U@e#uAP_x-W+r!pNs0|&jIGo zc7wd083LnJ^s}69i$*FAA9nAoO*+zFlbk(xQmOq@H#sq(e9u<=U{|NzHgKI5<@oZI zqO)y&;J$<5s@b8pwoV6iA9VUg)q`bJXITdVn5Dz7WXvhLfkJQ9HT@rJZygt9x3&%6 zA_yuWAW{<2N=bJ}inOS7NOy-ch)Q>=l%xU!(#?oUN!JW9Lr60)bj<+sUZb~n?dN@; zz4!P1;}3u073*5-T*rB?<2;UNc*kJ&3fO%2`HA7Qa801WmP4(Eu~_5gy+OKX1h61N z=b<3l+VihjE8qc>Ri4a)p((?N<;pyucd0`>82a6|RYu1&SSENC;R9X(g~OCUr^~Hd zKnGoOXIA(VAL)gI1E0j@9Gz)L8#quzwegZho}lEjX4ecb)xb4w`E9k8f7_q2Ij+9 z(1;(*HUN+WUrzXurT#U9{!03lG@XH<5Ej3|11QL>AYBLOI;=Kd(A8exm6K?E7V9f8BpfWOZY z03`i2cKkWF{--L;F6*_~>*(r3y3zGQqsL-_J-1A7B6}1*!RD0w--q=c`31iU7nO@r z$n}?ma@9m@AfT+{w#hV*H*pDH-r0}%n@>5GrBLfyS}%WS4@53HY?-y@?-Ya(ox6J zbr98S6F?Hl)}{SrZcR80&_kH61Ft->YYbH7YGq zXCO&eQ-6!~z^c9&sf_#~k??bimPO*NlKz4_f1u(2_ZzI<+U^`*fJ98_OL? z=RnT=#em)dn=sac|#SBwN| zrW{^?{!==NRPAoC@IPp@Kq>D(^F!H~#W9x8R8{C31=<~BGW9E=Xc9W?iKC=&4@X&x zTTekPViqabX#^s|cR2eM6lv(88Wx&?=J6OCyn&pWWU#31@OH` zBfYJgHgi8j6PMdMf~O}w1)48$1%im63*s6qLnf0GL8}uRd*y1YPsI)jD0*D@z^I|}3 zQv=39S|YdLX9ZH$8>i-v2D(3D-oF+zuu<_J)hF?5nRA=opQ zTw^u8>FU)iPn7ckAOQO+E>u|*^!OPR3t^Yhw>4Fz#?5cE(>c+q+hPt&4>!G6G#!EN zkEqmrzRPb8pz6l`7D1p!nF68F-6eizUc|{kI>4?po%UG|V^A4yZ;!>m>H)I)p23zY z&=xdGHC?cCJN~gx{n_)*0`ABPQoFY=8P+dWu);x362*WC0g8{=J$kEYNp|OI<~!oan}$cHsOJ{ZQKCDU@W`NoNTP@^7NK`%9DU{`X!pndHH7yhK5BPd3g2~oYl1noD!cxe(}U_He_;fJ}6tQ{6M z2*#0D0}3?+2~7X>Qx4Hy;3V6F%y|1pBLY({M~tFPBoDOSyVYqFNl-<&1kTMfbWMHk z+XRRS0B=bv{eyDGb2XqZ z0+y)X+oH@HnTi*eR^&!sLHc^It6~L390{;)<5d`?jbbEQEDRaoT{kV-j(2$-Al8y? zBo(C8v_rs}3IMin8&f)sq#x&F+l9ZW{CAt#Gt!T{dCwZC6VoMM6cSc4mFisDn6UGg zfy9+Rgga8#iRe+~VP+ zkVjum_3Qfq7N=Gfn`)m9QC^!~WslrkOuR;#v-)d`4f!aGqBw`zpzw|*fUjf$_q7rz z+FV&d4OnME8X8zx;^ zJH%>T7Eq69l_G&yB#eLGr6WHK7&zWHnw{TnKLe<&LjAQmu|lS!PBSKNX(oEBdT2bn zp)pgM70&cW0les`DHty_0@Smu2_|LK#s%`=3hic>@4z1S*PUH9^6K5d|0wdH-pLMS znEKt9fN=9q%Nu>WWBtSUwPF9<6g}$#%s7J=V0)l29&zm?$0XcsAq03JR^= zt%sJ2ok$G^F7tov5d}W93Q13H8HUFx!*Uno3Uj4OG%M1p|spGfEn;0ht z8}tE-Fd1tMY0L!r^GWGqIO+1N!~4O%x=LAthU0TlXMQjzZF)A&43b;$%-*DR zHCw3_>h&~i|5hrXX+%omEZJ?~KMC=TXxP&)xKT|hqkk5kw5Ud63{*>0mYTpV1VpjHX)$cHyG1R z=ZC3@D1&q%nRe)|j>6FZ7T9CjO;}!TOA9gsQLo^N5JQ59*WQRNJNjIxh_L4NZ4io$ z2HBOz>Ou7v{jz4W#J`bcjcy&(ia6!mLen2k30as7IVqaLe9K>pEz@{O@Bn=t^~-mk z?7c1_2Wunb4d9Znj+M9jl@1j?RrfIZpj!C_$|$sfz;5g_4pAfYq`Z^H>KknV3vApG zWaBNY2InBw9vuh)Z2mnGDf-zlwA2BtMA)2W|Jj)m;^I`m%ECJRU;G)@BXrrU|`y_ zd-az|DYb%U#HzjPV6JLD!eQCy+IVcP8TSieK8W2M0Z!xMXVUG2 z4B+T9s30^*j)9@Lv9Y3%6W)8I&{L7&9&wSX=7bSCQjL)7Sww zwLgfRjhoID;~EYK8U-FX8?sPy9u7IHMZO!a^@(qvK^+r;f!F~8H8Q{mzMx;!1(zCxZFYG9>{o9f$N}ZyLt&+- zd4lma(3Byeyt{X>u16hMvk!A|n!Z0Yakt}td;Qn&|0BfxhuN({t|suoVeA`(e~2ez zFL|aL()YhJet&%U|EVf;Z8cWb4rqN-e^v7@vnO=qni zeN$fD!dd+PZpViK6#gf{U1@)l3TD)~07(A-Na6nct{yg^a4A3cJ=PA%{BN&0 zZ`1exJ~97(k>6+GpKB)c6v!&XWXkvdxSS=sEa$IR{$pVOm#h35Q|#yXxgic=Ek2H- z;{V35)%ve=o&IQdi_}eo3AD8OCzRTGIU}(?QU6VJ$ zaU}oCg2|7k{QDG{QwGKWz4JZ^YvuuDT~q(gLF*qQ__r6q+JgOZVak1hCANRYkN@y{ zo%Tno$zMOgkHPtOfeG-nV3KopPrL2xT&?ms#v|7Gd?%dG$FVEyBUmbZajhil#z zJDC7G5{v(TzS(a;*EyG~#^x83-&g>3as{DD|q6L*2vY&@Ii4e zGeARAPSgOCUy|1C1+WowO7jwf0b8M!&p^*LtZRI@+w=?Qz{6|y1S{tLHEQyW7zR5q zt=HZIn(y~2Bt%wgmj(ev0aI$Jm3=AyVLJ6_S+hm;1(|YFnR0@9CFBQt5V&DI?edo> zGXu5ua2(`8UX##nEoe(S?*Q5iC1)s7d7#Mp$@1alm>3Pw#rJ7D#p?%{n0#k+-_WDV zj+VjJv`4iH&s-k%;Ya?%4a}D!*W=#~Iw;*&to5IKzrvD_i0UEgfcEUBKM`_wRw0fuao&8<^{A5KBvVg9@vG5RgfA0}W-{*}PJ6hCAUf+Mu$x-m(t z=faWSQxMD=e#Buu=u^S|m^;yB+4Gt>prN3NgOXAdp& zfY~5MrBgx%)wgd|?ouoFJqOgSX&-|QC{=ZIdPFUgU3sR50K3Wv6S=RiT zCAk)9VxMU?_X;j@W6y)LMdbGK!WYdx_(MWxk;!&>=K1Te$8%eSIlc^Do=J@O{m zT;b3|rcfZDyQ&!Ad6s&N+|K$j=VM-wv)uN_Jt})pljX$9XlCX4S)XR*>x=2h2-!3o ztBV(z$%Mv_Si0Dz284mG_M8xAca@PDP-cRsDIapKRfUh9V)=)`=$z>V{{LL?B9{O+ z%0IN4f;>*iIwJK(Elq)t3?M}gnOXvt*UFy2Z><0&D)`z*HwTn_wu*;m#UyuPa!J0| z?Bw}}3>A4*>+s9_TNZDsFDfbrkst5(d!Zi#-U8!$XB!qSQU}b=jHbT7Y5_sL@d2pF zkjgL}%QxUHsfUDb5XONgGnU6?I{2>uzT~4?h^ZuMx%*w?K4fKz1kBxtmV8;$@>722u-MSNagxi*`Q!}*Z#JRmA? z;{mf49bNEeEH^bGRfgKZrt|x5Qk&*6W<=0Jj! zRPVI6$tePo)w>NWA~y+_IX(d2-uq-U#eCI)XT36(h~Ro-2p|APnDyNcBo!6p;-(TG z>$uHj=Hyy(DZC~~uPjr*Z(y~GM5G{C;lJ=V`4tsp`9t%J=8VQ4igXki!W;`!V~oZC zUX-Vz{PZLz8*j#}+dv@3zzg{q>e-wnV;2-KUXMhByNY2xM^U$(1OSS~DHxCuW~fz) z(~Xg+VdH9&Npn@+aj|>_+(9v={UDHt+5NXv{>Gm!l^*y(hVdJQEtktL1q$6A)>kbB<}0K1i$tj>Q~Sp90JnYZ>49??fLgeBjOQ?$coa z<)k=Th}|%;-W$$|u%K-Y22TrT1M9BmYJzMVUQEUdQ%#&Wx`eIDHK29s3WwQ(72eI* zch>;>8Z6VI-s0P5PS_R+z*-pxx%1I%D+pl}+Xyf)=BYI`-hizTdt8{cjH(jsb_5RoiWJ#~Kcoc5b2ff;GSW)>HQQR~3=h_E;3jI!L zhW)_5_LQGL{lbm`Y+8@5W|jhK6d$%YK$oK_(6e;Ol&2i?16;yQ(FM9{CU<9z@51>S zH^EN=5p9^>JlZ#ZUh7yZB;{{6!=fm_BGcvc!J&yiD8OM;h~EJ&w9DQKHr>h@Mb$z2 z4L}02Gc|E7wdYomYkLh(e9#hpnBAt*C1kCx>S$A`<@0IXE$ie4AwyNoPsZ24=k|^I zrp%fSUZ$Kb>u1a;zmxJ>^9Q`l*8ed0OmMk?`b8%B6E*-a_Nm>K>wui!LU+8*U1{*{ z(i_vliNW6+34T3$=;_tFKRRRo7^H+pb0F&79xrG&Z;iC!jTE1l)F;gRk7+aC9kH*T zxq{?tiHDl}-7w(P=0F#?TLhI@2h~X>?jn6X@8;g77xWvJUo*a&%hk^H4LOCo^OtXaTixnQ>?EmBY0Dt|TG8NRLPiKmW(pjJ4+Ogd< ze0=WT;#Qhx4c!J(ir8$9HVbz$_5ZSy{=DcfVE7|5Ud_Ql*mM^|jhTZ-P#Qytk4v_t zKsiHi5d4qI)JT=-A(aq2CtrlFyK6cAJMj1Gt}a8hIu)>abZx zBWdnW-}665`=9wK5LIsBMp=heU!HPdQJsit^tJjD7|%LA3!R2{Oadl0?p}hs7h7{c z2S)}gFASMxOYK4E2ag>V8-S*TSpmvD!a+LB8sIX?E)H2y9;7x-D;vbBBYRkU26>H^ z8ZbfE0)i^`UO%P-p*6f579E%ky3*Jkrr&Pn8)y(CP6w{mkDa-$xOttrgd48219lnb z`6FW~aD>LR9eAOpf)m!pHt?2equ&Vb3p5)j!p=*aODa5a{5LDGm=^G$_<)5>@hz3e zPnMfl)KQvuU<$I2J=I%q8Lu?>jA|O{vUvg69ZBjp!|US#{a!OCpQJhm0zdj?FB!I4`+F)5v`@wq~_K28woR?t;bkHnG{Q zwzZY!k#_pxQh)%SBgWdT4e zO58O*Yx(YnJL2cI!_h|4Y?46h)%*3%%M8WH z^vr~LIlE;m&c1`~RDqD)><~s;jXu+^59~ZkNmC1-q1kGau3Hfr4-8yASbhcaT|0T_ zo3;6d!5{H~VUi{@JovK*b3P2wY+?gP1V#HB9Y}&7Y!yGL)et=cOhGP7?cPnz^!@<- zO1l26(JyGU6}exxmv6wVyyk{_XCl+K+-BW!xI*7tVtXd$7Z>1y%`CFZvJe~eh&EBM z!l2=imlnKaCBY>63sUHt6c~WEG;xq%coy_2UI1OgO1eZ2;0+V=F=!J2{77`7%<7&Y z){f;#Gjtn}Kx1B3_msJCo7!%0^G)&eQJj5Pd%?!%&vCP6ADvBMdI5JD>(=Zw=q!q1t#-2%=r@a3@#3K$7oYDgfuJzuageO6Xonr7 zj%Qc7J*f1uGIMfi9%*1ah1={~g@Y3c@!36CV4m2<49~d7QPXCdxA&N3rJIFx6?0Kn zYul_kglF9QsP(g2)tZ->7J8QB@!RjR+H&Oa{y z*Zf(Q7!OBmqtwS(U*eWNK7?=B5{5Nex6J{&jU8Z_A-7@O9B?G4A^I(eF{JQprkIGw2ki$ay?KS# zTGvMm8QbRXJ)0)2{|u+K6UK{q)8uK`o~??m{dJ(l*P#~v z&;h6T$)n%e4sViU?3| zb{KNQ7;f1#cNuvY1Kuc&16dW$h{u5AnT)zAgA)oBG#gQ^CGi$yaV}eyl@~wgx$<$m zz5C&j=;{O<@MfBAxH{(;PZWJzP5-B506GJ4e_fgJP2ruwGsX_iHJfr@tJfgw0=JH$i&h?E~U+EmYb~e3r z6n-F%jPRA~YLSz2fglE&ffX4CUC~bkCI#;WPR;|-vqu#S^^FzOywEdORk-NU1a+lI zRH7;1<+2B8E;{&Rzwi$Nn#*LgS89=!8eTy_C!1evoZ#}}>4-|h*^B%{rFzq0XsF4` z3iVNdcZK$off0vsk^}Jr8Dz_=iJ_^F&9*gO8`4vPD;=%N&RMF0yD9!bjeEn;xf})X zA;JvlE2L7EY1ILkg0&Ont*A}Ds@;05j94i~P31e)--vDIhtc@XZu%>a2c)~eU(e;x z@kJsJ3>s8crlk2h81N;YUwLqTjaV7%WB@ zZ?t*CJ$j^q#zDn5)4bB3sFiJ;1=Ao-)zJ--W;i||DVBYoz>UTR==6FT>gs_wDv7VX z*hv;S1l{G5s^=NefH|CUpF{Z$=c|YuX0OyMw5^!v zcq63&yT0ZSuWWwEIV`HtZo0PNJR{4ZDzI!N(8^2Dy*93OMsKwIOd&RiXxz7HOPvj| z*Q+?&)r=MluXT28l2Iaaj#RsXfGYkBn~zAT^8kBG$jNkE2K|x$iy0!s!N&VWa(jP4 zm{S$%AdmRrV+$)VJPo3UJ(;Z>rV>b70#su76>|O`Z zBLl4Q;3ObJ)&%#to7jL(*OS8<50*8C%ZFq5_x=LP7xhArKfx%6Q0PyIMnDW#+ifRl z3ZTOclmRR`H{avek;cP4t71SsQldzc?Nctn`%mq$(XX1fdXeU?lc4v^HArS;%EsH7Q>}D9$`0J+_o()?e<-*}0{9uLW`rJc7hy>YI4Nq+5 zp73mPXw-Wq;kdHV@@Vr(Su@btfVIPWGdhaIp^l|47qS|*xZ7#G_zxb#~*4&z6_i!jGit zCRsP$LoO?qz`yyTsjGci!j)a7a%Lj%Q5I zy5IMpz@de)1(aiKW_9mxmvg9&91!LHF;beFsgDYp3eV+^PRaOW$W(%Z>|)Th7D|)! z?`j_dyWjd=Su}dB&RJ%voe<;<_DioK9H}^ysS)0PC9#!~Nbl1W{NhuVvM7xm)B#C5 z)tA+q=(kn-qHKb<+0?%#p0hw@HjLF4u&wLyL+Ty*WgpU{<@#2s3$>e}ppPb|q3_^b ztGuS+JhM_edRj0Uvr4qJ<{-R6mqst&E8W~_Bh6UYa=QRk)Py19dpdbiw1pnwPJxpm zO-c{^ng?jmv{GwQf$Fm~HXA~4EBR&s;!CxpPxHN*k$AUP(~7RWpX11VtLr|=3dld{ zbfGOCa-^Y)*Wo)E;RTNK7jwCU@Yz`-jqC}PI_wN&HIz>c zmZF3Ji<~v9AbFTxweOzC%dE9pTlwyH zE9FK1V<5O`3@vz;N@O=V^%Xh0prh)HKOAK5b{;aMY8r~yiF7r!xUJyM=}R*q(&J}eXj_pjA#^F4hVJuI z2QFY6fL0#?$oB6p-&5RedmlnbI1Zdwh~=-0c(`4bUTGttqnfH}3qU261%oByKU^D)>KZIcqY}7N zV=b50*~#~s+kA2+@-i;jp^f6hqk6Af8Jm^II-Bu6>cSoU7J%C8=9gz}qYQ)rGC(zb zVmKFY2(UQ&Jkqk@%XS3$Mg*J*oxHA@ByIL`tExnbEJ^V0PgYL%26jg@R*4(P%o=Yj z6o%3YV2x5kywVU0ns%G&uWv^8s$vf}4vUg!YKZE`OqtA=Bkl`BBN8Y)l2NS26ZvO{ z96;WTaZ=m1gAc#=ihz_w0}id3b9at1Sdc6UI|vCknPx|#K+AX}X2PmhLlbLbD)gOe zO#KU>7At4I$(sY;K9qmW;|_>vf_XY&#TRY%E)FCBc5q$}Tj$#$aam9@hL1a=m+|kq zKlr?9Crq3W%Fas2w>$Y_+Gobr5;mY&FGC+NrcNIBJ`NJtWxW4>TFyX#+b^&RLmQ{W zGg$MEwxZ%oj)8^qc$a?c=70o@>?FHxu;~$;|EV5;s45(DlzD<|{d^DJ9nB`%%E2$1 zH}hzzUr*@}W@cORXJ{iXRE*uEf2t72fHf}MYJ>NBoW0o#V{rrIN8>QINrAUT9CDC2 zQ}ldnGp&VPoms*S!M67X2f@uydw6EQFPnX34m$iHw!|M7dN&`XVl^7vlwZv&*oN|XVUZKwa{*E@3<6-dBWEZ>u^vqI#o{VU~kJGMaBCUYvNi# z>nYuQ&Qp>&YzPT=9$0aF9wvv}%H7yScc2!^Q~(x-Xiw6jyxK41W9C4L);d=CpNg(u>QyOK!1|2Nn>%fEO31)%I<55sFBuO49|xhZ775-qn>nlmsg5{5){Y z$bBQ&dSx{H%cbV4{-uV@Gt%>)E|D>CGAmzNSk~SQ=%Vxr%ksr zn|kGt8O=t&wbrPUj%tk?q?=c!sqd=ZXX`n<&b-EBk<23vKZ$**gfw7_dnQ+bM%|h4 z+Jo3sRt;Tj5-3$%Hk9}{!hgp=Bo7-BvP?%IPWl*&pamYB9asw>ckUc{+7Z!X`_!rG z95lgOab~-EXbE@@(E~E*u3Z!Op3{sVvD=;9{iraKlH*;dOoBHug?&Q3jtf^7r-vU!_pb+2VPo(3za zeXp=;_NIaztkv0EaO0^r531eNuQ+d&tMIAjU%NNhC593|eahxU)=6{n5*J7Rh zc0?#KW6OS%X*3qdABAp$BwUWXx4;r+iOz`1j3@O6RV{|p&;Ouu?{w;xk(FVPPg=}y^CcEc4G`e*ZRvzo5pmGzNxqLNNr5qp&I*2$=bt$Ar!Mzk_y{~5_^B$^}{ z-eA{;=`PnXplJMU^v`X-+rSJyzRPD5a+>3vFwXFG#2b$n)%sZk0_Nm($+(SwBf#|% zR?skspd~1aFlfa<>?yo}LTAV{ zUj8DP{3w>oMiSf<476*&wxW>w*K3FkZw0I_$9!Xt&3Fqb5B3klW)jw8{P^wo8m8Wv zRLR>H-ODC0Ijx+b@g)ry@a!~EZph`_B;1<1aeFBONc97p`$P7d@17tY3q|TWbg+aN zRA3-e_DlC)*tdpE8If!$KCSi?0k0e<_4e3(WbS> z{DzJ7Y@bWe-rL!L*_&Ja{$SsgSe^eJ!-WS#_5$-$ZBT{Y{p?G~ISG(CC2_$Comw?zx4w*rpT`wvea!l`HuM6Z0VTei;4 z*OSPH}QdZB<#kQir{c`e9`q$*XX zIcAN`C8gITVSC}JW(`eW?)+?fWVf7%DV)eUw&9ANDAm_oh(gekd@2{_{Hnh`E*2jkVe%^m66w zuFG^bg#B#dEU&Ipq#laZ(SLpUHGmm4n3{W*5*bYLW`q8f`0x@nL5uUY`#${o$=7zfP)+O6r*Ca^7IHUhI&K(53ca2+R11Z0*Eo!Fq12P(QQsROAz$4!ANehCp)O5z zbzU3wt6IgofgWj`(8#Hjn7%0{+BkcJ23P2$iA;J&l^W>XdkCYzy~T zt@&2pp~p$;frzDnE0ZsM<#@AU9M_FKr?^PC+BNe{)a~lRLZ9jeq|Gc^=O8})+)e+NQqooRW?6o5wZI4`s(%Ru0fthj!j581Mz5V2sb(E~HnXzpOWvSQbT0dB& zyPa>-a@Vs@==_mm;P>lS6VG0Xn&V#3qq{~b)E@Lg7IWA2#tif_`5_@{=r*W+6*Jki z#>2$JSj;f$!fF}#u5v8+VQ!xcsIq2xppzey8anc3CzK4s$fn{MhCGt>GaebhJN=Z(otQtP9$-Rt8ZjyKE~n`-+7RrL3e4$XVv#v*8SCw zoewc%og;-L`7l-@R#jRWRr~WPy{i3e*+>jJ)`5M4S1z4wM?C5Tug8!3m0&`?i@)RG ztwhXw~6@AgxQ2vrt?B|n1 ziOV==IL8I$bMrcJ>`4p~ck7b0p32r1KE&ibwo)}cSt%5vg`{w~*m3P(q_nE`g~#l0 z2iD|Kr>D_wVCiwUi(wU#k4~)?qbmd=)J`SWrD&{-;&vGTUQJR{=Ln^#{8^5(0>hV` z?{rmxETk9)(he%5R#1jPEYJ)hkW4L=-O5M!G_NhBgd+h`D7?AClS76atw)E(=@9npqLy zr29sAd~wQ11WhUBHk79_6@8-N8s|Gl>wh&G!dRcIoZ<3hfp6-Q2+eTU5PeY8_6abL zDDQ3&vlkVM`!@U6aG8`RlCsQqO@6-|TX-I*?M)h>pKG}`Nc-TFB-C3~er?2q@5Yi_ z^Sf##TwlfKD;;ctMs=R7>!D)BtvUF2)AZ_$Z}vi|b=Nf7D#v7Q1EV%qs@pmR>6!Hv z+|yzw>xn1Zg;gw(eUtEOAG4^n6VDqAf0vL%rPw?1JUlucVg(&}LFEb37QEsCn|liT z{GGb{`0?Vns4s;RlCkECz;ONf^#;dDS`3e2qt*BlZyO~?ss5#@!VG-(L~V(~Dl5zd zJE`w+!Is}R_FHo#E>jvMTM)NA`#Lp8(D#-ymjs5B75qHNRrqs`1+i&5uiLlTu>F?} zK~wnR8?+pIrT(v~wkyBpR3L9@Im22m8x%(iM7eOBAe!Hg#{7m9!l7aom*Tfw%H_`O zgIhdc4@OrmPsr-`Yq~m-dtq#lG9g87Kaj>J#=G0s?$Xd=qh+WR!38T7QaF!76O_5#+*wS|MIwA${3HdkrSME9{DA(f@FQIYM0Z2@-DQl#-dJ}8% z59fCwR$E34#2Mfe((cl-ioS~X4p-_X85U#v;ZfTaG4DLS-F-vY8u4kBx;AI9^u2%k zmCf35dcjqT7qokv_T$Ry4lv2c`QbDGjUq#-oAD-5&@;>}fF_=m0`>dgeN&LREm}gCRw?kn0 zPF10h_`RoJ6jmSfiL8%M#EN;=iHn~Q<n{}gd~0js?&n$9*ZZ-o4DUiSrG9i>OH9|pULta*QOQ&!uiZhNz4ll$G$6E zizL|znNi*^mkS$EU-4fq?Y|l+6lwJv-;MX8dEv&nqx)OE3!_x0ENzN+dX^9iqHFWs zN^tzMw#c!^@3#$vi1ofx$9sFULfak{R zd|x^G+QjG3d*;_xj&9cUtERNK#;x@1WfS5(aG>9>bDD~O8BCCQan3=p4Nd#tELcwT zF{Q#Qhg*9SP+tEXFhQ4?5D6|sgGy6wicimqN?1}lc&fC>=qoCg| zr*OCC&Q}$3-t+LZ=lY!C$wT#|y-O@ce>R%=5U0IQ{zEA2*+~*Kkr|1?5v=0YZZ0V? zdly;yZxX-t6K(d2Nt5+?{+0}E}v2(i%PXmA;2agbw*=v?CoMLu?t7(3Fyt1 z*OE{Q*X&J>ogB8lrKWwg(J|kT5$t#m&aBo^D(t3kc|z(GFUAprd7-#?G0*m%FG824 z9`GMANL{HQb^hx*&4rTQ#UtjzHLw|e`J}l$b)38Zr8HTQulaDn2B=1NfRN_3KyqA| z4Bh-y*A~C8giW~fAGQ~&Y=$y2*92?070_m&P4|>mLKD#7d_o;9n=okRks^NL3H6VN zWTN!jLj2UJ=zKKnS)0yr&&`K}$^QBhI!30l7{mv>OBdC`}hAe4|Ouws=0>y7=6oH=bsPq=0wN03Neh|yD&Uj8e)CGz0rSe(pwhS1Zxp7Kc`XIC6m6`xHLa=Z@>;W&M@(OM zKckI2rSZ~!>v@U%!)FDZT?1W~Kn9US!er~;2kN!F*)_bJaKTOGsk_#gmjTJ-afoh* z%uAt_6uhxtVWRjR7Aul6xbv!eU2)lYPq&172<{SC;#`P#QV-A6nU{cT%Skp1Bz>#2 z3K~{cFw#WLzG`*fNZ(g^H3cQ`+$u~Vv~x+`SYau_!h*~Zzp#P!p#zO2g1!FGhhRU( z&95VhrjMWOiCo5u-(dQRo6+FkNu)L?oOV*-8KyfulzHbmm_m9W@*{s;Oui492EHtzB`rrafm=FL!q#PhIbS>v8=HY(Zp8FFE zm3~ur3P~V_$uei7ySIl3@+7tO8UgT&&(HV8goxia0^1Jgp~m0pGY*2(T!wAJ}d zQ|KJp2I~X}x%s8=rlcw-q0wB#>S=Yd=koX3yTfr8?c^ znD(~mN}tx;d&Q1*RS8z&?*`L#`0-zaPQr=Tv==**3XejxuFU3Z#OiCjT290(C8E{q zRwf3wKI65zxs@;_VCjJw7HFXP@H-6nGheYJaD1bl^QHAGbI);7flO#^m7AD=9fovt zt%)@CQLnW7)CQl%SdhrjjYZ=GfiR7pY>lvwZF%%9yzv&EnpEa5ZCR0zs$n*3LK->= zKsxf#hS@dg?RunEMMZ1FaPRj1O&oiT&wjRT(PS;ONAKi~x2bOcd-PvZrt{d4;Q_{> z6@nO`$nrA7MWcX#T&Q|TaJpKm*w=ZuL;W%Nmi>n}4GoRredvq-xWK?bh$><3aiZ}a-B)sGmdmo@r?1}0Vk6si` zPi1(Egzg!Ai?_Bw>FSJsW#u`uunm{7E6;vrr}}tUg{&Ze%sW8yLn~t=A5ee^oc^*r z`+I_iA`A5?Wp`mqTzDtXeM-FKMl^DV>#C2HJ|&a05e>l$miK3Kh5ZGAUBaM`%XhdCf_I;jht8W)tc?{mknuN&iAiA?gfZSzN;6!-)tKW>Sx$G@oL8AZ^* z|NOqT^|uXMQGW@`_32mS!esLfo=x-j7N+FGmsv!lW4mF`I(EGpBB8W}*EUVJw{P1& zkL#Wnr*h0UlVb+mSE1`e#E~Q7A18X1)b=u}~;jMR63H_E?zLUQOgNJRH7SL|>}f%|aA- z%$Fs3OIlWWOWPo#!ZT&ieG8G^dTiRYpRMvt!1Z-=Ki8w3Qq|MaVfI?n(GJDo6^i(s zwOfn|=Y&TUXYnSk2B8|XiZ&}1BE0rMDrTHM7iQ8!8N^`)>^v}~cbuc?M-ijO3~!sXV9H{~riGrjf;oYCaf*XZ^> z@##|?)kHIJ?dI}e=A#GcSQMrdIeNce&5_*dN#MyL2kk_$ZZU3BZVzj$j(Je#AR^hk zd95#}CVv^cfyOyF@D8`+76U{$DEMxl)vm$WuJoINmf6yu*E#p1SoOlT-83?YW0(&G zZ_S@6Yf?Hd{s7-t%=>EzNWUeR6T6hww5B*VW4G{th1FGVaPl6?Xn;^W*pd~b@o7!` z@)eZpElF-GcqOgNYw z_Nhp7wb$8UfoNFU*_SBikNf>zDZ$=;=+@o3?}dps_HsP!j0N8{sy|jdQ6E>_gAPjf zkK>m}XL{`h8>hGAv3L!yjQx-|8rW(%h!aP1bQu+{*24B*46FGge+`R!PlmG!oG7P=hy z>>%R4M~Nl#FGr@o=u6+My>!u1-|D{VBwtUz!-KSXmVT;sv*1wnq%7cWGUu2I)< zyK#hfDendKd!6kspVQ$!Z%U2sQo`|y=7FK-LkG70>Ry# z;KAM9Ay|Om?(Q@McXtWy?(S~ErEzz6cl$c;x7OZkopXLNct-aa{mgk+)m>N3*@q=N z@+XB~J)Qv@pL<&lfscyHav`Y8kE+shMHo#e+C^skN|dCAcgN}0SMVlAD=8{dMY51* zuTH%lQcv#He2M_REO}^7n#26}QF5rfizIAgV0#l+{vCEo3WK0->`z$aaGm)pxTPph z#X(!MDd=0#4^+e>=U3u1>p?G$MN*bLHb*QG&F_{Tc9eYNl>lxVO{4m#btP67q=Mkn z)KT5V5GM0b`XN0TSBSn*U40n{_ zCB1R$$CzP&p)G&&<3Q8?v{9(;1}xJ z4=y8O*qWUrU4k1mLVJ&v!{Wit%$r9S;K%IW0AtnVGkyPlL@ zC=S-&WFMDkhx(*5{R^f6I{nbuj zpT3J=i)keDDYytjE5Tp4PLU2om+)y55<8#9b9;%bMEW4#Nq(Rj1izCbI%(WZN?WEo z|5~l6rU+=f#pkl3$kxALibL*@PR$y3uEM2Hb(M5P)rt!!*X69v^t^1Jf#S(;h)=1s zK@xqtxP;`8l~@5boYf&|uu@$~JF|0PtK4)b6o`t_*3gHBIDbhwa?y@O2QOP^c!jDx z^@(<9G{zD`X+<&b;)t!$Krwg|mv#}dqAhu%xpZmgcOe#@?7f{XQ{}rU;b#I;KoX6*PcgAJ3JdWsOw)=%B`o#9{J3R-8=LK}=8{MD!$&*SuKf&vund$PdgNJJS~=D& zOSwb-bK)AJn$dxEdnRHL)i^>%DDK*2K|g6NH~G7qN%-wevSd{h zT0mCI$0z84|CUVx2pK9!uw};Rd;LPwX&(b~P_E!C`o+3X=Y3C0{-eR)2_kImb5>Yg z$Wc>?5o7!Rnf3qkf{_HffA;(Ll4Jn92xyTOCsW=0iYV zAFh8oIlf?lK63)~zX5V?zr@_$X!KE9%|;>3)|ln4tM8meaTI#0Vtu73c|FwcCdmrVDsYEijwp=fe)?54zehFFGs3 z$8$y^ELB2?GQ-&SkQPPYfSP=$uNd8S?9Y37ocxhHa9`lUPmsO740?mDc-IZSR1KYG z=o2dp7-UQl3TP0z>kbMBY)8)Hr?4b^eHC|$YhKXZmyw_JjSmlX&-Z4Eb>70bwIblg z444QuEZX@^13*R zJ1cp22hMt7aG9u!qB`~;LR^a%fML{<+R?t3j2`KIfMpE(enh_oktAn`Jj9`!7?NKi znNV{~>!Zq7_0A(v+HDiINPpf0=6(&wS=-O91+1?1tmR}cWNHE;`)a_hX|VevFJod1#E{zqf* zpSXL2!pO!7ecA%OtE1UO1f#vHCIX{SBiNP5Z zf)Oy=Jj^uv5$Qf<^D#5?Py2^pN$}Ln*GqM=?wu>z9>;V`1@AljgCxxkk&JaEi12@# z^LogVU%>iSh)}AZ{S{?yed7s*?_uQ3`Q%KeaT!DGq+LdANZ+%R-nHADBLUxMrcy*d z0Bi(~L1v^IaPL1u<|)uAZ(R?asZazqVeB_6D$@MEz!i1sR9F9ce+(>27j@s9w6Sz& z5SfFeV1gd`!Q|lMpY0#I_gdefbI9+;9Q$V-E@Rqx<_X)NH8Iee64e9h%r12B=;yNqi3nZRJ*Y2mVQ*i;!(0@GG$@5f zH;Kr$kIupqJTRk*P*E_UpRn$B-YKIf&9fE`&`T;;Ow;}a_rl}@z!IhfqT8y_DW?hmru|u{=hRI`{Oxn-A zbAW^={pzji2QtpTuuTl$ED7)T_Yv`1=WkCv(y+1CTA%O}XissocTP?4nCp5w>9iGp zHt3LDj)XT@()#nqIkiYbdr!XG2V45 zyoD~V;vn7Qef+uVeb41X!9oa`A6G*8D_G@6ox~yI&^p+~I@8+#U)>ff=;>`_Mu`93 z1^>Azp7}njU?y=>Q#qMR-|hBa5c8PGrRxQso@8+L>&JUhdQrAQZ%kSC!~)!ugN$p|8p(>{OMTME>ms>?(v%AC0Ki zl9<17pBmylbG4t?M2j!`I6VDsfZA0q=VWhcKR;=~Xh_;mTJYp?aa-Hc@4F+clzP84 ze%V)Ts}>}s-$Zh;8#!syC!0kk~=KDU7pK4);mxgZCoB;*NM z+?YSc!y+$AH+ovasD7z4K6g?!%ay5y;;NOLC_A@-LpRz5T_aifA?K9fqzNvcM$)it z+V&buqbrN~EN&yPf}C}K;i3Ws=2jEJy{c{%993+Z&nh`d;tf@BQ(H!uLA5?pNfS>5 zxXr_qz&l~b`p=tzL=t~*_v3ehos6y14DNOA z^?Zer?{0b0wD>ZDviXq{(@?9Ye`R8v44Ad;;{ROF*K54wFM?Al$tdCw7gAHt0gz%Y zNIO+43X1FMly3C=3{hmj$_b&mtV45{IN$pXuBVTcX&6*4sD&caeL?sx~9XQRP`UYE}NMU+Cyp7sf()VDrXK~JXF?k(E zuyZd>aL}Ln*tUb_k{C+D^TmQb)}9r@7TRK4W}poLERkBvw8d%XLP);nzbf)D@`;(3 z)|N%3NaJW5tqDA{r^em}1$hj`H;kZ@o&HM?-cZ0M=@l8S3$Dk$nL(8#Pv*%~guCD+ zZm}GRU@GEuDZqBEj@L;x5v#$Hz?c0`DYA{ii2h%n^RHkFv_(ed9+$#uyt@eHyebmb z6|lWN5yfws;@=wV#fp6g%;6NVmykvkdKQ)tcr$xB+8rEvN zr#$RUK$7ii#|qeyb|DBOO;OTa^%R^6Qq3Z`%QqcFP|B(Rg$y;yizw7sq+CNYkXeXRYc&86|~87)l!c7 ze}l0beN5iFts{-v+*u7*6)O{^JJ8ADsDrJvZE@IItTnTUW`2C*^*yT!d7{$S=s9(j zC=j`j82CnNy)f;O2tf zhPMZ}G7{1e-qHlF-qW!Yp3up38m<5m&-2&zniig)-ak3Px*l2=E_yxRQW`UAV~0wW z-l#zme3fjV6qcJ28IUR4IMk(#Hzc7sl+}4!{WGALKSNjHL1gyA+qL43_^*tS-QwML z3?>`nFjWm(-dTaV-JXc;2WUBW`HCUiccH&E6*!0-8jYCPKtIwOLsRa(;_rkg&{H28 zUSe-(pifzueNx~Kzud7nRn)fy&ZMgxgL^ zdKp`b!)H8nc>TH5ib#6G0Lb$Aru8E2sQ;2_63rBHty0ecy8X`7RB%bdoiw-$kAgE? ztr8XRBCpClf0L6Z5s2EijLs=%;dZSb9BCEE=PqoJmV2#&PgBszvk-}BA**W8iM(4N zpV&(;ozNsa#v4Go$?l~=I#}b9a32xsG#0t0K2|B_LQty`Oe6yR(E&K$3Acps_xDR^!rY@n zYSJDiO~h0Fya1kl_gbxKO~oT!;?zzZ2pM#0LA|!r}y}5k@Z1 zT2LAC)E>uU4_0ZI+<2uSel|R4AYL9ac_bP%$cxDJ9TixTCd!R0@v8n(Ym@7BUH9xy zST>JYil-)X%WeILT~q};YG#q@gF&>|3!UpA2=?d%vo)(OG;)@7O3 z4L#DZJeX(z<$05GY;{|c5TCcr%(7sPn3xH}LnXw(le!)!8_#+eV6R9eL(W{jlG_Vs zK@F}pHQKn{fcjSw#H*Ab+UPtI!5`|(LD3_rN!H0cn1SB~*wtiHDv|02W-o}#Ch`Dn zY8@UT3Khtis%MZz4&hek-Jq^B=(>Erymoxh+`N%;?uz$eceOB_yAbnX_Q=f2 zPUfAx6BGY#Sa`=R^vw|c$z!b)%LbEx;|citNj)#+C3C`F6$^=O zz)5%-2qk4L`$IM$2MZ}DC2Q(% z5HL?hj19_cj8_#aS@Xx8az{Ade>HOU41nCSTDjM{5cgr8pP;GgBJy-&xgM7ww0`;2 zF#`t2Pte{q3erJao4xDd3>cfJmWsQ;!wmrl+amj&gOo@8*6ZK2K&>{W-*;bJ_W)IT zV;+PHdQFX_O;i-(ch2EHCU)C#lW3kc&Uqb^?^I+fsAXFEN^n}_d+H@%48XWM9f|i znG=3g-)4MEO}|Io^3ShTweKb*LTI2Vk`QmCb?KPWO6kGv#Uqs2);FQE-B3P4-&z}- z?X@THOVJj8kI|qQpy$_8Ue1_O2{9Y7wJYklA;(-UbR6IKlXv+0H-(L~Kcp9XD)VJ- z{t2y02kQExIu-_nE5u-c!7hIhyy2iOgaZfgfdQ^v%^-H^*+!*KIIUUNv$2wa5h5aw zGsicB8=^9$GJd}2UdXN}$e0zEFKzU^?n?5vr4-t(&1)aIQRj)yj~($p4}i7qtuJ5H zDrN~u3?OZ3??q=Xjg_)bT17{_A{JD6n8|1Q{n13d5&l~F^FDFa3_ zpl!q1xBIdR5ZAEnW|D@!`Nm)R>1O%;fiIe!r0(#ZUqR*BF|?AuXBfS16eTf78lA^w zq$a!#IlptPy;pKjygObNqJ8$f_Gp(|9-I{Ge?_}gt-VC@-PjX-<upa=IYEwHNq#yELhah|6qiV;L_$S9 zd-FwQv>B6LTb{!5M24gL9(NT5%{VOB@nKgUFZEx>&L%8apAqt*Is`wCoa&Yi_Ku0fy@EQ0V&^w-rwksm8T@4t_JkSbN#OtO68ZZ2`bOHvUm)1#?MPRk0{Kr6p`9YO+4@O+Y7yB7zh&E1=%%LVQxKWPMOs9M8U$S*l&Z z1z&Xo`I_d$wmR??_Joz~$J<+BW5%_<&En=_2KpQnD3ST;tqj@5EwXLo7sPnO07^YR zPUkr*D`Yz>(5O`ot&Fqj8V{87Y5ZC7(=&$7qm`Jhun~$lM6H|54=bfcE+eU|j0n;w zk6sM&Xig^KV-IwPlBX9kdp3^b?jS^LK^42~DH$0zwV?&E@g>$cc{p-!#lO7(OwV9R zPEfLSXWO9eYe6$5gTh2iNQlBF6bL%V4{EL1!MZ4xY)vVaYDGfF7sV6= z)E^^SuuUf82x6lzU5;{LT&50Zwvk`t-)3$?NY+%~>EKI!C3^e*5D2h6U(aei zXZUuZsG0AZoRDLCEu6;{Roh=VS)s-6dP_+YEiujb_-(NK>_ve$3dNXJL*a#$X#+kJ zS_Vb>y#3pY6dQw?IvuyH^aY{49PO3Y!k0jc@dspQnRmFfc6BpBgWGx*I2}w*?h2BU8 z=YvpK?T!+Fp|-}3d;s?B$7y>`mY1LNGy0IOzx+n%Dkul;s-4MZR|E<*dr7=XU*Z6I zJ)FzC3DWr7NCAc6hOVSL7Wn-wA?`^$yur>@0%+m9ajFVg*Nc^vp>~-da1~BliBKY> zgt=+%W+Cc?Wo;j4y&}_7&i9nx1&2fMdy{yhHQ#mu7TU2)QGI4U%M~}&Q!pn`JGR3g zk_(q(X}vL88|Fw}hChn{L?DAG30>50cLxE&aigNmN6;5~it?g7n`k^aP*JY=YG=JM z$qY&*GB`xoECm|2!#!Nn=k398IlAl#&!#55P~RGMZ~xd}n8j-Rmzujt*gH42Xs+&tSw>BP0T$UOYcqZ@F*h+CG!NKZOV^g4k?H(#uAXOiI%0 zJ8htJ6y$_+T9fWr zZ>RR{CobZq?P^~UVW?wqi*vlHZpUm@$ADmCwfkME>IyPYhlttzfNgWtxEE@~!fX0R z?N2)H?B+4fa6Y*QDA9Fb&Kk9tk#N(scOo-VKPb(deqsXrM-T3=XN_$K9Rjp2`4-7$ zbQYvqGBku;B@V21(1_+PxYZg)z5W+F`R`Qnzl0GefWoNp*M>AF)zjRy5~aJinh0~lgWCuyy|pI zczkEyxQxB08cc2A!1*!`yYNDLS|Mj{$K3q<-8+7K2RC;E%Rm8C1D*;1Rgqhx=SPc3 zP!|udZx@syebj-<%6_;Ff&=5+y_P*1IeCiw$nuVQ-rl|&>f_%X6E5{fHKMTs1939E zuP8YPxcKeNr!$=3JP5kzR1ZE=BZA7HJJ~vsY>2w@YRMu4zAfIZo1xcLGuK9wh>1JU zSWZGW;B6Q2akz{aZZClv6#Hl3nsty z-QMobf<$5-t;S!=;ZEtM_b_ZZ&me^^O(&wjn1ufux(6Sc1Gh1BtaBX%Pr^L~?D7>W z4XcNpX-X?8&ls203R=h$+pg3LFKLx(i0)8D@!W#|V?S4x+LMJ6a^XmG{VX))5Kvw< z1bm`Wlu6(-|Ct(u+elPsXX;1j!Ng+3#WO|O!@x#p!QdYy^2K^)b~w%QuApkFt;IMy z#Bo7yia5<{Ev+=bJk(8hXHUj8p#MT0lrl;y*b9LKm z(z@D>5dIv0&hBT4w+=yBk2WL*K|P_sB8eO%{2I&Z^-!qXh^-|7?mV!oqndlI>&2V{fqi7VgUNV^;ve>ISny*`qV?I6Ix=eY z;hfFT$C>tWK{+00*I4U_czeH}s$f9iCQ%K0GgfFii7*oRx8uzZtEJ)VY7D)SEN+L*|B z&-ZSsL_ijON~nu24@()r-S%P-{5?OW1cd$qFkN5^IUw5J!4<2=unbB}duk1=)5V2T zC`9T7!?Z?3_j^p)WIQ5aX5G3vj0#-T#4wB^Xo4o0mdwTWfCbv_R^ zn5jWGc06bdkadmTzJ$)gISc{dh~CF#(d&AWOLk@(R92G%ci&#vF*}isz0m3};6{b< z*)EcNshzam7gss^DDs;fWPX0W*ykHZ3VfPII;UF)QgQar-(+kh=eZo1$oE#U4_vy4 zAzjnqefF$8ClQ4d#Ydn4O^1voi!I5C?uWo%=NmuZe8hBKf8FYP4Yg`v@+o+gjHVa|5+&DSiMweGt zUGx;BnQ8o$5+Aa91PHFXtdn%{ILHdcf*Xd9P&x?Uqso)9Kej~{a=YF~zlQ&%64A$zjmo{VPLN&od7w0-E6`4H zu)oW}Ju=~8ILx75`tJ8ufJ09nJKXh7@pARF)Sx#jnx}G7Oa%Kr?#H*w;>IxlI{#M_ zl1j?L4U$*dTLm|+Mr|#QF_4JyWEyxi4(-KUIlyzR75DBuSc|$8(R6!uUgD?5zRMU5 z-kDPJiSo{MbDxJ#Gre*my;7KYpLMHj`I)6F>3+Lda86K3AMq6HHiKp)0?_Il>fALb z4&D+<3J$&yp&XGBd4MXnIA;TGaWTV0$61A)4KMfQo44~qsXWt>{5kUXxk&PN5Qc$t z9t;qBciitQ%a7qkEXqn4a2fpzkE;3egddN`G#m!=uD_rsq)va<7Uh5FS*_ z?oIe%&a+y-xI9K{SnmY2oobm_7H)NxZ&9dXydaGIch3FmWMljF|oI~}O}OpYGr zXT~4e)Rn(upqSbyhb>7PXvWn$odunxICvl`=(^$j-7z0-D^7I4VkYoAE`xflY}&qz zjO+O+y0P(5un}QxO?GsRD4NaRCk2jdz&h48R^=VUxZ6 z^a8W@cko8)#XvP(ygUE(@Qc!n+ao4E>oy+pVY6UJJ=mTPicK;bb$A8QC~Nr|IpF0z zuMIZ&6&wRBujzmK5q|-wNy5)Es)av~#spU7OMW$h+>=1Nto`S$ue%!RJVR@$aNX%+ z!v@z@CAkc?snTdoV|j0JSCrC8P~R+X)|`&_xmXQntWuj?+4-ip(=e?w#f(WT&O+Sw z9)>5)W(QiFex7JOFBhu+{OoBH^s!XP^%KUE-jBcN>-c!MX-BRtcc4{@*h`*G{Ew=O zz7KdnYUoZ&+QD>9>HE>pDT8ajgb062njGSSDWZ0u;p6KTXC44EqiULcT@1?4w!1<> z@PPP`R9E!ln<78~;P-*^JR-{(th$-*brR3zggQN=gLlSoP(yjUc7eN75Hfbu{ zdCY)~R{7<8|2|W6@d}jb~`9(%kfkr_X1l ze3-6z^4-yt_xccK zStIXAGP4sN>Hb{ftEk)wa)W0{n$l9^;B$#mYRK5GwR^Nh2|agLOY31_;c@O0h&ldP z0I?NoAP)EnHUCY@jQ$#G%MiZ{Pb4G?@$Hsh+gnj4dm(0MYg{={jBZX$z`T_HRa;4U zi)CC{?4fBaLk)8k2e+r+5#6?2Vm*bsuTZTMeOohh7fIM7@{fm^qjL_FTZB!D=HrPZ z&dGt?FU%{XtsR1!+g}C_BNNVjW^1?u^ZWQOFc}cWn~>TU3kANIWLst8M$uz7shW>{ ziN~Q2>|hIprMZJ?Wno^B-ucKSE}MXhL8L24zGxBg!gLFkQu)>LkpMY=1uqqry8YW5JQNCD9R&4sRw zR}Wf+KS&Q9{2|uUg(Xgp)Q!R~Z2IlyC8?Z6EpFR0`nt+L=d{?a;AybAeG;Zf|9tq& zDV1hkQKjUl5vvSzMD_Y?pb@6Q0-LTiOT*+WDyg{<-hFaRpB1psz(r?Yq&W{QdaO{sf0>Bz|>y-sq92?%LTDObnJOl#Om5d7GXcr z>l@N_H;1Mh5V7-nOcDt*193Merqo;YC2Gw7Qs?r(OG|q;;Xr4uguk)9pEb+vx~H%A4y+ryJWs^AXh<^?!Zj9Ll^|Z*PcpH zK2aReBcmBf3sCLWWI$qFWD{rB+e{$pdYpenRn!#?QG$;b@5MW5h6j(yj=&0rpWwzT zErT<_C&8b=@)m}nu-E!LS!dQXNB=`#BWe$tgBhcn1)m#SgMGeUeZ!_7@Ehd4JSqzRT4 z*Y_nL-oAl`3%iSdo*A~_ojEP0(K@%Yh^S+;{|fKZ53x^&8$fzZd%yoUqH;8 zWa82Fxaa+j5_BrZM>Z+i6%(kR%^f@a7>FPeTB|f3c0K!;J-&}%^Bm0H<68``6C2K=)O@ zv69nmYL=w>c==lCA0t6h`Zz$n4G8;uceW#`h~hcaZ2_fd{lp&YJf$Gh1Z<- z#`3Y0HZPP^%3aq};yw$ndHLFFV(eLG%8qI*C^DkwvRTbs1L>cU>(A0mlmaz9K^o6K zk!$;PL*(+qdQG@(C>Kl84D-CDGBEun;E>tCtr>vW5#9DmJ8~)Sv1^?nt-Hb-Oc;cC zviXNllt`1EElhS2d?OHefcQvM1H>nGLw}zAdM2=GE#kQ$xDI9i+DJ{EQ zo*@JAY{dI1_u#)hc6N-b_gX6XDFkBvGU~XPNe;_S7wx|)~Of6Cq zdEvxHKDDb2DKqxp#nZnaiRwY(?)of|3}@t;9}KT{^3}sR*v9BvF}pG*TPsc_#bdMt zd|~&yuEc3(Hs}?ky(CS}5D&>BKZGsl6jZawiN>-tX|6X(IiMSK`^dJUHObT)opPX3S{dd0C~*dVtL^oCVQDW8C>TaT zJ$wkvPbn4X^O-u|GWKpysXb2N1oew-yaJ%( zLSKp97ZyPgMCwvN$(UlAD!uPj!SOzT+()0rBR#%c2x|a^09xI@;)0t@EKXuCX`#g{ zYS2KJFr|ehg1tS2@dGeG8l5Jvl)o=}woDwa70{@##A;f*F8;}HR%^Pw(v{gyzGtvi zf1Q}HLarvFWMJ^;ysS{B!LJNB=BXyo(9(pii@bs%ZQshx%7SvZEfu6=$r`Y33p`Sx4n9rta6ywcA7F%q^!EE|Ypuv~V#&HB5F> ztf;=#2i7NpeVkijz&WD|<__;$NV>NKsNt`@l^%g!L)g=06ZI*Dz1o^!Ns;er;I2+S zs_x>)Ks%i>+5uy@8%4d4@nNDwN5B*HkXhDbfDT67D5Pm{5O}b(CaRW1K$SpgSSP8> zO{~B=Ooj;mkt%!H)%QZNBWG`l+R5)Kwa^R>`&-YO6F@A@QrHX2X)V&y8k~c3XX(>; zM#5uu7YEF>V5+y$ns(77S{7t%YR@uXo06J6KGdVe@fOabr>b{qYx z!5u90B#9Bvw_2=^@NjoPERhsqaNyJPUj)*KGp<*6eqOfJ>sZ>&zJ-<;z=S8UF#~<3 z3%BcSdA!H>+9@T%=F~er>_$LPe`-jWyxZJ|Kb^Pw&-fe(47VHl3n78RKu>YDa}gGb z+oslCfhFUui_0Sa5={fQM#B7dl8OqfXxcA1?!_-1#Dj3tF~%QqMW5l$D)Rgjr?Hoq z5;H=Ab=2_$BbX$vV3)MF7zvlt3mg6K4gnU6g}L>>+g-{RPP5|;I#cIbWDW*-8g zj$qrM)r4BGtmLrPB)@Mg@d3N6?#*737EtN5K@-zss)IlI;sF8$SC%&c__|~_@WV@# z7_o`gAKjr@bP?2Yp#)CG-$L||L9@}na$TUpcS0zVUBBT*+x#g9#nB~E^5mJS`JwgL zI`+r%pmi~eW~-y`w8i}!udv!ya7RXQ)|yd{>PeeXf_Cm{%y#n)(` zSnz#Sw(ihbRs*YKSi;>lRpg-_F^^8oQCU|)fx*ioCtQdK`({|BRWmDr8*qdFe7?Ul z3r*k*ieWok>NxokzVd%u6*l@}A7Jr6fUSxAVr$-Y`t$^`VKJH8%5K@2@}{n4ESUmp zumng&_mMDbmBvUSYZJY#qZ>YRgO_zpDl=w7hggUJ9xU#P>(gA7*$Uh;uj4PpC{)x% zozIFvuMIH`bpXUS1)NWX-$?ULi6#q}idSrNCdCn_c*vZfV1HO5lvGMIHOc4F5F0Wr zKR1ro9WKe%QIQ*-gZwG|p_?G5jvo^7GY1^?Cuj zE<`W#k##W{X=Oe;L^<)rqk=cl-Z{q?e;@rXj2>+Oc26I-wSp0;1S<9A{k9{TioQd` z2qy1~0W_`ln33Qr4B2belxaGc zxhG#Qzdrt4`)2e__}$fF)gnW-s*m!S8f=l};6CAr$1wz3>Tkp${Y>${_8T}l*eZkB z0<9gxo{>}fRVs7MGjqCJWh)VIiom?(Ot*+vs1F(b9G)*fXE2s_^h$d??^?@4mQGI3 zoBT1+HN`kcj!&PmZ_n%;sg&{!;lZ+%LCX!w-Q|rjwGGLDzvjAVjwo5P#lr!FAoXSH z9a_y_d(E+<4Kof4#aEsv59?{TQ(SN8}CTJ5EOkaJG*~I|osn;_d^)0oOVm;8( z>g*AJf>H~I-{So~t%X)qa<=eO>d$4AS{H&Tr%8y63T<@nnbpv;I|aVT4`m~im3F|wr`lkn)H}#Uk*Zd07lzxkGkuTk zhg`;Okl9LpiY3F>P;_!{Gwc9BDfQ4&ka>;g%moLtZuudp=!!Ea6qh+Bn#a7<1-wXA zO`=s`XdS?KJTCBRxXG}wSGgi&^O{1p&QcEL7Vz3aZ=|~@8Qw>O|Q{N&^UM( z`0*I{XG1t+4$e|T-5`(BQYkl77`+QRM~${6GC=K1{L_RJ3le~2s>5)z*7$gttFmL; zs2EDf-tqJP&<5=T-NKrTIF$j40JfM}G<@H3!}IZX_wpUx5ccv;kI)mSPXr4d@-9Zv z{LtF0Hd|B{E;VLuur*iVoz6fer1ZojVa$c{kLJfGdEm77NrJESmEr5VRP8(CFvl=l zC#7Wn9*Kln#@N5TfPAIUx46nKdcEiSSU7pUY+KgE$)k}ov+!0jk$(+O{%>FM^a*-# z_c1p~WRHL&VCL!p3imv+ObKZC>qP4<{JV`2eBIH(CLZL)Ve--b-x|pfr(eRKeKMEl zo=`kZ%?x+C0WY9c_q65+kZ`pje_j~R zLe7c7RZ(~3G<$N%=xWT6nLfzS7*@Mi9^em$j zJ+XTy{1McVf_f?*bY@ zYS=MEktI&i*m&_DI!wzw8zM3%u`Red6_G`(Dim}O@WKhvZfiac>k^uFHy9YZ)12Xn zKB)*Bw)l1owi#k26L0pg7&_Aghb<{ktCY?nYKP478*$$Fv_{-`) zHKZlI{w6$@PLOF=3pDHqIeYfpU2K1&%UrQ28P9V%@YWt(l^R_=wuIRz*)h&NIRe!N z+8WEOIXfUy<^62hJAFZe2sOHW(4a#y;r1=fMWgZT%*-yKaRO(E^A{G+qUcZ=4ssO+ z(|s%V9TS={HbxpD8BHsS{2(M0%^_{?2hiM-NC|JQAq`^jbx7FDSo%#?3~zvwOVBrPZIR=Fp^ z>}*BC$R0P3f$5dqVxj1%x+<1UwUUX@XoD9j9JCIjy?>8u{B}Cih*C2h%qp;!T%pgB z&x*W=yR2V^z^iR)T|%#j=l&{^C4`9Qc*bmvH*Rs0UN$Mx-!C9(yaU6w5i4m@X`RW- zAUdM7okb7Oo|D33qU6hyuCrd*!j%~0`(&1KGP(Sa!h_Nch={MSh*77P+|#P)D6>PG zn7ox+YR#id{_O z{{fEk!SlK9P+51a*|z%Kl0#Gq9gg6Y-m&eq^M&Z) zL@7AerYuxVd-p9M#LKVq@v~EvQkvt}b$_|^NDHPrSeQjl{t_mH)5Iy52&N47?olzvQ+V_ecJ)yH zD&KyKQJ>L!pT*sNBwo^b!fw?rx?wmT!E+e$0bUUY*tiDD%77dnpeaDgWDE;xHMLh! z2?uc`k=oi>6N+cc;`=xoU&0UUzXW!)huQa8o1xN4lCx#mteettxa!5#%$yzXXimL! zyH(ylBKMxh+(2tFIL3n=-uvgbV9hu9|LZ^!y|1J`fj$ZBsP+D{qz`I*p*d{dtjbSX zp;5lKA|?thCP0{HNF{QF=VpqORKYYM7Rff_s93J~bK@xSM#;u@{eNtog+o={x~~8{0{d}r@{&pqe<15;Tu#u)GW{+=h? zdzLP(!JWNhR74IZ<1atF~`rcD&mT;Uj%PbVyht5kCf zI*fBdTvit4;>42M(t1K={;}z~z;qnqJP<^6nMjbu2)pyNR{sb0@L$!be;2E;g>t$= zmkx4<)={n^;#1uU^5r*natt-3g-#lsTd*)1@71N>qpj>1A&InRhn9@L-;T2*ycn+? zIiq7RCi-}XnYkccP}vrxHtpab#`D8)bxpAtBXFUcefPkk7 zx$ZI3ax{-#qc9M4ZiYpmf&S-LMlqiX_P0~TTuS^Y{O?Ad`}~ zMakx9HZUHsTswHFTv+I^mDVWHmk1RI$$h{DQLK`#=Fkbm4JHCqxC|s}0t+*AXN(I?a_+1q?+IFm#QFd9sRMT4*_ilVf6n zEzwL;lVT{xMTpCQKGTCC&60I(rWNznUwg$^OE+|qFSC_s9tEE=bO&9~FTU-IU$_MK z=bbtSlSao}y8un8Jc+b)JD^5q_($oKWP8drDr{pR(WW-BJRemAXF{R+Y9HFRLJgPq zh=cV*+B~c4)??|`VuE(Ta*`o;4~vbgiizCW?LfevglnJIsTT_U;}bT8-ay&oJE7%= z??L^D00Xrd^OGG@WgZkwgs zgs=kkqyA2c_z%>--L~)uyR}o&s7sa@oj@^(7rLP@a(TagfHB1Z4e_Y!BB$%ksSs4} z%N#}+ikm3V{%p1@^Y}>&dRJ?Yn(blK<4=nl##xCAy&T{hnM}82vXXQuT0JEy>{fj{ z9o$tIp`V$<>2>}y;1lFEtsE_Rz;$-Keyx3Q84&EYJu$uq&5MN=;<~5EeTE-&!@_9E zcuJ9Q9lHGC4fDrV0FcXyv`f9c@ex1&+f1?d`{V09`|uxX1RZ)7ow8lu#bvRTYI58N z8gc95ew`jOBIB2WOJoSP3SqF?pWX~~)DLeKM;LD`w;nPXI9yJx3y#=0Y-32R3rc(H ze|JpB8oTI>w%=L}*cY{hw;zMv`29$Wn(bm|^=JW{i)4p1k*i%vnrXbcF67UXi3#n* z>q{NpV`8cy;V%^Fc%_BU1#I4mr>AQJl&(S2RYrU`Vg7->ymYzuD|KQ8CY)&o$f@_Y z6Y6*NY}qtghVBE;4|C6h^mLF)cz_p+y;ewg(SSn91N|Jn=)TO(MQ;!^irmg2h#k?1 z9d!T++{?}6b$lXKniEpR8~i7oEit$i_e80x(1 zV~DXFF8f^C#Y!AL|C}bQYA2;gwm9j;p8B^pF(ck-PH{kxdHMHpIxs*J0h3a3zTFh3 z@7?DN3gaQBHlV3TMlDi&DL&n~=YvhuY<6ac?Yz#;7rQdQ+*=H&BCY>7zUWUNkVEw@ z^##|Rl*>x!>+=nV7ml}xbatz%{DIM8J5xkUwDWacs_|^gL`9|REHOd4P7kkvt1gwy zQGffmqZuqqDHgdFHm(aeGnRCg>zmx&2awZX$!a1EaDY$)s!`XysBY`6?(Fm!gr%m0 zygGRn!k(z)#yM4;CQ=<&xPcA?W|_h&oHr>etcXW3(qK_R6`knBZK69wpcuuO96bu1 zP`HZ*w&|MxG5%)yzVIAnlca4h7bWh7NJN`a@<(*=U)~6u(1R~`He8b~%Ug6gyvCPw zG?w_X>0g>A-WmJe@!12l{a@A=KtH7vt95HtD3%JP0xncc-j|8hP+J$eF~ZIDRXyfe8ne!;K-Rq`MOKq6#rK>w}HTSI`@9Y~&7? zc-<-*{{nw#y1D-d*kg#Wu5m+rp_`A0 z+x4ELo&*SmMUCQ)TO3ZV@HwZ?>-eUuYM#2A4`vpg5vkh?Gu#hg`(r%&m#EXdL3=T@ zZz;PEREmTmQ@1b0w5nEvk8kijyNa$=eNj&AvNo2`XQB=z{V!|R2$v<9dGmFsf2yTrrKu^I`bYB|V9M$M%nspM89F$KpT)mi;$huLm zx}IbY6Mu%EC;UKIzi>$7vA{Dw!#Cw8{)0yHltypOq4&@G0L}wi!QGpW+i1zue~%N6 z%1f9)16sX}=t%dS86*6QPLM*B#*;E*2U0+UnW_6~SK!CYdMtyaZG%tf+{eEY(K=kb zag;AtQN;CAJ68&KkmMS6rn3*qNmFj^=8HKW3 z{AKCkhv8Ii;U}iEyKkG{`4+$U=%15vc5c;HHWOEW>RhRcTuGSj(m|R1l)juMzo)&8 znfZWheCK=hefIr(%um^H9ZB)VrI~1%Rl--y=<=?}xaT}^nJe#So0qn|39So}ce~VT zNvy0}g@w1>F59D#@vL$!bpe?rf%1{W05=ac7$Di;_Ix&ruSL=Y) z7@M6HR$*}%FE>y!=SV7)lJa{Ofg8RuRRjrefv_&mA;Op6JN>jU1RNvcpV4iC-#k+w zT9Ka|4W&eBYTs;KQY{`EJbW2@v`~<5BZeB;@V_k9HWFF(bKJaC5^*g9u3QFcES8*@sI&`vX=n|{}>7OR3C zA*FO>jJY!9GU%uZ=QbXm+36Cz8xEglK|tNR&F8bBvM&?KWK)nu|L(v$H6)a3ONaeR?lAjsO2$rr!n#Sir+i z!=Z3_j2+N6@vQ2CDur#LcAbBOAPc1wTh3Z5;1-z2e?&{gyI^|$@-m|KFv@G5N5fgH z?p4mN;MVi7RBSz4L{8-J8xFHm#~?h6Go~a0 zi4MyW{n-82_`yVhf@VmCep>H>H|LSjwIGs=B7mEC(*DI3;WetY<|)31pMkD{jx9$^ zccm2Kl|5NmB6X2V*7JcFqxETQi!3?iUP7B7qCfiRKtqFWdXRnDfhY;--FAeMp}WC0 zyOJSYCC6589=IB*ZobV?lJ{(6;h=C?VWxek*e!uqIEa$9ujdb*)|ykBP=N0?k-^{J zN8kajg6Wiz2dDybE2ddkelVDFQVRx8bd_rUxDKjY*<2GYHid$z^S+}=K zLzUwQ%aAaakq@?*--gJ>Bd2z53VIUv4nDfn#?x@{bZx_$J?(L4^XOjdn!^JX*+p)Z z(bWqY=s})8zBFN5UynQ0B6_rgxpZ5XJjr#t*z8mrQD5=ww5$aB)`oDr zhA8MS8nN=R4>;+nRB(*_{rao#gpp1^u;Q$i45NuSU@(<3=36jey6k_9LsU(slWdDU zPMW*S)f-ToY}>7@?2=oAC7OGG8~QR z%KEW+M&8kD>L~k5O!iib4s7>I>k2(`5gzZE6wvn_w`GTBD_Dn%jD!xAjuEl=q8@tx z`8_PbkIPwlF3>ovIESz;^6^^4#*)sPiAn6kw27w=%+feKh?5T%0nvIdSxEJOpQHil zuC%{Ar5_=z?OwDqVKwx2d5Z1X(_hS1Bg6g)LNk}Z?3yw z#OH6RP6i{mn4;1WUNimSU&?D2!?5;Y9V>tPqPUwyj_1M}czh3=%agYvI6m>Om6km} zE25LO(IL6WdxDmvvWUl~GxDKTZAuL>K$B#daIme*X!?d2e=btA5&-wY&Rm7K%`>?; ztvy{c1J13m&F$!n4YXnOawnvzXj6E1X z$KxgWD$sOphpiMX=TkqekXr(!r0`UT#Kh#+n7K=o>nDokcn_cC+FqhlQ3DXEAygaV z0b*snWOM8;_$2%U4FQ{%a0~IaunPe#%hy#`cgUR~^^>_TN@_%9^QK}5-o34je^H3ap$;mg-Oh5kd`XPxkN1@ zMMRY@WR`9WjN}Owdx%He%}?b<^QfSy>R!eYM`9|z71xCX)H`1pHK6XvsScEX+2ti6`}XN-C*K2xbJxy~LEmYu`z6EBnQ8gbXrn+!#K z=+=t?*pJo%AKx#nc>v0xkIP^C!CMpr%h-yOrK>eA=k2^N5t*Sdo7)*i>V7OyR`6>= z0ItOEQ9%Qq9q*eTq#v&W`S?=Ibjx3R*8{4Cn|Fk&Z{iS$wmXy!k3EU}o z=Y}$EK09Ih@Z+nLW=ycr(5D#|CG#)5hKlM+Q2_2}(lhsWObjb3^-lMdrV4u^ld|pO z4@r3gGifkbKdCBzF|Dj{e^-EGosr9j#}56wN^YST5&sGD5jUQhU`{s09(xtt204K~ zb352O#HuSQbUu=~@Fo7z{(RmC?K3G176pQBXco_{BsYbWnp0ayLyI$0AB>fpyr*nW z5_4zMu`+v`C86|3ApMyh*cE-ja_I8g!1oeh{@d#gnYFCG(8acZPN}KM94vkrpCDl= z#o_iu_!RRl64SEEi7-U#$*@qV0g=iKELxI+GN0icUIel=A>5&=)%1_=OJ&|}rPyQq zeRvlEFsKAqk-VXW6m5PD39H(9u8w+HL^SlV5Tvx}j49WByb}citvV5Mrq&x+)6pI0 z_xaT9v8n_t?Mh;UE`)~ zzu4;3q{rxl2GQV1JuMc;aRa8P{m&GMxjHmUPVI%nqglmLZ)CWu{ML7Iq>|Q2rBT}S z?LVEjGK?+F``Es9&!PIQiZTxZ16=a@*!&+DzFL;$2TQ`8e*3e_nayo2B z^MsR8-)GvVw$CvB$X$F~Wy}lD=$XD&XnC%mxGw6wmd2$jrilW52qMYILKLy;`)5$y z=b_o?#Vx0C?9?{iCvYL8QAA7V@Hx253Jiifu9iFHPaqihUl8*f0hk&X5jcQ3l|esO z31>q6DbZauSDAeqHxJ2=rMk~1?3+?;JFz0}6!M0Q47U;K{x2?;13^6PpFP|X9dphJ zw0r9YAAe}#p4d!VP1DENfAPxZ%E;iq71>PR2NHr#!fAB@L&Z8^PYm(vkEe7)ij30q zqFTI^Yu^paPVUC*4=0pa4(}8=t}`s+7{T)@w3{ zD0sP>-^w5ATzA=3Fm_pi$nk5OVa`elp32NHY;Te;pO8)kBsKagotmp?*dA(o0*Ld3^9>P?)GaR zN?K;Fb6~`_l46<&^=ne4g^y;;PPd4)io4|!rbA_q4-t_;5Gd|Rj; z&Qulgr5Wv9ee6nxFdPIE#62A zr#zUkj?9Yl>@%w(MC|lw1&~0vmFdem_v07JQn^S@|fC5ONQjIgp4WC*}Ac1zLaca2`-UEW8?24&ak&GJ@qE8TNQm3C~fO_ih zyfGul;D;lV@U@`PwEp`#&%wS)TJ_hY+=*l1;IN4AkMXe1`pxT6ohMsE^>HZT&JTx& zS}`__{d~|ZJbq7A+QP%S8`j%@h!e!RKIDe9Tc;1O>C|e`8(GvC7B>~n+d2=&ain&H z|Z*87aW%Mas!%(^%(dnXNjqCS!~I?Y1$2g=__JOcwp>rNErxE z$m{&@iIxEmH6hiV3P~i*_qA#SzrPa_wPbzoZ>aGdrdAldtmgR~*$K&4tM({bL4}!1 z?wJ-Jsd4jSFdB#^M7RPJr_jew4S*RD@mQ=0Cko}`*@L!t%~8(CqVnQbGb;7v zsc&F@NiwI;AF*h#IYC^FF=u+A70?R%r!F{|aJ)%A=eS z89=VW1~a#}C2$Vj7IBsbvlYde3vLiSQT&e1!uL?B<4eK?PTHNByoCkX9%kO?(3)K1 zv9I27wZ`G3^v17yuI#X)%{#{_9ske$0m36EF*5astqG3gLsOu1S<2ix7&=+W7eVjN zfSb5?{5X(rcc;C&*(~u@1Lw+jzy93z+}S||-fb$`b2LqmBr+!QD72H9sSCM^p=pcQ zRV{{2HLnOYpkCc}?{@>0Y7oYz8`Kl=*`tYVVNbWsdq7yiSExObyTmtyzFS9oXsySF z!#vNDsTNzJgW_}{;j-)EKPnlF{>-~`kGpI&U)5pOm5WiJ^BJb}5rtVi&&RZoTJkzw zbx5AEd@)vZFfv6JJ$RC=kzGud|2%g2uZlZmlE@XRe3plc5A?&8T0Kuf^kvm@y9E|O z1HvdND~`8dht=a380+tH!?~5b^SOq-UZf^dq)npRej7?c1)Vdw{Juh)SXk)qS29Z$ z5Uw3UGY-H(f3TQjb>T9j-Z03UHVYI_4YVBy=%qpiPLHn@aW&TU5WUg8cRx%LE$do= zKzODXXw&Iu=3L5KtqWXE1{$feV}Z<*Vk=9G2+N`#F{Nm^R~H|BWrL6$d%sJUV0^ zr#;El>gdmOEW&5phEw=}tc&^|5lTKehn z^MmR|FhO+g&nOLc6?6q$9#6 z)%8&Xlfm!&{8x$=#{M7HRi2FK1dlr-8(M^CiN^HcxHNtmlOTRr?XLIYcc?4Dk;+(U zXa#pf^<(eYa5(gX$cIX~^gFpJ#6Rr+NoHy&m{zOSsA4d|8U#c*c)XaxQaH$3O@AW= z4rfTMn>wX?9bb^uIB^voq#)O37;mn5hN?Kzzn6M-om$b4ub*B!^GJsT- z@(cWFLN#|4wfcJALr~4N%*V3nbCKzkG!;S%x=Y=qpFPmaP0Ur1Rx8{?NxGuDj%enu z-p}W8x_a*+FhTDYT-!zng({%#sWl*kOt(iKxU&9Zn8*>?9!SYgNyy3$F#);#c9Z`$@RS@X^w!{hKvO-i*>Y zav__m<`(v(2zYfMA{-?7G@w__Qy_I%@NSG}4nb5yJR@G*Zsz4fy1Il7*b2JVe0VF; zcoY)JbU}l>Q8)0>ck!uMQZ9Awo5rQ*&?@SpaD432Jb-uH)xcgHVV`XgD1GS@SehaK zkunxiVdv?^@*Nd1%F9rMr!c4Y{cbW_(!(usMdlv`!+dK^^q>mrvGqw>oEh<7y7T70 zSJnN+tvhAyvdecM<50`5!P3SOO>^1ytpD(`D-e3L@BVM^XhzH&MTL>3!q+>0^Ex-*#wJ zfxjVW{*)t(exb9jFW@m0dg!EmBb~-!0LvS>&7UBC7Y%6OFt0!ue1jy4w?Gf2Ru4z? zMl1=$`lI*nAuIetR;Xdn4B?=-W6>=75{nsM6vuZLyqS9C=iG!9XY-mSZoT9ikZ@bz z+DoOQdBF$Xj`S@WWpT_>kQc(W`-lw6RjO~^gi<0C3HX>MIxd_o0&6{6%CpoC1c~Q* z65bgNc0KBK7GB{t=#ze!_2GzNhB%Fl@+}8l-jje`Sf2LjQEr`${ZbnunPZtrqT-P& z*oF4yV_fN;{ACHS-LdHtvfof6YO5^xf*!0&J)oo#o$9qYtFVj=&Fn>s=j>F zL*VdI*DfCvp8<2lNMikd!2B9%Drn)GwvIHxs1%s|Yjru&v@~+}{ZYv+gsK?{auUmO zh<%mV93qz=Ev6-hym{yJz%ibHkPUnj=$$-DZ~Ar z-h^ypm|&xjIqsQdOB=1|?CIIx@YAjtBLW(Kx!W~V!|&Oxp+ZZ3^>AiPA^Q?tSY+0$ zmdP9p_7SgVg16zqPWIyrMH52BH9a#z%5NWg>hzaIgqQ;sT? z7NYat^mO#&UYQ>&M_6Xv$lE?SmRa_9hS~%hoB)f>%b>mfLl1>rA5&zUg*$&SqKzR3<o zyR=eFa`QKY$kb?C2XCm>JGy;A~;ea@OeJgoa` z7h>9MJ&rRlw(sAVz{Ls0yifXfq{Bya+@n9b6(4o-RYMNuc@y9W_>=<^n9nmg;ndX0 z0GzvbO|v!NtsvNNTbkK{P5BTr1ZZ}%0QF|L5U@VucUFzOeZlW8Mgpw{ZgY(cy)%wN zO1M8&tY`5v|936m81_)R3UF?#G$NkKDD9)xiIrb2_y74|2B^pLo2{#@oNdDH0N3Fo zAK>LgKN%8X>Y6Y!9G=Nu3VJ4q|*_6JrZU~@Vwwf#%n89hdUZZJ$J+4pL`fM*FJkmaW;me2uNh{su%JLf) z(E=+&k~JaBUlavc^H+RK8+iei1)cf@(z0hQh_b~R6t7;Tzr{VzCEV5`6B*5}sf&_u zcL{wU99zhmV_hEkCS`~DzysxBg6}vuCRXdDcf<_GxWwKgV^KKUP!D54bjLJCn?jn8 zxKpT+P%o_Z=N_XC8T4Zjut%deBPGSCDw(AoQu|8QO#5@{8CJy>L0$JvG^HO?UTxn} z4YvEQleM^A?i^dbs|T7L>3R&YLP&KW`s~xM6z_N(tofg+*W$b(a16E_Uu4MY01q;6 zw~sAJ)G~2Um?5%aDW&gwjz6wGk8Y+iZ<~^H+ch_(!tN;p&*d`t*$3**{m2E;oTUzM|@LX&)9o@Q*-5|E_+dD!-RBUHQ<|S-Lz> zR+rXQn797wO#>v6x~TX0srP0kt)YQFWUD^Ny+RmYn}EkH*Pr=C-6Gyk;@*EI?Of=F z>8oy`WzO6wIH*DbJ9AUTm=Ol&>vM9MN68U7$r$oiEFnTl1=z^MW9#k!d(*?%m-dDM zdw+itpYm~@PyS*SJ-|L`iP%}D+aak&D`zioAjO@AEldOu=L(IX{hBtJAl^JJx10g? z6QBCXRn#T#nf|Z>kFqf29;`7>!UbiVML`0_*%%aQri!kcR9Ia4gAKKg7tyv~#c*ES zKm>U(W`tn5cUc*(s?~j|GU7;LP|>xOEHhDr&7M>>+y;*(&#@&jxYxeQdnuSXF_V5d zG1KQ2=-Ml6mxBYk$-S%%HX-oO4iwJYSX9^9C7lw$M$e*M9CyeU>(UrRGS6lX)Pe>C zN(paevd+m02Jlt^zGub>Zr~1u7NNG)Hjf|N{zZSX({A8y!rlC}X!fe_ffF<9KfV1J zq_^?^(L?(~e)xSjK2_zQ053%C+l(IScWPJdvh2$1LIU}q_7?vxhX?PE<>c3!P91YC z&c?-ZD2@y?&YeFV>;pK7nDX&%P+bFI3ZF`=KnN4C_Yh=;p1F}eet8_r;mCanTGBKx z`J%s8m09&Dj_qqg_>*58M0{diZk-Etz;3^Tq~Y((<`lA%s`6ZqTA@-o7^X}J5(`Ssb53ewd5|#v)vCjL?WNoBwFv}<|XNhwjjw?_C_Mq#Fl|(0; zJS^srKV5qXCo>VtnA!I_<;jpaZDWQ9Z{iW8^9Jlir-_-5t4!S@SYqLeh@8faX7>l0 zCNZSK5v@>KN>)bGNGneJC=KP2+Ve$;amyPfmbY8z^xU^)(&NZT+vB=YpvA3$5(s2^ zXDx^4pf6odscpY9|KE`CKX#*dG`B?#cetwNmo520^b1-`R6B`Lq|`%hl<(iixq^^U z%(Zs_Lba}t5sB9+ckcFpjv7bREuw6F9{_F*o!OYnZoR$qfYoe`2DqtUQ&!i@)!v_^ zb>b-Gv5qMc*Bg32!E}GqcJP*tI8y0J34G^ z1k35;&T%m$B=ByU6_>;23<*vtLJm8+35sM;D*( zyRGgak%*8uZNHMB7VLx0=nplEa*x8b#S2V|=)=(Hc~ zDmL|$?@PjaU+c2fCBL4;{)B+mR^_*ROKP2l!^?qxq=O2ABs#}s4)hMKgeAwlkq<&5 zIZc5gbT+GjmzH*&$QqTUlp^k1)`SI~kKHY)C8uUA+#dj5Ygbe5+)FWRp_KagcuYr4 zZQskbm>!nEqrx{aBWJ9?-5AaqnB8|60xR+kxL++#Yy>)R0Wgm&2`O{2P|o58u_*)$ z^O6X~ zcPb|7+R{x+s6XqWM@npzeMQCSYK}0=W3#UVr5u0KSn;t-_Y@Y(TB5W9n3GuzP%QhN zrscser59hZoHHxk6cceryKuHU;}T0Y2KEm9?+`$#?9&$byh%k5M0N-x1k z>}p$XhD&w697%K?lz#Uhrs7BCMrx>m@_zrbjyPWz=)DuMh-~DR3FZBs4l)w9q?`U{ zaT*Yg&VL&&!$bRm08HKq5R&?ZWaO`w5uJ1u9I6`Wo>{wUm&i9+u8%RFbi1b}@dk+f z+Gu`ZYhY8_f>Yli&F!FW-0)rgo>1;2Nw3Mu@o2UPoPzNQBz)9Ew{%PNALfp7%kv$t z^>0|b9@U%(z9IOM^)Q5^1uT%mApyHWgkd@b2%2x_glMOV>c^0$i~kj^jCxg9II_e@XYiv+PyIhzF75xe8K9 zyb^0w8Lujc)DI>*MOdz0=OhB4g4DXrgT>R%zsHT}u8!g{GQqOu;g@TebZo?VXdylHLujF`tF$#$^ zcmA|k;}x}8NL(cG%`l9^WeKvBQc)n6R8Obkan^IOlBcLq<4t+5^Yf-?)>*+1=0^qBVM*? z)@(L^&0ZOH?k9b43&4CH0@!3I*JbDINxH;yz;_H}c#d>v-Dd>l{{(}3ImFOt>ZnnO zm!Vn}CA}_h=Oq5+=7X}MJK1y-k4)@&XpQSmqwfYKNtVNh?O%{m6Gda@b9{sgdbC`Y z0B+%5ObmEaJk1Apdu3s!94a#HML%23KuD0wzd zH(e9e%itQD(1Y`eJ+Zt&V)l3cN&gbD!3br^J4;9XX{roJ8kmy4VhQmptpQk!HN_-f zUWcA@YzBJgqi(D%QuB3vnoPPG7~D3QH0_4_m{)21Snz~t)n$3c*JpsUx);n zp%nR5*lSu*-B*{tlJ(wrjq`IbSo$xmr2B6M&F4qHf7@19b0su4M{o*zzShr z-cpKfPkoP?Ek>BzZqBf)Ua5I#mD~;;9%=4E^M3oOLL{5V>Q%go^fU!GuSTedQ_htA zBV~GOO z-XtyF}QD@TR0bhq!?jOZ*m{`6^Wq4#7fuN=Nv34nQAJuPPUE=>`^uvTSskO zTcO0-kt=^d}D>diIOnkur%*4Fi8U1AnBKE1ORGGPk_RH{fKvj><-R`le>R zCiZ}6S=hJtf<=2ig%eOuX zCt`Ofi#}&CqAM;c;FyX`^P^fI?2U|`$3squ!T-Dd688@w6ybj0y`q+F#)6{u)nD;j z8QE59j_^!EN-}7em5E-`J|uV?0g^s_==aJ!>1sNU#(1+1No;o+4a_$NMLuj2*MQDZ z;Xf1G2g&ew9d;v|WLMWzyMaEbVT=i4!>nlk+_)M_;wd<1uFdJ$TWTN&`|CXX2P58V zy0@sjmu!QKab>4P)1e&W0XOx3g<4-X=0iMCrJv8MCcVx(7^WMUXb413XVaINl)P_L zK}L7d4>2Bp7C{*%LVty@|{!}5x{-S(r|Xq{PET`YM8Xa@fsL#mfUqIWgH1BFVk63oU}jGP}FNOh-I zgg9Paq86;B>k!kH%T;_3{d!)}=H3Yj<+L_)Vu^pEQqqB;<)f0`r<=_Air>LM{f`&% zffM_gCxJQxw3|cMVc02?tyh9A-p#@FTT#2E-PmTD71do_*oWRL13eerAneFA`$~bX zlJw-&3CZXZcv}K(=RtrmQ`HmhHt#0->i&8sx7yM#Qc2GhBu7eLYPjd>EIsBF7iDAS z2@CZ)!ku4^kVWdHjZJXE;i%34v?Y_^jtziovaEYe_JIO9y2ZJOe@df?$!WTfMZxtB za=Ne2W5owG4&1qYu&2vxYw^YX^u9Dw(*RtV=jGvJT)8r?tP;$StybIqAy)qkdILes zM}|JoEFySyn+(%p9<%p}i=Owb!jMbrf@BP>`OMq0Y>Hh6y$=S(GEv)|z3D6tBH;m_ zKjv*`&csc31XL)+pPS7tfuzMvw~Xl>5q&GA919R`hbXAorAC5RqChAyficOAi%ws? z^;L0KO7upu)O#<*N3cCX!hv36f@ne*gGx{}2^ z^z~-G2B2TY>!=PUGnowTN+!bGYds=rZ9|-8&QX?`zqV>neB3x=c2Kut`4L7<*?(fu z5HzI7;%EqZ_GaV+h0lMZyNY0E$|}(ziYP|W`&ts|^ZuINO>Ls?keioH*GP`CQ9)=G zQBU^A5+-#%T=7oC?9Y(>(+FBC7@U9qh_7FP*ZJH}P|}X`Jw4g+fbXeESYu>|X^{ap zTw3MKW6%li{&$fG{{}PTiE#1Jw;Db>VMmqe$yYp3BK3x%_MIWcf=yH4MG-zAFA8vO zy<#6P<%#~?d@xh9rwCn9HTFq1)1k)8Ev<@Y>ixyY(ToP1u+H|3h~6tAVgBt}qz9duxy1KnXdO!qKOp2C-H(IZ{4*Q#7kzI|(gd}B8FVuUTO^LXo>vcyJyco(INTT%6+z&R+$R?gM&t;y){P^(lsACU&MDe^kKved!eB)tacqKeOLR*3Yk*|sW87}%|jT07;=DsuVD9q z{y(?x;nQ2K(8x$-zaU1h>oQZm$b!_@z0>Qxy`qP*R_A62gpJkZ#uXxIKKVW0l&hz3 ze!4ly2jD592B6}-;@g4Pmu@Xf-;o{ZgUrwjx0KBy<>#`0UBT$hThiYDycF5iFo7AJ zULQlkzwKNR@kBKfqPw)tBco=_LT0K~m_c{nn?htw2unkj@C(KW_}rr>faI$FwPmD6 z^U*kD=(}|~LudpgqVR1%+VyfBi&1~Ty((#^Nkcoy~v8;dn zq5t``2oM8%i7Pq%wCY}s$tPtW(Y59VbLtO{w5oW-I+I63%rH9pP0I^6Ax;Rf4efF~ z`-YV?7ab83JAb04avr%!Q+YIY?3qUh>gX*Gj8o}oUmlLQ)$#J=K;mg|SFsfh zqitqKo-b*^U9@EThk-`Od%EBQ&@!AOB6Lm{G~s>)*+7LoHj=&ov0kkia=3LIZQ$L= z-N;$G86(1?X6pUsotKEgXri=NDTTGAlx%L>K4#~z;71=3QfQ-AO>7(R%mXxX%exc_ zemDgDSM{3$jmMr%RTuUJiY&RD?_Og@4KWUz#{i1Me3#N7|9gmKv*aS=i{J z6;6O03)ky9Sh;9Cu=Dy^xg1OEo9H{mwh*bFZ z^y&i>lvv}s08yV4BEn`-;^LA)HX4;?YXQdyoAJNzr6){^RWx)nmf;=OnVj+Q?wTSX z9rx_X+j_yEqq@POOL=Gm0+_S{nq7wiw(0P^@Yu4bS22B;P{|8=d;(LP|LvrS1l94w zk9rJv6Yivmx(htR#C4sC7g2|VJ%?hUgzw4$F=v_XI~=UE}L)D+qezdpPtMaA%}~H(g*BiN;CPgZJ5wA z&UsEbkc(JqrXK0!$^aUNh8ah8zuP)4v78Gg7U-9Uou_Di^MLc3VhO!Z`&cK#s0A?p z;)p=Lwt7uk6#2u}v1nkdf9Dr{`!@Gm=ZDzc8!ZXRjr!rnZ_{Q;`J;XM=5mRDIdg~v zi`Gz^KKQLT`@JEws}@|v>2kN42NU&4%RQupv8Cf+C}BUT$ELlxqQ~8z>kmJC5mbCT za(l-OVClQQzgVsix(zR7v)r6$Q!vnRpTLmJ=Xpr5+brGAeV`cDR�UlyI`;jB{0% z5JDbZa$X4<+B#GxuxjB94z%IXbI`vWFV;OD<~zv|i@Nk-wi)z#E7oR`pm{#f{7nGh z&=^B!o-z|ho?WhY4jAMO(46sx)%A?yIqn#RxSaLe4}Xfefo;_NR@|S5C-I4EOn@f8 zv=;-cg;F|~vAsWz5Bv)JWUTXn{#6^d!UP+3z{}=+78?H#KLX$q88I35<10i7agLFf z8}}*uVsLEw@%EI%*p1nkQ_7?FfCN>9KP9`Y<)r{=%A=!#)@1jiDcz*XPgjF&dS2Pf zJ_Yly4=*Hq`scgSD#zj@jc3)?DL`rJx3guR9ovpi0oSo{&CPbs76%4`Po$gIql^3y zI0xXx6#42~cA=S~ z&857Xyx$9C8(~;|>P^T9K99BJTUd)%MD(fF+!NkY!o%O@6WHMC8vQV#U^t>8WiayT z+N>wenXP@I9w6DZ{u&4xe#LlQN}57U+r3XrG4F%2KNjN+o%3c#`(1vacV)wv$zHgO zk{f8``0sodfG~j{)mQyG#G_a>3#9RVr>64R0#hi4A2jRh;xEqu>!?(xw*wi%8Do;O zX=&u0bIbbT5R0!G*Caw^NnZU`%M^J=tHtF*p{;IchY!P4HXWjVJ+~ahlL#Le9(t5R zK-Kyx<#~n;XY1K=UdL7{lVLdL3s(Hv5GJE~951lhYN4;0st()M7hl_ttY)GUA3K$H z0|H3ApY?Q90@oi2^4b?73G*u!4?TiwJb_u93^w(DAtBLFyqD&GP9UI+?7E%I8{xbE zbr$*gbWLljW67*>`N9333?02w44MiY67t!*Zx(N&044%Ck&^_G4u902e)ok(>iR2r z&86@Ecs=%t`<7fjnkX5?Y%nl7q_c*gM^a>VoKwmkMyMVt4fV9H5BG61V%2b8XL}2@ zyFpS+)MjsaPW=I%U!luU!}A_Z_n``a`jY(0&q25`Slj3Dlj@5szYoEmx!7Wtl+V<6 z=N5SSZqG@CHg`oJ^(TPY;s;1s*>9pk7|sfwwYL4rYDW50@am;sh;CWJFkM;D^@;lR*YT z!TXEdv%);L-(Y=7||?3ST~}e+I=0dG5skLYVa0-2x7v9b&@aF0N(LdL&FadmJpmn zhBK7qN|d}Jwf|vp1We@tYK|Qh-yO(Vd8Twzs?5@`R({Xg}4c)qBNs?Z4aJPN79qLDo`cL180C+vBgpX<}GwghKj#*dyog@e@Sc z(N=I+F5f%RO_1@hk9Si?p#t=~A_^X?M@-t}836rxG-i!dC)FdhwN@%i*L~~oQ?ll2 zyF-{{x$=abM|(x)!*wchyld>S`<xUa z6C$5%M)2D8)!_veL$4Q(H%;bC=*qlz3!Q0xfN9Ke27ZK$_dDNY>93pwh&o610Q!j9 zX8@x;9y-rYohip&uagIEer(PTyEyp%62To3w*R($k ze#Mw~@oD#db^4W6>L*8|g;f2hcOMWA9kgCi1WGqj?JS+fb%){uo(II>UqR z(jH8)4h8{+4cerk32I^W(Rm6qRNKQoeR+eeT;hR0iTQ>BfwAUr7r3HH${zN;JIXd?+53X{0xRO1eIjP{ zCZqOMxOt8@^fmd_)Kn`J^Ld`*UQ@61+(dNr)b~4v?T$TyiUt2=qp_H^f@N~{K+~%c zeG7Md>-Z7thGZ zy!dFV*|(>0E5yR)Lzvk4oEEle(VS{=qwut;+5X>P=bwowFZGzAmKQUYlB@h);mVCj zTzJX-11yJ0GU^9Ta|78*09!o%k;do3PCUQIw3qB1Ti$&?HAe=#NA86P)#w=B=e;Y*)0-e>`U_)bRJf{7gyGSlk5vJ07Fm78v}-GC z_n0|8W{%8as1g0k$q(KZt~@}5zZPz;-(B-gcGrn6tIS9QBD%Pf)Q@d0Z(Yv9QR7KIQ=mG{WNUP8Yo!9F&AO5dUWmIOczxt9mJLBH8u zIE1zzYPGY}_|gJ~Lf~$6hh(7UepeD>$-Y&}(PIDI(D|uOxvrq^voWb+bgh`SzV9WB zDCJf0*#KhL+0E`Vjb>ro73`}E!ORuYjBAu%SxDOc3@3FOQ0(U1+{!h;whj!ru*(^# zD!s%-U_Ll-R?FS>3tF()W%x#5pBI&WYTWhd^au!=zIXwhjwOEmk|N5Q^(=nNr0|6H zM+fg=tZMl+k9;G}$3I107y%2%rf}V51KKzI!_F=x4+Vb)-0w8l4L<)c%YI;vLA`!i z`0E_FE4bCk*_MA&I?ib)a$C533CZ7eG9+^l?<@>m`|h%;UrrCflI}aukfrm?B%XFc zW>r9Cr4~=RmyD`5DQSskrTQU#t=~t!YduGlXG8+7`#UpUz61G6Cg$>a=<}C$aYGYZ zXce;xTRidz+Nu4A-lf?TZC|=uIekSOst!;0Yna!KjS!%3sCJz=vqRg+l}?s(}WBg(w+s zRNNE}v^pp3zM8^=xxyZ3Z=tbMe)iQe^)C{q)q;`TPTgf~Q<3ikd61Jt()%Xr>k~n? z@ao#jK%YDwKAsOyFBU$Ndwd^l-8o!rg2U)mHWZs9GL3WBUq=WMIcF=?I93&pCrA?P zlAz7yLlDtF{a+*xoEI%&HU%j;NxF62Gcd5_NNWDR*pfQGhC{{kIo$9KI}3fPNYub$ zi`>SwkHw9$3g?HRf2u`)Z6cl@6;Syj7i^>G%E!h_1QVGin=_*`m9qH(`CuXaLlS(4 zy~G=~G})gQFt(dR>_tiUd~F%iu#Eg;Irc*A9juA{8c(ux@Ix8oBt4Fs-DEhoR&%L~ z?nj`n^$B=vw&#!(q z29-Mt8s;cbBg#U7s@+RbJl)$NB-pgC zPR+jYV588{-rKtQ!1*(l=hPYyrKAN1bs&N$pUcMkP)D9?gHk3xt)r0 zy?bNw(n|86F2NPO5#-vkFa?yi9x zmFg&Rfy|rz+}Aj3iKW|IolW#uI#0leD8Dm)Gcnm_#i;PYXg2sp} zoaLv>=wci&p3b9!0xH3jf!{)y5g#XkFc%N)m!v**r7F7GsyY|2>`@FAdW@pM@-Ljw z5!i*{8f7PLJZi=uPZIS#UadL!BU5_$1Be`O#Pv&?<3ncY3Oa}zU;BCKT7dJYh(y)t zmAb*BquEBpOEqv9u1kP2g10vbll$lZhs17nH2W-8LSH35;MY?G!(qgdve-`zM}S5^ z--ivK_VL7))Kmc(3Z|Z2!(bIj^)Ki2hN!7_dAT0&x2&_wp2~*{95AQ%yAIW?ZddA6 zb0^9jZ-0o+x|S$gc9ZrkMjV*kRJUi*^SPwcMkM!a&FVI zBJ4mx3dZFP`VjowkK!$JaDSD!=E|{t+|#AK#cI=;0=cBC#q|D~b~-cdfoG>A2lV6w z(GrJO?b9DLymv8N#;aS)_qFlWdymt|!#s04{C%%K4b;e_f(7Wj$m<2b0e@zv$n!!ln0H;Lydk--q78P;xcfzR$olojNRa@fZjTNaH9#fO6 zyHdwhWvdmXFrs+gvLZ1#Oc>Q`0g6wSc^a>I6yiRj~MOeP{eD3x?{&2IhDQs1-460sQl4FepKF% z(`ACqI0QqOM_%+SGkdRzq%b&Q>_;i`2El`Ktz%M<_FDx!2L;U-(pX%V_uBSLg-*w& zPY~=rk;I?8FQAaF3LO@tm)87*4!|i~LN<6c9i$$hb$k67JM}V0YlZyWZ!@&6W{Pw5 zf1NR%FrX<~s!NKln=t}}iv%z!9sOi#3arIQ`_6x-(YB0krON+MMEUr@z z-Or)U?3=syxTLP=UBtP#?<;sFHha=H4A}qWRnjE7WdFjwLEmWhmn%OYVVOo}d_??M ziJT1s$*S%KDNO;uE3%WHXGn9m*C(@qjd@Q+LAoEcQS{h8RPH%vPW}vKy90ebDyrQ} z@qvh0+dM~Kw2!eecdl2`29K4QZcph8TIXLINmck-wD#^Pr$dD8#okkoF z=8l$5V`PK&)vRrd_zji@(9WQP_0fn!OPVV0oH4fWGo^{E_v<^EnF&uIToH^Pbki&7 zmdqkRkLs&K_F@fszQP^KnaB%zo#@3j4QZ4t?$9cTQN66bW#s~Rw?NBb&6A0p=VRQq9rJiKm)+0|#z0u8e<;LPx0=RQ^x>sn}Ckx=(Tqo{wQ!M*P z)EI$-K^V|@k~QNt`aXbJdZ?Z}JFNbfkLaq+X)GxA{*i@%N;z^&E!A`9VxC@Lh8Y~~ zSwZi75`U_YTjD1yV;T3r;@u|_SD=4|rQvJA!=zU$A38r`ZkAVe3wGL${EQ1>sCEoI zIl1li@XDrn-GiR7m!=!$(iT4HOIa@=+X*w&`q{XwVCbFo{#EBQ&t;$U7mf1*0=A5l zNc=;u-T*I^1D{7TPN54agnq1PitZHRDQl+tMgDAEZXZYsmXY_bepKh_o_Y7X@93A) zji`CiNXXAhD1IXKZd=IhPPx&4w|Z)%m@XQyGb*puSa+GxfOSfr;EJcSW`_}*#rlYi z&>w3ySclsY<}32EV)ON9HRMAnL4)|2H5^y*oGsJGw>6i)u;<@r!ah_e8o-hGf3e){ zp`xEQzNy{QQwa5->^C7_J+?uweEZROr?#jjN+*z z;p=fMU)%!64C_hhGveff&0;++IX^B2!|-zjk|4DlRcA+{_MZ-NVHaA!e>iBJ6mh&Q52vqNs)d$g;D17nGYrOZTR&ge1Wq5DeD5e2NcpRETX z;_dIQ#gAct-p9aN^^k9st&(=LfNN`Qq`zNV^W(;U*~E=m7#A3~1F*x2a0A+8!%ijt z8CP^p+pC#1{IUXqIbb7T|6ue(Q!V*V4y?3$PJRT=93$*Qxqxr?k+55@t+Lfn!DZR0_;bk|v->bFn zv?%aZaPh=pwEM}+>jJ$G28osq={f*P&VQJr8a9)=t)KbAxp@)@!Q}aqPw71>Var8h z8xODwEx&|Lsec`J)KK=%5wn>UwtIvaaWAeG>^~I%M$X2eSHm zN7OWKWbO6_$Xco5TpSZUU!7$-TKKaQ30`TlVC(M1B}|!^NhPfii`-fCcqEP=M>Elv zgK3Fi16RPJJ?xk40CyWvHRdnrhk@TSqVQBoAajWHj3C`*XUP`PX^&U2HTKb-{X971 zTvyIH+x5^_KD%7xHN7MAckc`q)~~PnvX}zGbaueco(TH^w+-#51gVPY&aqZ(y>WU! zy`sAKN}^<650P+?Nv?uoah3^BPHTVk(i@t|ljFHG9VUlTUAVR^34;N4M&U(6EjOBx zS>w(0VVscI&q53MDRcB?9%w#9S9!cVxZ}Zn;Erv_zkcK|w~`z1%(U@ui2n~t-CoAjMd&lVdKkK%q{c}QU4Q-Dy~82!O`9@E$cKYq z@m}kYe%Sa3z|8h#*mXSw7{OG`qJ>!X=>8=_g1TLxm>KuFr$wHa3LDCmG1AND_uz)0 z&NdiNh+bTn9D+hE)4hi)g1N+din+a&GQE=5Xx!z-EP|5L&TeF*iAu7lbkW$3s_6gE z7XEUaU816*pr%S7%5C2A6W?A>IZ<85Td2>G{;U(xKN=!O;uPgYmX7BcE2SRzOR_>c;fGzJ0an z!|=vs0ii^q{uJak*CJ|@zLUV#s7aWWWV|upT2}Lf$wGaBIA`E%5&4hNS31OtNrKkD z_>vA30SZa~E<4nOB$xxJiLZShAxVo;F-a)hbHU~H z#qihp&OCsX>>ayeAt%TzD>!cZQd0hy)kLYW+@MLdKc|`Nrlz~>2e{K6aX)PQoDs15 z6X*usKIJ^%f9MChR6yt&Bt4=)S+y8la>j+>^2M30V&OyU3eEG+CybPbv5wVDja`=^ zzgk0FXGet{4Vaj22SiCU8AY>+JYF6E?B&^KlYB)f+4N6?U8HS$1<)M3=FPcGBVIS7 z>7UJ$kJGrwF4pabwFqkLMsTooEFwRv%CbtFH$Ujr!EaOgIEs~XNGY_=aVg1seJ3Juih-g1t8Qk z&+-M0!&vko`B5!-u}Xu;%%rDA#@O2D71|MOjmw2`po_0_8sur4?E#2*@!vI@qFjFTTT?2_Z84u3xhcVJ+1W+mkXBjpt_k z@s@U1Hiby+vFAef>JpF`O9|RzjBdSb&3ar3kbF7Pmn#{r~>an#v@If zxBRMeAi7z!WEk4U7tK-HLZ@ zhA!q)9<#Cv-gScEP(4w2aH7w4Q0HLS{39OwF0*rnAW=YvnQ0F+0Yh5gts3Ip_(E(C z*$l;X<;Osinl=Nd>RQg)K=83ESU|*Ud+*Ac7wK^3&(-SJJX_P@AKMXkO(Ab{-_u|| znhlM2^o-8*4(8LqJE(vD(9{eZ!4tiMg7({we&+bad|=wx%c7T({2ub!E1zQA%<+R+ znD*Pl{podvmHah&x5Sku@0~5`ls+4@mdqtV%{*n&uC{1$(jGfdCwyo3Y{wo%WetLO zx@iXN2d;Gs8-4N?jcXE2;U{{sJbVdH;#`#_-m2rdV#IfFD+05*1Z>Sa1pRtr&TSoS*q)C zr7fN8t)P2kvsGU3&|{f^x079i&Wr4C5ZD)oas|ocg%~_efb!S_I^egP9hKr-_8Q)E zvMNSGHdAv}6LXiDt*P~fR_U_t`@3Ui*0u)cRVCZaRE@r1Ii4z$gJJBg*qbB_`;p#2+tGIlJJ&CU+g)*~<10dyJlZun+Q_yhx^kHr}l7RebBzL_B-YiF9ndp2*( zKmEHFaQbuQO-A7fQWs*rLMI}__^d17w6mbrb}=$%UbnN+c5%=jhO-u1R5#1qndtmt zX;k;le%Cm1Ma1OnvT-1bXXM)&-ws|Ba0yoUluf~g>#enclYXF=_+q&g!FNL~V}6-` z+i58kiuAj`4f)ojS@Di-!e|^%XKmFnee~0Pj?fjXhUQRm*p>@ErG{i2n4)= z`LM{3Co;WyH1{?rn|+}KE|pMSrzJd~4qZvtvSbv0x^rM9gAdYf+f*IISqo*JN~~v{ z{>`a~<}Rau{AaPMon1E-kbuU=FO>F}Bc(mtgZB}nKX3Pd{SmxZ>?ylvyR;tWuC3>RJ zP(!Q%aoQ4=277OLxIKIk66n~KzvrKs4})sozc46cHf)iG!0S?RUWSdPKxNc+JkSMl z_&D-mVHd}t`B3#%Ce$z2{A{V?aSl}9abU?r*%LXSzR|QqLL~Qu;!V(y%~fvkn$uPA z$;WI3od2$|K{0TH2H4jy>zJ$3){QlCzv;53^<+C(8~L*?l2JsX@hO53#8FTCAW z$Z?1-dMC*40eS5&3>j8q(2f96&3QLd3`K>!0qn=L2(O%&N`zw;mQK&FQPEP{PBcEZmu7|hJD<;Ieu)}xF5D?4_V9NLJ&;;q>-)#->Z@$r#yl$%^ z&3cZDrr=vLeC{rgFfUju3nK+xs7aS3bB0}yjDcgFvN+PmR~`=pv}QAxK$08{c8IT1 zA0BV$WQgPjtAq3OL(sn=5Amj+QNF^*ocN{5{2)D}+}NE-=!Ii52ExV{+BPL&T7G8_ zSjxZRjA-sT`bRTY%x#f{!)H><>X#zd?ggZ+po2|&NDCZFMWz338)v=kmswPp|5?sV z@#I7!t>p3t!0#JE;mu0+D)#_ktCn~Dcm)^Xe!rfYfEZzrgbSLUg*8*XIl)C5v9;zG zeH1-5*J;&cK0kwWha#W!VM|R_AHAN9tEkl8zje^lC__ ziHf(AM7S02OT7XDc&;o;`Lq9MfhrGbls5AF0r$reeV;dV$SsiGQ2#po#Kd)H6o>Z` z%1N998lET=$lmq6c;$Fm(|K}CJzLKn>GnA|OVJyj!)G+9(kfu+3V6CQT7EpA=b`Gh zXW?ZSgP%Dvmfnw=U;prh%jakG@?rc>IGng7j#+UjJ7&N^yh2s>UhgeP^XBY=$`Z_<(fTDcsU4 z+t~E3(zWl)OY|kA{`CVE_c5jCxL!(}c&M3^4AU*j{p;CD6`!g)3~(z7+;qVP^j4#4 zJ;CwBnH9@`9$UK#`PobRGEWTAY)mE!zH$GW&D`-?yl@ov#8#?Epcvv6RJ-;0G3cNO z-4#T2)->&t^DcW4THXV$X>9bgrSe^ZPZY2SO;0p9PziNso6OW{)5ErO461PrG(|f47QhR1T-t}708*?07~ltTn!&r{_0#DAbLxju6v0aY*6_iMN*pXe|DTrq z!j$W4-oH5i&i-g)}2K?VVx3udb2#B6Hz7`tthVm=O{o zfwDJrhZ;z;KE;Qz?zax=Hxu~w2&FjU_rGvb?=RA5r$^{FwuN7C8XY%DVLvTEvX=W_ z>nQs+t) zua)PLjS<|b?E9nV5U(SS3a?Ds0H=mE@gZF9YLN@a5GJ));Om+SPu1O|20IO^j7uKJ z;ymbsZUw&}G|X2-d}{ZNThy#I$7aU+ZYZ%WUlV!rdwl!Dx7~U)>c2&)`YI?ovQ5M4 zd6scgoo4My9+q3_yDMD6i4z|knNDZd&U*V#|>c!%hAiCtUZ*$R5UMiz}1>gtn{znyRg zZ%tL_vK9q6U?(Uy?pz_xjV|OK>un(%k7gf2IRipAlSd>+)y9V9fyvOr5Vp^Rb_qWN zaxGNQQyYoH80PQ^|GS&glsngdXhstK8`sgg)40Z1;U1s!$f7~k1YyW%KX`@qGDqXU zh9-!`#Wh0QiI+7;5Y0;8^h;ZT2;pO6#}!$3c8rQtk9oRPZHIM{3mIFosdD3DZDwEZ&gb;q1aAE(kQuK_pg8HiDrvLRGV+|h%aU) z`Lu&2eLS9hv*nI)S);s=f}V%wt`%1az&3{sszj{X-PL@Xv+Zr!E-OV?5?MJhF2m;_ zwyCluA(X>P1|88p+@Cn~dEQJi@PGlX8}s2+X+%pbE2v{agA!uk*>Qs{<=&e-4xi+Q z-iZa067^&7f-u;*aL3648aZXpi*hgecZK`Ig8qoPvMg^r_v$hLEP0T~=PQXNj%mLm zGCOn)>24Q)++nxU6{&NxldpwaNQp^ECh_z=TQ4>uXk4fhKUIYi_4p|O83%{9c|G4G zMZ~6j4$wALG@-Di5OP>)v+&mq>SS69xyC8~0h;(*)26EK+?DuGY`ybmTq`85H90eh z?*w1@tft~rYqW17>TWZ!JVJtfZ{$@~+Z2>*R(*kjFK6EHgY(t=wr41qapEM2lhGWZtR6$4yhgAr#)6Mv1PQe0(F=0m;yvk`>8Az-_uej9vkC{e8b1z-UQ z0$fuq$5#ayK*}}WY_=GAcN>WjR|l{1TjKJ!v~4K*6m=czHn)k?3D5t)T`k~qP#jbs z@}apqnyCKpEgH8Iyzg!r(o>!46owRntlVaKX-RzwXq)P$63B58D3G&Z8BDdkRQHOa z*s2BxUF>=6HUmBYJ7Fq=gha>-NCDVRq9h*Mz0_C*vgIVfEvbJ5Tgrwa9sHjc-u2?~ zM-dgO2HU;bB#$(*n9az>WP3d}!@QJ~%<6CLBa9(xW3*~<(#5ZNt_(6}o1_$f^xt|* zsi_h>7PNOxicRUcYlY=9MHuKroqEm9o?E&@XN5(Y3eU5W%FRiV#Z|3IhRND&A8*t# z($^M%s+Mx(}m(S+D` z2DjVyAQLl1lz&;;HvlMx|GtL?dDF=@LXmxem(D=JAtHs+igB>1@x_A)4T=8B2Wnpa zg>OE+%jTNkZ$pNgcITXW3!2SErX>Khd;^)XM4lZ^TLxGUEW^*-Lyp+3iNw(`EHFVjw(oHQK@5IwF|!2UZN z0V-wHdS~*VO;|;B{gM3Mz6n_4O&WZOtM<@1Bt$1wkuZeUmn^^DGX9kU%lBZBLyE=XAq65&`5x_paC8 zxxT8j0-=HQH>jNDO_I}J;oE3k?4(X!el4>7MpMuS{lx5^GW`s;Y^P+tbc!Y>B!b zr$z%S9SMrZk$*KL8=#<7GbNv3oG{kNG=DU+2}CWo;u?L!VFpf*pm@y~@xtjcChg*h zva|N*57yXE$|M?2xs636aL3V)Nho3gROrOj&zeil0TlomDq(M z1f`VPWVQPy=G*)W(dRp&GmTn35o=wm!u$>)V!Nd>6{YXC`*q98FMF~amyo^Y`kt@5 zIsS2(zGYDCbeo=-C-dS;D{RJIt*v}2{RQ4CqF`X}=u>vivN0U@?^-}frK%cqr!?;- zO{AD8T`_s^RBqZ?wY?nigmO$oZ`Luzu47xa_4Bp*409Xw>eU7KmAr4>hoazX{)tT; z_qwB~N}4LA_C&n0^`pQ>kVZ}Xoz>p}+W+JzczbVHH#pE?#vkjwqMfd2T# zAHg+O8NJ@==j>4!q0_Ygm2+o0Me3wxj5EoB%LK=j&+jZXBc zVz^r&`tnVZB%tMLTwA(#m+CRj+S-wby2c|)l5u2ll|n}tT0P9YPh#(|)90zP?ak-L zEmdiL>Hr-jeNwKEZ>{2#^=j7bs7qWUoirov0clgaE!pPu7mXQ~a{b4Qy1_LDz~jY` zB=5Gjc(jfandZ-X=&P0u6>L)s=gL$H%Va5lP30D&RzCcDxBvUaCb5`8p0Tv!^8fQM z;49^*Znv23cf5TEb(*;UC*ttY%mmRY;&K^9DE~dU{T1WMuxgF6mh!A#yS74E>eK5y zz}05AiCO4ed2KFZ> zw1;E)%I;=X?Y37?@bWtb9>?Qv%p`sJ=y{64#b&uUH641{G{{u4NL*=$o+^9#cpTOv zHN{eHM>i27#ieU*wa1w6ucj5|d#q`{FRoylIs_yxjX^(t1pT!OPPQ)0|H8CC(r9W_ zTXf7f%FI$izxfzI0yk(+@c$(MRra9swe`N4`SX-WgLfx5eAnKkI(zqZN{d>0duDM` z?M75{IxA(VnrC`$s_kDpMP*y;SoJ*8M~vswks{cFJNOL*H^ogMoQ*7r!t{*OUiq?X z^J}%mm!9k6+2GNuj4IC^p|uLmY0<4xb=E`-FcGEN*iHyD?G)yPmbjduhliJFO$@Wm z<|o+zWdR5u3)8AOdBh@awT5pzsO0%f>07ZvkwW9(of__?qFXrO;-BhoJ{@)ck+5fr z$xxM&RO&F*)D_+aMDweKo95bMwXE{rUhq`I9gF`$A=8=rk1u7Yl4n&zbFLQEPP$0xLXk!g-wO$=HqdWMXgo-f`w%h zg_CT~lBv(K)%{2j7P2eVfm1q7c7R9yCWMj1ul3!d9AWSU*Uoy|b_$4T&>k_aE^K^R z_5JaW;tReC^Gru#u8`I@;FGyM}0YRNTPpXo} zc{OZ3WXojL4_q%}&GsWA^M6k1=;W3B>()}&g ze?DvixF={ouQenQ&uWsxpgvx{4rDAL`?- zgW1+HCE6%^;p0)&sXF+Cv{lI?Vx+Ovnc*$Y9>wOYjN@f8#cMlIBee6Ssl{1)%(7*Q z;G3%9r5r(yeC*Te$(xkE#HTO!UMx~bU8KJMakibSu+4eKj!PL^qw2G}A4sGNPhBe% zVVV%pQ_%>3U!9fq2$yKbTahf44^?LG5qH*DzC-9rwVAdk88`~ELK~7IA0^U@uNzuP*J$F zM#a5Lf70#AxE);tqRg_%PXwZXwBLf6F;1JWT(2z)cL-^%(lA@LyGOh_7Xh3=v?DsC z^mEh%w`_ScfZZLjFQSf{oz=5it_-E^e;f{w5r4n`e>=uo=ZDQ2Z@YiM>G!8^u%qdc z|Gx}j`_bI#FqA)ioO9b~GmyDLz};ItUYKX0*f;C~QITg{wFowjD%OtlrK(mP8Vk%8 zoT_oMKRrsJAdP)@L0DFrTfzuJf)u1FCHZLE&*9Sawr@&R2jXl4%S4H`a zOegQ-r)4a+k2_ezVVS#)9`t(hhWWBJ4}}f`SHJQbzZpFrgdQod^sp@5EO_%dlI%|U zKpN}~i_{PO93%{mL2_Yw@(m8@&H4d~8tiq-g(`F@J;dak3&kSv$r9^cVjdl*OTM;o zE4Qkm#v!vWpo_O&>#{kVzG zWtSdd_@9E1{1*il zN(8o#u~YnQHw04rIp#{>u%oOm&32M}*M?#p;2ZkXx?Lls;ObGQ$pT#Cf=ds}Z=C@k zU+IJ;tC3ggr90#$pa+hZpoN)vGe(#D6RreUj9#j_ z9SYXa3;-fP(7bEE`}W<vx`Y`-E$JcS?O3& z^m)h|s#mH$Rc=wG+^gPM^h(y*Zg|6kY-kwF*148_q@}L9_6v9YX~s}Vs3W|wwGhXl z={SH3erGaiXHFwWMZ;M>OXA^L8226o0Q{eU+Dc}T1c=OOXWIWn&%kLT*@Np;RJ&S`9hHFQ9UoDs$8@!IPK|PDaCbh)3()(vz#2`E?B1281U?mL_UGUEW+Z+cnU1 z)XyE9Sr7KEY@LNxtsz4wSB13CqzD}ROzOB#=i1s{tX4U%YNVO~DISXkr}y7((-Lb8 zNNF*8saHN{4)Pa*RTbn@?v(15U_;fw3VV%t$N}|(q3U*PNUwk0 zdUuLIn9nN8XaM={Y?B1&0(fcuYu>)i8%uZyC4=rEmU9ZPowBY+2+<%?cVokhWvPVE znM75MO@t6O4DZ|{1wyx;h0Dfi@VvBwe-H|;&sBcM&Br1(y|%$b6E{udvOw*T4A%L2fF)(pg-SKO`MNt%}&5*o3&^h#53yzekjb+ z_~}{Iwp2d-?^j5)@G&LzRlf(t?pCTIFw<-^nyq|Bq9eo(vBS4Pm;bu#d0*o)+Pa!VL8 zi+dJ%1oAUhhFx=-U-nF=2-{gY*DL2(+LnQebF#56mVr<+cloubL&2n}%|xX#-Y~w- z1L?3K5(ab2aZH6ogH#Zk%qJJ5Fd~s=YmuAn#!;k+gCARc^>#(i4U1JoqtM{`d=FRN z@2Sr!*4W+ur9YYyxo0LsS{9E9K!^WFiRvE>B9+d)CST;`FPDJ@^61%>(n_}R!m0=+ z(e$~ez79TdaFXmZIC>j=s^7%?{alTD-Db;KDxOjx+uAwG82otAJcWDC8esVmV{6=4 zdy(Am0^Iw@i}y?G@Q*1<)%U5Ug{+ItWDn`x#235t2{W;L_t#l-JaH?0r_A|5W&Xy~ zovsAFO+FpEV0iuI(BIb5q;85#&VOyApwj<9L3%XKhfCe>%LWT{Z9|i6ba7IFDdz*@ zGR~)Ds*pk&0c;lj@$?1fvN_8Lim)ZyzDs0eVMf>mm)%C^vj3_B{H)#E&qxeuo@Mod zmUU((*a!JhH%}g&d)M@?z)0JuUinEPd-58g{y+!pCviE$T03NV#$|5qRI-qGe#_$V z#=yyT9~E0eDlJQE{Nz!Cy_qDOI2?apTABa{|du{c(w2k$>Zt8!1)v> ziuQj#4}9sblR4nyr7}Sw&NDmnj{Lk2E)<1g*~}E3ch@JU>>-PbdLqBS627CN}kln2aa%%&T<~46v z?EZ{fsu0xZ|0{LtdWa<{L^@DCU6C(4+MJj4)>Fw|`Gr*CyiaKek z{9O!DW`)@ipZYPr?nTszBfpF-rC5P&xp&rr%|)9rBsxi9p4+iQbJ5*xL&t!Ve~c*r zrY^+tr8PRXzj_=#fm05uQf>wv7}##jmS=-hM@Kxigp!R-F8MBsj=ejJ%30GG&N<_2 z90qfl+Th9?0tZzX$!qU!a_laj9CX=<2t(~;SO^Ux-@KTcl|;W3^y@O#L83n9i<$Fz zz^WFv>%2EpN4>s?&a4CZY*etIXhHIKEuet^TP&MM#Z9^StZ7)toF50iXzn2@;$LOB zKLq+eTH&P{cxMfR(d+NL50fk7&XL$ZmB_!07bx+T?mTQTy3n>6ze#XQtMZn92Y2Y0 zOwDbVC?sfD^vT2sP_9uu&KxeA!P}V8JhoFU1B4Lw&e*IXJz|H~d<&GZqoNYwlf-H& zTrefpfJIpyGZQeKDho||OWRynnc4>Z;|Ql&n{i^RYFrTjQaiyz;V~=M^WIR;ZqnVK&R{dU3EVk9^GuQ2$5Wzfe3iPjiAz+EC!JHL!)^)2Q--V27 z{Z)Gw-3n)h3Aq{9wv}M57Z&qobSj5kn+L+`AY(_e8_*|zshoNkqLCEs8u{s?m}lj* zfJ0|R<#2dWnyx<>ltJ0KKit;wfZ4faZ;$E8811bC6#atvQuSZS@2$9O{86mJyS53Q zc6pyTx?25jXZV{YC%qV})0Y5>-EFis0b4(#Qr@Om*|mzu`j<>gQ5aoZC}Mw}xg=IV zPi0s?MQ5I_6$kQ>m+9jK4#~E6DXZ+}hnigVr{4OFZAA0wU zGm1S(+#UAIz}A>d{OdeBwa|7qTYw|G%3tdpOBGyvXqH2uajy4ZX5AftMQ_zx_oC)m zS-2{uBO8UvjFXQb_LslJgj1MmWIP^%;wW%aU_M~|^0UN&jkd4K&#v_uDZv5olzjc^ z@;TvSOGs|$4z7-tBkhC{3k$z_xBj)&1V12KswZV5Yy6wz*OdQ{vG)L{^8f$G5s`{Y zCCMmdWR{gp(vp$vU9wm99w(K^in3SPdmN7KkRtQQI*x-Qdz@oC#&MkAjl}!?`FubB z|8=gb>xBEhU$5uuxyR#PwFO?j_T!{8Ixk5f);2o|0`o>fnx3nRto>lkwW(-jz=~{| z;%Z%Ii6zv;cm&4RT`CGo{^l!V6@5=!-P3(X{*i8=Or?5d(@%H%)qfnrLvCrgBR}k- zY}3i98mL+;U_5h4PP<^$1zehK%+08bKPjgjH z4!~0AYB+z0-q&`X{d#`wasF+nyw5GTtnB@!hd;=9wBuexDfEqa;?PWnb6&r;%We)a zDCa$`{ei9TEK;8X>-3(z<|LPmAS2#MO*Td9ChD}*#@H%uwiH`B4Rt(aFmtN4%nEOX zkYoGc5u$??8Gy-O*J-Kh$36u6Okq+6??RLBdqjm$%Mg@QZur$dK#PrIfOAa*qL7kW zCf&Oqc#4EsBG1uEC1P-sfgz>?l~%)1kdkRT^+gx?3p$=?T*4RYG)FV!m488t#ooj8#wCR*IxsV- z2gXAIf+f>$!;f?S;Q6?ULkraiLJBEk7_Zn2)wZj2g+J?@+5$Ccl@~cvz`dRRi_{%BA(R3I45h%ETs#W=$LR zV-)hviko+PNKuqdLZzl?-z>Dtk*(>Eh5h`x-okLg#f(TAGzj%R#&{3XSP1RkhxUrKsSUwMhvQbCEkEp7B zyBenL8+iCbr)$zcwZw|z-LBCP$RGmW{V(=nuhpeamIUFj3b!ij0FGD&p;-DZJ|rj= zoZCptjqmH?juhhTa`wO)a(bxPY~xvTa2~B2iR2<$!{lhZ7B{zh(3&u+OJ4jZw|j!K zc0#3&mFLv>>qxR0uu!>y=GNylgZ?W#AMD*#o%1Bk7UsNXrn?G&;{94A_9VBv>-$yP zfeVYCHK%uYq^{)hQ$KW?l0x7L$Y**HHWECuA48i0Ey4X)P*XUw>u!9Bp;JO+6}@cj3(9c> zHVgi6|GutYc{;^*E%413mssN4)R_f*qahVYulNnM^6KVxY9_SQi1Ti5;p z0libdwH=7aFZR}dSukgFD`f#uiziqm4I-zkj%}BxE%aM5c~pNKr%j%Q-E&WEpL-5r zf%H?UE;@O5jtbW#&+z%OIsDAMv886N)I>d_qqkq3^br zKL%MY(A<)e_bD64`T0a*(wy0U@+$vM8e15cTYnc6a!Z(S1?u=)Yhd*A&5v!rqvp-W z+@r~#BP;cdaCuBC-qrTbO*MS5ETD?pqHFPJqX)YqaPOLY_i-U!TKN$)Bti z(1l^Lc`7ABV8AZKI|0)thZ)+^kXI@!>FH`1s45ASfEAtBvY_*ULVLa9YkVNPyj_9a z`RbOl)^z`iZlOAKX1b3<$e&He{c9Q<@`^qNgbKdx-#mua%!?DNT%Ey|8WF>kiZ;25 z$Ll9O`W-%`)o^V(uU7k^*CMQUOk)~^v0}1}{X!L0fz=`I0x!}Rik&2o;bKw)!%1dy zYeZv#xhegMo=l~;s@*-s6DB9rGb#;=ywoapcHJ!?4b>v4v`}hS)8DvZCQl6>C!Vgw zu-Z1c356cZYn!`bT7ZhcbSdG%x-)j42T-paoO0SHT7@8jLLHwYsY(UhpsPlK86xVi zP<}~x8pKDOrAl27BIPp#X;sdeE07bTpVRcM3?^I{WoB-G}s?tsI z?yi`=Nz2;24|GZ7o7zZWEvfnGq#>XWCMp_J_F=}9b#_Z~#sRXFXfkuh17j2_Jln^g ztGK?+gHMYt8km~Ak2D1^0wv08yCqL@qSti3G6G6Fa3=e!ll2q0)@*T6_4lrB?oHFW zb#ek1Zl}oKt8X3bNw^{f*N4^^1$!%FlAlw5gE)D8!y@7Rm7yhY$#>07F%n#;2-}PX zt2l{&D{FX-9N%bde`@=qf+WcEDJeqhr!iR@{33!+wdy|L04kGR)hphtp4tlaPN$&qg zX7)gR|F?SsPAlS!=3cw|5jSgZ?f>o1|2Vrx_qKQU!>D3+RVEQYoe*qp$*-|Xtzx2r z|0X?$JxqTe{I@aZZ{h%;!4a=P;0thHh4QB{=l+Q1CpKxz|LY(QcKv6q|6Ji8+Wu*! z637Sc8(f9CByu3~z>-QY8hZWf9u&-n#f78C`G1{c5B}7NSNS-F??nG1z^sAq*?*Ut zkE)=596b!3+=fCKk{EY-C;9vSQe_7se zrYo(9tF#A!K=6xjYZZ^j-ni8~ORB2WW9Fj^-mFqh_?pgpiM?aCp_@n}EHM_8jv~L) zx8wpvA44DeF@KC^>TyI;UT6`tAjAW^QfMMl@RE_@$weUQDn4Eir&32flDIu0Y1df> zH)8u)p*w$BNoZ+>HRH^;TXxN}g?n3X%O$dWKmt0gPvB`Yr4*><4Q{`?hYo zzSM?aMN)&7#Kof=741dRcIoI@@9po%flD(`ZlLB|m!ZFxpc63QNdHY^U{t z7&UrnuV()0+#)mJh$wMHL4 zBb$?N>Db1-x|Q$L!`>a5LxSC6fGX$|Cut-7Ux$Anmu-^71RR~x_Rty{fo(f~a0xpz z-^JBI1cI&+cJ_oT57B))Yf_3=`N8~_95Qu2nOFhu&MQ?a(c}d|z9&lf zLl0>@We|P@GZnPcPI6558ZAJf+3&f@n#59;v31^<6kC8`M^2xc8y=>6J0Ke`4(#9fS3Z zJ5J9esi|o95llbw88RJwGMl`7Z?%tm)49wGY;IdD&jmOXS|p@|G|eYW*?A@;B-E?G zy3*uB&r*|GFauI`I!sIMIkD8Vn=+Vz$dBn%6LrvA`t;1-ApS)0FcbU2U~*<|raOv> zRAR<4tQl`xFLLmtX)>rC@Cayz1kA;DxW^TT+e)oXynJI*R*Bo&dEcz6$iy_tZ7cs( zy_S~#_U{Pk(DFMmE4M?dY~s#+7eDJdRxd?UfOq?4XWKI_P%+d4cam;A^Y`hMG|}!I z;|pv_1`6JoDTBmatpQD>@WnbO-W0DdGMrk{geFq14_z9%)e?IGXddRbP(RNGFtziI z*Rc{5jotTN1xYPXM!K=Tce5{47&Cd0}7jm#;>2}iX>JDJVn{G+u5*~$M zHeOC6X6oajE`SF8^ov^IY`+4KRvo(I`b+^7z*N|J!XoiXXEkTx#2vT74;dCA3AOZc zzIK1~R-M(^#T96|uc?#)*DqhDO8GY_bo+EFr8Nvbc;;Wy{mc6RA-I(T{;Zd|^`7Ae zDUZYqL|!Wz$o+bgUOKURCuwd{@qR=^615_gE05jKpkemCJF!1bo*B8aaMy;6tYwoJ zc}#UX-xH8+TH^K#gY1rPiAcaozI3}0k^&Fk*~Mzlk@BLI`%?a^s^3c3yr(ByUfdY@ zb*L5%-;rrb931Web&t9hp!AFEQw#YLuiPx8!Q{oR-p^8QiR^P$>{DNntIBf3QV&+g z>pK#+uXH^-k>j~|{Mn6?v=byofIAX;Yf-6`R8&mN!@gor@ah|*KrMfAAzr3C`oM}K zP%*G$>Vf7;dIyZkoJX`4`>YOjRbo@lUc3|W>?8jK`TVu|%zw6M{kIODzmBFf%gIml z{b-l5m2uEbQUT%R#Jc_iM+P)-azi9FBfxO#+0qf+K!Co#ly>6&MY$5+>Xt#5f$mIcvB_nk?f!} z`}yfv;yp+IW?5$hK)qdmPM60;K|460&g6A$ci_|gF{RygB&&j&K^=0(#a52FWKEaI zCUcNQpZ{OS`*-oll_WGiF4mThykt43Ut)bt0MQ;O{&Yjy=wN$5>%rH{)ZNDtBsR&M z0#7+&p%ql#XOk{d$Mgzj0xtOJ%HOkg{oW%>b(4VC_O@|Krkzo_*mJ~nS1}LfRfK8M zeBvkjd$$l$;DozZ3MA#c^<(YLw(Ank1zPM04Or7R>wks#`}AU$H@sFTg4wq<;+fz5 zVkI_l#mD#7%N$JPwEtI^Xrr%8qo>lw5N{JVs^;$A{S*r5SMkO=sR@b=YQ(Og*?Q=- zRyZ-XyV|>k;H)B`!#tOkXix7OnviW*f0CYx89JWJa6qc)(c8aiCnYuV?*%wQr*8AAx=hDcV1G2Vx*k7;gLc$0TEBD#)$kjf-cs~O)TNcSCSOZuU zq6*U*ySZzTmzauWDVj3tjD8pY(3X5vEpC`9trmD7VgOyD?TDpp*FLPIUU^TblQ<^R z!SUe1?(o?^EvcCy@w|%+qj=XAc_e5-%4ozIsGT_UK&_nD6q~izafCHBVLGz zfbyvNsDILU75_m=|yHiLlS*j~uD7MRj%sp68{ zX)78rA&FbL{S*j6BS>BM2ezaoC|#4mNyzj|$o}>;m7bIqt}{RscF}75vBIr(YQ)JX z?!INv*aV3AvoUoMF;vG)*mNMa1C48aC1o1K7Yt`_`prd0~^lW(4kQkLvOP0C{#};OaA~g3LNG`BQ~ri z9k#+lzVja`ArZ3Z|6}S27~h2A;t4~gs9+w0XVj~UzM-B@?1!$Pbn-8EHda7ab-4wLs z+E076Ez^K&u>xqzy+SzYSdMdqOEZyF3rp8!si{7qJ-Kdfq^?YqX}3rLhRnFte`lu`$H zbY)vyOi4uU$RYqXwetYDKC(kv!sKIB_et*KfKNC-F{ck;`k?1$|<|RpiYxgH{v=s$f?}KV1uB{Tagqk^h*caGF$;?8rj* z$@Gt7IgX$q~ElUXTX(KBYbR zV@?;gt|AxiP1Rp^5`Rl~>2l7I5~^XPZ;~(&jJvwmyHaVKr}y9M$2Ge|eEc7o=DijD zTd5O)R!QFO%*T#Qu15CUONNl-9*hdlDj5D+GG(RWc=M_){Dyq~qg-j3XB1~f2G7S8 zRX++s$?z-%9w!x!nv%C9HAy3i*k=R7$dh|YbA?m{Nj(@ag#yQi%>41Uq*+ezssG*V z!M6YE#9e|i7se6Qw^!LM?olOKi!_=X_#vMxiM2mtnyeDRY7{_ZYYIy8W9$2pTm+09 z*q|{QAAcdqC2o~}^SixG<#<8s>_MKyml@zi1k0=boX8uBeI4}`Up2o{ksaY*fm$N| zLmv3^?J%hfR@xP-i%dT07a7(Y5#$X)2aX7c^`fCgFYO|;sZ>-4IDLh}z9Xf3M2_^1 z@Wyd2mks$yu8KxK&hV{f+D{*Q-u>x&-aG!sXUr;D0Zr5eSV5YpiK*Y)g$X}p|J8yi z?Lf(QXygAuDCus>H)qb--UMJ{NQH$l>839Z_oH@k(ZT$6d`^(=<~(N?3vtJV=T>cs z;?du{J0)4oR~dtT%g{ZV7(eR$N_!JaB}t0I65{jL+{QH4OQX0buAc}y=ng%+_kZxs z?y8a#$tdwdYhTA;Pi}c;|8*RItE9uFW0|K?A>+b-^5-{iEq~8>?{?-LxsmFN8~k15 zoJUL(Fx$VITG`$fHNSHe*6S59u)hL6o;sTUA0lOfIj6Df9-wN3e>HNM+#HiKYZ6)-IZ6HPJ@<7QK$YtIo;c9= z7_#l{0T_6!@5`z=xHiHB^Ld=C{w_-N%5_?3tFk@vHlas_CoE{A-gkw7puKHHV2+ zpg6)(@Tu;3fp%p71t}}GE+EAC$9XB!5UU?X00+O^=cQy)(D#uOujAx9Q}@ylf3D;p z9iWnU>@D-9JR1n!{; zFi@|bFgf$&Dn<>h1N9}gBe_SJD+pJDRZY0K{&oV z&U8llVvASaM4hHGGd|I7TKx!^K?)Y%p(D@jCSHsw=sw+0c=+@)>}t&E34l}o1goj8 zQussjD&ZdzgCGGm4H%)vvusOu?>##Gvp{K=_zqR74ypl6!`eeN6|oU1wny~mte$Vx z9XRgd3r`@E#X-Ea?|d*T8p=ZxCHo6LyxVeI8Mdx012A+rQaCTnjLEpWyf0}?%EP2R zO(iVb=TteAD?!1|v{ZV@vUx#dAajz2V_O-&y{(y$Nv~^fSF5cfYaKdD0$(!owUVJ7 z>@h#z3jsvn-}=B{umE5OX?3W4oI&22vasU3E0P!SMZd-rXg@}a|Nih)5ydnMuhHS@ zb#(e`7>Y%sEZR?U;ZZ z{oLgGuRq5=pYzd^MzfHBRSYu?Y%k4rN9uNlrX&kr{&o(SjOk;R=o?5#3)(TkZv)dp z1sADFSx*3HEmpScB?3ho3w0d{t_}Po6b~DYx!?$@QcFr1;4+*eG?JtEN+x4SCq%B;I!tZq~-G> ztD$Th(<}{M?qxNmSzm~fHgCR|TGgAwFGWsjFB?C)Uc+*kp>=&d_|v7saa(D>lY82T z5_cUX$CGtplj8i1LQG>42hWgO?~iNG>m-#ma70OuLgdlM+_0eXh2pQZXR?BPPm~}k z%Gf+ate2ow`jw6|w?=<;lI+MAyh|kVQ)1&|yl$tjYfYVh2mQ&|PVRJ=DU`~__v!6> zyJQ8~lbP)r5`ie+Bp<*%xnr&}wbbBz?)2uN)(K|i>nWf;y*A53ruV`q$&>>YUOD~r zX=et&yWJ8s?ssR^Rjr&s8BI?LjiyN$E=MR`BmqBT&hSx)-e{#+ZdT;ryLp3-bs_^D zW3hf}ddZowXk9gcy~~Dp;#uq51l|xDlKXulIJ+_un4SAc@xZ|#Ce7xpyL?%Fq)_9A!OQnd5PE0u zk-kRIBNAXb;?3*YuMcsGE&1cEX)zClPhqnkh@T02;WPEwvbQ9;+`VxC7?u+9QC`Ui z{&Awk z`?~MY$tO^_E`DaCZj7bqST>~|STN|o$Rf1 za-z2Ta(h`55OCfiSbH8Y!_-qO+koBD(AiqeJf3(?_Cc}*^Uta5$q(Ttwi{ z2gzSZui*tzuHY3XkH56o-CC?pawK>KTBWFMprS0N;ys*VDS6qQqchi!( zad|nQC!VBjA5ah!b9Qk-gOH!_aA#-^)sVPd>LcY58}qBE&*J39+jj$03e}1xqhx{E z5dTZRXp_Rz0SbI2U%f%UNF{IA3IiqpOm&Hm_!AVW zGR#7EzWK#TWPYWbOKBCQT{OI@#CS*Wtn>s(Epf{Hoj+{$54+cqZi^wHNfHX42D@bw zP80KUK>VB(bBD%;>+{2B>tukW;8}0l`tJN*t%Cko`!*CerutKs`H}Bm*Ba<(-&5dj zZe4Dk6AxrU|LB;-Fros$ZPEpO@zy*4D3qg9i6?Lsbv1(pS=EaGaIug;XCQVk3ez=( zUc+J5ICIg>k&L@NrWa-$=T;UX!y~=IDS_i2CWEZby+SsvCx|NI;;RiU829nSOfsZ* z`=x4W6(IXmnJJrR+JGxO7YRxUY#1#_hBi$MKN5-1CNAKqwSPw(WuR zOOQ~Ya>W98DP7YJwqOT)CA$@htKaoA+4Fha)@}f!j%<~K``J31F zBYPm+>)8t(+2y02+o&Go0+>;56$XWwxgF+*nM_~lG*XpSen(Kqhg5z_Rs0UtIu;-l zVeZZYsCuA3Dl4x>o(bsS{1WIGo(ABE<$q{qISG1qhC;;Y`YFePH=f0N$7}M%H$lp) zqr?BAmPL_3QKFQcC)%!Yse?CBW7AR!2-rb(e|F`XaKaILHQGT0y!|C zav*W_&3AfvL1-O1?hCo70+s#pdSdp~TW3cmh|M$B^zxf$@ESv3vt~;4b7&$Gy_ONX zHhlln$&QmATu1aQR0pg_Ke)RH ztLL8Y=Og~zevkl!Qf+~BJfic#;1`Kt<8a2#aZr2Ww$qJCOuYt8^RZ&q*CmXn1};;p zV|`0x6AN`Cp55qEV%isafy%n^h5Lls_=k+w@gwWzsEOqTM8G(z=5urzxu{1hg>(MB zq}8W-AWs*&{Q20Kn#+JG){rf|=@r>yj#eY5Qldy?H$N?KG_wRMCcDU^bA5hPSBQmy zx$`>7lihJzfK3}|C}f_7bu{xYTY`rwuOCT})p5}nRvx-Kd*7Vc&){6(Ktu?p@^1+X zys#*30_9I37xY+O|AD<#QA@6>K=lGUk`pjatOZOaM=IHTJLqC#e^v)IzZqRhIXlg%}JiV&`WUcd8{n}mDC&^kbW-Ppqr4V>cWc2r- zVx>n4-<&-A_Dslce++AX-4o)yc;1uP9oSOkg-f$=4xw_l$kE7;ed-D4HKY&ld-e~C zPXT!S*8%$jJ$=&5D3qSV`XpUH^99{B^4lKLU^Z&a<@Gej1)vmpX9B2xJ)9s`V|+Q0 z^=Ze<@1O#fSyPWk(UZ$rK=0I^3`%SKw;dD*ZZfGpx ze!9PJfpW^l-?Hut|I0nnpCdKf3obkFkZg8zJ5*fJps|csXny#|st|!_zC<-{9h!riPEQ6eP?Xs4;Zk7mox6uMYa0MG^~uzL6>rvbY>6Th4Q>w>OsvNjSgFLh|D=`} zSV`2EsBgq|)qCgN3XK}#Pj2#;7c^yU+y&sUohFg^)%4tG)>$5z$h&H7WRxIAV|3Fc zVGK8ZE)H%7DM*-KDfXJ^sVB}G1U$DE40HfvS)|Y?DFuY90=NXPbkQbmSP8734nnA{ zTSuIh7^sT9Xk=6ZE4P(3D6noY75s*cvlwl5zry@-x`?0pcL%ZIActk z@+G|HB@LuH1}-T&`Ez@@tKNXqz@xnMiy=^+G7&8+SVkOoM}ZY{I? zTEd1)wDsJB^VvwV2}nWjxTj#!hXvc+C=#tZZGew{KGW-6{n#O%#{?QDeq8E4^45#y z@oIYmk0ft282|Eq2yP-I2}@B-}XhEVcz?;foHXl87tavhF z*Su24UrVmgd^Ee4%sI~~jc=jYSckAd`2yIeow2Hq7f3~;zN+JeKAh|5JtiEPE$RIh z9+vgW$5BIyq)5t)SN!Wdfi=yWleAe z%o3u{-&UY&w4;=Tj1Y{~f;}0br`Kmf%vAlZb}3z0z2H&%ii^yPe|#1GTL@F)F5BRx zy#S(1qjD|RFg?9)>2V!jhbxt|O}d=>n>augzg7Ji&mycrD75n&)YSuK}*JwhFn@ul!BSC zi+xQr+^1+~_+zJYnHBXk9a7kR%>g;TJ|OuSEHY@dQhEC!)3O^!S5r!St<>$W3Xptq zs~yxTQSy*z>=)tQLDpv(B>-9na+@H4-E5wIj9h>qaVOLJ_>-nIca*rva8HJA+ic3< zg7mC#3vjQbp(=%S?ahMR`K&K@X%!ZwE#_84<*B_J{jB$dC#R_lm1(`er!DUk^tTnc5f9JWkK zKp`MCsIrJW0@7&i)NCp2+l8ws&<)e?jn1KImgu!?d$`yyZ%>2T!L6a0CS1^}EYgOl zbS;!2#cTLi=JC936Qu31ZnIIz+WIS;VrAP_t!;lO*jaaNbIY!;ana7@=Odq0doe}b z8Qj44@>Va#r1hx0g#@e{U9r~)!Kx!^6@y8l)ghqTMNB7lhE}`m;ljU$CzsC6Ipnls zD%n#E5OY`gU3c!e6~UZyec+3Em8eFKQtOPkT&ND_juzNg8}$y-kyL{2T%4xbbm=kZ zvdXBG%3qsH6DnXT&+{o8*jZ9Q3iB3uy-a{Yye2PlQb+DwD#O3Stx_R<=GPbakbOR- z?qzWMs`FJ$X;}tyE4}5nHa80xJa})T#5{dkjaOO3OgF2U__@4VLN&&kW^h)h?e{Uf z(SxB$vidYJ zzJpX0wt*e8CbD+UpEoaSTN1uXx-;)S`1Nce+9f@h?@6O(kaR6m3ALE9)-AoKw1c&VL*&6z@-lBYPMpsuQ%UNP19^QG!@GwIMb*vq<>dFf`mLoJN$zZ|Eqy#|t=tLK zu6Y}o&V7}AlLs+Y3PASO{zjGau#*t2-#uXWqAZ9k5|fFqS>DfK3Ja<#(H6YS#kkGOsp4CXDqkija&Sw(h46@V(B01wE!JX^mR2Vk1A3)$jQx*eG}rA=!pnm1 zIafx^__u#<6iw#Ixaan^nwtHXE%XT~8XFUCUrH>md(5d++Q0swZ>!v9d&53l(4zE{ ztrmbRg}mi1TLGHg%`8FY4FL4yvW3)VR%sE?2sEa|SV>IrN=8urs&gE9V{W?IZF|pQUWm}!Nx*;_6k9vNfjkzdm)G`$dz2I3=A}5nV+4H&dh{Za`YU@^Twd?E z$dxPi)$B*o*3<^We<5@0ySWs&{5evn6{{b)=flSaBE2Psu%U5QKDhAUA!sWrQoFBY z;wc2b(=;$>gZt@F-2+Q|-Ke{*`CBWIDtE?tNl=}=y=QZF{I@@V#E5dD9oSsJIHSFc z!YC8?7tEWY;0Te1A?Ki-Y6I0DxK>Q(9D4JNJdJ$Ln$x89u>s3#Ec0%Q7dPc@url#p z9Ju2y8_H@sR-Rrp(Y|WuQ&zuUo7)XpqMs70>xa|cP!HWvC=0ynZ?}kv_`6M*q6P$!x}xVYA(s zmg&csfrXP~!8fQHwtUCrM;A5525~b$Y+Vf}G@6DlEWrS%u>SQQLJ3_5xqv2x zzVFAnF0*3PGQJ7`&Dr9%XfVkev^sePJ@tu!n%6LM zF3ybe-OX|(19`I>E(0mft%GZm6x4x3Yqx)+qS0TRmWc){x&yD_TT7}fj7e+w(&ruw zEV;=%+40kYEc6Z8wM)|GILJsrWOzzd(e}yL^hEk|X#8z~JeaFV*@x4~=RRN-`0Z;& z-O%2|0f2_!rW9>m{ve^hb!*GOzigIvXS4MYZ<%B|Pt0bDsGY^+_&WDT#kg(nZKuXs ze}mgQl_T9xnAT5Hg;5@cWNYkUTLijUyTzbKl`GjFoMhoFYO01S#S^g<`|SJep#Rb& z$roT+cLMO{+)vV`Qq(bC7Hj-t5ZMSHJkTsqUZje*%C2VNz|x*`6H9qDNfybY7rXI$ zY`%T;)-jSji2VJuXYw6T!#SQO=kGoghx6K;{?j}02aW-($s9!X`8!FQdGFMVlNRwJ zg9ZGrSGjepn~4+qoL|J`1m$0u?bTtuy)01{WU=sOxyBlJ!B27uO^XYRs?u{kNN=Ih zPuQl^<$>l+7ZHnc1)ozA*>9fJ7r7GI2xQ0Ikocz3EdSoIMcJTQDORz`uah95Fx2Dp zS`k1oD`AfB3lf(WbP_h_ye|WPl85a2%E`4LE&o{?drHKs>{D{x3v%UF>u^4Z#OSi$ zss*miySt@j6F-~y^ghAbD^=ghz&+`jmw=PsMxX~|d{sWKVQln6PSTvqu~L|2+Pq_2 zk@YYB{+!nYjoUy<30&dfYdiY(n(E-#TZn?b#=VFiA6MCO7F_%w8R3ynSW|e1)~+hl z(%999psQxjIc(*sahWESPPPP*(XY4g-v7ODGX1xt1gw_VCqFrLq5id(D;d&MisrTv zEy+@uT$&F$tsl0m$>yWn+V60SKx#~rJyk<*eK$@8Y_vZ1G^=w-vhSOx>126d#`?h2 zi)7U|OU^OuT|QdwX`@V~gFH(t?APEOCr#0wPA{T&LEKWrfH75Am6Vp-mx(1=>#ffW zt^;E?2TVn0zyj(+*w)k!?WvX04;ikkH%-<{9r9Q(OD8ZX1#;_K+tj_8J&{P^P)Q^D4Zr|kuoRdNQi1XuD{Wy9o`M=&n|KF*&Z}8mXS{tG6;PIteJvt(>&8@|#f)7gk0B zwm&aNZapG6tn~G(MT0YxOcr{lFm7c(jtyzsZ%MVb(OzH9I5RWK-bK$cb6%eCh3_I7 zgV*jhZ@sF6{l%6(p?)FY#u>yzBI z!|m$Le%GN!FlecO6^Gl2twD(?%d*W)({I=&7wr@o`3c^S0F-_dY@u%gEAHgh))QpD z;8MEIyt06lT$bg8wyV_QeVQY6dM%x@D|_tY&WoHy!NOyM%;0w8JE@d3n=P#?R6oS zfQfl?4x1&Xt)d@g2D+QFZZaOQ8YLzlb{0a;7s-T_hqgbTwXP!y227z%lV}vRY5&UG zNY9<3b~J7r@b~9}xZ^)nZ2HC8;x^%2oJJg|;s#3mrST?Lm%_4J^2&olx<)s{)m}@t zhg#~uXMPEFlxg2A*_iqUYu&CAUJ`?2OEoUcW)1@tFi)qcOi3g46|M=L?efy*n;TQu z1?&}%>Jp=7BP-+;LjXov|@ zH1+to_vMw+$E>;E9dsluk&IshxLL=&1Co^~@}?$Lt6!|jBrRtogy2)59cm9Kw&F{Z zAf!+h`+`KlyyVw(vNepk=`5eN`!7Bs6bYsX(XW7nLfN!Me@_0gc-Wd7Lr#Z$w)p}7 z75%PX|C-z4A{+llM)NtFqE$Zm2il??T{K*!Itc#UJo3%{?h<30p07S}vB)BIi9Tmg zn%OlunbwRxdLO9KS4Ff#6t-wP3gK;2)#3WVKhef(dcvl_+Y-5h0^d1>JMb(orB zF_a$jZ8qJK5Q3L$*~N(A{=jTTGtD{c+ve^+Y6mSC3mC=TIa-EzLALJ{!#kd~3pcD! z&b3LJS}uR~(3*Z~$@;CZb$bvFBC%03F`7h$YRWi#387HiOd2_QL6PvO4Y8Q4TGfhsa zB)$TulUVny&!xm~_b()^_!SQU5^)9j&R-MZ(!lFzQ7-_eLwq(>T|9pDxY zt46J!geL=A`X&rd#D;Z-zb}7%^;Vm8__~BGW2aFb8Xq}&(Zy58Ic_p$<%9L9g|_)E z@AB`7FowkgqH||SIBCwro`MXJwV8v)&fW*Wb+KLTJz_1_f zC4Jn~#rWmc0Dr?-=Jbq++xhK5w78hQskH~%wppsoe_-Xx#OixyFLotl?0H0yV6l;< z1mn8pZJ)h%m!2}Bl-)+0S$$^g0v}dnHM5dw!-^pp#qet(Jesu-5iHIZkZcmg{HMoM zx`@R4>3M2L0Ak`i4bUUc`6EAPX%04cY5+Yr)ZG^BD;UKY#!+@`03Ia`^e2>`WpEB8 z)F|nmaHBZSTe$g7>_XJeRt){UZ~K$CzFx^%>p%&=b`&aS4w4oHHRykSwUd}HQhJ!P zJ(HPMCwYQ1r;WjaMpP++AvoJW#-ph(xI|)nHPxzY2lL!gfIBQZMW?L9Wo9tm(eZ9r zRP{n+!dCIkm5ui4%*cLsCQBpHDge`j_6Mn;QmzG1M}^qoBSslN?(ieij4@NwQFcEP z%Yan>jAl**%Z#{>6t~=LT%^XApj`xdA*`oM?~bL-J;s+Ci2!hEZFA0=^e~y_S{&Af zfi@L<1HoGC)2cb?GQ%d6*enwSN8o5p;hF7y4%wdxH?ll9aRFLPh7)mI3OZKR1R-wl zeY)cgK$-ZOv|T@B?d=$-T&d8SkOklw(FuP7&{?bqVndDI_PP}lr9cL)LPEpZPJ`Z~ z#x{c2y4~!fmkDp2`zZu~J!?M+h1+l6ZafSFd~FLHnWbcxi;xK!)zH#wTu9i<}iU;qg)QXmX4vMCIq24{^+ zY~PPJg;cfuSjQ;EuB?X-0Qo*tUfYzdQc-JZii1<~nwL2?e&&APx`I0PTjy7Q&S{;# zhJHwib=hm^%lR&C5rf}VrP|n8wUqAk8SrdVX}x4kXGx+V zhfOoWbQ@(wSo~nK?Px(J#a1{rL@-%U%fJ{B_)DvwqTsEY*$jN?UaV{}JI$d{0J0V5 zysPz8yzs#KZ~D*OOMR?M{V$Bw^k$dJ4SG*>i~R9Y1pJA>(RLu9meIJ=UW%6SeRH+v z_h|jyxcqAN-k;(W&3+%n`8d>$-Td33?Y2AZ1LLLAQs2->^R6AGeYxZiC9bEUdJJ^t zo138K=^UVV;;r=5gxYD?Q|NR77z{QQYLBW=S2wipj3iITJGQGWueO`|+<(@s1lKh; z?(R0=sgk*Vs-JxS6YIqE9=8*>#UQdY43)*>9ItJzTNy(uX7UibWufz8ayS`Wv?tW? zgY`XD67W4=^MfhApcB`J=vJ8cg|6u=X`ab*46F&#xqtt`>dz2|Rm%&mT+@A_ckh#H zmu`$yC67V8m2ui`&GKQpz&ORbxP0bf0e4>zEtg{S+<%WJS9p54CkMXH56yBIUnCRd zVuhk}EAotxH_V|GB4SFRdav@|sg3;lm3OlJ_Y7Hf+D&JEorOWGsM`?kNF=TxUGPc{ z?M(0qPWQXzst)H`vSTqg8|UDGVq}SbdlmrA3C=h^I!W)V<=N4aK}OI?ysO-MtjwCe zx-g`ydDCReo*Lp1nmS&;f!$QBrpHz%`zTt7B@zafrp%T47CrzVJ2c&t=`USh6*?giEEO?dRnLq6>UFUpz{bJ zOnLQrrx~EH)^2ZW#q|XkYf3$G@mCkf zcF22-KKCffDOauQ2MwbY&6?SF2u$8u$_IEET7>?mwJuTa9Iq9j{4G1~)l9)~$(Va) zs0`jh*hWS~3)^xE~2xuO$N1TqCOd_K23PfkDYnR9CH6#2W;(XRCr{l)6VkDyfK-u)3 zTior#0lH8m7O^WY4{`x>sItWEDwk`Ez?2K>m1*fa8R%&qAiNkooY9L%fIh zq^baPO$H=9xZHJ0Qk#FaxVMR{r};iU^2O@_iGD6h_qcP66%6jOcb~?hIYld-p2r5# zI00z%gZjKB0#42C=D%9gO)h#ejUwL!>1z~PpSW;Wu|WMBOVqCu-n89sM$Hk3(#nq8 zP@^pL4>-LXWtwBpa9V+ORli~47tAt|8)rU#bS1#OBqd1DAiNWM_Qd9e_WV}sbB->M zXSPaWzxciq3M)nAyCWaEx0SRl1=V{qm;)mOfjYIeO~C~}E_|QZ^bQM9<;xVvx4y9q zz_k_G;rsnDp>~^}f6}8r6YT7}d~&TioYPS6F8ct8kxR*_rLrHA6X@wtmwbiZe83J` zULe?*51f1CuhbeJe|9Ff*_KUTr>eC}x8LY%tyCzJ5foF9k!m#BLEN}}O~Uf=P(Yc^ zlTuf)mIgRhZY3IymftUSFRU*z$P`_a+X%)7DFp3+t_+^S?C^Oxs^II8TmBuhn znjpv9GG354?xK&XD%C-K@E)$$6sWwhzNSiXC*1RgkDDbY(lQpYJC|I2bJvIYA~NL3 z-&`Lki=>)YgL&Y>smn~sXU@~tsmzOZL&}Qq@>WC1RmiFyTAfHX2k$bT;!zpkP?_<# zk~T-W^@V=<-9W`*e@>BHN|$GnV&|gb|I$LmWAb?MWr&AX053}cQMpCx@c%F zds+i1l0sj&M8q=eN!lWC+_m42xu=&>rihwpoA|v1zzj3WzuZ^Zrp~qdY${YRioAZ$ z-Oe%mez&4!eZ}QSAeaTTj5)zPW{mD>=UkJ4qKAdG+ArGM(Wo>iduHZHy*Ve(EhE#M z7rd^|aleSO`x+q6Ucf(g6fRI8YiKN0Qw;TR+C!>OnkHBNA6H);*W~`buLvTdf`q6D zSb(%h9U4Jex*O?6X*M=WMWh>PN$K9G5lVMA1L>X&*nlzkK6sAjbAG>dTlHXb(7($RB%L|Ggz)Z-KbQeI7rErG zkZdKkS!R{DtySF?np(|)Nh5>;I3p_2&QHy&1x~kqM}uw}`u%*Q3I=W)%zmU)JjX!2 zOF+s#TC?c6Nu+C=wxq1q?Qf?*aeDDuNa;!BK_KRWmo$Y$=4t)B5p^03=7 zD5Dbu4h24!Pi}2hy;S$2h+SnB>e!Icdkc|NqN18@SC4;Q>AzMg7PhbdoEbCvl-EZ-J=|L&uVDFZCL2#t&SJz~_Fty6;C77%TpGU4g ztzuiId@r}k@MC#>#+A4`R zm(Pu{zdN=xz{5b}XZ&a)iN8Xr>(2`5U?WbWI$jLx_kUAM08m3>hbVftpWpm)q!a>O z2!*w)0eLd|jAD{$52xQY-Qud7$=C{UbR9TG3!8V$(Ybp4ypM03JY(9F1+dVDv;z?V z!vet=WCDJ8+|@>s`N>qZ`AavFw&Th`1?S|ZwF7`qUvFk_eV^&B*Yq!*C^M9Z12wi> z6RX$;mme$^+n`YQsn2MY(2@?u^(83kp6kYhJI$GKDb#)o;A4Z_BXuLK6NUw4aNPvF znb7 zVY#j~aU?WXeJ-U`2BaR)G9J!}XZ>1aim}O7=e(v~HZMZGxlxmDo%lzEdm-Q&pJFQm z_!QQt|4K-u?F;`rmTr{aNWCP2dyG2!d}Gh_NET?-`=!mc#+?*b=HE^Qq>i*HF+ zhSik9#XvlHFNpjoc}w@pNmokOrL_|^o5d)0k;j71GK2HxMm1)9exxQn7(5-})3I|6 zyOKl(?tVg@pq&cMl)GC-_V+*DEirQJoX%FvEr6*3ctk+O7Fzm~cV;ujmrNnmlnVjU z)s^0p~_gfZjUXB)T5iDp&{ z$w5_X3q(R@H72DGHb=lgAos4frJR)$Ca{V5sx^CdeCx1^)L`)ONkLu(c*6D^2n?7t zP1h#F+TwvVk(29DK;M4T_A%>B1k_?ceC!+4P-O{I-n-6E!picgAA`$YvS?qUN)Et; z_rUSJs9ai|Sbk@jl1SQYEaT1R*M-;sD#r|iuk(i441-C<3Pe9qvq=>Sgj|dTS692^ zhq~?R2^~2tKfGjx8hm(Nv~UrTrn3=h2z^cr9D`qXq>&QIqNc>+QAi=I)bOj>$}M0|q`lNdz4L zfY|>>@1sicyW7Z2*PE0b%kx=cWM)$WwAr_{AkT~bW;_Br#AlMtTn*^6LZ_ieJ!liZ zwgJ!608VK)r7*t?e*RD#?Y@CM5rzr3+<^0!79}l+oZ3ftrlx z(_XV(*rCGpL^&JvFWb*--Z`j5?MC#ee7U!Cy?O4Afv64WR{~mmwNz`*^T0t(-LKLC z>CDbIC8wjC1gi?y^q`d)_8-x%U|T9NQpv$5_iRMuRL>&D#>wpFTf~Yn3&x7roA3>> znGRp-05N;M`2soTYf;MO6I`%~!tJTjgHv8Gv>>O>2*`OW`Y8tKUIn6lelvVzbfz(P zuMH)%pOyB72qxOx#n>2 z@J(JT@vG4vlXs=w`1c9S3VOh;$PA}_r1og+A`E`Kii{&R^8OW<*VUqX1|8dA7V!#N ztr6>#0)!2^7f8n7C4w`^)_fi?K8=kmqh_?If!0^TS_R5YMUYBsN}S(J9s?cH_Al8W z_`@i)WkcUW7_C z@Usq&!6F)`nU-2K?}1BjIrMOOreQgy1sS)u!=T-{lI#$Lr+ibQzE@H2h4ZALI@#hV zu?!JZ00uUk#a?_92vNbV{4_pvUcc7^-#)VJP-ZkX%&L~AuDvUHnndYpUpYUyx#LPt zYw939rn{=V#jEb2pJqAv&OJ*suE9e8m7i-D=$LkVv&IT%1s;~e(2tv{>lK`4+^c;X z*zGg9_Pr@44;!sfJfMU9BKD~~1>JGG%DX6!^;{#U1J628LsD<8@Moa~eE1gzb-m4Y zdX|^{d0YAbZZu~=GO2|61(AI;>yo*o=^elwfb?a5_581paB&`w95t$A6svhpXNtG~ z$D=17zgQQbZa>9mAl~YoqFOh>{Gt6sBG0=}lZJo=4p|vG1CTA~t_-AU-)`L=eR|Of z#dO(91AwT2GTQ|dQJ7puDIC_6|(h2UHT+^3j}X?xKnK)<32wt?p$9|_={D4 z@r0)dSh|j^N=Jpt)K(obPtY3X<$Teev+(AE$Ci&$^{A#JojK1fV{B_o?(l8#mpew; z1Jp?+#s%bxyn8~s6(H^$J zn|br42K}=Y+sC@y`ywlja(@zRd%yYl>XZuM36*?;fG@AS(WmNHSJx=hPGO7Z^ClS-#A(CHGR^3JrfAQYirjSMm1!r z;o;RBR1e}Rg`hN&O(E;p?L+EE>#ms7cQY6o*A?-!-r9Kd#cuS&43!(A z?4ii;!#Q4_!~xgW}cVyrgRfUdT1EaYl|2c(e2 z>|W2H8rv*U0GdrEz^x$v(saZ11GTacJ?Jdgy3{|;O-Y)5wvSX{60LJH-TXxs#23yH zqVf>X+kIO^0yA~Yzq<6%>I;22Etg z%5@xHz4cn;XHbJs68vf({q6*7VI)`CCR_K$(YDAA5b6-52dc`kHUQan=%+di45!lZJw1dp8&8g zybfz3g7nnY_5g9R>x(U7N~F!g8M5>>&Oyl=M(% zpf~_P5!|sZa!dtgl9zilS3g&kJxK0jxJxRMZlKIyJ3~L1r3ao>v48cyph>`fQQp(p zz-^W`Psiz7N;|E?*M-%!lm>mkAOY!N+4|2>QC9CqESNr?a2~)xD2Q^>?yvWG*}rn2 zIJ0@-i6}agbeM7w=+Sre`HZ&a*JCHnCuLKC3Csbhr{2Kn-nq}p%Uk!y-9?D%7YeLG zzrAe>{MQ{(1%lKNroxb`hBekFsj>mUI6i4nPuYI)rqSbjf?(EKt5lS(hp( zS{aLZwC>O*?$W)^0t7_L1Bx@&N0H9(_^8b%mz+g1)+6Vx3hXfrYfUNd06XR_Ahq+z zFE_XjBO;V(+maP;G?sG@E-!ByC8BGuuQ4+~&&N^u>4vbj=+{r1eEFL+-9SRk007@# ziBEG*ua}tNHCGvU|vbM#_Uijy`aneZG@#N?5~xVo)Iju zrUm%jQLY=-@3bcUTlKA|OqH*_rUtIsxjquuQ^(i7(M%6x`=)IS&NI8x-~;(0$A+(s z9Yy;YM~{|RhBqc?YImJ}D2$_$OPvzQ&?hR36U&hkVC3brKUEuKqdeh=h1eJpZ8<}5 z_l+2I;+av+B9_M?`cxFGy*G8fF@9y|udmip`NR<^WIn^CH_192EQuYlHfi6}RNAF**pjb)rEZY)8Cabs^{o3+ zbR-YuX)XQ`!l-8x={2H z3QYLKlt}_C(;R}=v;fzd8SvVZ8DD4HU3FJ zws1pT{rPO|SJ@mQDrq6cH{SyH`-kVEDL%6$&`C3K;Q32B;tkUr$tO&&qJfXknYnnp z-B(^+uknVw=smkzUvo!-(900b#S20W%c|;rap0Uo6^_y25uUpe_&9Y_QGh4BTZj^w<{;^0e-?J<;Rw(v-Ctb^|vA6FI^Gw+=1 zk+wTy>`EVUn2YSyZl?ju*SDv_t|!o?!;l+q8*x{Xto!KA^Vi$jxl;z8@zf~qBr!(3 z&JKyrO;X!idww=WCK&g?uh50zrNLrdnd>LbwJX5QFb(X=75;)(opwYJN6qvc^q7{O z_1Q6)38ailP!#;-UvkD&WRdIb> zcARHlG78N4Irj%&0SgZS+?Hn8w?ElvmHMVG8T$pXp6-~3YtCXgo}^ldH9bU^x{BDw zX&Nbg4pa8yKZv-WL>y7YzI$#!bf8aQbU#urltM#nampcr5>OsE5o}> zcCcN=xm!r;nPK8DBYt2I2uikPu1_~T$iBFFnxw3QEMCn~mMVHX%^P?=h1o};z`<=A zqyY!;eD5U`_JObRAr+{~I`XtX5h&~>v2`-`hJE_ZF*V$yFSxp&ZQRX`PZ3kw*5yC2 zah4V;o%A?V_Z7LsM43FgKI&DNVO7E&79GIc;S;;s{vvL4#_Nss8lp`+ffv-fKG~TWq(z*M;8a z+=nsJC&YDasCm;0sQo-TtL1y65Of2(S@gbQa{KE{6;iAVXZMl*(T16j`ljD0NYErK zJA#oLM1Cx~#+Ixp#y8IeDD_!4GWj%ctgCq7exWgc^^dQs9dq+VUH1P@Kwnk~_*pP& zixi{KtuOvJGFXv&VY;jfRU=&+_j9bCCk6|?HP2Tl(1CFgq_A!22_QH`%+%&WEn#G> zfggsi@`E1&_JKIMyZEX30T1!jKP_h)X%Xws5V##g*>fs{3izg`9YY6SnbzJy*b6F&HLmU{#n6zep)n zVw3CqB}~YgSNfESEkPr~?x?drVg4%*zhCbvW@9ibnxM?*`cL&f?j@<7DH)_nBUN^5 zJ%`N8#<^I`Z^eN*M0WaQWRvKoa!=(A+{#orCf4rW=2)dYtJeB`U9N3)-wR47&>%7qLjx%)5j$n4ferkat^eshT z#S@u7FLTIkJ{FSC(!S$66p?}0sZk%@a{xRRD$c?yH2Yn249#(k-ohx8g`F>n#D!$>tUCh6aBbU3@NA6sE6g(}Tx^8}SyK(n9Li+MjY} zRr^8AK(J6_$WogD%f>2i{Rc__0y?NO_PH+uJJEWIa76HN#}&wF$wzK!i+{KFN{oQ< zV-UMR+FyCV?4d^}$c$d*uo14BHQ`#>N&C)v&7xqC+34|ebArRFMF2F)PcNxepl6S@ zG6e7eHGtw=)swQ#1IlF@6+O$f`?k}^wIL=Mj7mnEy$t{_K-g+kw&fye9WWIx zles8*{wGuyv5HSNzWZc_JVv;9Nz4Fc;wS7-Kelvtx#B0Z_#)CdZitKh+*DD6hD;3f z+!+W-S9*DlGi~4H!lZ!Hoem#{&Q<0$)dS1VoAU)*#TLOiH7gUOy)oS46Kjw-1N9qup< z>!2aKcHg&lX51*tCd3ETul)1 zEa@Y{55h*eY-Ox*XyTp3r~;>?v(LrmIF#a!E23>UGKb3%;17oE{0O&&UJm=fy|Wtu zq5_Kq2O#M*eA#=%mPGmEx$;P|4?+wReW&<3KAQQ={z$uRuab=E^pMm#s8n1o6Sq0lwu0S>9U_{vB9rBP) zl!zbd!!E<|#dTA#b43`Yo690A`14oU;`sBc!kD=?!h@l7BTVf)=MU$2_duJK5ap}G zKK+|3K4Tqk(&KcLQaiG-j#g(^gzrG#+DNEimcD1daNL;q=@EKV0Xg&insKi_@nhf5 zxtI>V<7zt3ButICciR6*CFF9<(Hg#ZCM+s;6{Al$)q zFr;w>GSYo+w$Vs_&?g51x#T6?S>Ue=GA?;9%%m?)NB^FCwy?6ud28Awm!1D(LVr)! zn#+3%`tgn}iq8c}qP}U$mDCOuypZ6k(!P!J-W^I>P^>_w#ajusVx;?{ye55FF4(A@ zJbg-Z&-(ZB3FP^i&^GRniF&iv@zI!ne% zi~7rQY(x66apoQ^u;!{^qkEWRTvc4j^kE6+eERN|b@C|mq`sINTsp4XuC4e^kqJ=i z?Hsq6rU%ztr+;s?B3=zS?yW)JUxf^hui$LP(@dtLM^FxizSZWZw)~iYbOzx&r{co7 zZN&L9b>kCVko>+iKJTm8>7T{Z>mBDJ?#mt2!n+O5NBoXzgR(^>1M_JP!Buwa1UnvY zZuig`RM;OhVIH4x_G_C0dbFF^t%&oxgZ^h8c4+e?x*##jt<%qOad`NRtYOP>m2Hab zF&LLI1VFWwi{DNJjasc;O8_j*EY~DlJf3BAva`OoWr|rSX-=GE0t<8`^q?Ar` z?7 z0jyRZy`S{^n0@Kf)UH<7@jQ5GfTDh1Sb%airL)qYW!`sRLrF=;*8?GUm#UuZ^<0F< z%#fD`7eOfCC1K^?T&ZH5`j=vpL}*^?b*c-G37!+uk4)StL$&}qEt(aWw8Dm60PxH= zk3RlLLwA-7^W%(?Nyl;?Tr3bWk_ZmMnzwIX%rn#OZ{jF5l%}~R7uWdnSvB$t2n?IQ z6*Rd0&l6kq_A#eW|Ict_+E_W83diU=j4=5ZK3q;S8#h}bU@Ad%+Jaj z2YO1>&?Xi@@1Us-z(BpP^lkAg?2H`ejh@N7)q-=v7n$%WobU;TYPTH3SxgRxnsQ@m zxccmqXRVexk4q7Xknw?e;0Me-uUB!4nQlbHm%!fnBGY_d%;px4uOk4rGgkXAzGXm* zGKG1seCwdu*i=KT3GCOKj47t*<#xrXIH!=2os-k{W`8s zKwraltQ0>a$l>&J0y&QPEAni6+%bA*T~)KJ*1h9P&^WyfdjG3brk(#ZajFpnp2qIw z{EV~ykqfIqu*p}=`NRcKhaJV04UP_uq2+^er=IlaM%Co1)w2 zy@?XDT>e%L&x>kYE*ydi46P4h^%sCiQys9>1t6rq?f?{t{(RxQf2T*)NB&xX2fHH# z>Le9e`SS1dsXy{8JO6L8Sh(kwuKh1DMJ*G_>=3}9>dHH|ZjTLvs4C792lce`l4VmO zI_nJ+_W6#q>a}IYb>GUlZko~PS(qf_e*FOu4{Wi;X$|P8re|)@n^P-S9H(AH<>BQN z-@OFKi#ERMrg

oPWS9g}szcM+8c=Ux}zVU?;*Cii-EC7v|)L%vb-_^kei626+Yj zgHsBeP}9(hDIbu%hi8o$advoMQjs*W%jwvDOt4-ge`bl)tD8CTvOd!{&v6ZO^FyE^ zqjz`WVoB~DHwt5zz@N4}VlfD`JCO~b)-z|es~`h-k=pIyoLWquuMg;CROoEpvH&ijj8siaYnQR!;@K+e7E(s$PrX3{Co* zIAhBtsL4|Z;A5uvlA(NUEPm~>n|vRa$zU-9BQ>az_}0giH0M&;3-ME>TZYzO!^5}Z zZ#h| zqmP+HW6OONkF0;*-@ksM?Vl@lYn}|hUq2`&ZcjSuJh_PBgiK;;*QN=tL->mo7GD5` ztbEnY(kzUXrmA#E$UwT$JcYXFiXT~dIsHOcKQnUsON<6g(fwUx-Gbpw>$Q!y@+*+l zY?P(&Vd(2@wXMS39tW}A396*H-xJG)JuATF1njkW`4hha^v%|+Is*BYzQ%FubRU>_ zU)}QMtTd5Nqa(T!8(s?)#06*gkLlebMS*X=yFXdLrK3X9MD9*w)l?drK!cUc^Fdoy z%{Q`{2paq3J$%k?vg|-rW9TYLv=?9YLx&P-@{j0J@MheAUn?W&cFfyw2B524)>cv< z_8a+S63w?zcCUgIILLM^v!|~+H2zffe!CB#F)K#p#Js3F`$ZD(t&1puzaQjL$>uAm zI;qA-PW^;}6K3|yCbGjsp_13in|ylaX>;PpGsylxK^L<>0E!!McAoQ8r7Xnxu-(5k z+#SxD`lVDIW?)~GHh*(C5Qo-M(z*kP!8g}(@#qAezMJfZ1S*u*P4R}_EHw!cqc%6O;0K*Q)SxY=#^nx z98qwY-H#Bg0G5pbSk}`+z{K-aAk|0J* zy}&%mv2?53y$p56FateEDL+)kGd~+U6CKW+&>t}q$t9kW-P@XeGAv$%@=@Hr+Xtu* z`X&d zudMGpFm3PGVU7?7l6SLvAxM0=>`Yz!g{`0!Y#MvV*;0G0`E;XR!g#;z!s6e+}P#ZL=vY0 zTH|)s36E zyHaORG<3BTa=yQeGb8duecbBiI&tZ&SSeq^gtU2OxH>%ga*fwtBEL!1CLmwnjOsV; z=vIIl?v(J#w9DD>}i+xM-Bjo0F;uv`9lmZ$lgb^8B9ji2*y(2-+(L4G!?e;-RDRN$$hXX2MDaO^~pX`SOH2)7a*m=&;r^7V~>m107SC zFT(wSYr~kOj-+jVj&j&}3{2@Y0)Q_m9q0CR+*Qh|cQEPP5bEBfCYcGn9NU-q{6P%5 z(1JFYeHp5$@-^Pe{<%$l%E}$#w3O%_b&qYQRu)hND$n`Ivs8=2a;~wosjp$)4BzUT zuWqxKdbHYlS7QXzqLx>QjZ1)n4|5!Sm+gHJ7|5@6WxN=Yy(6SG8c^wwq6^ z)0gtShjRg8*wVqayx>>Xj%sy)-hdZ zc>00h86Bf=yN5n>mV)6uola$F!?r%8MU1wziZd5t%)#YUE5|7q#6U`z$Q&&FjD`p$h* znV803rW|C*`=sleP}{7zxLSGm^^th{!Q)&peuW-k8$A zt`RTd1#w|@4y%%RsAu}`qqZ~&t;OV^A!9Kpc$X-jN!Rb;C||g==z!0WQfF+GXjZpC zOu#u;EU4{U_ibITSKFxI?u}>|34t%Z_a;!x*eDe|EmZ)T`hLlSGilR;OMe@7k|%lt z40U7olVERR;NTKBa@9f#QE97~KB9S1K+h@qw}$?=ae4DL^2i z?FcQ^eOuyGjpZ|Nfc)UcFn42a8fxG))<^SwpY|fVYN|jd1y&NbYw@k7$#AN`QxsJ%IjFZ{9qBA$}EGt zSNBA@J|9T2%|MjnZ&#i^uGbm~c!q^k!jgu9uY)|g8+YL#QJ84ju4BhIrERLH3cZxF zppk>OmWh>J@_TJaFaNJmQp(fmaC3pU7q-bA&JHF=KVo4sFz1oA+~pMru#Hadb{Y{O zO0}mZB@R7-KG-;%;|ax~B<8_dvD^^QTc3n$c(mxS5Ub)D$R z9E_)ja!OS!lp*2OVO2)OCshhwlh#Au4$4`SWVdzK(z{ruj^|G>p29qa_m1~8>cf+d z4Z`A1JP!vXWjb_mIa5xM@eV%iXf(t5ZZ(XQYU8PW#SHf_vV%j|A#`h7*w()qJpbxL zB5L?H+ZH?)RKDZ;mAGU2!P(4xZ!oyK-q4)+a!=75X;Ijo-ke;*WK#@HVdiMZD{*=Y z_=d#RLQMR1$kp5uMZ2M6j~j`nbSrC;YkDT$2_VrDr5n@XWCByz4K|15fl71j_EoP> zU&QCJwDVS0u%Ngd%MIQ6Jf`U|QCWR{j0isc>xSM~SC_}uesS?;71H+&%Y1Yt%Cl)w z%8V{O{ZMBg@ZP$7EEw#VtdfX(6TceTuQlF&t~s+2p{d+400_F{=gn-R#=yR#PsFW6 zMzyl9(1Z2|*>KP24hQQuN~hr)A==hxtQBW8=az19Cd_seCrIM z)2Y@T+kqgFpSC0DB*q_sn^usH9DRqF6TE|H?Lb&`k~SrOF?M-cm#9Wsn=ru7sp%P-(8NSO&3?!X~OrYbYzfj z7a=W4Y4XZ**8$1~Jo%&l7(nU}07t4cz5_*)dphNCIjO+8{p1G(FbKq(;kY0Y6-+*BkTE2{YPk^N`8il;1!Xtvk50K3(5A1;j%xBk@n z`87f85?>w|CAcQI<}*LApVE$61E=4|=adQzh|5DB)f%oHZl(o$XFR)gjhPgiAS_Z^ zN6psFnH0O)<`E44Wgyx1Vw!eQj|-XHP{)(4>%te@xmBN~Q!*Djxar>A8evR}6jC02 z>SzS_yv2zf2c^T~*BV^E26r2|O1iu^CPNgO}u=J%-?T4_5v0;KY{>Lhk zkLRsM&@<#VzufrvIy0nU$%C^yN(TAJ2;q{qhV_HfBREm$c9fyvOFIz$*1?Hm)UVBA zUvZR>VDCPn`kO8<%kDMQQ4Q4YAPU~F?^!)xH8y)pJY1m%R zUnS{j#+DQJ)5^ow5Gp+?O#Q)3s^0H(_WD=yk+ltaSeAj+$X5j8kXNNFqRwf@BuK9@ zUYKsjjhoTpqUZGuOZFRePoG+YEi(jmHIVx0XMp^|c_1HkI5QQp^oHwJsrUYCWVzCJ zdzN?uM0FJ!a1bP1?h{qVDm&U^62J&ST5wJPynh;ba%=yNS2_I500G9nbXFn@dgX*R z6I7w3MQnNC?&B>y2cEOYPoCslLC)%cEs34aPLQX0pnWX5`-<`MfVV5zKGLK}hEKbP z-?#}BMUjj!=f`_N0FTjyW1aka=90+VmIeXi-Z?g^5NPG$rGeZScp59x8!}=4d8>mK z6?h+?trTWgf!I|`JHBUJ1ZkiMpG%gabgu_71>q5?n){1W<7`3W+fOtWlgdV4rnT)O zb#acJQKBMCKFIchw?CQaq zcq%B}9-DcKwlNWuiomvaVY{`EXSpUqFZvBS%+p?)utdk4%^5_Dyk8!^0)A(h=ZF+4 z!JKmDm+Qm!OE$2ISQZq-Ceq;-JLjRo2i2%k=BL9$`-85(0^nC6_kWe8JQf_bZ>rzc z{n^BZ>*=4~P0iClh=ReT^PK={04>=t3PMgpYD?pc3Tbr8J)L}inI$h2AA5^he|?Ge zoPsi@OFbrzZ{Wgje&OW63yz@*G0}|3TANiPgSe;6rys<0K(2O=>v!3*8n#xnot0{l z^C--D#k_FFGV@$hX?!p3Mm0vOLVcnX`vD7{oY4aHoIBt{h~*hkVpGUbiP}--jSSgkzfCypQNq#VVRgx z@w>Ys5~_N~AdT0{CkQ!QAXWN)?bpnuB)1$A-K5Z*F1>tRlbAc+>M*wrq`|Yht)QE+ zh;a3H2s@X|I|#0WGr6`0p|yGvlu6Y#>Xo}v`s&p*j%jh8*RoY4u#_B(V1-;voTlt- zH%_#Rzk{L&r42iqGImyUDKUU^0-10}Tj>vcIfBL>lHc&RiAm;@e|GnF`M(Uv1qT0< zKHAPe6H%s<6P2*zfqV#4?u)J|5!?8Qaj_cZcvABJWqA476C^dL7Q=goTDePa%zbTg zD;!b*?F{M&fmiC1@XKPQXJZxe%3Us#N-=u4PG0AJZa!SkL+8eD;e(ic`-3I&r z$vN!AqV;`V(U5FB$aQur3|H2l{L4@6Mn*6{P}=%%?`dX~_)dDHjeTgm_K=HXVb})W zYag=z`uufxS~{l6ea>#BxHVd3X-#~;HA$7!p-9*=q(TkwgEpg30Nl{47&FKEK z`XGT8QQex#Ztz0;aVb(L0o1R7IE+A#`(}BfFu%s#YCRp1i)}BIDQOSulq|yKIzaor zs6O$Q^Ft#38aWfeHH&;dj#@oI zefPI)wR)6Ayi*D>CA^KV6y%3DKB+Ce>ali(Mh{DGb;tMXU?{F%EjPYmS(-IG@uhJB z6P;~~;H0cAIn+gnABYX%q{TI&pJb5q>C4$Ehcjmp%1U6VI2duC6~Q55ZF)^N+THjM zNcvA0U*z&{lK=9Tm{ok(BIs1t%W${CCq}opz1{-)?pCSavQ>}@=huIz3C&EGA<=*y znm<`A85z0kp#Jrizb24C`7dSZ53p9lCGj3;kkXnzgP+cCjSR9G;NQC}9~QIb%Oy)Z zdm5p~gj0)Y=5W;fPr6m4i~r){qWSBX$+T^&&E#1jPuY-IFLT97lFW}x)OO%I1R{U1 z0JAtI-L^`o()}?)b)~0c>d~47sA?rektLTK1YP8VTcuBJ^-H^=w=_!iIHu_bL_kgJ zkR|mOW+JzO>9cSuOwmb)4H{^|!1bFHi192zV=OgLn?gu&A<%oTot8w*k3yKm9wx0j zIAl}!`(2M*crZxWQDIIlXEtFIoqR5T)sF74=<7G|@V^nda-|Yb_;|F5a`b+K+C6V_ z8PKD|-wAgQ@BY`9?~feKheP$W%%Rffli^z2OgvdU=pDzn@HZe%SJjYR1?(IqKBm)E z-+q;~m_RY8GpgrfXCT|v9->YA9bM%gXmEDn1mgr-Zn~#Z+~~iY5dhHi0}~{9nSWNG zbmx83F?uJvZaDZud5J;Hq5^Y>9YOxMZLm9g9*#1Fi|9aT%wAiu%e|LNP{wW(6vB%;A`5*uyo9HPyFys0|WEjSZ7tt{o#R}xr6DE%!xPO zX{rSLb~*G7Ck^!7*LS2X&h%dFU3R@veT`7lk8sT|?XUP$sQ>o>Mzqm;!Ze_Ae_|5Y zvS%-33(w(!+?K1j^yBY0ubL(S6_oG)KonwWmk#ft{pj&O%=O0g=Y0Rq&9y-6PhGf7 z>r4mtUJ)Yc3Rsa^p8NjQ#SbgEna=maVuD({rvrJJ4^eTbQvUiH&_ltUGTNES9LlP? zZXZ8?>SFm$uHCD#2e{^EZGqXDJ9jk@p1nEh>|{yPRV6h6>$MZ`SSpg|?EiUDDEX&@ zgVkA?pyM~e?@AvbTP2mI?{nLw#6ET3SgWCF@xONTS|TR3di}6i{xBStB3MST|9owNT)lzwm6Jsm| z$27cQ@AXa{+z`Boq*%r?!qw8&6MiHv8i@KP}0e9!%iz1YKbnmIbnTC?23-)tCh1C?fcORQIJ3_%kjn2?Zm49T{_lbZGzTy}vs(5-^EygQrBi#`~zcH_DHs!0} z<2FB}4&oZd>gVNTGn8ICQj#cDnNzG<=PLw%Zj>p{bD&n}I?MEpa2~ts_gNc>Qh`}U zFFV9c8Jf78?4QS}YHGUmt{OM&2VNrShyF!c)Pg10mzkpUP>4>3Fc6adJEl^QPo^w8 z&@T(4u#C|uef*>U{5gMW;zM$g%i4eEo=CV5eP-)yzxHJHqoj*eRA&ls+kRY;vPji| zV!B_Ku5X!8pT$$RJ%)qKU!9-~=}X5AR=prc!Ai|peDta;PhFC9jQjJS^&ni!hu9Pm zFRa@l-G-fGry9qeLsx6p)8VIoXZry{K?+kopiBTS9^Zh^aBl{M z?H4&hod>g@DMfy^N==H*k=NBffbez<5?`hv;^kHgVR~v8%p{|b0$lmQBvBEj*Tz9RJf zJVegLW5a<`(U9Q59}4ZCZ@NUVOG*wXt6CY$JaT-N=xV;R4V`LYgmyOK0uh;|aSaon z#M>UvkFK+&KJmyLVkeKb?!9R+w&`oGIa27j?#-}(82_bEEB+{o$Ij@9c*0E7t^fV% zYe({a)lan+j&32ODcz!lY?7{mvwek!vcI8oM_SpUNqRIgMQMq6AE_x;$@hbrh&V2D zPp!GPZ0A$#AhTSz&a!eW$cAAoh)t~I5OPALO(OGgXob0-bhO>mf zFGghICE5R+pMP6#`r!zo9?v>P7I92J43m6x{9L{1uNkCX>XW{P@{lDFu|l$$ucbHk z7EZjb4KZbajj~RTXM~yafXUfeW-M9xKbIg|HN$DHu02RWYcB7@FBLT59pQtgEzyKO z&}k_J^4dwSze&$CtBf=~DStt@c>9^WOeoWk9wm7tUzcD5_Q0NbWk-~8O+wWDeCRv9 zX6AQ$ev)-}hMu!bevq#bP~I}J$xsY z9yK|a{_muLk^iv%AJShA|1Y9^nXp1eB7M;ytMq}<*XZqSBGE~<1)L+A*OGpX^*?nr@@^Ad&jK)O{?9)*7f3DAvH z@w9_R9+N8pHwz})l3Of2`>CzN_g^~FJ&@Gh_=+5ye}ck>IxKa-Alb4~e#1Uz8KOsq z7+0V{+=4(0%Nm|uTlXG+@y6=_7iKJFS;`<*xfW}91q6xZV06qenR4zGcssyc5}x>f zq=8K@F$(}?oK5XkZzEYE8)>@_Q%^{h2BlbmMKupOx3$qI^NPW~P8>@Qm!OM@1sPj21XZR}A^ zRh6d9tYjVi-Xw7)pm4;pCR7Wam5AHGwPeanGm_KHL&^14AY6TA`Y+M6VCLG3V> zl~C3D>rLmfgSPT%r77Pg32tT%$oX`?*j&Yky9M5d`ThhmDPzeS+qsQ>4|#F3x*^j; zJYwalfm1}gsHPS1ok7fdYEW#f@~13hi^1HA%L4nIuug00s=i8rz_Mmfu`L3zG3n~n z-A*6ZYUIvhd;~)F&;`~gzFfVYNnI6gd1Ff9GI8U8FpfJ!#u@=-bnVMeJe{A^na&utGSXs*K>-T(b5wTJ1+qoF%CuVizW_ zFcXMQ_WdJ~lWy2|C-mtHbxh>7G`)PbQDM-r_HIVjGe*g&N7C|Jp~S$~hkJE?3b4Gs zAZO~*>2;ax2bH%AXPFcw3)FPw4we@UoEx{CY4XGro}(W{55G~=<-sL~5G)jfbty;{-pzo1n5Q7J|4vgGOlwf3RSvWId~6 zvZnN$^oe1DRHC(k<(%h{_H;PYvP%PY2q1@5SnnIP=+<)2d;NsAnQ>>keg8XgzHjAD<1R`vlu0W zkXjhbsgn9D3>5^$PY^+>Fx!#q0wi~Lse~S`*WXpB!-15=&X#t5aLz}$MBqsMIInNk zXl{Wea`VRiuwd+@F^7zQ6=Q?VMc|c%T`hi7e}oOxI3e8yGG{@yStRnbnWb6oou$#L zT>XsUG_?b$h3ATWpV~g~;9>UzH=Nm{rt2^>Q6I6dKn)lRs>poOjl z2;}3XCTCz7*o0$as5eLBZOw5OG7Bn*o2Z8!&>LUgzRH72tN>l6sI0zwDP5=r2&q3= zEiKN1>RuSW#g5_IM^hwj0ykegLe7(iVn=Uqzu}7mF^T8VeBfM>Y{+OjNb?lwE4ily z2(e&Md>GUHpV7KAmHM`WiUz%H{fYMj3u}S&Jc4Q_Ao4;18e1;d5gYk9xSB+-7IE4a z~s7&21*oEWm^K zCfH|G+F$2_o?xKkyzu0N-lJf14weSLZTK#U90y74ox);G;Mryo!}-q2VtqGhN@l#- z<{(sLS?DrNWt%aQ{*{%b>UGMPo5$OQ)qSt?LaioPt#v8Y4bTP{Pb7s06#E-+Ae+hrCYU z#r-nb!wk=0WECj(<-(^tE0T8x*qrnN_eq_4KTi@M4I34dRE4OQ6d#%Q*&S3q*f!fcZPoDUHe=j+J>5!(S%-|AX{YLhdy~5C5cbrgQ zM7OpQHrgv@ym>6aET|pY>)jVJm1+}_J!%35H~lxGNvT%(wr)FZ^ff75g1o=h&XrsN z7hV>rCo1Llcp+o&iDhrfXo`l`f*V#+ITKmKj?kA^@pU^zImhk#q^Y z61!|ddi5fRp7^FPs^s=MUMiACcO}NvJ2o(S0d4#0j;a1EruhN)^<6=|j><=#>2R^C zEqFkS)yO-D686R^lt75dhtzxOG`}A+eT>bHWq|pzMzb0FtZy;#7$2RsV!8g6KvJMEpiKjI~nVL~1 zfv%N!92zbLOD72sl5us>qC*aFUI&?LiFALrJ9}xYK1srPnoHxJ3|}8cBMjPAjqC4p5jNV;t9(C`Qx$bA^eB}Y0l)*d2g(t zD`sBN4DzXsB1LoMPnW48_Oy*sZX+Q}Qp9b=avSt8H&3R#8i;UI0*ufiM->+rB_%q_ z)7XycWs`XLM)OrbUYIDa&hK=2ZAjn6L`J$2_1rw9NRjATBSK(rJ=hy#Xrd=b0`%XZ zDMi#Ic_=HcSK?&$1+=n;p~Li%*UP$*`KatO=dIKEfve)yXWP zk@aF@2}45dk>p}WGCW>wS(kB4lf5(iAS;v*QZ+x*9s82FwdOJEb!0}QCn6P%?$ji= z@tGk%vw+^&-ngBnu{0atS&* zj)=zlD^l3jCtn3MhOGbGjP$3`Y@xjV*#eKSD@UrFXJ5}YZ1flzOH|O^_JcJEN<8rn zk`9L@(UXAOknc5Z=l=Ek8mu4u@C@uh^0rVBp`0od(u)CD5~wiB-6e)yoeTcoMuVs+ zKa7J5>4`|juaq}TGi=Gi8`PdZ5!Hb|d`lcc`a>B{Ui3F4rB}&Oj7~O4DeURjI9c&s;uXOZwYT z5(sLdG&Imd=WcT)${`fjHS#k!K- zHtc{ym)-Ssdx`Zb+Y;s$EMo)qA8{pkNL8~w07|^e9iau}fsZ1|9gO&1ooax{V3SG# zQO9!}5ERLP^ygRVAaDO{$Nc&IC(#2Mf!zB!RyV3~S>kFpU#)EnzOHgPK#=rFx&DB} z`V{8MJ=||&Hu!DEi>jB_B9SR?`FhQ}hlFP#Wglnv^&{=`rfknE9ZL80<#WDg>%?tO zv=_s>9((`Rl}{G;a2`G&)P*fmN?;a{OxAHez`uM?h`GCOHMwWKkK6?%;hwBL~S3ViDC+QPXW(Y{}p=rov$^%2}1Z{=L0P4rUsojVoxH_Q=9$v-|J1bW+pA%=S00&U|@f7uhSjc}b{MhkyGf!kq;= zg*W{Ja5u++R2|m5@}iy#Yc_ zc7-!_p-~0CtsULm?Y9ozI8JJ+E`wN~9>L@{BN5roh|2@3rkVS#4>)u(z=36&SaCFe zPVa9@{L{CqUzO+lj-*w& z2U71(SeG$(-36l77M8kUjcR=7aYa2DccMIWS!76+VJB7NehgR9xYdowz^J~@u{cA} z^+(_#K&Quub6$K#Em>XH6|?C~Oc8t&q5Mwh6=1m%t5S`%jFH{$#tFGACJ0&f!X=h# z2{`VRvO{m%jPULz52TbfsF z&E_n^+5K`plzf%~1@;2@A=<5$_ZF!A;GAa^sd|Ym>vW81E{lz=*m^7`cS1h}FvScD zZHstU%%|pBcCX5a*}SQHhbBdg|L97>C8Tm!(p%1SF0!Q(R=sfA_ldmlyN)39nFz#w zPak!yeIJ})+gn{sgpvyP9`W_x-Oi)}oXbSO1$3Mn3#q3v%Nh-&hltY#emdl%tJ7ZD zT(xFXyJmvQgW35^#|vUHQTj~&*hIs$6bvNfJ(4&HnI28W;E_>hhc^roX3Wv|Q|3_{ zo*)hD6@-1rAMQ)83QzM~=VuI8#9q7mI0gXqDYT9l8AuWGO`f~@mR*4VA)*!wOk~s` z*CfLS{I)VurSH1)n{K+-Lhk-w?++(84lS}I@jr`Ig)K%7V2yI@c z3yGvcDD01Md|iE!*PY2}8WjRDHV)Q1@~~CkG`wvb@;|3p?UBfP1ConA>-=_0CXwiQ z!Kl7lK8_UU8svZ@{C;vlFCC)!=$}KyPb)jhJuqpJlsZK#8#ehOMcwrszG{Ykt7NET zoEA6(yN&8~x$2B?Yr?TNK<7eiT@2`R zMpCHjGA#5ti{t?qAQp8h+=vIz>??LVyeVlPJmCfrU;LJX!X4l#2Ok5X+5^p;@!x%& zpEHt@WquZMXCQJRS4@DH7AcQ)wsjaNjOHB$*MqD2onQ8JE}&mXenSv|F<`Z4H=IL` zta)wLGHGArF)0rdW$(CTXSK}vHGDajLj~b!t;ficBv3c2G;O@TVhG5Z0J7Y>DrndE zYWJyv7_*mF%S~_)FdgcwQ}R~i5~kRN7#622M6Oci>|I+V@2z%=5v{A&-5wcJP#L-m z(gtIqIj6L(^vq$JjbpI0@%e^~dRj8Dha)SJB&vD$=S&j*o-rck`95|I7gqZw^7|kv zVzjg8$M5F|p$6Xha_!ULr-*`2*i@uvc!+ORX$d)Mx6wt@fnsj9>o*3)z4C{$uUt8R z6D+r)_8`6hvwORQ#<&A}X215#C@5ktQ6jX^CJr3P{Pij=LGqST;^YXNeb|JXmax#j zx6K*bFYpt6G#2m=%N$Bj`|5Ld8WCiy7AmOIc>|#QOaJ^-k#zsFV}KIMKuWzaM0(oM ze2qky3%F32$I{_c5y-2qsBIb40-U1sbt=3*7H4+Bt!8L+Nq5aMK*vmCuD-`YRgm} z8ow1NGc)Ai& z5L+Nm%_n6GeL8jxi2x9}i*EzpOG;8;PS%B^)v5&xO6YbIH+Rp|u|rV%k?(TJ$Sf;W zx|n?2%~kD7Unt7--ftk~<$|Q*nrl?&lCZVqJEDor%Sf@DCHlK6z(3&&RO9GRPPW4Gm=HQN{SWatT@S)V*&Sz6HIhl5fJ48KaB{fn2Wak0 zom;e}tuD#Ma6#K9c|V{=p3xu#O9J{}VUQpOKym{!kF;T)Mq&zmfbpz*ejR?E0yOp= z{v3BVoJI0OPV8dSkl9@lx!wC1y>71Q5Re1x3=dwHkcgk!V${+j2eUQ{$vE1s@3%Pf z1(y))&+{vnKDsiOj_d9^a~3Q9x!;}q6E1*iAV+9TokaUQB$EHGpDt20$t`i`dDV32 z426Tj5Qx(-^$V2u!pPBeB`!^WRbYZ}V-vm~&?9>>l_(x(-g{+Y4A?1SFFeWe!^-#G zB>lX2QVs= z!Zy`hj>y(X_@ufia4e6ZE_&^n__~1p674a_Y&LlsW>LkomVyUjb&W3q@qEw7Ne)82 ziqjuE78QUNNc)aktcUqxBZ5}S7p@U_Z@8L05}HiX63H!B5H`3cnF9AV@x~8rmDr*b{uILCd zukR-{K(Jm5L$NUm!`KfW`yHgD4aQtIQ&f}f!{US>T?M|_rTSf*s@7WQ=Zu4c{4>H} z;m9>Ww=ZY+il9gUN3{pH9lMqv4$aJQg$htW&Cl+^}-~j!Gij&MNzJ zrDp-RH-#02*7P$J)>adwN}A_o2(E5ov+)l5Z@NS14H5lM)8P!@HyUkUfPCPbHBH!l zss)-@L?kv0ktLl$QbX?#z%bczNplO|>#ppL=TIAH9(;5bf{Ls}I3VG5x#(QF0HHI$ zV-v*_g6EOmkwmSX4qkLm2h&KD@h6`eW-hY6)>%HHSW3{W|Y^X({W$dCME&F4+oz#~JLc;X@@R z$K-djN{}+=goNYXi@H{!4V=@*8{>@OLa~GTfL%mFrl*dRkOpAx^->k{hL~Qoi3kxO0fSvf?Z1LyqGs-wCPNjrZ;|#RD2j(V`|lHUp0(){9lJE}eWz~gTfuQjEs$mW!>9oC4$E9;<1B(1lLp`zJzsqc}P-I;W!*eTvJ_nxl zj_#5+aEM5t!lM^!*2L#Q%{e!NY`kU7UOihA@_0D0^l(#PhWZTT;TfdHjqzp8C>)SM z0oBK$+C3+!J~BH-($pVQ_D)EYQ;f+kR6~}UJ@UCJ09qZI^f|+$$yIaj@-yA{NpJ0b zhC#m`K{xNjtzuA}Po?Uz27Zi|`u%1SV?*nySiSwwmH!Xq2prV!xv1-fl9j79$y69! zkM9BZ1yw{gZtOD%=kyU`MMgqigcN~g@F?du{q-N5f#m^*6MAjjsuvYuYt8aRTqgdL zx@s}>I4@nFK(SwVUs6p{yMjTq#4F z=r2Bk5D`y&x|Mr0QxrV8&^1aW$TRniT5! z_Fq-e6CLqE#XR>9Xm%{3X5N+LQ-4fyKhT7LT$OU!*XLnR(fhwGJay`TQTY#sJc~9Z zm)lJ&{H~#FinWn9lx7M)ypi&_Wcr+vhf0nm`0|X_DKvLH z^gndsr~6^W(a0i+Z^+gdjH#II+sLCv#(b?-v&nz{yYDA|;AfT~MKshoP8|=59I$vG z9f;Ps%&YqlvN-e@mNKYX5joJy{!7$A95n{zcWiz zK5(YTrnHt^K%_TiRhl*(Y8{c^oP)#rMG5a_(@xGdc?s|`WVsp}wV_Se*rTQU%6E{Kj@^M9n%z46Ss0Cx07<58`6bG!U@~Hj!BTY|ct47XIMZt}@VN4XR`eAv?ng zgE8cVc@{JMn;NN8wyZ}W@i%M;ke40A>#vq~94WO^p%x(xV*Lm|HjnIlFU;^}6tN%? zdN3;M9rg0+oeyWp#Iqe9AQRuc5_Q%xK|UD{DXM#mnB;PoO%GZn6yq(YG_Gl}!(LVt zxHrh=ks&1QfFXk^W-p28UC^UKlfFrb^B3^|2J5&&{2TH>81fe%+MvhJjHV?RZpQ+~ zUW+RjF*J6Rh$7l${vzZ_>Fk!G80og=ZD32$AZJ?jf@K%&9q2BKl0fjTg5dCcTY=MB z@gWT*Em=eV6@$*ad(Fga$!&!Nua#|Na;^ktz}j7*IOrf|ZjAH=nBosbEwapUV}Evs z?t(H>UqBUJ`0+{Ru$(h4|1A_8nLSW1oX-c{B{8HeEsZ=NoP=B(WBB3Jo&j^G{nU+1 z{!HZSc~}iPF!yaK!86Q9NrBiAme$Mb3^DIse;hLs)wmWAUK-xRk5n02xP7MxVd~Vi zGz$#WK7h-8tq5J#p-Mh$Tx}M%o1EUdSya<5hwvMt;Y5-C97zHY*y7)`Gmu|13mTM2ye#O5#I2LE8nRhlJcnL;IXu5C&*VZ>PutS~4g; z#}7yaR0|wssiv+D|8TYz^L}cRiSYrK-o~E6{czt0Sc1vGZ!~s4)ugm_y9XKsFE*H= zwH9d;DOm>4-9zMw?GO;r2OcCD#b{?oeg{s#25D#+%lXTBgs-@04txdMiLdyIIfnVK zhAZe(Gr!EkfFa!$AgZ+}s`Cc<`YF?fjUj^B7psp=x3F?Z@R^#8rUhSMynJvEmH3{z z2&<@h|GUe@ODy54a@|Pnpf5b?YL3&6=y(Q+c!S8}A9{m%1}WqpSpPpY>s0M$%mKg@ zCzm(!>YM=EPQSRW1QFx3GDk!{%yic!Pjq9CkGVbsW-dczns~n8aPKi1q#-9RpxG`| zK|QC<1saDe4{-45Ux9fz^{SH~+JOeLWW&Y&ME;l~&V{%}5izCJz>*e28Q!x8kmnSr z=UP;Nrf&QN1_d^=-QtEhyM(kO4@sq7o^1k$<~#4)dZMEgIF(gCL z%AZ^SYS6VX7AMnzw3Qh`64IBrp9z3u9FR6|jK?q zv}Is!u5HK>Q%q|EZ{#rmc#sg3tUaHY>=J&pt9iT4%cbJhAk23 z@1_5YX~(V-HRy~d{~y5SU+TxF3ZM$%Jm)hAJy8%y<+@WBZzSHrS;fY@AHxRa(4tW? z<7yXebq~>#W^hw9%uD~q6#%0--NIx52s1X5ZqehJbo~^9<{rr}8EvkQh%_w}6Sw~I zOk&cSj$bD915UNU9z=oPE#fYL`QqiCOjFY^2V= zJr)c!?}6*}KSun|DDn43hde&eg-XfRF>qBE^hs!yAX58Yy*$g2_0op9$lHUTj zt}mirkV(crj^I;%f98+X1^>%yFa^ORoV6R2|5@_!V?)yY@2ww4Bl=4Yf>4JeG_;Zr zEW(+aBjacV*0GkrC^6pwf}ai${Phj?1af(df6maay5DJy_*)P5#R`5j)L9^J8C z{rE-*_mh zl9WC4`Ndd%(gc1je|r)e6!Nqq&wT&)806%ns9(zY%h17NsN5VWB!<7n_2cSy%lSu3McnxPOJLmy^scD(2r+>P*m1ajs_WlnNdH1X zgb-x@eXolr`k#759h5G9O78gTe;@U0egFsL3uaGs%DTEB%cK3TqsA;o^rG*5hnvy6 zj3|fGIi*!>ZAQ``yv$wQZEkB0YQ)xQC`hGW9%5UaiCy>_L-hJQwc>WZD14nrq1Z-D zF+@#SnR3*5Kgm&sR_YD2sc3Ulhj@aKlH#_zBbP(6Xv9id#p+)ExDjt(n)nSC)7Lil zc7Ie6p_-7JCYYe|>iY24`HII8$qz-I(b9SX)oNLdNf*uV^>#gR=H=cuyFHIyuLfjdHHua(?pPQ9FAg$hbwYM=7 zn;fr|24AVP)m9(9ZMmCkXSwPx_LW=@Y@VoWKZ4gevnp=c*V5;Xo$nIHBf)3^cCy|Z zvfIf^>YV%1n=r9il301O>#0lWR;$vXiC9Z>22HZi!URU3XDc_=O%dnj>DBS_KuHFL zptotG=L1JWng?S^?BIRDQa;Ayp#>a7j#b}EzDKMS#($fCYS@gUK_vF9S9|te%CG3} zcNG?u==FgR5IK~< z{xZ;?>AUTL?dT&X$2lK%*=p0!rbM*tCNs}NvgQlu6Xc#a2b2-=@b5vkQ%SiJKrU?@ zOV12!4JquVO7*md2WKa`HY$1MsCK5CAx`C~oE-BVD}&53#5*l(T8~~{5`HD)saL@! z!kw}vgB>~QqL=WLl|gOScKI#}12>JWy-GH|!3V4CX6gVEPZz4K>+ISouBvJ-dk2x9 zV_D>MMPtdAEBBMVTr`ZgJ<*ayW9~5|iv`<_w)*oNDu>EI`+H=^PWl4Me8u4FZ^O3I zpZTwQqE_wiHm{5*6j*3g)Fw``s*L)sRfo25A3$ZZ1pI1dp1=*AKAYGNo>~F_Sg=9gf^fm&<%( z&|j?TK$3irYjd#k5ykSOSfB6^wr@aN3Ro97PEg4+(7%;0!Ohr9rkLvSYPn)(SGHC^ zVJ1z6qBu=Lr@q1a2H0?qO9$%P%!l1R5pFauAeFe0>;4kEZ-Q-v=3|b}B-+)3@8uQy z4bMua2CF*L>S31i9W7ICy<|<-ZVfduanXbhnO(;`&~!^LS0j~E?iQyucE~H>OpGt! znb5{_Y`#RllgTof&2uRF>;Xa5me0k-AzpDFW|}>eOJoEC#M6BYmHT|yJ8Jh7 zyXKmkm)B-INKHTH*Gx5s2&Y&ndHM`Z1Z-N0cJ1?5v)+x#P2ke)#qCPmuyAkW*NY{^ zKQWI4E>4sEg9Esko!yt~9UFxlnZe3ErO%`M`s7C(H>L_QVIF&T9xwXpl zr#g%ExeszS>}oNN+wWiR3*+}?tEsKpc=kziJ(0zZDh_+s?XFF19fD8&RY|P9`-Qnv zf=!*z6K2D^soZGnP2ZkcJy!n~mCcV0Tk|y}Wm&0vHi3sRi#_3;OPs&DFp&R2?UTh! z+ydX$N~adj);e|)&nh=M@T&_IR1ow{$-y|-K>RWS2S-DXt%KxUMR{teBk{S-{`|O?%R32 zh3@%DDV@nM+-9oFJo^=2xQ$Dcdk*c{Qug@+Y_&y({37Q!4B~dR+r_R`prIq8A19Bl z^mgT(Z{_^7@aTeBh}_7IJokZRl_T>EnVsHXXu{0cdVVX@t-RE4^S4#_0&!SmRvHXo zwq)XdR%^c3q^_9cLHdj~7Nr!0DI2@0sTIRA=7@aWs;!&rq^wOfx)XTyXD3?h?{1Ko z6+aW};d+bKsw65sU#sSFpfE4BJKDOp$50x#;j!ASbr^ck7!F8j2!N1QXT-racEya(>!pF-5{rp*c9ZWNGBt{qyZK|B zOah`D#_RJIzBvfizqwZHlrED_ZI5y8z&3}I!8u{EYDRT@H_t`?!wkCCqtMLhd@V=> zD~I7Eucb*gMduYj**j4$6OTG7_=W6fMr+qqKA)Ph(9-0hq=08DR4SeuwS2BTGOnyp zuriauNC+5Su(lN$9M_HqT9N3?1FHvR&;_dX+?rz-gWI&P}_v%7sR% z=gXlV8>%#;tq@pk!PB63)7U7Caol_rD}W$TR^F`gym_fOQSMVT*A**4|43%tyZ#(Mj)QOp+O zl1s;j`SJth z!%}KL`gvAhwqD!ddN_*`y5s75xMgpnKr?#VDhaC6!sF*>Ai0R8ik>)5$9V#(*BSKx z3l9o2#%fRbuG+P>`g(l6E3AA2kMJ7#-Wu`9>u4fIW&?jRtpDb%j@75V_^>6#FNJX3EJMA5m7bRz;^)2X3p53~x#hox^VL48s;%$=LWY}Z$ zNo9-2*f5~bsP~f*4VRhmTl#GXC*6oMM%Q88Gq~bjcY;IPaPh^U5fz#MotH1274eNg z1Io|sgMKAKU$2h+PCFjwOusJ^I0{y$2o^fn7MWNO!E{^xqa=y@Jt=P$IqLc!r%I9- z^nJgAx7MVzIx;gVi10H}7p`gEBw-qEW3PVB*v#>Tb=!TpQY)jas411ei4cG``M7WKe^6Q{{vWP?YlSw)W+W(23 z16`e7_Sz-ZYt}hWG5x7Tj5`}& z&}l-j*Cc3`+k=V;`nNIXSXOWrI0>o`vkIw{ZaDAj#4|nRDrOmGlfLaVO=3iVNPK^^ zOnEFZRQ^&3kOAYt{22T(Q6Z?rDy~SU@BE<$Kg9uJ;1-JGjqw>N9{{l1vir%$gvfpAIgbtZb|Huj(76WV=RQ$0<&Z!Md=A z(Mh>%KGBWi~-|W2REr@95c;b4i_R~YqUqeh? zgL>jBq3y_To)@Y29w)J`myzZF^YXVJt~C|s+0>3+rqp`%xNNGj9fwhQ`Xz^+LY|k; z>Q{BLUfq&WUg*uja!SifU3)JU8JYI>9Xv&31WVv>E%|e-+O^M=Qneom(T{Xx=ajD8 zAtMtI|4mtxDB7w^@{QG!nvz$Hdb6l_U-6I@Uvd&OTrTNKwjV)Y{*&gWW7QQ5IWuu8 z6F5d}f8+AOjX&5gAvPRL zX??Us{7QD_`MlmLGcD0t87cJ7fd6q`u#)|TJyFa}Y-F_AZ)9e-RM|E;UA`O$CvYO) zJ9ww%+opIK0#%_BpV~r$k3k&W&kgTuezqhh~nSP^ILT76A+O&B0U9bvE_ZInZgMyD&MZq3`!pAFf(rl-%7B4MTfU`StR6o?HB7}F0YMT z>E5);w}tS4`osL94$O;4`^y_VG4ul^{Sk#mA|w7(-X>XcvLjh}Om!OzeM*QvcYxmn zm21WUO0l>OP|6kDdwctr?G_4Irjid>1e>4x1_WYosJ3!V8J8})iA>E?z#B9BhVMSl zs(P}kIx>{LKsgg@lXcTVJtoR_zSMF?thkv2o3s%?%ixuV++m#jSqpaFlLQt4QSDz# zN`x1*!~;FszTM;E*z)5rEuo4Es~m%)BzOdUr!?Qmo9fIpap_c$=%J$%6TWj{69PAH|B33dEErZ`|(CV6_^qVct`0O{Bu5FFFG<&vHVIIKe76S*H zf^pUEd}KBe-LRQH;R5bstJ>8%V^SSyJ~y^H+rKt}fg9tC;jv23c@lTxdzt%g=(f?8 z7qIT#10f<|VO+q=tcQHOsfALpGEBoJDhr~Unkf?^3{ zR-XPiO@c=uLFHC_#meWk@w!DTadxRz?%d1@#!Y^OavPIUqx-|-!ybWC3j=DQSpjsu zJhxPbaDs*QXuqZAYCI3P)AMkKawn=)PR!VOdC7Llt*b4tO2t9wJ#Kzebd`2x#LAb6 z01BPPL!3~Di?_bus#o7Xyr~Cw*23N`8>28-UV2b`d2d^CIg*0mM_`fmkRf!3Eg=oSy*NfWo^_ValmAg+!+d?=BDGu zYtPqnG`*QzKQ18j^1?OX(T^AjVgZm9O|sHofS^8uELVUn9XZD@(vfu7w+QXvvDjx6 znDi)W3dN|dHzy>1Rv97OeLCJuQz3er;6UF_rFLndFZ*(J$UZOja#JtKL_R6&P{Ynb zTeY#6P<9^~i_yap?k600&(-?Y@H@yJkZDmi0M&uFNH*3uV{f2*4c%`j7{lwT^0Sg) zQ+nj>*x4=m7t(Hk8oKwNj2$$z$Wp55DnDsaEcD)>mU9Ngc9|`^T2(B^$LqQv(M{^O z)8b_^AZ<{tOD;3WeE(u_ak}wMyQqY!Z%RCdp zSBr$M(FF6OyWG#_SOU$}P}|cwa-=Q)vFt!kl%qf1*5F~(<3sb_uBYi5{j`U`WRv(z zbJ9#SmQ3kPQRL)ztjnL4MR0EP1*A1yy21P@yP7fx5SDlQHooDg8o{Q7(#l>bxpNZA zq{;RaE2xA}Q86}DFBJ{pFKA^Jkdz&inyO^*c8uylij4F=M%!!-2zTX2RUaNGom1%= zVjlomX3>d1yb1eE%u1%WT+*sEgn^jihwrD&%C}(D6siyMMsY7AK!}qp$4&peu z4)E}HK5JR7`6(qCwKbC*dPz;WK043G5LEGf)#%AazBN1dA+p_GDOod>uaA@RH+%J* z^;KsJlfLhqY>TtCCcLI;)lOqBQwwg{%udrSJP>R(7m)_<+wivYwXqI;LrS}&j8N4` z`Ul*R@KQ%QmyimcHQo-Z_UR~wivq@3^R{+oJ66N7krdBn5`lL1rUdWn8U^Mt5lHL^ zwz7VEYIiU<)5QE858?hmSE$#?vCqHM@(*-)d;`HY-^Xr0U)_H$Tek95=6%gdQu0xk zEtguMS^#T<^qZ_Cc9-HZ?9LE+iX+;st`#FFK2?3sEu4gqMVImZd$Z+cW%BQ zs%`8CXo=|S$}JAHJ1uKIulvcA@cXh9v@RAttbfz6D^Eco`=#O8Cp8e60LEm6P=wGh zP#5WDZi96XtnyHDYqQ0XImaF6Z#O62Ah5AFhJmi}+oNt((2EVGsx81DQ}7M6{apA; z(+wZL4rbToP+n|{d`zNgn&#D%SKAK8?A9Z=3 zC0iLW%U1He^#Q%J2+Yjs59H1|u^D;S6)EZa<2Je;urmOL_4;o%{Wlxl1@ z6J#w6Q&q?gdm66|vri(WN*1PNh970zFk(c@9kJl62ZT3JoVO3om&Pw|VDVOOuZm1& zN{}@s$=QUkmv?*g7eCcXGAX`9cI|Mqi%4gZ!S@aKK=IS^DxQ*oJG%k?d2+**lSLY~ zp8{#Tidqlr^=!y6d~z z3Q~kF2RkXL?9*kStfa@8Y*=Urn$Mo!@q>{yG|Qnd6qC1c?I&_N9%9<*&qkxYA9k3@ z?A-WZ)kGCq$Rh%(BoaQX`UD0JAN-twYA8D+kTl>NlLb7VS`??-(2gnz-@=1r;C>Uxh>KPm5;3PI5pp; z3iW0w5M@`HdXWg7a{)nyKQ? z#CFUUsrPts>o2|kEw9ynTSHyGWO~t|rzser?QK=V2nPnGCIF;^vgHLX1 z)Fvrz;8&@cvE#~TMhr{BB>)eA4J=tT3g)d)s$Z#eFb!;{Ao z(j3-$LiuCUZp!G1q$sF~2Cly0PTDC3sgYNXwe|D;%R$s!yT-%NOAw)J>VRgu2deO; zd~R}Jg@MpmrN?rU+cK`aFCqjWqC@6VF7qr4*igs90T13_wK~4&n{+64mN>SLb z{~)HTCSSsWESfMRrKP2d$f;t}Wz%(|^fuCD5DwpD!cYHpdg9^ka*5Qfi3EU7H8NlrNQgy(SvnfEH1aFF~uI<#*nZ>fp)Pr{wOxlp&AE2!KuxC%6|q%iSz91 z;qBr_qdQI1oh5Vcdu{sg8QF?A228v(46Zz@7}eFjNgEf;kkUq*wFiQ?&hnSChN4yS zA#2vQenjTc98fRf1qcEu*L~eJJFV&gW2}~Pbeg4{dX|=+G!>Co-W3wxDh^FkE4zF; zEhNqi7;%W)~9 zM<<=xX{5+32>?UEom8$1$IkieZsw-{0G;jP|l_7a-u!y5e`j(8!{;Na( zSLYY?7U3NZ1o#W77h_yiITpdW%f(@vkyMvk$|OLhlv1(too{EkeP}A4OyCppa;0d2uk|4Lt1ylZ?yko8o?-0ZA2NNA_Ru*CyveYr{J6Ffd&y38uL*`J%kKF;FtYmn+!r;|JPF?#XN#1dNXP zwq&6q(@`RQWC-;KyPi8yH#kDmG9t@;y{X?4arOBtN>GU{XYIrzZvmjBQ}Sc@oF2~9d( zRV$r6Gi;-t(MiZY9Nr?=KxZrVfL?FTV>3(7*==nxTrKDuPM?F4Xauriu4a8n!zAG$ z>S9S(Yi~Ve7)#&a^v73P;;cd#JiB4oFQg=b9^czE%c8@;$!X$8$5oJ1W^1m#q_^n>c`G)N?ClNM zK<56oJ#TB^2L#?QKzun3B8uNB8i z0#`Gg1iWuj0|f=!dN8BkY@hXOA!w?AtFI2}7+!&wm<}u2nRP}sJlM<%HOi2X)rnTI z2?jLm4fjZ(Ia*Ey!LiQURk=nJyb)kaj*B7C3DTB~zE<=$FWqd9@u#xssC5qWS^vgT$^t$rNzLWbR7Xtvl~ah1Qbi6kjb*-{1bQ zx1{8LJDltqx%A4geOGCm422wC%8>QeFoGkVo}Byyd=vBSJg_D5TM<974dZ7$&;5qQ zhg3g&xVGmRi+|NS0^LNr3F=u!o#WlU$)B1h8dzTiM3~!*TG6CWwV`tJ8;Mt(8LR5!q37Gxrg>w> z1Er96Rs(C2B6zzuO1KT~>huzA>D3n3MU)DFcSG|Id#j!_3W65_ti7VC6Rq&QuEKP5 z?EI^*R_Md&-&R@Jd&kTgmI*{GmFYB&0=n4tMi;9A)-Mg3ef~r9UfzTS6F5%Y3HBLnFD_j42j;&az zvn@|b1d&_K4)ZcONgMCuhW+cgdGPwLLBO9Q-Fnh5SFxo-&wU_19@A$p+4JVZHkblZ zhj1rrEo_K<07m6u&w!A|RF3(V<$(`CmV6HTdSG0jwNC@{PGGoIS26$kJhg!gulMtP zD^{99DD(ypzNnTknZK3F8QxCtyZg;?th`-o+8w7H@HZA zYI=o54;*zc5KKAmrDI~i(mnu`(#4}$0DQ=E)D=R#TIrpzKMPFiYO-l%XHt;E=i(V3 zPn1ijwJ@2oqNfUcbLgB9u~e)j`LYUwS{Qg+aUku1vppidAT!}J}c)%Q<03a0V5MvKfKt+gQaa*Ah z_4A7Z4{faF@X1{+a@V^Hp6%fHVAPu?-9EL3G36d`?&E=$-~_=J07-TEj5Q}5ASM7h zptGjAl!Zg8*=tj#=NS$D)Z4rH!t(Nw>0=9+TZ?=DgPckL*;=Cv;pexhKk$?nL&+`e)t4QIXhk_ghZ4BQfRhJdbn^*F~c`ZDhRJOQ}>72skaH0n^p> zO`)L6m-zCd1>>nfqSO1S-Dc4ul@bH1fy5wVnIt^X5*2eTL$Bh5QMk1lN^a1Ljk>Ea zXt;mj{Irr7D%A{aK+WcNyZ!5^{yT2XMg&Xz>_#$U8>|y}LhXFc*5VVAb*sK_y2~Uo z+eY8mqO{|vn7N!UKx?ZEIf84b7+w)5?io8pythu(+^9<#b`5@4qql(|_mG6^+>U1} zH)rLjFe~1Aa`7@S$vMG`V7pFShfXNnZ8lJ;%Gqk^K9n=VFSrOpOGn36o1}6DBt;u? zXYdZE7OxXjtPYB#Mu2>|!b8fU_ics-QYXdz+?SJ=0@7i6)&FtZLZ@@-MyIpH611E zhUMuns_T%r z9?++P&LCy22E<{vFDTl)KYA}^y^Bfsu`9wOu#SI|N~Xm0TI!u-H9ylLztDBPFv7_O z6NqH;34lb&=@N5Ev}6j8o@oROovT&9% zE7^W7Y~qG`n24N$pC(H_rWCkB_IgH7p4>|%oVkoeF8N^&MZDfMbjyQBbtC~-k_FSd|GF6Qv6tIKZ5yt%oofprn)AIP;CR6G(W(o3YtS`AlsA%(ggajBNp=+5|rw=MGP*gRYC7n{d|} zrf%cNTaUqkn=ZGy-3o}96@*U8kGcB@>bLxYsR~{ajQMkX!6ne%)OE@7$^$qC5_dOF z2oPMyx_>DGfz6NL*%%lBrZ8@WXaYW+*^lo%BYeL2>JBw-iC3Qbv}!^UtjFu^{KE+@ zU0sE)0S0ScTCJJ+pN0udi?S6Rskj760=rmT{HOt;#}fR<=*)-)c10-u1CB-d1jX44d=G#|*F-cQ5vy^VFN4UF)<^ zYrO~sra=S5LwoS0!XupxtmiOZLuUY0Aq>|~aKYnw*6zac^_II!ToNcE1_%l;z1qtK z541@V>wwr3NDid$#|;4BHWl7+r|rf>o*{Q(i(Z`!MI_0$4vMPf(1W~z;mUTAv-Qia z!w<_q#!7CFg34qna`0Tw`>NvGBcNHB9|N?2Mn?XTK6x~QpN{j~SMHG_)py~?E2^0u z`h4c@W>tsp&sqdr=Cp5ci&8`cnPdE~Uahl^Oy!!#FFH1^&U-GFjSbh35`3=ANZQ5=rvFtVP(-Eq*I3~{an7K7^V-wTP=Q-R*LCZ%S|IP}B8)Go z^QFA55GWdyqq<64$Wj zZJP`52;Q2Zg^t^NdTR&e9aEgV8N3H07u^c^Aerk`z_fyp+jRisZA5no9W%6tO65+0 zrfh)zjI^vSOSd;tHuvf1vR7S0Wiu1D!4v$(lyW|JpLQ=RJ^jv#6SA1co`rQhkWEnE z)tvir`0&Q-f*pT5eY*`i)NTOteHqYu-9&4NAHbor;1y@t$H7oqHbCY}nkph`0iZ?X^Nkq5^@@b{u0^t>DtE}ps4E(N1?;A{o&ghp zPtiaJFU&#=UNs$eQCe)ch+ZdyV>474tV?s$SQPHN-y z${S>+M7n3K=xY)F8jy6a0S=7x6!+Jn6s&iKL|IoUSODB=IpAG2@zMqG6@02;QDKx3VTdJ7jf>voT!)$OBf{6l~> zfDhs5;cHjV>gu+tosc-c*CZ|x>Gm{|_vAE}LA(#}Io?gPFCrHLXpmBetYb7j-$*=! z7>`kH?sTp`m^be@Wg*}k7~-)=?0?&zMzO*`ykT?G&wlL^t-$Hv_@t0IflgI=E4?sP zHZCGq;3D^tb?jvD^@ewj4-WK{7Kwy%?;aYDu+=4FP}ya0L|m~v9KQV1aAH{?>Z%!rWUO1)!<0_LXGR)m*Sy%Zz_D#H6)Q zE2|)Gv0AYzu?pn`C~~jtLa45_EQ_OR4%bM$0hq3Y7r9>|`Ic(sT=^W_y1jAXcPH*t z*BAb8g`Yqw;m)$kTXb*f{d;w2F)jr@j8GEgqMXPKV1c0?*x{T9akSbTL`ByfBHECx z<}=Z#Ydzu{OyZSpW^b@@JCf%rH@V}XSMigBL=xrwLc2HRkeW z3~u25aLQcDiE;h)kxdJZ^;qsGZ{s$0Qbg%vECb2bM`Y>uA(I};E2W4M9UP-Vh@Y>@ z3wh839_i^|rqG75Sxo)Hs3q`40?;yOI^JaV*0bE3YjyJXybzoqLof4~am zry8{fXY~Z22`14)bSZ#1`{9Yyu;(XvCud+}&ijC|tHn>Whx1qi0ZW7R-_E~?odsuk z3#QLsZK$ENkI79096w^_!5e0kz=V7B;j$2CE_y*BGGTZ!GcEw5j?e%s0FDi7bVl2g zvg#MK>5=FupwJ|?uryC}38)Z^O8^kuh;Ib0GOj*60IxW%@;=A#RH9z?#F#|t0!CS;>ZdIkUeJr%T4jZ%8Z8* zqZB~e=IV0jPrRgdGx%#iG}4#sB2P->{w zQfSy8D%I7t%KNM@`AV?wSype6f@J@=2v)KyP!1z~)I`yKV5aTvWwyBaK~C!lfT6_! ztaVJ>3{jo?orQ{I7xB;K0$h$l4+@1Wt~?Ap?F~=^McDRRU4p^A%i;Aa4H!TwrClqKBKHgltj(NWNEk(;k zUZf9TqjE%6_Usm3lHA`MeP@wg_gumcN>;ULO*PyuRhTMOugI>5021uwbCi49#y#F~ z{1tE#i;Dl$NBCiK2ALJVEMG(ObEGNecvC{&GZxz5e zth7J7v!`S4(~^FiXV;doxordmOlQ0{@q$yqJ1&*vAGl3rA?@7GRZS+))EHHpcU;p^ z0wYB3@Dnc3ghz?2E%|T9|G(N@Ptd>a>hKurNucgK5{pY@uCEV+nhbkD$*%z7EFPy< z)2zvH^l=_D#-e-<1}OAmHwL0(h;NcI0jCQ2o|c4lcEkPvERLYydr8f+#-MeQwrs`< z5bL0Ew>i%9|K1?#M(qCT`1s)h9O_7n@LeQQ((F+C+JLVLLZ@OtSw`S}_6F7qpnyPM z?!s#|lvV1U{ZK%i@lGm72dh@5Z80>91Ta_*2e`j-Pxmb8vW-eApB-7mD9FH3Sxbi=6+%XdKLXADs9W@Z{Pfc~`>*w8Z45xWgNZ=EDGwoy2Dr3qdY*nY zAR?}5>bZdoQ>y~~sy2N|l9n^kLYJeKOL^*+t7nC&r*oy64boROqgX%&CLQGbPoZ+`K^W(p@ln?9}J+*+VOFB1-C!jbxTdpoouqrXAj#7mHUcA;vu zzN%jcAj3Hcl*QzY|b<-e(~SKn>gaMq}F4XuC_nr1S+QM`eSCFJn6;v)B&9g^31 z4O8|2()vn7k{%cF&>IKIAA|#`{)Qev^T%Ra{qytk$2WjS(szb7OC(xLd5H1k{SCx% zP2}F_Gz~3!%G=@XqF>UD1z=0f8`v@*vFCB{qe`}PpGh~ig#@nRofsh9fBp(pXQf1K zuG*RJU<9kgs>|}N;+vN@eHA;=s1;)D6e2ikQ|Q!6)oE|P2Xk>q?Y1=7ZzF^70Nk?l z7DUdzU#Au@WrGIy;VR{QegBa+|EFA!pD^+=+>4qP4F(iI{2simvjN7gtZRB%FxKh1 zgrz$Sr5v6RU@iT?1$SJ;Nwx?d+tgQUnOY#>U*Gd8OM1t&!4&4BW4WXD{xMc}mY02a zHP4A7l-)Zdr`5DKzCvC&w=bkMB7XdayGU2di5K$HA@|virq)-B&+}<}v63Y2Vy|Ak z`j%63Q9Uo&a9WHCWn>#>l?s$ar{B5wYVrx$l?r%b`RDV1+y(UgXwyVV+V-7W@yFKu zjdl|^ANIaaiqacp8puJ0PU=@Rl|=y%fJRmy5X@eF`>0^Qqk4NeSu`>cM&KMuk|))} z@uX_KXi@~~-f02)ZBm}k{cZppk1Hhcrn70fc+Y?46D0$dpILq@A^Bvl#~ok7sbKT% z5&->PMcf|*DGSCbNAYTuHf+9dTlNb!Ufi-T$+&y~iXlm$GKWdEuX&g+U97%YqRw;fXfa9(r=n1~r54C^jS%nz3k4b@= zq204qeS7`2$d#ZR1qyG|cFG2TI|>x~;{;`~cv{BgP?hOB@GN|JWkL>d1N3^VV+*fdbA2waJBKKS znJ6^3leoTEeY8#jKrt=dy4<>~$&*Js4ac;jb^y(Qk8mmmkqG++7BhhmV|w2?U6+*5 zh?;7VegGwX?Pbkc-t!H53oKi<0bQLkaKi4`UF(3mkr^Z7bahH&0WLpY*lxz;n2IIH z>=F-@!IY%)Wx^5n*HAX70tDI^2K)oY>WxGe;s9akaYwH&FeCW^{j>Plv6Gk&ANQmz ziW?xS(s|MeL%gO%)Wb$g5E2hzd@ccQRpkIsiE3jaKmRJm#F`K!NT>jUCa2iE(f`=~ z$r1I6d39L~uU`V5K2USQ_F_5C_(dG#AZ$P| z$)@Fky1}0f|4(`yFr~Y9(Er9P9bkgPXX_ww6{JtJZTd)*asbUe_bD4XEE1uM61|(- z;X9?)1)rgpgIBRAIjN>oR&^XFQ_fmGRFY$}<{;ZcyT-^hys?K@1R z;Z+d zBLbJjqf%(@K+F>@^WG^jiMpBGw3?5-=pc%F^@(BYYH-a>^~=x9-WO>61*CF zW*eMxP}e#QBV(t%9O~A6wnIUmA9Y|O7A%M^0p_ke2HaJTJRq~Sn-uCDX*>zIZC#&r z__~IdvXIjH^aO86iKVN7p1ekiAA9k7&L*w_B(lTw2t>bJ3 z+?K5m9}dUsY1$P`y?s`L7p|C_*#)OdrAnNfXT%rJB*$JOR}3_9a;D$sTUZKZGy3$! zHAou53gLXqtoCge+dKORq$ig^qlZf*@7dzkSu*7E+^3;w?7`4c>a6{iKwa1VAX?8j zS7uyYK??;|^ANxTuAK9O6nOPj$kzk7;Yxcy(MJoPm6<^)4E0~Sa+$HhpM$Yaso763FugMJ_d43&a z|6*!5x1Y%=R?wLjp}0xMm}fm&Vxb`B1?C!j#-vOVu}+?0QWzz)KH&=m>OCR&h7)=75u7&s`ST^UH&NtY z6L4+fFb2?maes1#{?ha6$8T?p+e>JBjq-k+cYmtGYDFslQzN$%ccJXw1OUs zzo9pxZ{{TWcA|f`S@)k?{$=@o8w>aQt^b$1_FIPrT(hq-f9f(|{Ji@A(suRN_rK5T z&;nTcO%ISC{Cpewl<1ByqyN-6|Ni~?Qeml=RgD;!|7sEAk@o)>VsN91p|%qR{Xcc9 z`(cKJ=YbVKy`ZJ^9YJBa{r}s3q2C8O?D^a0t3yR#UJ_>XpLG`gLs|Y=KIYkh%X&GB zKsy?L59t1PJ9Kk_?~7yv^w?YLt!Mvz9{L2G@bvG?{|_wOk1OXs62VMY?(UiY>t)2~ zg>Oc%|49Y>-J=@J<;>-XZ1=owhC|I-n;>2&P-5%BW)JnfNxZy!4jw;u852mISd;Jb(k??d}o zo~}hB{jNWN)lgz=JN@ep{vDO|U0?{H0_#9Gt%Zc~qq*K6cljqr;FnwYU(T%**7 zz4Dh#KOhyqhn>G{`QJIiZBoFkX7HE50sT@BP;4pxe8+%r{38=d4)oOH2uhja{yrk- zW`jhCn%x`wyz&w1OHNEVzTgvJ{(kFlP~9z@!G!&zTZ@M8FQ!BrwIcr$x)NOS z9ksiON8TP!iYCvg8oke+^LS67^9-l88dQQznhxP#WyReI1j`%)C2P0+O5mFywe|}- zm7rH|lq|NMy#Vqzaj>NW&F8v>OP8p28?PGkTDB@MN&dQYoG_dJ{#l`7x#mN7iuKFv zwaLL`LmlBt&F8?)7>cWEC^H;7+C9lEsg_SK7s!QY5-nGQx4mNi!5)OD5#qVE$TF;Nw@ve>iPM=ow5_aiIA&#*gg9u-KFq3 zJ=!xLxq|9Jmp9`mPna=8Y3bRwO@#RKbSImndvWi9qK$Z*(_P)cqc_(cvdXlH*IKM% z*2kNR=LeBH6`mozTCGpL6QZ^E?hTF5b2Mi!c`z?zNzYD41de3D(I>%e;Ty^k z!^@n!y*E?8;+e}=e9G0mueWy}|I3h9IGNWC`V9$VHu1EdKV!00Q|YQQXi?H!^3B6s zw;(K0b-lZ570*{98kCz;(>J8(+Nz=_D3*^MHofOGF-E^~sWOCmO^08cfys|ky2ZWGIF&~ zmoJn^WLj&!ChdOo_!*M`gL)$WHNvBTuYJNo_uvg9S)mHf?C(qq;Z$hv9Cq$uviYK0 zcE9@xzRd8J*Ry&)pV9GRvLC3igj;9@zj@bM)UMD$NO?Z8B8ELW?r!_!h1S&Fd*=Z? zcYJ!t%Qh6t#2`mi&eMvN=K;E277P#Ub2BUaq)p*oktA9@yI`1egV9+{SoapGtld00 z^pY~w1sJR@uu>2{>bRIdF*=RaxqhXr#7%=e7vB~CGYBi=xGZ2xH0aAM%L$d=8!^K9 zL52I@^k(su!SH6?Q8wg0#$Uq&W*l|l&{4|7+TEzeN)1w=yg0?)cdEg8W*(a zv86q`F{4-7+N)|co*~l<2?=i^4nG1X(faJUn{r(GqNYa0#lu>kS9xDXjf&3}O^S z#O<};zAFX`9swl=G4O1|sh?U6*|2m%jelCGm=h?jlabFHdxZbPs={+TiVo z^k$nmzU6VV5(om0lTn*yJRW{TH7tg0o6Vu&Az#^H2b)0p!_Ks+!PW*yXu;-|VGkJG zy!Wep_Yg4N+yTEL1$y7#Q+ltfZ|&Xa7dA$8y2Q0>T)IY6cOvgJpRdm+CbWh*bPHD> zToyhg<8c$`Fn>p4!1S4|*ID)JjjcjRY)U|G7P>S%Y=dEYfe+p-E9@P8ZP9bRgGf2-I*h}`4-A= z2ob3REYziw&zySr)1|vXl!XtWDx!s!;&;9AV;uVi)?1Z|CxQ>X<>_*5q^}36pXSEd zC5YQsd^#9FRqL606YaQ`cR5X(CDPe3>vx$y>Vrw&aPvtFTNt2p4A+~z!V`6YS`?ZbVXSVDP-O6BH zp&G`^I$*}?KwB6+%Pi(1=NVeXUV#A2z`QGkW{CV#bH@t}W&_pA)^W@@w^zv3x zo7K~`VYKUk*_5^}Rn@a=a1kF0<;JEwTztmEuyY(0F+{PiY=LLAvT{ha<0|ixbBeRrYf9L!nhr&(@=_f# z8gOF5KVFMi`#g`%2}zfd800X}{Y)sobT;U05G9R* zVb^=RWBdbmaK*buv^4~ncJ4FK=n~_|mOcX#!*i;h#`mNj$XM3aG6kEG&rSS4yG{E{ zWLNFK71hb+B#YMc|2%#|t)v##G9}=<@_~SLMLJ#UGbpsBV?%wBLfROpSW=h9W#M{oEr?G&UL23Jh>QxDRjfWJf(0Hn6^q4mc?5|w z0EWrf7zL019^xV2^szSmXp~O)nwqA1#39MX!@A%T!5OloxA$SmboZ3zQYB>g>qBr} zTzxhcG8W{CyD#?oq2G%Vp09KTZ$F^IhyGevF;Z;5yx{t|$;l^@o$5g{Qo!9B!M2_@ ztnKt=I=x=L6aN@^2Lmp(R^BVgY_c>)pHWsAwV!d?jZNo|?Ex-FNAY?Zy3j0gq@M~6 zDO_j1#c=jX%_~M}6z&RYuS&(K!?D2)DRXmIEuCUH&pQUIlmFNYbiZ2i*Bv3kJ4oT2 zZ52GN9INY=c9xS9JVH7k;EH_;q<1lFQGEu|W6VpAU~J;mFi%8aT_RAI4yErDn$9}X zIG37wS!s^(t2J`>YreF))$25&Fd-j)hg?zhhcKxUjAaK}wZJu6)!)@&e9_WtjSHiy zjp}EMo?$3DUY6d}P4QDuzD&a<5?`QkaM^Bra8DNLBYOvnVTJ+Qa{OAnHz|ct{Pksc zg1-agxKNYw=U0A87`Y;?-$b&|rw9V(PqS4O8zo;>Z@K1!`Yrr#y`hma!01GuhZoNB zKeBOuHZQpT>o4AXoeI+o0c4WYb>UU*UVLBDRE-QO8Y2n5JX{45`(C#s*+(D2wWRmo=0bm50oE5g8J7RPSO%--IR;^j9IVZwY^^;?ly9Mx_zXfbgPqI|*$ zVZubnr?a!$T8NuwsYRH6iPX_%=3B;|${5kC-|yfv@^sCve8JFFFwuIvn#?-=oT)<1 zrYg^m`y+(~7L0u`lq85NCnv8sQ=o&%iG$#yY2)tt0<#wL)aS+ypyYQ3BFP&u9h4U0Ci{taP$02YmV=F1ezFvB_=9 z9a!Ud=7jjS3P{_;r>|(7US0R!J9xH98v>j#;3e#U`DE!1el3n6C%+5bC&Ws`$ydHe zDp76Y-13XyP9KBNjBsph@=p*h?I0#T#rrTAW?rxmH$qf`9i$0MBjYHf>f10`AK%yz znFVvcy9eBA1lhcY;;X$6bP*LmR!VpaBii3#>o|qMngZ6I+cKv=-1?Q0_;&^NdKbV3Y3C#rKX?2!enk13C48HFf|yFM@zO{21TJ? zClwTY>_&Q6ck>804^qwHH%N9j@*af|#kKK3 z$+cKpDGj(y`RA5MkeJQ1nx9wzd>bMKojm(a`!YGh_Abay;C8+D!7c9@Ct}X+8cYSp zvUcX!%lbEf?z8vccRI;RZP(TJ?MWF(&u&P;Q=e;3!6XV5?t+7A1!1@sBF~4BP=$SU z4pG)zCe>DY92sEA6?y!`e3fDYGxADayJ9l)sc(tMi;Y=jXqgTVNu$%>_W zjMesX9^$pX^4J>EhDsu^EH{f~!V1H~J+RoXHvDt(4&aRZelCD81rjCbfO-_7*6|Yk z7EM&d{HK=z)aU_ZF9EqX+cqHN;(b{+*nc&`A}@b$VSfal!5S7TEcLm97fEaJK0;nL zp6K@W(8l&B_oSw$Y!+F^pmmnnZb41+OLpD>>(?(TR(|aE1b$sVUYtXH6 z1D0WeI~*tK-hyK#PukJdz0o@AvYLzq(p4L6|26dt&P?--= z0tdXZQo44bZ@rw&s`Qysqw0 zrbK?<_9AiN6%`Ru*&I+@Yhjs#hh?HVTx#a6vJz%{JiLIEy_ix;sBCG5w+->~aMt2t zr_jS%Nw$H%yvz$80X0ncp3mO*0-S;>BXJBfQP`?FsTQ{#EkqM$+8S**X^QzP$*1PV zPqhAI!^B=s!l`6JMQtQnm?b1s%r~k0jt2@VmmArqwT_^2z07K|;MH+<&X?>GlKV>| zNYrNgGf(EV9*1mU7(9d%BtWPakTv``&XSKi7J<4Dz}*o<2D|B#elQSjIMLZekEgRx zUL0u(QJy?UP?K2d;uqRU7Jl&pHp=#z*ukiKEMv4WRBO1ahE|Wmp~>kExT1K+ooZJ| z;u%-6vTA-g+jThV(!)^?f~9AG;7Z=|O!az%S?TlgR-IG+0qHQ8rdMHDg%I6^`kb9? zQ-ra1)WP2NwUD|%@t#Am$33u6;3MyrhY^g4{jMa(%p7a2cixq9Hy-nW4jHS>l_e30 zUpEVpRt-2b1_kPgPgOOEY)(s>m8+2iAPs1H(BRGk z@maWXs{&sC>7*^A)m46pXk!1_730=`C&UW`92&Mc zuTc4Eo0aV*^r*B;!AoxY$gL%fj<^kUAyy`aYvkI6b#6B=Ud!vwX&RPDxZRZWPkI_x zjDj=vE7eC~DaWv<2xAq&ZWsRcJ(Zq#rjz=X)h+r$Ud$YkG(0W{j3v{Z&fxq}8@@f; z(%CHh4$;*RO=-97u+)eGZN6e+HU{^|uD+!yu0%q$tn!n|GrEknD0T~#5R88REvs}p zR2zLS&e2lYk`Ig1mA$m7=04Tl;;?fDdY1YcNz0oml_aKKNck+%1iKVT4#dq!{b~2B zIMqFPR;~K4Biw+IJpNU@4O8f+!lk;iSB92bV-n0A4o{a2ou7jJd#`2K+$$*}=J1&# z=PbJ1K8I{d6a1rZ?fnqu3V8$ZolGfTE^=J(zJa#d<7=?9%_Rq)=n(h5VND#(nb(Lf_}%J|`q6Fvz8Q0E6*1 z#CMOeOGdy(we)QzCc8xmEylRB)0LVtqy=#98<0*{kCg<}^BQ=IVn*NYRw7Y$cjw9e zvirsu^%gT3{hkmK=V)c7H@}oV8H`ks=<9cA-~clDqj;dyROk-k;D%ILn5!jHLH?-W zZ4qs}09OopMw2L=HCKJ~wsc7lNaD)?|7^KA91ZJ%c83>U9FH9nSm#6cz}KF+YsATO zczgQZtTsgTc_9{)k5^CTE4M$KSotd4cBMF*(h9QgDHz8qF4b;UI!TKDs>LfZy}-~A z678u#NYJ4N+eV1hH}OXO`kT6!J}j7JPL6;%U_vuDv3+?^zloD0ywy%tdqBlu`4PhO zF@nro@VTjz?DKeMJquJ<-5q=?%lEj<290|Xuyc#k>@*io3L4GXPsplj3IlOTh}!Qh zkQOrzO9iHlz&0_%&4S{aaU*Ov55Evp0zy)t1R;u3K1HfiMai+yci0j0y?+W4xkYGh zX*t&chgTvKd@4lT8dEWzJn7GX=udm`d3I6yvhrgGhb!vV6&36S?OoeDD0HhTf+S>- zhB7V2XPm_!s#0j!%1MM|1HEwT5GSwSfJUE3?YzWv?yO2%I;S@2BEmVE2x1O2E}b*P zSvs@4mQFPPFv|a>1B-jlWZWurMWVD>l5KAb&*9_K^fXRRd*cdlo&189%C3HI5Zqe- z3y<5@&lI~){5Zx#cQ|YD5PH*){ARc!mK=9+?j7SQu!o0uM5e@;uRi7uIm@A_k7poR zu6I}HJn0{K$X!|CERo!iv)wpgn$bFp#qFQkfTp@`q_M}5AD=|jtxj^Xpj~e(k4Q|c z<&SepX96I$$#p!f^iK9;A1vpT`6&CxvD+^bCi~mJz$MAniFyjA zkYzJ9)6umv;!Qz%#@Z->vueIu@@24L`sYS^_nlSFA7(m=#;XA&P5d* zh{XDc^QDc3Tj?~JZIg`dn_`Dy5kh4}iI6euPT$){bDmwC9vS6?XGh{IM8V@JO;Hc2 zj>geTDdz}MK!o}Yv^-6$gP}rI>6g-7DfsP3*AWNC_H2mzN|h!~Sk;h`Lgb4CE8UC- zqwYP<-j4uPsxk05d?lctv`ve<_KI-B48IM$`p$k+41F=d*!Y`~wCsm;Bow8t)%zAn z5y;dn(G?>sSvWjCpU!j>{TV>*TA1b+cYJy9v%?1*_=Zl8JII>o*lc{MBEN~5A~|?E zQ;LVZq`j+XC}`-tj;9sE1q@LkqM7%M(};GSB!1WFNmbfBb5_}Fw_eH@hkX#d^)@19 ztzKn_OmH(6C(cRk17S*Ep<@tTkH_X+zopmKOSE;n^=rw~{5+78FNm2(dn@TeVFvOv zR}mzZIs&vDk&Pmmt;21t+?CXw-{6Az6W_KU>yr(suVrMB>EicL87$yYL>?AFgy^leYyfH=6@4Y*JFb=#B8<1Fk*;0YQfqgDsw+hQ9ZqaI$jMXH9O- zH3HSr9QG+Xnl;a4UTb(*D}6)UwzK9gd@wlUY2DoRv#ji&y7PPV1J&6cnB1$&5Bh^@ zLi}Tgosj`I%LH@3Z|xG5c7roHB{0#hdPF&EpoOhS5gU*`1gz7~12!(2^J&-e#?TL^ z!$~)5bZLY@1GN>5eIDF1!r1P2;tkoz4}nV~KtDjf?) zRj~69V!0piyJKKS+*?XQVzXHfkE4<<2wu6^8~^O;9Igz!!Yfl?o=-ZZ> z*F3UTQ;Kf0(LO9L?q8zSDzU8X7S<0lS1rJ$Ct7X4rCwhXk?qV6qm);6xT-dmic z6u?O4fWsUg7#qe*y})KO3iWS6sDg+zM>2)-7oL^ z`7?B+{tHvb8LA{dU-SxZHJWb2#S?iBE>W)M z&?|)4O{3qse%z1%lKZ@Xk!UMF=JxOf)^R%7lX)*$7Mit3b#geXj`OW%DkECfk||fT z0qWwlHjjN_tc5=AV0Y_$Agpdt>xCIif9;QcV)9fCkGkpr;=J&A=C19>4wQ;IcB`rLRcXyf?-449*`oWJ?2r&0gf^|(nRAJ# zc;fT1=Zz9K-s@gGVF{s0d6p1O*p{Up?Jy&kt8YOXw*()V;hXx@1|c)e{Z^R-GKs@! z-us$omrs6T0Y_F_L=Mll?rLYbN6{qRTj{K~_QY<>CU>(UP8}{SQ6BY)WAbwDsM=27 zl71&_GKWm3MXo#Qimnx2!F5$k&TuZE5OR6-#mq?K1>DFISVmC+G zN{>5W+T>^MLi#CZ`dnmfRc`%&O8RF1SI}{Z+DTg>@yDYo4knx+ld#$-1n=H1o`$h^ ziZ64H*lMTJ(-8^GYKyS_G$eiJqB0lFzs(68=3`9l1+7wr=gsoU%>tmUk z;ZmXr4(TgmZ`J%+vUKT_;Hi(N(v0+_Ej6KDkUMK1V;8Ywb3j*lo#H;!`Tq+hP3XTfLEXEMbt^-FiH!QF} zSN1=tvxhqLJhggV_Wkq88XtDHxOdUKpWky%eWTs~7-f|;l9M=`76h8=%*i@BG-BgT zw`gB{!D(LGccwHxaY_MJ`=l?QVQMRw)n9Y&X^qIK_i{$Mkp7*8#VyF?24@{N&Masr zh43~=`NUI;uhP9{+@7g_rXFEn!Idw@sdU-ZBd!F|Zj(b`(p{|3xk5*QrYxsqCU4bg zqpE2Qz&ud2%~!g6ITN+Of~>n82~J!q4dhG%&gPQSC(;UVsS&<=C9mvxEo(Dl1;`b7 z{Tg>qQ`mSlH}4yWJJ z4dv75IKt(_Wj`4oO=_hG_}ZW|;Fv=WY1pnS&zG>J-r;^V_&^D1Ut+X5`Cw!h=5EZ+ zS|fy|cV{Zr<4u@TR$sVz$3^)-{xNPh4$$|!jmDI4fFCz?99f_koEKhja(7;|_*!S$ z2YG4Mfw!E`3Ibboe>9_mXwRV(@K@KpRwFA0mb4WOzG|wXFE}Qnp1+c|Vc$ zf{7&blKD9NuDcbcR|vza(wA`8Sb(^Cq%o_41&&32>e&8p{FHBKG^kzHDJJ(LI?3*O zTU9vXj{6-$HG7QJ*o`zZzp;tLrR}Vbj~$h0*v9WYRXOF(p&wU26Sil^wqt0{T=~=A};5kX~ULW?SG9m=h!(|b*I%Ji!UR`ADxQl#QBzZ#sC(PD~^3lRh#7yy4fhk2a0Ce^|+)?wc~e8AL`cZo3f;I8xXgO+fP_@rjRFZp8= z5y{imXgXf?!8<(Z75;V*_b2ck<49cWx7YSJwFN*|>~Zu5a}5pLrMEkxR=Vv^vX;ZS zBeud>cXu{G9=K}kneCNmYxvdlnB3v#rH$;=AYPXKM}hd5ii6}VT1fi?%jJOw4N;dYQx9$g59S~n_DnI=bb(fsmLd~AoMgd*Pv^ik>>kZc*3Tuz%0bn^vb@H@shBj!6_j93_tyQ~jlGy(lCKXMyenR?9`PL(acZ%CX< zJM#jYhRTPt3JmYQ;2~0|gUm3vc)bvyw(5oY9r?PoY1=kyPFZrY*B&pBvRDyysv7WP zjAyN_?)cgCdXW#8&C%CsSB0(2fUe$0ZMVc}hS0cXSXni!sM71@oR4-$0$~nioL+v% zfsrJBD}YFlr&+jp8GY)AkFeWv*bHaC$~@oW@Obz2#9*b|Cvv`KQ?i$nV6Iv}ee7H} zEsdG*lXkaY@1wLN1;xEB+18MgEW;grJe>V{GnN9)(c`Cy$Bh7TLrl3Z6S;G3z@uj0 zTfA^0LkPaE0$`@%tvzj&mUf=Arfm4S@|VXPQ*H=TXNLCaZVosw?zbOZFPu1`FXFHtk370^c${hBYx&({Aeu{&?ZK2tKkgphoupEcQ5B{1O5zmG;%+Wt}PEsFm|HhHGL1G z;ECWsaG8;tA;TV%ZYzWb%vGT$AFd;unQdn-mZjcYxUiW)zpX9HnMD@W%4)KjW;HSG zkB1_HbPaKJiw!r?(G{Ss~ARd2At>7R2{KOSZtYRmcZ8{wUzHMPcJX*3tJSd^$n1z*;QXS zyx(=_oInlowQjVc_=j~(M~XJc!dU>B;Mh#mW$bMh$**{tA4sX+BidTnv!B4vTi#wl zf-TN;7O4{bjEOX^g)@$rRfZFmOW>I`~( zf$g{OrR~Qd4m?cHcHF5sO~ucq;OX~qCQnQz>t%4~?em3R-qNbSjE^9&w!nm)ufl-8 zCKCu1h`T&p*QGxYD zoZhzXptF{V`?l6Z5?3IRE50!r%X<+6M;P4&>CW!<$BxdqcJxtL;b;0U4Od7yuU|e# zS6HPniL%=6m72%nCpJ#UfSuP!XJ<|)>m`P(=WXOOBv?#vK3cu2@#%4L+ZIJ^9TogT z_uDg^;=U)71UBhhAm`qXc|iLgtB;R7(frlfoLMtT>WC{NmvFBPJbGlO*$Tm^9WUD| zT&4V?Uj{|ejOt0v!#(6KSd9lm3LD1#w?@uNIchx~Ty;8K3Ord_>0Qd8mwl)C+(+%b zpyU7J>8*pJYXARXVv(gw>5%Tur3C?%4y8+)rI&7yloDxn2|)oBknRRSS_J8iT{@-f zcep>#_xX<*m}O?qb-m+N$HiX&M*l5a-5vK;S)@!0>(*~|m`Q8uPQTZDRN}vQ`>Pe; z^>D0*_rI3qI7Et(?zy^4X(5D`+2IdcVyc>jCpfE~0@EPsA}`(+)xs*!XA5H)0cZb9 znh}@Vy2~?Z>o-Yx*cR#n8z12LZ<yY$NdXoewMCbhIFs=bw?aamHXF$@tMf zX2Fs1I!PKdG@V#qu`4*i?lPk1cI%_yyVGd3W|XIel_Tgo_t!kHbD&J7l)CWumI!1_ z1!@SQphzHHCFW9l;OC!TH}>1>iO-f%dcV(IzB?5wxK7um!2a^5BQqqv`RWaLG z9^V3%X1$Uo1Y~=Tjtszo5oLaSg_RqL_#9A{Odjq#tmQ|;wef%V%G1c1ZxL)6yacrx?!9I#8bW2`S^Bw6z|GO|lSg zI8lloKxbyZ(D!Xa%MFYN9beZ0RSgq|cE|6-D@=8rzf8YBf&%4!8&?UIIn_U+kc=NpBI!Nr=!#*YG0C81<%rOB{3C|G6e!$9sv{?x=qRIWDn2S-&wMyo>*BNIHRHR051Y;@V=h2gmo zb(v6o=3aKuaI>U{CMfirr$qYmJ2cV8155J|$Ltv+C0y zjOaMg*JmSKmeg^AqAX(VQn3eCVDc7KnyQtit*9VDBB0V z-Bwixd4TQ3`NEbD3{c|)m^UgC;Hg?P^YrXrm)Cc%SPpr~+fpm%53e9}zMpAP)0HG9 zp=eW8s~_pSFN033gAeXsQ&xe z-s*IYGCUyOV6BH=ngO%alG?qkHOUw3T$-FI#>=_JdFadX=v3CGA~%Jt$=B(H+Ck8@ zr?vDu3d`M2&t6)EU(Hkyqw}dHSf$ALKIJYQJ7Z1s$g?7Z(&jP9$7l*Ek0zik9!W{Y z2a&6<1ebep8)~v!K`a@LFEkt}o!widfN86oO3oVYG%Re_<&nu_-HuOw zn{10ZM}Jp%AZ6c@m9&jp?tc5i@gK|Nvv4GA*Pq}iCas*9I3k;@1)2@(5Y71E9Bo@d z!1TKY02T$tYY37pnSQ?q1rjQIgd+CUou6BflN^>||5aG#Hs4XN1%Ia{v6sbYyWBG- z%Km{>_2NHS((Yc#i*A>$OwYdzpayS1jNta=H4~m9=oxqkuH4rw7BG(Z>wS=Lk|i** ziQ8u!R1WH&2kn+|Sf9PpcLXDJw&$3OuZuex#nBLiF?7a9PMQlo&Fkunc!z>{bX%ar z(IL@1gG9J=CnwjnG!I+GdFm0(wr!zQ0mst+005AaWWtC zD3N{kwfC~bRRudt^2Ksq>z3J5+0i&$falP198XTrQ`brX`@-xqSz0p;@# zj%Q9Z!H{U@CT?>P`D));l&b#QGMk$=T7Z#eziA6mPTS6AC_38H8F=;C9ny~7-kvH= zxbUt|*}~Q1`FxVhML+Fu$-PPMt8ToRB%kPa)5s3P&mu5-^@L^by5>LFWHMI!WEZ^O zjXMlHEbjS_t&b%){r(BRy(4s=C1KMlC$lS>*SQz@=GrezX&htzXHk*rI|-6G?8ltw zpyOjCXSZJ~qnF$p#)zGj1gxJGOF2ZzQ~)0&ut#6c-~8B`OOY6Zfu5$fJ^@BG17{ip z=2uu*L$vcn-#5y~s}1USN2mu`h5LY2t5@Dh4o)C{qVt#S!+V|)cYQ~QVNl|aoT_WT zxz?h)#3k*?ZUYg|p-y$aFB8NXtCu6R@!aM_iimDlF63A9H?9v^@6P6VD|QC4DOl3k ze6e?KrK=^T2+B{#P>IC*(X0_18L{JumpB(!ABzsa3EN~N-Ltveq!&ktZ{9c`{J3QO z+d_BwWZM8Z-z*ri#fq(+jV$pxzi=Q%tarM6q0`%S}d;vyj|fU40R z3uDS`rQHjenPRonHIA#EmcaB|0AmD?nxIMNph+M1 zHJ6cX9{8&02VC0hccDlx*uQtB#9?n(f4Aa>03s_o^_imQl08XX&xSUF|8^CLN+2R3 zgZP+;C-`r}w<;A6mDLzKAi4M_fJo+bu1G8$s8_6e@oKq=hW}I{+Vwx2qtndyHc_QJeSmZn?oo?SLP>B6zCho$Ldttwn@^tQlMiVwDnyTU9{B z;e~E8yQQz`kO*hzSlu>vAv1T+FMtqGOA>vgQd4!_yUT6wA zIgNu)9yL7Gcr%im_nD3;M_9$IZU>vEaM?)0rm-{gCKsbQ(ZI?16SY~L1BmyU`jZ9j z*;7S$^(47D!lElE9?9cM^lQ0X&W#pGT*Ixu5~YS$Szz6D?fhaWdQupl7L8_X@JrqF z)=iRB7?>EJfGbqv6~wD^xXE)9+BbHN??w2RO^zfVY3|&AI5mkNF1C3anxq>{Q79E5 z;pGH!bz>h%)o}T#Z|Geq_PH@wW(Xm+Pro`DRE}p!gxs|DSB+Lads#O61ZUaAdxS22 zU)EY7cIEu0UB2^P2fC0*>YUmHOrG@UdE=kH_HX^>;CL?&C5iD6i_bndwcpT5dUqf+D7rVi*440Pm|y-i)#y%;o5FA2!oF+U z7FbDlASH%TFI@0NjKgIXS#i*o#hLJZKyZ*=I+WPEx`O)}YSQ&41q;W-FltzqUG0FN zTvyjO<2!x1V*LRi32gZDzQ5_L$EQTs%}VcsGXDk^V+#eeKxUpg9X zQv=k9K@Qr-L*3jMflpr*&)4{u5pB%s^O$Ew1t2gFrhOUD_|na{%rIU7Ff&I=$FzV8 z9!Bc=JV4(gr=1czZQjgz&Xn>_vO$dn*WxqWxr>TD|JUL6B7>3s5n~jRn9GW~WWEt6 zhiwz!$wWd+Ly4HTBVldF?*~+(kBqo-dN+yC+ujrRsqIE6E!<{Zc+RnjThX`c!mHy3 zFI~FKElN|Dm=iKIhm}5h?M=vpnl|PL7h^V$r{5=XFpiDIoq+z@8fvQFWK>i1(5L;bts zDLA|RSEYN=ztb@(Q4t+&U^kL~N2K<8OCD;Vc!m?oGnT%62U=CJqJV&_Bl_OI@x>M9 zMD@YS7}0`1APX>IX0=4G4n(LOn8x!EZSDe+*eM=GxA|!g9rY_N;7td@J^8v{;yB8! zz03=~>U={GC^5MCHjr(tGd`}=WQ5t+n62*Xle-OR-=!iFTEomg^Q7ejXBO`inM+%e zriz!;2u~=_tpa7&bU&K7LV|Z0Ly?}_gTch3RPWk)lm`PI3H++#vBj@ScJ^xDC&>J@ zEm0q2Ab$zJXiZc6WXTJ+*pMR#{kugR&2p~H?rhZKZT8Vs8`$z+tj%2>Bi2IdPiRis zKR>8cWg7U%uHEQu_$|z&>>BP}J78l`gTK>s-*mbF#4Xbf#S@lYVkcXvJ1^<5$o}Bn zPRYE{2KK4%8c>?QyZ!>=dt}rz>?VrP;PTDwZkb(0V{@qnf~sVh3vK|7xiioCd*qbY zzcVM?7V@`{4lC-_Lsb_7D#A+h(yZsb2VbM}z*CIsgok<7ORSL!L6DT(U0ChP@;K}b z+p;mts8LcaK2qYe;*$u&1p`qtK&$v+9v@Z}(2>RsN@Sqvl`?g(>HkwflWbym&rTU}Uc{>( zRqAx5oSSEt1gozJX`i>+qhLMs%;-fBnjC!WOZF=3(lqgpCB|}v^-Byp2MB9grW1=N zCNBR;pkOl0oThv!*~2X@s>z0}b(Ww>Ti@2AKJKmAycRRwpLPi0zCMY*ND&t7(9nQH zuKR+;vBIFrGlbV!g1Dul{Hb1asr=Pz4!8Vs!O!q3Tja%bmzy|)sse;%8Ccx5)c$T2 z5_1=wBj8Va`bQ!_QHWzlT+C8;a*E{!KzlzTf-5!dCMloN>4q|qp!39hdb~nTi#Z}h zvP%7C3!m}#;nBKCX{JEvLGQ{@uIx08!~3*lwLIKIMLPwXuBI}gw5+XBx{*%0+==G} zWFY`WyJgwqd%m6q3{UbPCU8TFa&$#}`Ef2Y%H`~K^Gz7jqn%CmslBO#J+0Qg$`cu+ z-rVmKZq{F!Qyu9lnrGVO&9B*b0(Lhl59|$&nHrYxrainGm@^oagM5^~)SoF*G`Qzq zzRWb0!Ia2yHrvy~5!g+&i*%UB=m;Xx4du0@j%K?)Ix8-H9wuz1DYtKzhk!r2ORjSX zlNAPwRJrOxE!*Wxr^^2z2KhIpTNCHpqBce{+&_`85 z)LeRDU1)My8qwCv-K^a{m6Iu|#Nywx%GZfj@KM`F z>C;13nh2n~HzT;I#}maU10BP`6It@9<12t%LB|fNmjkroMoVaLT%+v^ zc5|Nbzc?wE=%3pY1vHryiiizt)F{QlP0bF*gjfs{9Th|P`znOCvV*d(V& z3e7q;(rCBVnQtXwv=kw{-%@`NzdZ=lMoRST4AKpzC)rF7r_QfO6fE{l{2vz3AxGFP z!Z$KDv>+o<`zQ{EHFJB!IDO-*aVF8ck$G*~A=(V371w!FVGEo0bG5cJ<+rAv8jeVs zR_{^!?(kwilD{$>r}yS}qXYV4CjWde?W!}!O+4GR+cn^n;z%)ZU7WT8R&@)bXJG6x zDvVYvoX4I3I6mK?m=e!k?;h-cwu^junjFmU2`yy?wn=Zlw!OJ38rr+e#iZz-o~ga-*SD1v|M}4--HkJ$9Va z0?)+kpiumP3@Vjl+W1f0X|^m+;n^SMJ7qhurjB+&uqZrSt8yMsW$)?{`4aeh-gQ#cfGsCYtEjp(5#BXY1mBb$~ zHCkhK&+|fVvigMcf=kCQ9hZl->$bbI)ueJU#wyR{o`TH8WulqmgrhY*s^Rgti2Pz< zHgRX=ot9vMgrwgtsMfd^?;|Q0t)0I$_>o}0pgR*)?t$*^i#5eIk$!HrLzbxp8<|^) z$e@*`UVW|3HRzED=+N|sH2X}F=3{FQu))$g*_uS+OHE){tM2+xyRPcjV$TID&l@$> zZvdvh4zmZR z{5@U*7Lwm;prNtf;^s**kmJTBFXt$qvPK1{x~Y^xTCDsNa7|aXZlvuuFp0ecI!dW4 z%9s@sc<9-=q4cenccKXNW${+6#MJ{;ar{vkfxrQ!`x9zE>8 zWkTh!88yz56G}~TkzrJKv4Wo?Fr~Yo4$$81EeoRcAQH!x2OqfiKVV5ML?jPYYfyjRlD%rmUv^}%8are3W?`0kyr1f3u`=6aV|~7=mA4z7 zEUzgi3yi=4<8|hOZ%EHp&%X&LceLK{c`>xB23d65xS9EQye55gp`Rhmj91x(z!Sb^ zy?I&Q(|aQ>(>FrD{2f_Cqo22vaw49zk++2PW8B9Pq@vY#N+UKLTzOEX#>WUaLfWyY zxH*Ckh2Kam0ItRTx|^*f z{ekmHkYs9cj6^!ANdr4%pnX*IkbAxt7lE>53O&e{@ipC&sl;`xAl<{4&|lgTv}%D0 z2Re-KECh7ZBKmS-JC5Guv8%7*g59KZ&E6NbzpfQcIG>p$IXKvlKKS&3(97cdnG;T^ ztjcN^P)5Dd!&%DZ4mgN4@O*PlFP8D0a%6d$UJ_c+rL}cW|M@f)6z&btuThmU@)sHO zC5cg4&IDt>fFa`b-`70P_rj^E-snRFdXI}7P>vd%{MW3C1ofhTL;alex6iv5(ytze zeIo^|Xs_MB2>`btbad<&DpSv&V1N;rTe;oDX{=!ObCYdj{J4K8Slf* z$pC7hcod^7YKXxiV?v>)?Q+1dS8OHR8J-n_na`y1Jw|*ndwhRrz;m5NpbdOokeK%r zJ>wnYF7RmiS0>NdnAkQViO=3x1xW45XLiU~bzi7Df|-ekY}fq;{z|PkEd10pQX(8# z2s)HR_tE%NejBkJQh>PoW@HSW7=J0?UxwpWTkX!M0qZAo8|Z_%GO1xw*`aQ3Rvs_v zKgH`@d9UAn6dMSwRBR{J&Wt+gRigIkVb-^>d^STj0w+%e0_Q(aV@AYz9 zzbxHy>jXdeS?Hck!Qs1#tlpIFHgNJui|TD36*nHEnBE)tXZu~jJX#LH&!+sQJB~c? zd3iU~J&+b2JlM0usQ#!i%NrLb&>?XORE-*I#h@6x34g}nJN87=; z5(4RSmTDa6Z**1(r=H}%KuQDhC6s$`YqZ#WpMk@4^wJmES;4UOBma(qBz7==?1(kF_91eJ3)0GM z7|&C{hmz&G2>Ql9n?Z?3WI@z9Is4EjvOy_6rz`a3u8~BI)z6nJRQS6JoSTQQea5M2 zHFgkFBVA~ASgaBss%fq13n-Rp=X`oDt)Pco(Iy20;Q6%OZqzG2CW88R?wiRJU-O~D zL7ut;&}*I7ORhLzB8Jq^u|`=xxaiJ5c8AGrSG* z07#{ojN^>l4nkjd4c59g$^wMkbpvoc<|*pdpN)pmpLR*ZFfg>CBZBJxv?4LOS&75Q zSw7f5gWFVrYv|DPs$2jO_iCDvDa0X$F_w+^#g+v|xhChvLh1=}Cr1};49DGD?^%e29s z339##6%|=%bOlFp6@zpe7WF?bXw5Aoy~_+{wCW^KO?`r=Nq|{8?`MhZMe395mU~NL z&$ql9K(x94nQ_5wO#qy8H4X3Vk;UnlaE;2o+pYULt@VI+=OQ&|T)W zU~B#Rw-&AUQg8$xJu#=vFr*ofHQoSPQFEcb&Tq zm^s@+eYKcVuUK;5WECe|``PFLDvZA zrc>uyr@UR6OoruR#ti5oszmbELGe&+3^f9O*y#4+hu4bWm_viww3Pja^6v&&x zlpWpCms@u&b1{GR^|9H22+Ak$Zr@w>aqLwsemy?~ZxL!-{{G#wY0Pjo69?;xXjNI> z>ZR!MW+|~PQ2DI8-tpoTEhIOG9}nBIT{QzaxR)umzSF_3iAogtKRent%O|+Z96v~vgF?SbO@Cg01 zNk)Ub7q4>^$LK}G6{YLY_=`LI%2lXsa!!PZPQfby%V^)B`@c731ts@9{_kjaI03>P z)bO{-Gw8anJ>_Gid`_HsufEY1Z@Te_#Hc>sN$P+P;Sgv5O(6(aABvvl`(@7&%uOU>w{M8RkLb4da z*eIldYj%=BiUEu8^Zb4s(7Q~0sS5mG^DO!Z7MTORqC@e+IS zmqm@dtUUObH~Iui;g=r|IK;tBF{|(r;(Z?#M95HGdhk|Sd5anbE{da z)3p%X!@EnQ_)Vh$Srg77*|yTo!tBT;v>lo3Yv1)RG4MNH?JpulvrWy2cka1;i96s* zW343-ex}$RnU952ml!wJ zP4GYdQOZ}t&@_yV_J4H97#uPyE*MsPC^-fX9dgAEXfA*V9|;66pMDR)FLHy6t6(Rd zeerTFZ(k$*VR|!Qr%BRyJISMkLu#P&^s~iUOK5y_Z~Ewjm*fB|30!ktSCHR|)Kq9eOoY7RjS%KvzQ##4-2rl3YlQOy+qw z1*Nl(Q^*__tmXz!cQc4dJ;9V7=X7PWq{bR49!K74R_t|pjYl7e-T%!PW7tZw) zFl$Ln2N~Sm?kg6k=UEOb7IrLeqALj z&ByQMfk9?5R76E_X2Q1vggB8}X&8kw~C zG0JJ`w_p1xopMxzYskVDH`r@*VH1E^1syxg*8SgvlfN zfs{*YhQx2mfI}YM1b{YJX}+M&r@IM+Q-Vt?{jGxHFP7aRBr?Wa%z}DSoQouSHVV5- zqJ)>?JuutX^3S6dsIYtq>!c%0WnR8uKh!tRUKhqWG_TdLa~l{f>ouqolHdzy#&$CW ze9$AM3+|8eJD$zdbcxtXFzX`yPey|J(;6}ygcUyY&QgVMVGfKU7L^+I2n@^N=j=^q zy$Ob`z;y`hdaaKp4GpyKnHn|* z6s&dHbr87v)cn|PjO}E%5yp9I(kXS(&sC_G2IvOobZY?qN?-@);i4oCBY#WntZbkQ zu({_T`l41m)(5$@qqRY)Y#7?0rjhQbecAb*15G;t`XPy=rM~k>nH!#-j4Yp2K@DFQ zK&&bWJo7>jGKB4d@N+Xc;1-2Js*i6^Eb$V)X6M>qLo+!sBf~uG$Ip zlv`>1XJw@*Q8y1AN<4-#a`~BgI9E1KZ3#s=q)94WmE-JheEad$8r|D{M$tgdv)^hE zfM*Ju;VOLY5DmC+p zv4l5mEeQ4G@1KI+f@U{N#`yLsWVIiM8IjqiA1B4S#RQ%kYetMFS2z!=gE&aO!F#7Mi*)Cm*^ zz%lh?=UOAEFm{C`!%1m4wpYmv4&u(Wv@%w`dUN3awOOuv@$4T$~K%>a9jZEvksV1Il}J zRO?G!k1YtWO{?=SmYmi&+esGJ&_a$svq~)H9=zn{+TZPvde2qcpdc9mUDTW4YG%wl zIh&IXX>Bhi;z!nT@5&S5$-G;eqs?vH!4HTp>kB||K77eTMi?8C;yxqBLeQ)yP+t>v zyYy-&CrS?wHZT@R2?csU0Ot=(lfDc8;AkU6ZjUV3oqmVG?%yLb{;OZTjI@X!#1Srb zQri#)r{{j^iXVDii?~bniZErd?LPDe3?ztLarB1&hQuy-L|S45py*wfh_o(@{Q}+| z(cPI@=?8>xlk?}(iBQki8%D(9w8UPg#JBzK-x2brJC_Ds5dIK!U4fhe3w?Y-z8rw^ z1ySwr7U!^z4i4va=mvVTDM+uY#&Pnu+*R|gnazo*-_{S>(6N^vgkbTVVo;r2)2cV) zrLuD5$;VWct6<|CKWuT|0B>82jJ$29~XsXh(Td{!;LaQ z7fJiUh-s2lhuz=zUbEDz#KkQbjQn-CohdSvmIMRd1q2E$KXv!*1!CU2T%D%Q<*tg{ z?K)ES{faX=HX1C-iEZC5W$$?(FO803E*{stV7bW)fw8hmR_;0fL{D%2licw^*Nn(A zRJ7a?!+Jz&^VQm(8tU=d5~9PikB)8bafx)5Gyj2WOuX*qq{t#X<0AxL(Ijn#^Q^eP zS62{Ly|o3p<8C&uP+uPBIO+ELHz`zIr&jk=-qeM(UcBv))?0MchF_UFn)JhXY~I?^ zmWVO)W^&;RI?Z>m7^G+2@e zO=1_f5^S$0sPZ&xS~gdC!Q%ENZ2Cl=CwMd{U%s9&+>^a-@)Pybg!l9U2Hz5Ha_V$l za-W0is|~4>P}^yZ_ftYAwJLRw&Yj5%TElWqHYw?)LZ-C#aaB2_$r+=%M8p8Wkq==1 z61VaD)E3=hc6hD>otKO)zB-m}!%BllL|+sZI|by!*}ppU;F+W!`s|yV95+-w#|}Oi z%h^d*HK7lS^{A>0+3_&7nH9_RC_49c3WOChVWakCOP92jpnIav@~lH^2OA4aXOp)c z-;#tGDm>jLh)2#_yL#lZa!ja^*c)MJ6VV9tt(5r9^CU>Jt@4g&??K70;oXFWj2NfP z#9nuDNNd!d)TywBXskR9*^3(Yxd{;xdrk}wX~(boA1j%(NW*Lcw8AFgADdjyIKF>z zewIHo39k$NrVNz$$HW45E&o_{39?hLVr_60mLF+OUVc5@Jo&8jUOZtLjq9HPd4xc` z+iro_IKxJf0BHJ)mIOD8ENYNQqg}i9D)mW!2VyYa+Gkyb^?)+q>4Ugl!LyvUmGdl6sBF@#;? zHI1^quZAP2wLF4jW>YK5#*~h^VE3y+cI)f;*J=xM)2+De%LX!?HuyGJuJoUIBp-ao z`Mfx8boA>!(SO}(>@Acs-XU_b4dCCFOD`R+Mueh1470EQ(<4q2iFdrTg{Ei)v)6_G2zLWm@uxT^m@OVb#5vf#fjwR z%_}v01%9%M-SiC|@)WkA9@$KNmFnRAlJDJ&nApln-bdGR877}iZN!^PajvYT3jT-t z)%nDG1KLMoiJ%KMSMK5d%UN!|qayBy7`8f))7^~pzrP+=A42Y~jztvcPs#0|=BdIq z%)761m;~TB&xiYpf&xyQ_62HN%fP=n8_Sw>6C6}9&e)XW_NK*ZUxI8g!@l!3|Ct+F z+~^A!cg$K2cvlU~1yIBK0YFJzLKjwYIahL8-~pojvSPmz3U$A6 zX+vt}TyJFa)a&c?=!);EX5~B*m}gO6=Af5TTPusi3*Vu#l(BF}D7q)k1* zA+xuyesAQ4rZ)(BH01PJstViwX@|ZERuqnjyz;2W=b$vG_vQyz3c;n?IU?^}7H_jq zOM2NfLAm+bO&;lNeos2_HD8aJt&horq5#__p)A{aq`=}Vrx6Gk)_a3FcF+mk2V5fv zQHm>w&y4FxPk&1tb;WZu7aOc{8XKQaCJ&G4709SI6Y!$1&r8esJzbxVuafH>p(N)H zMJdBP=z0DpIGH6`J}nxp_;(;g9)(tZ;1$BF2MQ_5&vk`A$CN$4-d;C4D|A%4bGz=* z!{7(VnYB-DH@7QogwYr6F9XkX7l$R_ii+1_3^#lzilb@B3z)LNHDbW>*G6$LQEY1Dy4`J98zV1+b3u!sXo^=^U}qi|*6{GBdeMGZ`pN%SEn5fc53dAt6!`+#{O%WC-i z#ib9j#Cd5Cfql@hP_DO@<4H~7_szFeH#Zc*DV0)(N7fK{9lWm*>Vu-UJ;$_g_LiHh zgY@ZYkKebEl^6d3Sy=*_xB{{V&0tAaPu9n=xTEz}gT>cy8b4jb4r0n(dFmn#la|lxZ zV9pjJKS;WK-eipEjh^=Fw%Hs|g-J3VOD*dj|C`;Zzzp#a!v)U``j74C5r*t7j%B(7M1P8{&r{2zDGf5}S*Og1@!v>)u zW*R^7y3yoTyb+ZKFFZdVl{rLry=@}UH1{AE8v~3 zmPVjeEquCT8GoOL7W-va{AB*o)}Re*yW;)oAG>RQp3Sg2_*Zht&Dqbvhd~zUAIiwS zikYjtx0T#Ss(;P+t&+=&_xsIgI4}&(G!&vC-N|R zjNzum=f-7vz|i%E{lh#Bk_FX{#!I>Z1iZ`Iw9c~n`Z>+7pc z4L9r!{f>XO6+1E4Bhl+?1#z{1nUJ4h1BX$H3wNb-KDBLu*eg3`VawcJ@Y^yzs=4?- zEI?hbsS7rfD5Wx0u&G`LWQ~Q`QTt#nKVDhta(k(Td5!nof&Zl)_GFck0}##^(U5x! zd+qxK`IyBY^G}v+YtjiZk_vZdq>aU%-^?1yi-AKrfCD zuO!|Gjp~|pdHTrzDt@BjV$^p1Q^fHRj+8`Sq6CD3sSrf-hm(B~|7Zgv;Ncx7)#t-1SOcWML)?=kvb^b>6k-gtw@J}jvv z?9WM5_%QUOKNxL%PrUP^{YJs{5NX&H)fuI6(z%3gw(h29`NqNiJ1)Tt zt78C+Z}(eQXSSVVIHRn7*x-=|>kljZDDAN6OEap=x(UR&*yk)#g@*gQ$N5ZsXASD> zkUk$eR;qgZjKxV2v)zySOlt;aT8_bHqmnff()Mn7J(cw&sP~@WrF-ku%-sox=TeBM z7l6NGnbksLM3o5A1$WWVMBc2w)>43zX`O5+KDq!rH%H1T=AMkq{gy=%3_10)Pn#i_;ldT7;6t+XHQXGq+rcLB$cRs< z2nqbIYj$?)*sFWDZPkRw25<3acYZRa%cS>d8O9U0}9Y1-^+ z@j*6&UaZu{u40ZF8l641LNXwbhCreG&BE})0sKpV=Gk2hkuKxyw3J2N zXRbWeF)6NcEH+k1VL3Jp=6fL%st&f4L-T1FmX?1=$hi|6UJWVEahrNF1tQg@j4iZ- zikuG3svkX5 z_`D3rFaxv?ObL0#lU5&h_L)uKlftseghwGm|NMJFR&$ZL=z)1CRusV|+L>u;?~+7c z(wdRN^Heo#nA!4d+vq<=F@m-Hr#*(?N*H)|q_V*Sh&=qTLiFoovozNgx+Ln9^NIJs zdyDvv{r?LL%|i=G!atTmZ8z|mR_r7TM9()|{fSfQ$Q}JAqp<(dmSBdN|k)8RNYhb+MTEE;GP^#mZ6n`%iw9to}2n z@$o&$W^Uee9)OYZzl&ZM9Y-9Xkw_Q|)HCsoFws0!TPDgiIw#sW3P}Ua6Hdx2p`^{O z{o;yEfBrSqH)o`M2q{`omAcir6d&qsdK`x69)BxIV_p!^jSY=Kz#lByz>bHM@Z3Efpgx z35W9X$lStxifj65QReL{;2zsMW2laFzGHNID{2J5fJ9ZaO9XssU12#6(h_3v?!8`C ziWh8a84dS4Km*G3ZW&IMc8B?wpCgOHd3binu)65NpA%a+JJOPs21tv>M+?oJX5tqe zq5!P6O0hQR>?8b$-$`9>j!@lmHMr_RYAtU@+}y53NHpBo_eL&0iSgIG{6FP%vYl)_ zN+Cd1W$cpMnfkr|>S&DQMnW`Rv%35|iEyoUKIG>#cx2g|z$OXQp&xYmMNwiY=H>1C z?$zKMO2#|hg8ZR5?f##>3&VTo_UEX^TL(wRULg231Ld)UAB)CcFJ2)@#>(?5-wtgG zhCefXxW$L_=R)?{>irRayqv|o$?#sgaJj1c;^AahqHRUo}5BYPyHE2B_f%f%z&nTc0ck$+KZj4P?4HKiMv4;?jhZ%&ikQ= z{m2NwnV#s%3WKI2F`64)W$Vi;_&~wiuk5vFcuBnZ6@Umq2;Z=%C39gJnI_LGE4)#u zbN7Uc?2Qe-q4WHAg$<8_k}m13ZU{&9^VT7|K|DWvJi+V&xjEEbb{D2r!cIArUASw= za@oL7T0F79Al^*2OieP^*a%xB&4g7u&*i8V&a$(kl#}&anTS*5(Uuf#a@T9=f`b;c zHL4iukL<%MpK1x79VCShsq`$Y`(Ddp*u>l%hV88%m-z?JnX~aR^4t9pFE%PyG zu3VE`PyI4^L-vJ*2dOJBhnvxF^c&Q_a;WeYDVst=-iLjiMYiuHw25mpbx(8;eS#fj zu5dpKf*!j6Uw=wGjF7y0{V)M;`wqnDo1Aua>_f8<&4uWovDLhCECE!&)O40y9U8+o zuai2BRhl+Q-x2N|Mh{x!XOznZ=8b~uGzdw?-2GG)u*2YYrjI!<6ZXl9wOMhwrS+Zu z>D5m#|Hx|S@}3ytd%}ZJo!lYF{v*bz4)UxYU63@HR0)VvM_Tw^2%m>J2W1U&UI4)i zXRP(9Ez{kP<0D(O>%LI4S}EmY;Y??fd1?IUZ5j_Zu&x4&#LJh$U((p-Z~+lWFFH^$ z)~K{9dn-m?Wwwm<{cR7nQ&Jskca-2W%XknIfzfv9=145sU zOwRV=GSACOcf!#wgwET;i1j_1!-ZhIgW7qO0lK|7eo-Y&*!@9=X6b;ozQm%1u%K(K z=9&oQ%qA$rP znQq$SKWcvU8mp!?cEVPb7UG(7l9dZbQUO-nqhUXLt8!?PHZHdpq@zt0ZhxH2`{1n^ zrMu&D*TF#IiNwny+wS&Ke^^Mv&Y1Jb)yY1NAs;D(<95y*ce>yDzoH16 zQLyj?xzf-?$?(|D)9(vCm>;HG7LA-c8@NE)`ar zwyZGiQehm-@o~6E``u5a{0x5Q`|-F3xOUR80my^UVJO3Ph)8?t_SlQYAF`4Yl(*nwu zYa9L_U+)1=<@^7Sb2^0<5|JaD?AhVi4U|pD%3g60$4piVWoC1Z5ru4ynOW+{I`%kb z8ApW_QL_HmsdwZ3{(gS{$KyOaZtC3keO=dUkLN4=I(_^Cxv*b}-9k<7i1*Xmw5cut z!T&Z?i?x}rl0c-fJ04LJt*;LN5tVcy#MjIVmZ?H-QaPwuUw^3XtTFOmwmOWzpk7yBWlPQwLaU9ia&6cXsNGr%^Fg6DcBm{cyLSn4%?~D?Z{N&w;1xqH-=efHs=<)4kH-ul|G-$oJAkdN+k8CUtYB`aXf54ezkmWu( z&C1CPoN^2@n*Jc;eQ`EQ-S5l>FW!l_(Ogp=HWn0eocBHa1~cd0E9T8<@0}pE6+D9J5ub=9m zIppi_!&JY7&o}pB>m}qIF;Fo=7H(` z*YvLg%Coi4aZh-!Qf84G=625ax+E2H%zjfo&*c>4{eu&Z=PHQuyp@Pb*# zOp&bY)dR&|oj#PSGZkp;so52Ts=K#<2rme8J-D!pF^Qm-9hAF}sknT;K*WARl6o#S6rwLeop<%vSSVgaB1!%-TGOC)q~4~{aIFjK^xHHnYXu8vvD zqPaUw9>7cu*vu{(p&1K6QZ`d8)K%&$8@kFC>eU~6dr`g777hxM%RYm707^kF6#n`pjCbD=S zjS&khjiYLEZ#xH@PWWgo-{po;dxvUq&ynv$g|!ljSEZS;a;1Khkh{gJo;4dT#miT4 z9&Kdm@CyMO-V6H|J^NsOEA;fuGcF>4Km)-3Dv#R%Ko1D!Spm$EVg zG4~|R@kg5}$VB-J>lDefu57Z`+j1WsonchI*(0X<`VL&vVRzumu?kQu1W!PHW_NP6 z_q;a&YGO{6-c{+$Yu{p(gYP6cBRh`cH#wl1Was)!frHlt$i@ zj=#Eta;_>{D*w)m>HMcfHPRNU(z{EPLKn>R!dCa{b+HJZ7xj ztslv6r?|-6*`AAfa(Bh8a(1i%%APE~?h=(0HiIor+4%Byuum0w$FA`9VwLZZy}{cIve)ci+A86@GC0j8SNY-jX}-@TQfZEZ5hN0H4|!`FnQIcW7?~ zehqYTDhqNurQ^%%8xP$Q6e;ZAd9B=6?4_B9-MvvPw)h$L5LMeBAS-i69fY?3#In5I zPoHMj=zXqg*1`TMPJOznlI+wr5(KVknKG8udy`Pdo<-So<;y;Vsm@=VGfM6a3A$El zoG&i#T2KemA`_uI?Jm4hAHO7w?#hj572A}#TSv%$;hY`NESVtHg^@DN%XO2Ni*pHR ztJ3On1%;p)IK|d?hszee*M9^h73l)~ymLl~^exv_0G>BPe&)?yXsr2NW#QeJ`nt?D z$y@=s6u;Ffii)}uC_C@WRgF2hx z3bQ34*RFra%qJqp8g{-794@M;0dcD9!{5f=4kVGrK#2jYwIQMxwnQqL{jmuUKakex z&NTmA9M`^a(3O(ca(3Kj)WW5YwjNzF zT0}SK28aj7nca_;6P6uAvw-N-GTX~-qljKY%oW)SjBhF}X(U*TWpA?E*E}DvL7X+} zAQye<;%adxO3nKd^J?p&kydG&RKroJ)IsqD?t^PhZ1qgC{WZ1aF5c^olRN0Az7p$# z+!Lp~Hs0pT#T*35gg}@cM-M_)d8&kc^KNu8SL6&wk zJlp_oO@@Yl;3;h3l@I$7#W9aodQBU}QMr2E#j-=Mdx+W5^(+_OgeT!+Ctu7i^Im&S z>=SW#cWCU6R^mc|Djasa?vR9H0;bH_;lTq|ud)}l9lm1i9qMC>0Cd$@?wU&5NQ3MP zXtLOTb4TE$rMI14=QkVY<10Z$oJ$uwdd@DD%z+k5o|>g3L5U#zr7J5O{k zUHs8eW=oBC**;$V+%6~+c?Y&^R()N=Fo*}6+wqC%rT@9w(zk*whp#kbzE(xo$JZ1;8gPpOQ);T|$Qm1i|dX0F|&r4qb6KxLLr z@Y!MiI*NE{tVhBr^*H*~u}W<%vqW3~*#nsixv$#;PwBq$DRBP1T*bBXJSb0MnI3W1iGnu#|AF$!0YW-{{W6}>BD+!oAUNmoUTeOULR zy{2BhJ4_}1U68QjFCD?OWM+uKjvSlw)(ZY=Z+CElY>|R?$y7x#*bNz4T=wMHsZ*$e zy8(qd$IFh)w0-_8CV*!;fO&ZO`4nkXP^l;Cj^?qsTX(;m7({S=lM(?r#rzDZbv9VNh*Y@@uHtx z?|$+YgP7q-hOcw&f|hss=`#~dVz;k5%dfnf{}v?TvdbwkzQN%WHmj|Y`~7kM z;sa8=cFy6*WLChbC!cF8BHdx_8gWI|F}=BMW4r>+p0btRjsn0WV}=mTZp3K4Qj>Dv zLybwc_gFt8Ijm}}m?JKkW4HNmt;8uY-mL2yB@2Tk(IX=TVYc;2*yc+~MUu1tkPUez z4MdP~#{<59(k%C5dXg&%f)nF3H_a!DG!6Jk#TzFfO_RKAtK9dT0dmoluFnTPEHX=O>j0MqYTh@C89IqZ_HbQK#TgC;c{m;!1_UW|A zsDmJ3WhGQ%CKqg>-SbNLe1sHsnmV7if_mx8bfZys@^$=k-%3_i(6C)H`+qK;Ct_8;#PGpE_+>z3u(3_3Av3zPvdXCWMx7gGq<3 zTxa4cziW3}nj=!81NO~mRU+CBAOXV_#uW3VlWZtM&ua|S^kQAYD~g3TduXHfUlg-i z&X~2W^h}zcBy+pMd-Ep1{jJKE8giL<4LC9izV|nwbc_FaZl@@y5A@sJkO}(`hO0o(s zYFdgLVwX;AS5I;f%2wY~Rxqz;we1D@$1j<2HyTY-C+tdx*bMu!T#&Ws2{i6-bjJ4- z&v}e^g+({q&#<)dJ2T@kPV7f<2T9e>s|J2?`88n$ zDsl(2M2dQwZ;WqPrJzLdUY-YCZa?U`QnOw~Bb{ZlKKRCm)>UL7)#SOKKXikLa@n@7 zxn6H7zQfsN^r}wEM)cghCZiEC!MW?_BhCpr2z@QHf?3M9Pxzqc?Ykd+e0x3M?02QO z$!Ap}pty}TlgGoCd zzqCBP=X+Eb0!1+1N6|?UU5#sT|Mc29d6H*-2v}&8oJi zH4+Jp9gMLM2V)QJW}&u?Kr5a;B3VpvIYq=V&&GdHBuJq7F8p&mbpKhKthf1VICd(x z?HkL4KOK`<&o`lu`n-cy6u zMy`_bVN2D1pWBB-&s5gAQZ!`t?a+h=n6olmXvw7auNp_u}#*s3$vk?3PV*(_;+o!3eY5sfil&nNFv zs8~psOGH!LXqTRoS^Jr>a5j&{%MA^rJY4H}U+;Uc_ElR9Vs#*+x56j|;g;mn-Iiac zDe>%LM1l-Y97kwNe%NQ4TAQe=I2$br-12$mk5N6G3p?i)H+@u$hC0s8uXP*2a4MTy zy%I|=L`xk^U3zvUMx%J^lg+xsH!TH^lKYqFq(A&56S?cQ@TfavyZ3^5ZpE!V{C8x= z{+V9cti?U;t4GA}g7DFt7L?A@yPp;)r{wcSZ!Ila+-%yR@{_wDTNBo5*PUWnEG=26 zJ=WWEv$T#I@`eLIB*xTAZU=GBSukc9CE^%j9&YQkfh>Vbymu?YlvRrDbQtyb9Ko{Bze7z4KFdEW@5GfE6j*m#nfIWTcxEbvz>02s2~G!aHY z-e=w2fb-r;d^->t`{p63tcR0Qe*vP(dGuPw&syI|hU+U|no9bg2&9334WRl)%tz3E zg%c*wSX*RCMXEf0%EbSpiH}3+I76lji1U}Qae3M&h$o3~&o!uPj_9kP^-$6fT%e?e5-9nV=1J6$sowG&IO2nA%B*6VH8f%^7O4-s-u@YzZnM5()}}qTiQHV}#?&q$n=<+f zB0{+r-<(hO66mUX;J0mrSNqtK;GVj%lOIs~_8FW>_ip^9Y%0pvcTH$X^B_hOha*0V zsg8pttCxL=YVuS@Sw=G_pP~q<7)&Iao`>v-H$E6%+F8+>l`=3cNXwpoB`}3%=FQiV zAp7Q=keUAOx250OaKJ2nk=a$J2x$zS;ErmJTjQ3D#I(?_QeD7k_zy(wwR%&uVhcX0 zY|g!pYImQ>{7@cY8btZYAc!Mw`GtR4prVa_d0;KYL6&fuvPZM(OxKUUA_i}g4R#*h z(F&xKn123_dcO5#C}XVkWMn^b&R#_{Ikr14@g^_>oS!OmGSU+z+;31WyNX?Y13S}8 z8U1cT^yM=qm6K=2ZX+1Tr+YqoQf;*0R~2eSOOGAET-d0O`bbI^RMdZ*w@(^BCs?x{ zUDb;_D3wWLc`6U432#cH9BVssb}dXbkNAT4=o&kk=BRbN-KIlq(}W9LF7$m`U}bHB zN05JkA;OSNcFk^b9P<;+_+Hg+7NbS`?xLv5y(^^-?c@9WIf<1wNCEjX4X!XUVz6_# zuDdW{Me81yjrZ=AyHsQ2mvvDTna*VwgPpC72p5}9jXi7KY>xh3L;ToP5)^G|%@IJ)F#VaG}H5Gd}by)-&;#--o^x%Nj!vz*n_cpHac7bH9K;W5bnfOwya`hEvB4 zihwZUA~lrx4$k-u+?bZ+21P%TVM;@d%Pt-cZPO7OxW!8Z#?C_0R9(tTB~riKq`W$0SqYn<=}w_9OHt#4i|K2$f-jbWlMeKoPKhDCBxj$dq?&b8U3=kn6vSYy9+ zkja~p$9wtXn9YPuQVOi~)Bt;CJBFFDh)F>?@6;uF#}?*zv{j6)V&p*NL%71Zdk(s* zBA~S|;Pa)BKK4|NH`OQGXI12{T}B;En>P z@br9RXDKMLtaN zE{6jWTYlRlXVHQix9~V>rS-`%%sUa;Ag-ShtcE@Z?d~AJXwH zP#jf`$L|pw+ueg-KflS$jjD{o;$HXrF^lZdlb;IF8(~Ge!B@=|F7p8Tw16~bUYj{= zfTm2mXjwxf?3W*~>L1=~7@4|HDz;AIB5At=lw-H6{N5MJboz}wnW0Q0#fH308;S!; zB}t&UAGBMExjpLFdTOS>z_$+2n|Z5)TLZW;Hi$8VG^-ep*icF<&Z^PZlQAVf?53Wt zgiL2i021mTx~ivgt!|h*@dx_NrB==ClZ(WI4QHrO*Nn5kr;YB=6rE+YD>k7F$*P=m@ci1=DO_(hzWJQ3RUoaxi~@=B{;nU_ zDohIQPG(K8%=Un}Euk09^)LCi%<6quZcQ=sNI`-L!WolM@@YiD$hA85Xflh-+SaU_ z&!4)B9|Rf3XR3N%uL6nf&r9b^1XDrSwxW&_aP|>iYx&3#D}R`*w+-cySs*sJq}l?)_z5T z3wGPjiSRx%%;5?*&PtOn`XqplY!Y`))P7ob`_5Lc_U79yBFxwoX<>Y;7;?A>2ytEp9%&%ic_kPvKPGoLfgON-*x%Vs;RIaBS3iEmB4NqfCY4!H*wPBlEFQ z@$E;}gXcE7mh*#-9uiAKl)j-J?JDCQ@D+n3uCv|SCd3>dr3f5f#4bpmm;w}1cWf_ z`)Xlwlx0n-*V(EPmUb6iT!r5^9zeeo70aiw7iVX>u3y_RQKJpn z9lkt#YY>`y1=I*jhIkvh<(NJN#jE7MKNRGk7xx2HCHy92|NN(1GHHpUpSFf>Ma3Mm zVuM|B;7bzgeM;$GFQPpb;?%T(zjkk)MC^TVXdRsT5V>LdjW`6UdG3U&fsCdSTV8_1 z*nX`$>~%*y)pre^p%f*6J?lknSdq#-))$ZV=gUBI+QeU-o++uU3{Qmgf-p@G^BKpc&< z_oNO>xRT}oO;`)gMupDw;cZggf>3%@9xt>!j_S1Lob)Hv?{R;{kqb^>=2x*EkuTjp zWTMUPmcH8kR`Fg-<2mf4)I=cMB+D~{VlI*1t`=W{K)Y+K_PlIh%)J2MoeR(2yY;oC z=lo_@|K&NRD(8sL-JXYI8$o)O}spHQRRA-;@x>0qH*+xCXsUJvfPLzevFR7V1jfc+ zD44rCQ%rm-!>Grm%c;Qf@hJr@i8d?8be)AaINDzDq^;XO&R{0A=ik(7xr6M!;f1Sz z;R*X(Ixmf(Hn{nYfnHM)$r&`|C!T>DR+dP~ZuOq-{kGgKL;C_VS)x6cXsNLW09(1Y z`(B9~t}1{)TwuiohKzfv2-nOTOGmweUK@cf#loKJF{7iViuAoS+PT0g9}uJ11}GXb zrm#a_J#=p!TbUUhfB2L;TAeI@J`0(RPa^ErnN}7%XT>dK5n{qJ1>z4FwQo33rw=MQDsp@Kg)iVHJA|Z zVou`YK0FjOc5O@rFn$zUoDb|ia$+oNM$zIlncMjY0}Ln8(}R;~k`Z!EG_$t~8`o+D z+TUv%ShvOZ>xqw5G79h6vNl`6T{13QqnV`~cNbszE;_g97>X#*43#vW7CY525Qd1# z$GKOkAAhsd(PPwF(UFG>6_;EYX-a^Uc&YYU@;anfaA%1q|g)ON3yzU$hywDH?}JU09CSu5IlUvogc&=Wbjr~@mewlxF9brN>s!q~(2 z6M>Dvazw3l=>+;Qbg5gOe3lin;)}HukI5-ukLav6ubAIcjBn$|H8F)g_Bz!o9AolUlzNI=?)m- zWg5hnTp(qZvMCbb^04Rl-K?Br0cNu1Bed31AG|~rG6I5FRyYqkB{Ly2tc<}#-=()^ zzE6jl$x6nT2w95>rm4;Hac^ru=$6hUP}a+kQ1_0hHM~Y3v94`x?{c%(Qi4Xn zuH#re2X^m^)j@*7y~Ui1uRlqy)vTsx@$4_X-q+iFbr@gIDSe8F61?4iud09LCgF!| zw?dhU8ftF|Q0#Q~gco&-b-^kH(?gRH@}^Y*D2{4JnY^O2kPR-&IebpEawNh1#x~qs zsK@##Tu6SN5~Hw-vq!*H2L&7gDikXAIyO^E*!!zNdt>^;;29p~IR+P@P8HeB#nd$G z#ihY;*E=IwX=a{WkRGMr{VvBt61JhhygUi~m-bw#*`C5xx%ZNpab!E0hAEjX67hcOag$31 zL{XTizqA+lvMomw-JIWs3!fi`*Yqm|DMrnN7F9l7+DF}}HP;=x{3bEqokxX-rZHxIAW*ZI>oyTn|1GEQ@J_Dao4 zdZt1#A8$*Mg2%Soc=fX)))t>9DV(Oh@VyZLf@p~@tF25p;43f@aAGSSrij`- z+{Gu$AB#9*x?h>e{0v|RFWsh_dtYT$A|}<4vxY5eKEHja0aR5bnggVarj&(%n3~&( zH5e6HE!*2I9b-W)QBExcHS{$RoIgwy(b3gfQD&U$hE+Kf+!?h}(Rgb~jFugU-0D#V_!OhA{oyUQh!gU5eH$0EB zsm8MszI)W-(riApnJw;dI%RbATzwxHwyU-yHL>Nk?2PiQM(sbXvN|>zvnmW)iVL2H z+qg!d-fkO;Y-RR5)3(i6o!fNER1GvK2|w~uEMxasj7w9PS5PoKLjygnfmY~L*gw1= z!SWq-czzsl9<8XE@yIDAQsv7P&R%-CQ@iO$%Q#nu=l8tV6r(4wgi|R-Gl}A-P7`7< zj~-nI`tdZ~*>+m7PA=4lIInUfvDFU$V5@(=z8}U!FA~CdN#Zq&1Yvil!mgbl$sEa$ zi*v?EBH-KRfl~+`O+>mUO69jKI(UG6o`Oc2Y5~Tt$j=oFK<+l(8)UA}-Ocbg{ zsHrP9d;~3c%tbjHL3t@-(WQC3(dD?h-95@G1@h{jT=e6o{jm@-iwJ_@w&De9FY3b~ z4k)y(;AzO4VN#qj=Ybx{S`IPRB6(}lF%?*t!!l7yuOBuv_r;C|7`hL4A3ysUMF8&M z?~h4Vun8MvgU2%FDpAQUMjy29wjg{KR8Y41W=F&#?BitZE4Qmp=M#sJpTE3CXG`T7 zY9V|A_^KtBmX;?)7j+QwDQY_3t9dMKhOA3DQQnyG)auX9@1KlYImczBxa=jSU;7St zdV||iif6c1Wq8CDI+Q7M!gpq)sIhvg`lD7V_K8cA8VO&$;4nGbAzWd%e!6n}IiYWa z**U!EAz^c3BuSX$@os>oP66-0+AWi&(+2S*2NryMsm=oL0xSps^>}`@l7<>q8uQ_t z!tMi)6Nm)+b4Lt}Nmfr^3@J|%({#E_)0o#`LS^9%Vgt_1t1F{M7s5}c^I!Yur=dg- z66pMqM18v$JcEE>8FoMpcTjpF>Mg8|w)M&CA$Y<)JPFFh~e@xVH>fc1= zA~4T#7Opk2Skv^d1uf6Vlq?5+tiygzwKM7XhQA)p&!_4K4&sBV(sBHTO-cHtY<_4C zCbd#a=bDALoxuk^_M{U!d){1@>s^>*x#J@ZJ9i7_ldlMj9G|n4Gs!zIt`9YdQrc{E zHO+djJ9Ff=9~ZDbf4fQraPlNxNQruOKC5%Xc0UVQ6Z@yD`roJy>~P87-%SYkrgBf2 zuYt66RlwylIR+^f9_icaI6%AT((4b{0!`w@p`p;?=Qb6InlGo2I z|ML@IwskZ0=U_6_9r9r%=UZV^LN&2hQO-AOy~o?3vSY_P91z}kYY}SU@<&c zdzdza35ql)?4+}I%a(85SeMkEj1M1)vBoFh+UgP~mF&RkBfx%jp5dT*GNiSnsV|%K zE86k%Bsp~aA)7S|nSqT3oG)F;%ahOTg}EP?1`f0n!4;kFh^y0Mo zCEbyy1fp_68XpR5xit*{-aL9)k2b`fv#(*t_X~sH#bV5Hbjhf#b`hPZ>P7T5-Ia1b zLf014lR>8&#Q8~6>;{kHIT$E5{%`@HNjl8kx_YXv@X{Z4Cvgf}d@Bg}$1k^cehh?l ze>;go09))lNFY;w`Y!sisEy2RQRKV%V|Pan3GW>IbO(83H5e2~89;g?y13b8Zni}t zVdAT2$}x68EZ>(|k;F)UQ6V@GM`1{%%0Exep-iPqM(~fl2_>?rz!DErk z!x&Q5CuLn+E#F|U_<9r4B5{`(bpw;*d3f-}g@;@mqLEZ%GB)e4iC0e^LdQ0_fcDvH z(;FOQN;IS``}H}kL_%LR1~mabU*C3tD3*yft3t+dyA%j=o&wn(WblM892GO2(u&H72q_N;JTZl}!38yQExbpc7rp zo9568CklGK8_YW5&urVUuCsk9I~|Jrn40@Yc_Nm7J7BcT$#jv3Z17Jx6;yD?ec)CC zcU}o-MsCFqnz*Ui@VNY7fe}u8C1xLwO42jUSysMo$BwYWh-zPWnIk2*{4aOG=%8jn zjDju=H#Z3U5K{0`wVoDU_Xi<}3Q!gF|MQ)`%Mf-~Fa7<|SBut77lj}GxiJZ1KC#RR z+O0c;Q4vZ%@Dz~%IBHwBQ@8Y208A3;a6@pkIz3?R)x0#GZ`z6%*m20q2U`o5 zLu&qPB1wVsL%S;^rP2>n38Zrt&eqf(kvbA?U&Z;4f-UMkM&SFA+wr(rDrAu)4DsT& z#3#FBj`y5J6^OU$Q-AZuP1=HV!TJ+Eo?mTx*`g1`;T}z{YJn=B zLWAr(za`G!7oFVMw}FH>ju+~?BCZQnWA8J)Gx#V8z2tk=!GSYJ+LY2Z<7iL+4D8s7 zwAO@wN>;Er9xMH9rPKy?tAo>3wtD&{H+Pdar{z*)iG+#Uc!|SsW*QWDe*fC&pW-M^ z9gK19x_&Gwc<4DUl1?+*n^@{I&6#_Ip>`hQ)MVkh0WFy`kznZyT2{hZ+W zd!tjmfZ->%^sF)O7#!%UyRn5_l;yXOb#)g5^qJI(;({wdtNkC7=GS$VB6&5a_WvdL zNf#jeJpabb>oLC$G(9{Rq2hviE%%5kx4WN^({y8C`UA2bR;|YW7-DBfAL2w??(Y0K zgE!+4b8k)coL*;2u9cjLiQBAuyy5nV^6FPPB@#)cQp8E5v-b9YUe})oSu&G@l+dru z3-RWrfxcqV?6ImwO#{J{?QM`298J$Gq5R-F5APz!-9!?=Jq@H#|KIQW^S!fosS57> zIb0oKu>C3OT3euJmojzEYITiH?N%7+GWCw0HWwR6qsrTzZSoHJPE*&W((mjvR3q31TSG##Ueg z1&0s4k5m$az>f=!*Hu*jSEVd~MJrx&pK|tt{p*{TQqA*TP3=bwYnSC?=^& zVibeLxrHvKb!p;Fy|ZT8PI@VaR3B z;V2Bf1vJ0}j9S1Y0l{XUi$E>Nc&D|c2&A{A%x2Jpt})KX@-h1mQf2p$X7)@eviP3* zKFm>}@xc0pZsP6wCCw*C!Mggt%hk55w<~W-%O0u{a%-*@uKiFYB^y&|`un@aNwPC8 zfnz2%{hM7vs#|{+_nVF^H)(X#m9y)qG-Rbj){WD=bgE0VRXu1UI4o^=>qm<$Cn%fj z25rPHC|~(=a3t~94Yfu84?2xd+NRSvq70}8A?-Go-NGU24375gLQ7NWvQ+VRCA*Bq z?zdg^SiIGE<{`1O8Rz#>J7c6KzT<44di*}l!1t$4lyBgCLF*YimUz-Bk+&kSHdRr$ zvOV4ANw1jL#B;$P%XB6fO-BfI+c`(|90O!fzO?XfT2yJ@iq}fr)W<|El786D#Lv<+({yJBLN0)6-&Faz%1e3($`2XK5^JiZ=c-hJHU7jN4>_X_RIa%3j)#mg(kZ2ILx<@oS%iE^Xr%HT&xQ58MZ;p_a z%A?bFo}zSYddlo%@LL4y1VH$8|6*k}o{$QwNfm>#12;&Mv{m&#oQ+T#rs`gjA$x_v zgp$>Ebku)%XOK-qB_0ewTHw;Y<{oV%302ff9D#}l&+g4FeMDfP8l zaiyq4qSjdVm8Ci)U@OQ zK+jSaMP47Gb<~424dTO%RfYWL3Ri$FLe9|hu;_|riZM1WI9fFlt|DaLUbOPnL(p_w z@a2HGW{pCp%93p4#CPBbc6nMSK5|L6U%Ic%oXdcn8{Tw*nVO)_8Tb5Sef;bBv(U%v z5Nt#+LW+XiHPWdPj98qzLGpmJO`ynPLTV(x8&djKx5|RSU*;YVHb`F5B+S2W_4E4Y zf|OXOfNvX9ZjeM-b$}c-F1{!fNqlXmb)CUw#(K4C9kn;I#&?EN>}>4$J>SZTo25t-~)3gE6i4v8(+zLnsN*#QiT`B+VYW zoS6CK5kj9dcQE0^YZ<={Em=;Y8#cyjyz~xZBAzWs&aNHpg7#EFV}=!G*$kobGx~+IsdzFAbjTG>iSIh;dwu(NnZz|?4Znl?53X^?RXL5 z^eYJTTYdY2ohC%8P;vA*K|y=p@4x@K_$e-bbB7Z2^m$;GMrW7Gbg{GXE_xL3#ZpRg zk`sP&>tM0dU4^HNonr=V5ogO-vO-b=Og@L#ui3$^{f&n79?A4(^C ziWIc@%as4c0R@N1lfZnCUl09r(YFWXFg?3S1an1@_aC;oS@z_B2tYJ*sNlNpj%=jBW(pJx2$aim`X)M3HO zfB)WR1OZ^kfsxG)e|GaX=Q|!D(Qc%xKBoup;8Ea+5%G7Xz@gTaDDj7Y zcS8q1{_kUctogq;DmVhpbo3yW|EHAtL>p&DO1n})R|taDYX4q60FCeb6-y|&z!rnT zR@;Fy9aqtXnuG;M8zT_48a^akX5ft*L&U+2{-XY0boQ5_sG|wiUCN`G#Qo<4hdKKO zxbF$MG)4W#zb7p{3lPz>8a7X3xRj|yX;Q7J_(kuT^`wtqFj1T_s|FZcuT&(!;Jz#{&(5Un<55#~A zurmJ(eck_TwgFo<$Ilhdp16rf2v`kMk7F)llBRuNP2+HMJ_bFxyW`H_b6c&=^mkYe62l>Wij>~a}a}zT+K&h`~rAHKXm7*$IVTw?t}lX z5GVB*9ZHjq>Lu&7RkZX7j<4dLk^>aKviZeF`ahb}@k#HWN@o$)Fx~f;qt^p4H-v_Pp z-10{mm?^ew@?W6^*3qd-dq&vL4*er+%qa{mQRC@0{WupoDb+mCvaEj@c$f4L{ml`e ziT|_zKY#tda?XrYlbFLnK(c{NC-ik;vC&S||7#;okfX-L+l5fGm8tIUJfxXx^w1xy ziar!5lA}~47cWNEACG_sVk`CxpW>{c$motzq57rBjr>>ETfm!-<^l=Wg%4Q=%0aTYbVe*mCR)svki>u%G4DP4`G$d?|+k+kd_khk58+# zLiPG>`DT>CMHG!_`^8A&ZAXyJK(|$H)?9Gb@c&w7vdE+J42x}mk&L+0+JFF}R}UEf zn_gEgDhd4$g9%8KAc)ZP1t4O5DLQw+ps0kOgQ9**M3T@EfjC9~pa&&o==@D1S_M1T z0}$YH>f-U$@4Hn>#Cn6?((0TH)x#;d*td4;EiB6=gvVS0NH@puY!r)!e1Z!}tO4%9A9k$JDJ{J6aHske(ALzVkHF3Sn_l{ zo-z6*ixO4$U_gkwXFzzMr7KBL>SIPD40ZxBU^y<{jD~W{z4`|BI z5z|BHbbHS6OssC%I804_G+{ zu+^HwEaE(|!N>pT+`okLf1C`_{Hr)oX4P;}5Q@+4gi1HN+C1!H#67;^mm@#n81OrUf54 zP41lkSl2Pt3$j*l*-fOJN!4D7?QK(Y5I?)OmV*etHamvg1#(5aC3aAOe7^ zigqz|Q)8Nd$1a~IH*vOpk(^FB^bnll5~ZmDP!NA~f)XcH0f2}4wc6hojs_49&zKq1 zz~p(5fkJq@POyJo?Wia8s_VRY8iv**;v|>vk)PrRv5LAJqvZumS`z~=2~R#d)(VkoO*$Wd}tsa|rDi*3>_mGyQgWR2{Z(+6y5 z!_z9x;Z1C2`{K14NO2ira8G!hOBDs>SSWP%gE|PkW_v?HC{nh@#k?||_s>@T-ct@> zX#ef0IBG#a6q-SG;oT&LWyrEfS{S_l8wH(>mGdNy>W}?w!Y=fG768sR#2VWDR(u`T z&gkX>SG7k~RXT3J>MYfM=0qEU`>9WS9V=4lgJxmql9=#?VhLOgqS{bk;OebpTefW^EC>5Gge#H0dXP{U1qX5*@t!Cmj11 zwf}rgNzgrpFogkOC3E-uI|%`*^3{b&IRSt3=u(hb@yN`QD>pjVRd;B{P~*xU4gE)* zItD_#OCE2PB;Mob+kt|5PE6JVV<#=N2U1uCe1Yus=kl8;Um80sKccF^fv%GmdR(-i z>Dh5XZ?06zJ2MDar*E`dnwyf)0u##FcBDXqU4vHG-)gRkmi*k$@YgS*{zowYWbyQ| z{(3bbeKBE@Uurwk?oleK+5%Zc9TN9Zu!!w@}{< zuJ!Hi(xfO&MVdHq`IF^u}Mt+!96VKi^|BJ$x9ZOR6G7K&oAM0j;nQvYm=tVj==BRIL=X)G-W@kTew zYcuHP71MV$p0fKEDD*wPq8Vp^lEG}eJZM=%S6*+6DrBVa|D+_AVR)t1i`XDl94$GT z%80(qO|rWaFSQbrx+KOFtAT+|hwZf2aqx2BZ%?qAS8+^Fxy559+)X|1tAP4u7kR8n5W@@~ZVQU=r6#q5N z{egt#@2kc)TGSnCYPeysX#c*DKd`E-i3uV7cbZBd20X4faCBNR+N)f-rB9!kJj`r< zIuZs|Y-L1`$Pf?CqdxmvWB^u`E-@S(yau7)zFZ;|_3f1FVt^xuEr@V=pRf8t*w;Nt zEq<{dOyLgQ1@B#5#2cJUg2{Ms(_tQm~6nV7Lzmhr7Wyk)eH2#|yDA$Is}2dj7sgIF?jV@WQ#ieQX>VMV zJSrlcmNrESf{4|Gn2u9LWbj2~Bz=fQ#-`=9iYVrH=qt6dK=Xoy$-M19y%_s=DeJH# zxXP|Yz0I0JP8qq9JD{LC>_e4Vs!5-BNXnLl^f2mDNQlm6rj=B^UAFYtCk--8iD^&; z^}zq+7s;HNpO)M|%96Vlbx6vA<`wzZ_|N8o9T@n-TbtxgD`n)%f~O)BOFHJ7HDa7l zRyVJpR~-MY6gn{$JZSk2%jk<&&?=mg&^|Xr;LIgoF|E~^JA^vW538-X#PFa8+JNb2?GFyD}tk*oQALVB}C9s$fM6isd* zb%CPKkV4?h!hgC6GM9Mh#Y?-97bu$vs4+tjGP(FB)jl=w*EuTqzNoS;0DB)3XzFY7 ze9r}UKFQaCcKawfchQDSS+v7*`z=XEjU5X*aW`*}tb;1y&Ei;VtFkO8uW}${C34Zq z8lj;N)ya6o@T$1jU=XJ?Okxf1RSz{f8NK{UV+4W7gu57NuhD@*LY70X6RBeqv_>5{ zL8W^5?6E^s;W~I^${!O9g!(B2qQ{*M8X63?v9<+M@J`dm?1ho)A=j_OL>>L6tRLoV zdm}?AzK=M5Yq7@sVIUw-GqE&TAd$93$HVBth>WYHrd^~`N z)o{s+7V~2$HXDOE4{5Xc z(pLtpyXaN)illurPSAF!OF-rl4b{sh1ibrXm|rq<*!?3G$qDY%BCUR){zfT&ok6`f zumrQk2EJXN29vrzq}};IY1}~4y}QE}ZwC0dy};(-nHJD}8NERs5`bu>yLo7-^IlA; zK04%~9jo)MG}|2s_8WTW?m9!p-xoQGRT)Q`*-B!VFXv0!s(vc6PC1ebWU##xbv~c| z7izp9HaGrhUoV+*?qm<2q))sUgFd+snE_pdX?Rc!-Xka81lxI0w4;Dw1t6fX_4l?4 zAuSi|f}1B&Lq%PpA4UPwW+H5V`hA!BJ%tzHI91ntsuVexE9I^tdxa#qp zWZC|iFOzuHAN#fkq_;h0&fN+O$DLh9ZXXbfM`vEUoz>$Y-x0>%I_4l#8K*LCQs!B5 z-yrqIh^1=k_1q=8^gMSE5YjFCmZ%Ec;iA{B?^U~B(%#lv8?Kkvj&O+5dnpT%yNW#7 z))Af4)~deQ8wlGJIG0(eLU9p}nSjt^#%Zz1k<77>)^OfhSWDOFzH;49Af-T3(Gt6_ zUl{u{GYwYj)fQ^phdT^r>K^!Lx$7at4hdU(Balr#;GVb{=ebr)T=n`{-{P$`Hs@Qx?c?E60DahuOD!q6C_=#fG0xP)2p$Z8@D z*IaBoUHv5Ix%@>gvitUjE}bkx1zpu2BhhVheIk-SNSNZPF~3gX&*mCln)G?JBD%u= z(N!m>7HiC1M+CvHXQfH%bo0u(rSkB*Hoebp7ngq6Z#j-&|uIcTiQh&Yk>tYuWkG0CfUvvaFB&E&SUILaN0`BSPGjK| zL%bdpV->e*goLJj6P_9k zl(_38TwI$SOH}r49AEX6$#rM{rXPBWOmKR1(Z8uofhzJ&lfjeq&a3tl9oT))dzH4X zwc|q-uLl|}w_xaGRHdu>9bM0`m1af?oE&;}pF4J`0C(nlm|-|Y@!w^7e?JhI-l44R zak^S`nl&FHRuKJ3knb1E&}<+{1-v$#z#N)jb=Wc zi7s5PYco`W^8pj;!&MlUh8)0$CYH1_wuy`!)A@>;NgNDaKQreTFSMTnL7<5UYs4@? zNYOC`$}0;ZA8wG90D=$-VdsBS5X@Hj^)a&a*^tIybMhlP7&Hhe3XvtdxLY(N;+&#; zzo(uV9|uoP4gR)XeuBEWA%ceSxjMuZ+IwmEbpsCr=UvRdup>x?#M~cD8xykiC2j)O zCT#?;xqvacw%g_M9sQ4-M14Jr%*bKIIr<&_>}~_s`Las&;8jOzuN&|uUuz@x}5nLvar z&Jm%k&W%W6yP-xT_y8jfS%x<3V9R>NSV?(;>O=L&qF;M`XDXgMc1Ykex@w31)RjzY zugL08^^Yb}0xG(!z9vE=(vWgwk6&N1Aak$TxIbjn@lNH34UcqZM2|e2qPtBr6((hl z*7A!j0Ji#Vju9!r$lBDfbH4I$=~|_D{?n;TCF|cy?(1W8q}r7z>{Sf$7n{+gXw^8` z83z!XvQXcl-`9R`jFsOl@yF|s)gAzB!7WwTS+hj?IoJisyYq(aGl!Yq-b+Xu1PC)v01=uKe@Ssadz{KLkiSqW zP;OfaERUatZ8{=W2Erb^#U2_Ffu}oL)+K_Qpc)tW1*;hC>nqA^xWPj17t#3qz<2`( zeuRlFziYWBl}d*SiX%0Sdq2CFH$j!XY}{8{h&&u{wn_wR#LR~gEL9&#qD=;QnbSnv zoC5wSzUe~m&d+`P-4Fq+*$0dy!tJDY05HeS^ z;73nI(C%7cX=aa?En1Kfby>4mXh_u-!fwU9y7%?nz&Q zEvbN9xi9P2F#lz(r=GP*h92`0Bn?EQZ@1I}^IkO(pr3=It3w;TzM!iz+e%(B(7jn; z9-->g6c?;z&JbiyI0S%`%}}yTpzx_3V-TD;r&R==-^r~6CxfsDXt}!Q)vy<9<>oSr ztGyreBW7iWuwVJc?xD9}aZ;`pX6oC~uK2y&0X#!s$4D-tH^D)bL935T5CY(JrOK4i z^sqwRofF(l9OuFzYM+txgywW8yU8Jh>1^82NIj!ef;&soZ{(Fy9$9Pm4|^O!xWaT@Q1Gd%&IlYpx_x3!?M0@%C6CS4W(gQ4Nf5k3z(=~nMhR{W zFu5o)8bhgd4yyC5V6jM6^Kq;)hAuieHS;HiZ6!D(RL8Hl&A8cyShsj0%k*ye5jMTq z_mahBS$|FEUUd=XEo%uFDIa>X{5Qj42k;cT>xK@*)#egs@dK$zXwCv00<~hm%~%}= z7>>bu#DDTt-m24_U59CpXX+rzt}p0N-{5rAd+El$gW?tJ3@dl^RHLqpi6nU$FpHy7 z%&wS6N;z0qRS*6wNbD_$XZ{dhH`^C9am*p(RgR`oaq@g>Sb=KmM#*DR+QO)ZWU+zQ zAe3LVtE-r`tt)4Z9C%mymOU6lHlBY;tfiPF1RH9DwJMs=##diW4or~*uyIu#U`Bx? z8UTW_W*c+c*vK?WVte~ze88D6!&U|IC!$SdF1s6MeH18OoJ$@z&bEFczLC8e;gco)};uOz6!vO%(!%&*p0Q2Y+GBS!NmW0L}L*U=wl>QbAjy3O!ZVS}iP7 zWrf)cn1wqB51GixW(O(alE-KN*SL0O*VmSOc3~RtGyFRpgH9I{m>;xO-4p!+Y?vcw z_w;P2gGXT;P%?y|vsq1DnEI<${hoRDCV5# z6B0tAGk7UQ6v)h`Vuu0{K5# zmk$s6A0H7YKBQniGmO79m-o@gh|5!g@`|5yO~Xx5al55PI;RCQp% z;?9RC(yfrJjdp`ZJv&+A%m9PZixIZ@La_bd zKx}+n#gC+YyJB9i!uTYDK4e1QoK!jZIQazFn*Q#OAy^5$GLR zxObT^N7_8*^QBoGyNv!K!Ad*UW#o*?vi-ocP(+3$e0lZFgKmS4FwgwhQUR)teXW)@ zpZIU$?6a>wn5KP@Kg(CnR8O>tTV32C!(N3N^Ik>s>K>t;cFZbkvs>3h#C{D!6E@jUll zv-Jan1`V_ci(rRJ_p3#0xZTk-L6L;_!n{th7ubuNU>S@hCo-FMsK&lvs2$WL{GL71 z3#&@+SFFWC1c6(Z^#rX;q1`QfDiTAX!Fi?y1m0z)FMa9*L=m^d*+O+Eyh6WzBJNje4M^k$rV7@|ogJn|jhaP9OYg&~*1+KZynuw&5JfaqnkkUb$_r7ut`LF$ z@auvYjDwM|X|Lj~uP~;#yWwZKfkN}cU7-h6sx5WEruP#9XZlHxYR2~yXn^d|4!lOK z@BT%^yaR;&x)x7@b}W1FVuDWQDiBIjnR41>$PPruZoe&Iyk>*3*?l>{#G&p4eVtr} zIjEIrXwb8HIRi5B1>?31q{wgjSM)gpUN^-+Rn!^d&_eossK_GHx1kdKHrhl&d#mrr z*quLD%S`YaslRv{ks&}{DKOR|jALn*wlS6&K#2g11=E7rK=&={>S)@_3#U+qq=M=v zo+0uo-q+xaV1DMf7ZbWVxm12}NOG94xbB!LMQVqOJ`mE7p)g^S*Uvy!`Fd~M=BaJj zz%?j^iY5u5Fxr|j<||6@t0>u*SYHp^5(=KP$VA2?QvYa?g-l8WC5#|qeIV>xmu2H> z)1MlAe_hYK?Ou}NU^PdA5)~y^GOz5UjeUXrv+sWlpau3m_qdyY>Gh@I(md?LC7sPA zF$k&9O0*}lk+R(E6~Jy)C?xR%2!#|3>^LO-cre(TqNitf+1R^fk$*QZP22o!cW0o2W( zFXV~`k41t9Jjeq}V?G04Jeeuz3W*DekhV+8%1euFIn`EGkK*80wq4?stP)g}og1We zMt3rMft?^?CD6%4)>fLeRh{QZGJ1zT`$BMIPq*sK7EW1-59BD%k3$ZJl;}n zO%PUND4@%kJEyQZYZH78KMTXHt*+Sr^(FBKOHC20*l)RFAySkpP3Ig34_*#R)h?o-E?6D9A0khTMFUdp$BnQUzNz6a zG~=l)$LFnUD97bawTOVHsnQpp4;CuH9iZ68^BDrpWhu}tHF(_oX2bc<-M*2yfzXH~ z0IP2+q{wI+VWG$EsuNUE6Ku?6LJoH_w@@J7&58a5B$Jf>@$TT-rVW%Y&tHH<-u-8- z42i3pB3aFtMdwMP(iZ^QaZJ0kVHLyQ6=&#V*oRZlD)wAp6=a zTMRP5u#&{1PaZlEB`vVui-k@`rucY#Q0)M>|CycUjCJreS4&wehNh>LveKBMm8lw; z&~t+jk&@bCf4-Xi8scAw#VP@4(uKK{qRFMc3+}{1jW7m2It&K{7`q_4>F5-5-yv}G z*>TFEFVqD9kYiJTa2`%Md>4qoNQ|q_!7@VCqY(p}C>B}@m~QE!S&{r~g+&qXLDln) zl4CK;R)Yhq2TmYb@H5WuKC7Ig%~AZp6N=JGoZx2O@3m;_^i?j~zqAmjF_c6@RQN7(hm5Q893h9~{bfKwdi%cfB<6 znCUMW0giq`eE&y(v=GC$FBD{cR%+_HZ-H}Q%-!;;YQ?G&tAJ)y*cS=3*4L2o+N$|> z=|5h>7mxI!0s{(G0>v?!zKc9Jh`UjBv297r$Uta3^E|VT*2`S%O9>T;7{JQ)2)o|| zMcmm*;~y)U_*%P^#czGkxlEqDd-eNDx;Z5ljFboccy_{J1Qs+Cg$;ZKxX)jp-hmAC z2W!pB3OR}^7z|AS@*>{-E*C9gp%}U_czAj~E2~>{{N`R3P=jL#@)^P0f^rDWGkzb# zhq3j=)CW9qYLt2sw1}8TAiRh#+NO=5BB8Dpu6=vFE8`_ZkRM6+SU>63TlEHY$P^q% zYl6i!YQ)_P=wvjauq2&X*_RgN3$-_h6@vmz6WAsA$vhjARa*|KL^_I&hXXz^GB>2& zg5b6mLBFNF`GQ`!`7IPXO~nTlVg692pPvJRkg+OYHUtU}7H#P8Bf?_i$%F_#O`Ip` zjM>(xy<@`aw)IPYBQ<01B@3lbfO_=+w&;h1Ut5%? z^GKoxCwm{>+GlU$>YmkRQ45yA0CAf0Z%F)Tv50&qPrQjrG>-YHB151+l36U|FEw`s zwtgPcaQE$lDJ73$>zzW3#yumq2cwM3>?qnO2BvQ{c5L{zuzo1!x1rmx&BPgu> z8xYMY(OZg;@WVY(5GKY@^O84T)~VgBQhyO7@gQ*l45UCN$6@bc=c1ZJ;IN2X{%C?x z=5^(2N^;c3qc%7g5i1$wF12V?qzHcw)p1!xXtM~5Jm}LiE={a2Wxf@`_$tWBQr3rw12B5({r59ELfGPL&I5a_ z1>-+}Oqm;OQuzo+bCzbqH-@$`m|=w#@8mNZ%3!&9Z$bM0!ImIkx)$!}_&0LBdX(O* zBGlM*@Yk>H!m2_RnYrDk#X9f;n^}Sw80Wup57st+y+JikZ%XVQpOdT;6|U&{niez~t6y+?dxnEs<9M(@qnZYhb$17cH1!`(c3z z^BNclhjsENvb~Dn;2BuL;rL^7AE_pqjG3RLv$>S_d{8VHA%7hU*zMt4FLXwYlm$H<3pXF8vJcX>KNNZ4uVz(d*F~;Jq>JbGHd^v7E~8X3ppeyE zOrWDulz(4_%jPyPSgiRzgYL6q(wS?*A3`U|QC7Pjjdo(Q@|@lR`SVoFkN^@=G4T&$ zq+(9se_qnhd+=ctQI`tuMT&r*L0#4MQL8MdpUpE1uaD=a59Zr?ga6lrunT(-i+ul3 z@>osUq3CJ8|56sn<;|e86$UTLghZG+JAbe2?+*~E9DoQPlIBOvAl!y6q(bKd)6JxH zEGj1Sb!N-*%qo9`%oiOWFFasd9d34L`^6g#dy}i>g+6Gc=w7#eWPBA zAlFLQl5|u!!-_Pm!R~{zJ^Bfv@HtdmK_x&-x~Z zzZkTc-ysL=39A=m{e>i!_lH+mK4)40;U(?K^_bl2q<^0dooU{i5hINHBD5|YHyug? zvl6`vSy3Ue_D94@K?b9JN2(k6Ff%@+bb)Rc^b?ZJ^izX4$*{PGh{Z-PN@ue(Ar-lV z!A!gQXLAtuX3xk)V753?4kdqWU~yUcS}7c_9jKwiu=5f~^J;IR))s!~yhMXv}8d^iDbD z?>?G>Q3GqC&m+eNW}<8UE{|K?O&*Al8My;)GN8fbpW(uFL3OgY!6{Nd>vm7eH6Nzi z;r?!qJ!@!CN4E+?bNFr4KX-;y_`-Kw^Ams7z&1uR0`;n3%eofC{H;^tcy0eGM-jwp zwTZ27doR=)Im!Q?E8sDqXKPH!0G2isxm=z0?nxh#A?R{Pe+gdfwYb`cJc^eg$y?H( z_{){JDg!|f>O*id{tWpFVR5na&KAGHE~%iex%`D+>j(-pnBdE&Mkpr@>QfAqc3ynm z&{hE+(BfL6Tp}P z`Bs8FnWB1m-}ZDiR$!*AxDe;HsH+62Am9Gac)^bv`%uGgv5U*(!Vn)I7ffHUeb)Id z-}f3&QU=DtQo~{1L?Wd_O+aBJz9t`21_m$UU}wPdw;5GoaT}hQCW<_B0Mi8$eZR>q z?gw;H1N^^Y6R1-_Ysj%P&WSkU{%l3c(ZG6d?>&=deO|BgFEA#ibdK8CHMKlS3P_8lJ}l1w5hqApG`0D-HayT>qhKQ0pTMEqrm} zX)3ma2<>ywa0F%ricU@TQoXDCKeZUR0p#1?NQ8v|F#$~Sa_~3{#&%Et<=FS@?LdC* z3I*PT7hX*tkrAb@B~*j>YYl&=cgO`2bAo>Y8vHy8TG6cpr647QF2q7vJo9Bbk1Z+b zNX+dzROkD)`zF45ynCPXlm*jQ9U{WOEj}ou-K{A!h5P6D`@v{;&%Ejdan94JO2!0Z ze_D4lh-X+a98@l6!d;OvtrG!QmAL|9@9S<_cJ!;=wn$u$tD%DeRz}e z#8UK z&yIr{b1Mn@>o$aX^y|!N&nZgK6u~9A#0l!5Z!#2FjfNJN{RjfDekT=q-uhLOTy8*u z3O7{xzb>T)MO7*OvpuZdAg#D?M9A_P{;L5qA|WuLcv#8BBqMRsftZyBq9&xA`s2s7 zkWr=Gdq0Cr5*pt>6NW42w1h4$vp#@-Mp03AXRimDi#9@{Q-j)Z+c^RL#?&J`Z>l{! zuiNOs5cF=JQ^%0J#v-M^GVy=M#OeoG|1D{q`V1*H;GQ_vh^d%hhIs^k-YWOXt3FI6 zI14mGF3fDD;CNC#V=;VZNdzov71pM&Y*puj;ooHWeIgq5S<0WsP-Q&3u2gnyx{wdG zjE!9EiV_>^AX#N>JB3AgMc3L(SsB*&>l5e0Wd}r)fjH)<7>VaYDx&_F8bf`J8O514 z`OrK6r=Wte2mrK`GG{9s6i5UjxcRwGmj`!SzwPzs$zvx(SRPHiUSyp2a8e)K($*a< zeX-veF3ud3L*ioU&^IR*{LZZN%nN>*SS$=u21L%BArQ!EW{Gr(9KRUZAms*}n}Lr< zseRrCL{xlR#}J4HWH&f|#yZfSI!S#tj=z`pw^M^YPL28XL01c%v(pGc)EK&(Fxo?v z;C4_7u0O!?i?N~-js+4da{{2&oUcm9psa~VcQDIHs=7%k#3Gw`;e15KcCh$Ic{aCG z$NYnKu7u`X|5(eJS)<$|1(nsVkV#Vi*D`}n>CW$?oPBs7hSt%4u#gq5w^PzDMIvOI zHy5IJUT)weg=R#?T^_Z__&=23)NSaX6d3=wJ zKLgS{iA!Tk=J;Keoyn4bB5&R?&Na|H+D(UEv^HZe-fdRkn2I`0L^PV%U!ogWNdrbe zU&e=Q>KEDg$fY~L{chh7X3u8akCoigeSZa1H!EHirElmbfwy?~_$gNtRUx~wiG9r%X=|r2*M9br zDaIg=71uA#(M4C?l)JQ<#2XZ1Ssm(f<#{gV$`5|b(gI^5oZmiY6^>T=haLu@Af}5i zH{{pW1gi!P|4x+tSQkh#CD3QkReCqil6lR+A8< zNQfyn07K)MNecACyaS~sv9Lh2dB4cT?0QaaZZiyS5gA2dm372nUsAxVME549GqvGz zEwd`sgB#U)$E?OgCAFB_R&#GQ8L6%&=SW4Lj5p+~9`RtV-FJ#FkYt)7e#M}~RV>W$ z=~XiU7B>_)5@Qg?gr%p!m#nxQhaiNJR&$xft-i{a%bgGj0cE<~#K{v`EUxaourO-! zc`gyUPopmWwBt=Yla19xb2weDf-e4H+$|G>a-0A1Gbwm01EB zR`bt1=4bFi`3=^a3G;ZDRW`N-y`NYnl1oPm`pi4}fpWSvJ9O_ks2k(Hwv8qg`R<6M zY0K@NLdWf|fPVYzNf?-k!Ezw$2vZ?IWdBv0srjL5e4wD}(z~k8~<<1L084M^mT{(@yJ|G26o`m&e z8O1}TEiyRd(rVX7C_4MEFa0mFg04*5=1tftoecJihafmW?*lx2At*mnuJF@g|AM$~ zG;}?1ZN{0){Q95)hk`;n{(Hqj24}rE$N0=g&ttGw_nC-GO6N^hLSx!E8#P$nI2-BJ zU%a9>wl20Rc`2_QM_iTW5@k7YIpclGN-WRbZI8q+ceEYnZ@W(EYvq;Ki*nq_`|$a; zqhk7bWg5FxG!!WaQ8jS4g?Ea+M1WDncrwcdYY)3~HBcdY36e$@MzCi&f!0y)2i zD4BVWQ*=?MPyt5^}p%on?s@Sssuf)S^w64N5#5vQ0B zz;=$LT0e%esW}8o=GmO)z_B+Wj@kc!V6aS17U0sjdbNtWt&gS0ZI~xy$T}VikB^0? z2ElWd_FrAl9s=n=m$#=gxWXXRQi6|HuJzf)h>XlB?jok3(6~uij+v_wZ^=8Uyf3Oi zJnQx6D&+0chHtLvzmvA@x+2qgMCE8&1AU=59v$E0Ecw{dl49oLJGP{IRYBSpH&Al` zql#)Jui(+291@U3OI^?^OB>|gk-vb`bv??jGuCk?IYda+I+=gLFd-AAcU@;{4fx{= zr|7>Q1t2?sny6C-5&}v+zW{TPsxSOZ3;4?nz;FK{5`mQf3D-*(HbHOmHQm-v^&pIx zDq+07*|zUm!LWit>lNjy+q(M0w@F0^xXMgxg?fmqE*m?sTcXD>c5`*&A7pvf7aZu3 zv^dzsNUkX?JWC825Nrt>ecfqX#F5Yp^OjxTU}Bp7K^$ffZ+a7Z0X}t(yd~&$DR}6F zam~-X%!g^Hk<{!trnzOIg9INfjH9^Y#e2RlOL|NxbCdTC+>YFj?Pxyew<=IB5PlX4 zVVvuTt}K-kIg@G7iCtJ)jsJvopbmjr8Z@i_C+?E6pzr>5`NvPJK)vT|!m7efH^Bk2 zBanh{+D%M!x#ToT37VENhGOuiSaO1O$trF;fsXvjg5Zw|akVF1{dc>ct!(W~o@83w zqh#R*#(yv5B!!)mtz&~(%;_b(%AB-OvEUWLGq%8pjbnXc2)*;G-uVfu{BgFqnoYB! zCVgKp8H;CEUcSM67aAoIMSqmvL|m$9cy;6=cvy{ld8s3JOVTkxyvj^bX?6@oA?~yx z@jN9U839;z)c58JK2#z)ZjzIqJR{`lv>jRuL25H{UU+emK>Bt(ZjhvFu%bADZkOz? z20<%7mo%2bA=A|CpDBY`!+jteKY<|D1d_Y1H~4Rl1>n=7|73?}3W4DQk0(z(IBW`<*JC_hmv6F5!s`wikp$GIVR87Zgua zUc5YLI7ap*UvWlf8iHorz!i_wj%_RTV>PYg$Y7OQQN+G_Be4#RykbL`;$e?#TYuJl z;>qYXpYrX#2<3Gx(~y%2L5dfx3u6>*oDCsQ5Jr|mb+XZBoc2-8R)eQf>Ii-8PEC#8 z2B*Wq_DF1nNn@yhVe85Qx0m5_ss|kdm&qXH`23_swwjm6c_)wU8Yf5-t(VHoN~<6L zDpvhNH-ZvjRdATqy}*x>*LbyfE}dN{L%hiq^zq9bC)^59-uu}F0nF$C(c18T@VQnM zxPfae_I97ZaN1Rfjccttl->E_WMxe4Id|-cR#YQg{`1-te~|MHiRfr#;_KU@C^)Ie zdB>p|8fRI@8y0~jlsMcGjw&s62Z@0pZYieZqUn96j@B|RWASN9IWb@m@rcMApGn$*M4_utN|k)YV$enen} z?g!mTv+>I&&pq%QboIi^MRhIGDj|5L+>Z)lPa3MzdOiyo*|YsvBhBhulER z%AQqm<=grZfvxTNMSudd_s6qyNTOs8u^r`=Ayj+&o|*)P^`3<-sPZo4yyqzhrE|e{ z+$KT4deud2pCPD^Wb$2ZZ|D~N(|7bSHowjb^0|%k*BdBeu^IE_vrs;5P;1=HJ84UT*OOG9+L{B`a>R)0Iz zJdvz3-^(K8`$*S2Ky+91{Os-I^$fSt2R=O{AhGwc!7yr*z7y8c?J*WmFXf0e_kn)K zDOA7r?j+_To0(9iYG)=k=UR?g5?htg%`S?vQH{IR@e-w#FUz+pUf#CX>sfq>sLEw? zoN2RV_KlmKbUGbF5CaoDJM4*=hdBQW@o2a1{J#wdo&x=Kr6Lj?aQSCwm~ke51|GCd z{Ijp+_e{?|gs2CK<3dnn2yW%emO2Y1*cwKLZwGaT9=GY;;#wxh0kaTgnLR9ZS6N9* zC)lo00IKeKHJ)_;w!@IgB1aPGlcZ1=qYNHZ8GDk?RYBVbCpO(oBK8iGniV}qC`Xz7 zCpXqeE>*AT(|Eyzo)w3*ASo_xCm4BU5WgsRfbVIT5r_HTMP^lmtdqi^VIVHQq`V~m z`|C>Q6S6zalR~N1=eRL5iFHLz z*gdkY-8ETL$gT7g#=dg_f5)TL>?iOGu!3F2mg8`;qcCU|PeJXB>%jP3<1be2_Wf|l`L z=k#adcB3CQobUNsBFf@vYFlf<He{EC%anQM5A)#1xN z_Lpe09V0V4FIx_4Qx=dIZP(mB#u_}gY;w0|dDkO)C8|xZ&F_|pYt#vJ{~s8G~Vpl#LqQ$ds|a&CA9smE(IM!nG- zyjoZo1F%u|@oC9<^+v0NKx|cAmoyb>c5_{N2VUz{qbpDk=T#clwd#Rq)m;5v?mgZY zFBvNV-OXTPJBi6#2M*ok+kMsfP~Nvup1RyfhsgIG17Cf@A=Q;h`p6K$%`)1~*+!}+ zGNaL>;Ypn%W<^6P=O{_(R_ZF3bU(9_kf?C1CI9-XKL+vgd73K7KXkWV#ZcPGceQzI z*TlBN36l{H+YGi|;+&*l;+eBFO38D*I-?r@E@|zv2A67Sisn&e^0&-_L%aBzyG*GK zw0RQv-=8pz=c{0>DOj9i+xF@q=I6isl1Ai5VE|$`YRBZWkksy{(JO`e6Ws?R%zj*+ z<(pTx_6aOJZ#FJ!r*b{N`aA(~dF;nWZ~azp%KTwb30?)}q|mRm?*~VBb*diln~8ey zvZ%$0n#Fy45_#g`q%i$leflC}*R4+(i9XHx+f4pkKPF15IQt95FFZV$mg%ZSL=Ga} z-9m(7w0+uKCYu~%O%ic0eD6Oubn&WZz}2G|C93?13sw~~^nQ#!ca%*A6{l-`4m@I* zvdW@NWbg*;?P6Y7b1<`=`?%EPv5zBj^^^CtN;;7MW*zLzy@U~qQ@)qEEkj<2y***o zNB;WC6lbQWKloiiS)sC>F)?e*|>O$&Lqr%OYOM zmXBnqwYPe`qeA@}1OygvEShzuGZ>Gl3nm=5y*y|i)@(~#sEpRDM_=d`c>DaB`s1$Y zY3ba6-u@-!rikF&7llRy_9mZBb86mzY9nv z3*L!qbS4SSENEI|&Jc`uSi@vHX%0|RoTbBFV!8KKc){fPQQ8lwtsQq`OW$j~?f&wO zf()q(+{~|z*fUS~c6n@3*MrW>2m1U8Qa>>;_A+Ga4(`m|F5lVrd|NDj0uc6@m zEYyLpRe<$Ib2fvNOHvrzz93AKwT}Yaasfzf7(ihY`T=Cgag=&ULs75^n~>B>a-!1h z($bd;cmgTl9)g@~Wx;0F@UZCg2;~yR53p{p+s}Lznim0 z7wgW3!+XTklhEO(dc7VG-}tSuxi4fahgwBb%XsS?i77@nJ6UHLQ;PSrw;&?F7i2ZW z=NZ|vMxM`O>G}LYOiq0!_}#r+kr~dK;hhm3{vq=e$z6;snL zyyz2xHWyy1+8Q&hu#dX=h$)2e3@iX{L&)EBk#{ zuIMbo%~w+CRDCAvZ@B)bzy4803)a$gOC_Gi`v$_2_X%5)97_73 z6O%dgNJ-qTn`$jpi{>4FF7jdctfP{R(46S6XdlE6#256Dvfz$i#2-Eb)tCJL2?Ez) zz>H}gwC}Bu3Vjc_521t04g8>!jzHZOB{jk&;Idb2V`Q5&jl;}NxQG_-+eepFWGlE_*#=0;`|!lZQt2aci}B#Dmh5v z$tS%vddt{)UXic3)eT%OvAz40=7P*C*2rXs4I+I--5i?ur|LHigAGPj7C2R{85{^? zn2bvIuPuMj9XX#l*B?@edFyUZ8xh{d*GJ=)-KuuP%TsjE^9hrYDh}3PM1-U;65==f zYfE!I@pzM}oQhjXds^`%G)qX}pbW7hKFThmiK|S%9hGw{W2Mb3}-J>m&P8|ZL zzY(kgnxgcJmA}JkKMCqG61n|%QoSF9X8D^v{QUu?%~)uHDy4|`D>Ya5%}$PDBJKC< zCvo}RQzw>*;aE=8(dtAT_8~o3^RF@WctV^E_8(&SG5H)KD<1Y$>(l!Ip||}w|un`NtN2R9#f%uTK3APN&Huc z&L{^{Io}_^F`wj@L}!8}o{dlS_^K)DUwgfG(|@aO zPpij=>Q27UiNetZL6zVUs~dCPKL{vNRf3G8Gt1~2A~&wf_~q)|S@gL3c8vc8Z8Uk9 zmeNyY%v?bjZM%_WLwe2IGN} z)9cD0{OS;^<7^TKhk&ft2XC58Qm;`r2%GtA4JL&RD1gUdP~%J8R2-R}y5t&-mvK!J zd#QcYJM#(N%l#?#9~DNdk<@%u5{}xERAcvj%1!W^nO8bf_*|*>Qib*mk6vLfxtCmV zb`HOsMf!v^{m3$&b(>4U|lc%>EBy-nyQ}OV+bBZ4voB{W54i{Kz>?>6q805$OkWqz7W=$U4IzInk5W!*b{x!7V{{JNmGBPT&t>j`U>sHxeup! zxFHu}Syc4;(=2rULSdU~9epK3T}Ol=#Z!sFQ3&QMnrL z=X+7PZM#tY;`rmYHPp!p&O&pew6?93ZuQPK8&AK`jn-}7 zPgtiFT4+wSv`OZ*lE_-u+A$-3UcnT_DG|pp)7E;Zxp@R#&xpy48e|CSe340X)9s0( z+1t*GZGHP5woeSLY1DW>y!P`Suu>lbd+!5#<)HqYsvOTtA>l36F%2J+%=9P5=LHS1 z7>r`n-lOGt8G|fO%&#m@J;_Td6C!xvn>RQ3$+nGPaWA*PAigg>q_&^Z-r&|PwhbTu zSRcRYh)Bz|c-JwJJoU=xe&r8Mn#;4)S}RqPZA|pDoD&}&SMS=a^zLaIOmwxjjV*q^ z|Ni6X!wbDF9pvJ166GY~YE>~L#$0<}7_tep`nUK#|5xQW&DOS*a}Mb2wy3cFr$TpX z0o^ck!T;2u<}E>`|GT#fjr3Ky*;$|~7bC}VmD7}f;+83M_ppTXD0(TXe^ zeMVg}>borBG~6i}xiPgrZO6r7IoZIIR{%9i#FMi3VbV)@Q(bn>@Kqbn$k+=6r4Y4n zcZGZjPi<;-0BJ?+=mwLB)0~lD^h2*1#c8D0HKo#|%d*sylUK2i8G7tzj9<754Awm! z*4GP{d4yj+81Ih=9~1z(<8w{Auw$z0boT74V|o%4bc(byO6+l@;R<_dAFevurC&OF zGjVyU^@5rwQ;krhyyVv@lX!cFSRa?Vy0=OrA3x|)FgbqVq6?;)%GPS&6jh!%_sCpd zYkha{K=QgyS#^gG?3+@_WtDz4`O5|vWFyX8N=4Un7PRb7I7HsbjNC3)u4-uC4czZ~ zLfh$4z2f^F+v4bXi9j#noi1XRv*spQ_)ONQjw8Artlzf4=I>*QtHC+<{nQOe-=Vb(;N0_X2Cn` zO1?n_{T^3PtLV0ZrM9KT-Qi@9OVA^5XWTk)kL>{r)VW%W5%>hnKUP*Az$LMNLB*Z25_=Q&N;*o_Wve;I4mQOYL8Dqwfc;(n+4FhLs5p;ZU^ zws%j407`Z7^lGS(N4Bxzq@?AN`x!)i`{GVLSp^!4X)zbB9(-mfdM#Z)&_1$7pJM*d z{BrdXqNgt8)3n7MZTUBP*B-d(KSjgmn(6fakTjW7hRv?yv|z1kspLaudfDjy4!TOB z)hy!kf_sqFCV?E1Fd6-j;@XqTv#f<8RaN@Cw>~w+`L{GwwQDkc zJwb`i>sQdW)6Wx!Kga%#Q!CNG(H~>2g6|OEEIKeTnNC$nYf(35)#m;rQhdb@8-y*+m`xy#SY`0JC%NpRk6G0V0wFVpG zji0zs+%=(1vmUq^#`xil$9A~L?OKeU2Q&A!J9)osJt1$Gd*5FL%r@4cq_@YYbJ6)3 zxezOL@b~w%ZO)&I)*c$?(<%KJxc=h*G4|C#RkmB-f`BwiH`3kRD&1YuY`ROjQ@Tq^ z1VOrC)4k~srMtVkzZ;+TIp;a&egF7o9A`VDwiCu|I-C0RDTfOkViGQK9A>@NJGOlcq>S&d{E<*q zGT<4mJa0U6jT9NW%44 zi(l+v#pTg9!`e^0%hLOSJty6rTjPPx0SB*S9}laQ|0c(x2w#4;=Lt1CVKs!1T;dJ*VbK zanAlFdt@qPbEbkSf4=Y1biO_|16)ovI(CjZ*{{mvht5&_0Z0)G0kXr*?JKEAYHK*n zGjG*9P<&io(Rfs2!{QROC#glZ|8&bY~yOjnRUJJ3*OrC;3yF z8)uzL^B)^=3HT-wjw-9WaCctM@jZiwB8!Ml4>Zm{vXz)WMto6NK9eh^z$Xd0_U7l4FE(@?b9i<`iYrAZ*9!)Erd4AWGf+$aO$!0x5?Me~xt`_s%eQV_2)*vV zxNb!Tm}!Oo^A>IZmU2^fFUM*js2aK6b^&8^cQhm?2?r;`Xsa<#vAF6siHaIIvu(aZ z?C(Cr+gkwuSbYEfs|fj|OL#XNqS~^KwqEd2#;r}2b~^!8(=?%&*jcZEM86#)faxjy!a;bMxka-GtybC`{n2cGfHsA`_~drcVXmErI0!&e BfnmmG zDm9*kKnHK`p2&|kR#H4Ovl+X(<^wb~owt1*RyS@TQF2;8SZ1FsEB_qA{`zcB7R%YR zgQlWo6^Zpv#(h23FSDy|8uPDW8UkN||DXTZ--zrh85dTy`%ZQEsw&}DPm0hC<=53s zTK34cUtC;6)Z7G>m<_C+3k7|=p4RM2)n#jqSxpn)vypCZmN&ba?=j#*GsHfS8bqhS zZ$))XUU0gpjR)NR)Lf^OYN_lTS7zQI6XMi-YtvG6AT20mrw(6Fp$mUKnG>1&L-Nsc zk0q3CZ@Gq9i5SR%+fpJ5gmhJnBgv*9Je-&g>_kY%I9lCl&V+!g*n z(w9xk0^eo&!16DXDejBfXwb>|&cuE<{oTzv6=s;Xq3O#s218fVVz=fib$vpc!_e}d zS$i_2@2MepHutvLEa5UeZN3@4?18#bb&ib%0L`{j*y9jQi@(nz&WE|M9>c2HOUi7N z{D>M3T(erb4k)AWY#eOVI_httr-a-qzR9#ItqGBwGZR!^XN}nA$h6;IqdL>n=XtSY zIrcemrl9p8PrFS{(AEw`mZ7n^>TIc7l2Xx_bkQ`YXxb)Qv0U<_RGo*nG3AlY<+p3G z*bn}r2M?B69_@)Z9#jxsjZi39J(!|iYoI3STEie|ml+ibb6rf09L@KM)M|TF2ERZS z)MjaeI%>V(drW1Yi%lm=i0<$_Y7T3Ayk+M{$y2lffyi6aS&2w|kC~|2?xC~1jG?o9 zoXHkb&mE2R?(f&GG^@I=)dYhlH34BLklWe!RE+t7dph(RRr|=l&QdBu-;?MUcJqG- z!d{>o^!>YK1N8Q)&8Awohu^j7LR6G8aP)71aGz!>-B@3#vE{O|cE?1S{QpNLiv06njXo64tvj&*=?Zeq1ag zq^b=bgzJ(j+1H7SN`uHE72_UQB7>M0@Wxe(@nw)NgQyzvtsOk{mFKDO2)gks?8z(Q zXsgu_4%uc}ruk5Nw5i%+qZo}2fVM0kY6)}@=R|-u83p-tR`y2}?t7R50to{pMpeu<-c@}WKBo4-lTEDT`_Tfd90~^m5PfcaaQ~n%ee5`WP`wJ> zmNHH6UZ4y`q1g;+^OmB=LdfNHp)K;Iecx!_jMRowR=>FBpS+0cuy+P9KuB6XS!2v% zc0k&aPeQf&7WZ!n(P*cqzFm#Myj##5m(SzLE7`kik!Ic-EP;n<5x3QqSLCZ*3PN6J zOQYA=NH`3VMe=+&IR5$hkk339Bk6YSJH3Ptx*@FUZb2^$y21R18iMtfU5BLbRxfqs~SJUux(HV5aD+ zVx_&}bB(^fBA&LHle7Q1?BjNlM*X$(SO+pt2ecyh5^vYrVsYR6RVwr0D~SKIRQ^?V z|It~(!~mV;c@2%N_?RATSo#q$PK_vDC^bkmERt)$(8t@H%d>jc&}(y;9ie=_3{47VbD`D?ium;roK(Jz9DQW}1*fK> zFyg3WQ4vX?Eg(-#&G=QpTf4*>#i#Wr+QduDFF@pK#qHYnYI%wOiDX2mkULt?jfAk) zm)Yy?BIPi6e5h2&_wD6QrH$yiTf6Ooz1~v7*R@2`T3W2~^7PPNl3Nm{QhYx_*J*T# zdr#MM!9#mw-a8H*A4AHr#`XKp*NVZZEoieyo}ZQw>8>;EM>DT)dXH2r6N1O7mx0vN z$35ol48@bOa230jRw^>*OZ=@>v4~?xUW&unw9=Tn3*S zd2e$>jLu@DWQlA;*uxG{&;1c^-cc7tm7>}8s??#UNA3DHV!d_Mmm9>F<$3xOTk;$N zb<}bpj-*YcK-F?h*m~`389{dhwqSU?cXB||=2GZ9MZ4stZbR4FiWd5cV~g*NvIrP= zx2W`e4lQ()d0&L?H>enwB64%OGs#Fi4?JYA=}HbOxc^;{m&0CU{=38e$Kd^}JO!cw zDF79{K);MVpdUlV>To~#uHb?I4ARf3H2ZLnS*iA5!2iZ%?-l!fuc_?JHV2rz2e-s) zffnslx?g3Rib2X%r9{8+iF4ik^YrFb!B>> z=gF4?%bMiH6zT9e27-rQQVv+*q}J&@hgfQgP`l3HW(ET!sRsj)5t#Gw2T7C|&_!sB z_W_E%K!(xFxcukV}jHzPa_MAEtyt>a*;q{-Ve|C1zBM+ikFGp-BFp3wW z>$J@iHti8ujhzAXBgGw{Gizep0lz?l;8#1^uE2$QOl9!HX@5CA$7i4JGKsLSpB*eZ zF5BOu=&@~j{1~MMiUJ;m0`bdAJJQH-PO(v&>=M|YZt=6BVk|HS4^>$*#FdG;qquhMc zn+*o%-vN^t5Em`&7tq!HGWVMi)?Fc(ZTDgAiv8pqgeIZD-kQz&U?p(14r1KRJl(X4 zE1;@pF})G(Nqa70PmAvSI^H`h&Bx~$<|7@Y?pR*Y4IVJhHSI^ksSq<TzIYkrAvx#w^(+50d-A%py2 z%ijj~s(Y5er*p+9qDfu^8*qM96^_lowSLa1@tSj67~h<7*?G4Fg6SENx5PZ>Sqi#W*p^myrXE|jldKC4r10D z3n4LR>CIsX!>cP;R2PNKDrw{UlaMrRlGv$~p^L4JF9a0tgDBl+){7A*v3k2U8e7W^ zp&sKo?`m(Ard6y9tsS*+!?n!WG%cR}(EjgUe_ghxX+}gX@Vziu?=P!-c{hMG+Wm#i z<&eVB>yA_5i80oN5ar?%_`Y-CGtXJS1XZ5ylE~Fk6#3QCOF9qvCN}lgG|)(!+y={d z1HXiY)yi`cqIuxk9^$WkBxRCQAJdSG z&nyOHQocOp9Rqnq%Owiulhj)z`F{a7Y!_Dj4d+CiNv$^Bcdj5I>se*)iQrNuJVu!I;9D z`{GE1rdKOy${pG1PhSG-v0PAD9UT&mNWz+v%*x9=$uwn_F|Fa%dB!+TdxxbB54Px$ zst$}t@18&)@cz-Lqs{T$bc_8+LM0xJbVS-|jM1 z=6w6*W=BxpOQ7-bo+DdbeB;LaQ$hg)8A1BKp3MbFX>+0;_kpg6MZHYPo6Vi>gFS8d zg5gTi?N2x75jpd0imqtX#|u=KFA-rc?xdliDBg9D#v*jAe2Dolpqk$|F7TN!*P->{ z)|jyFiplM^GW1)1PCIaCZ z)H_>Jhu^~O!jiFJHfwpTe2hS~n*{>$@qSToAOxgzyaVDWIj%`(dlYSK>rjC&>2E5! zuFGaDB18H!Jfq0zA+-~ z@RS{h`XtiXV0sYXw#^rhroIKjo`25bo>S0sWqGuAiM}EI@m9$8-e$Rb)8`_$!at*& znbsv;1>h>W@vGhS>X?X=Z{=ccb`=5Pjilb$LF6X&U<^(&T2I7a40)4$*Ksakw=RFD z;UH}8*eM(#mq&!%y!trYR1aP{p7Fg`JvIjt^Us_UILhQGMhqJ0{rBSqC6d=7W7a zBJ?YuqsM0cm{|MuRhB#A=I{f#)kU?D@9S$HNIlPr@ZJ01mN2rS?RmTw1p%y=A$pPY z;!A#Whn~1=Bp!&I+zb94ILI;U2hE-eCDpd8<%0?OK2T0i*Mdq3WjFN;Pw>rodq-d# z96cXKdHy?{`@63qKTU>fb@tS-$wN?FE@<&EcZG4NtF-m(CRpk1J#tm0sSk~xnliuTB2LEWN!SvqXp(yOZu4Bb!E#C;((f|$l z1yAgWgZLSIXGoEQR!ru=69ywi{e1Xxc)uCfx2fiTcyYKg<)Q6~P5YRXx}GcxpQiVn4`nsA@v8A9-5Mo_fJ6=o3j5yrqI zEL&&W`M#x*#Y0HsHXVoSGmIdU04Xdy(rJlmpQ!`mWb7c$5ln3C!L!R?K zko~5gv<}yb?`yrQ8S6SZr3~*9@HL4M9QB%^MX?z95uEMKEp8`2eZ_K$8 zaDcM61>3YGesKl_D~y6q_izJF5MuEYu^_vRX3|l9CfI_OCIq|MMg4FtTuDino9X++ z*2=#FHC7rJ*gh3bC~sr%%4zY|78=x8vnYD|RfJ$Lj9Y|c zCiiS8OQ8t>?yVW7T~ zXk>BlD|c+R8>TF8TnY7j*NA~Yg%bPw+E$SAIWvezNztlnw;JGEc+b~Wr@@~!7MEir(PY&GHmkhX&+tw#DAF4vaHqahf~ZA9+hs#S7b=Zzn!3le;D zW?g4X&2*+hbU2`0<-DLFtr+$#S5%f~Jj@ic2 z@2hMy!u`P^?7SV5P}U=>?hKs4!jqi>!>=j&P?}~&7Co;8wDiIjDY+)&p&mMMi=%9! zE<0Iot@at_4MeSS=H~Z)IslEd!YOy$Z=-4(f$`3<{k#&y(e}<&)BL_1tj-mWF&4N5 z6Ma%z7n$qdtJBbFe6H?aKJA5M+vBdAKvDNGx!is4dh3N}_4|zJp+fP}PC9&HG7XT8 z1B&ihma;GzzOv9)ASB+5J1Zex7(A;wYXNMaVP9;_5iHa6$@ne5Rg5|PUinlzRyt9% ztiMD;2OuH_=p2H97qwqNu&3}ne~@6t={FZG=Jiwm6YgUyW=4#iVAr`Sv+vi6HR9RUv6pXzwv zY;+muy?HKIp6ZUGJ&E1n2y7Y~a${1Ey#vnta&XpKd$W2UL+`1{Y!?i2Oke`<~zfh3-5|FFF14~I9 zJUo2Yu=&wyttxJcle?DZ@nA~-W5!t!477}&KQ)EHeXnPyvh)8Y8s0*YyfL8p?>Ysz z1O|JC%U0R3Dk(jb3zlxx`c42WLA}zP)agWJ30Gzle!E%Cp;bHTITBD0=Q$S&)t`%q z8y3j;HB2x3*j0Mm1k%MmJp<$Q`UNBzzP#cSov&MwKz0DI^ih&1xt^R4fIN22jc;Un z=(7CF97M%xD)DIW!$i=sLN1$?1_6AIWZ$Yi~}5G(Z{v&kZx) z1aRZ=-+6NLFkc!1w2&5vrJO~}P@s90!ovVG0r(E=xIN>YIApG{moEx?cC(_Kiz}50 z9`-CNn%0w~EGxCJZS4#{^TyTwzD7D>VKMjNCuBUN+Gt+`z~p!V=JfK-gcQGtg0*jb z_n+GyIaQ`PBHtI7gf1?bv_|6`eYhn$)F00V%avg~Fho_G{Q~58O*TkV>9es`EusC3 z^L58?Jd4iayO|U7DZP*6?Vo|moBZbM^7GU-qa%5~2n0Lnux^Sw?^Uv+-i|7PAN(b9 zpu16t)>@^hiV#0;o_^UQdNB35ivkg|2S~joL|seX%XliRz0be&_bjaUi1N>+J{HX!oA@pN*fawox$4iikon z6odzrULPj)j(GAcvD=N~5&s`NiOg!Llr-q;0mHcY-dKW4cRvCLJ zhBqF19{t(WvbS3MPWj%{5Ud+Dd4hMCToL7&DIo1fopJixR_j{p%Ap@p)TSSS>Tn6o zT|?)98BljH1~`D}=?2S}?Bw1vKEmqCUVf;!aRdZ8TvfFWAt)Kfed!a0h`*38vGS=> z_Y!a6!hYq%8`V=1O?3rULNc;K?Q=yuo3y!@ST?kL`IE!IDQNzPv2q_KKwa3kVi`tD zA8@g(9Qm#`yc`kh(iV!6M{SYHx!5^{8ES3r72+r6w_Qp)-KVc((iL8u4aMYD|Kud` zuIyy-mDko}RuYhT3R!byYpA=X0gJv!1Lj^}=>Td#7nXyPJs296hDQftl>LAPk#l`O zOVLpaw5gl(Nn`D~dYWtHubXw@2kmIJoq~SDQ?)~$2_llVY>WMlqJvpO%?ma2iJ~Od zw%1WW9^nej?8D_=Ib!9Y`+*AZy46j4XoHZQOwU?H#aTGw-$4P1RShp*CFugVfn>^y z!GAG8&vg9Xzy9U%2hM4g2;cG_#G)B{!l?AwJ;YIT{L%SS!rC*L zZOjalPSRUutt&uhEk`yl*KO;;F`a4Tx{!?YRj$o%Q3A0p3A0JimMAJ;6T~z+2nWil zxhl(>W`sey>%rf+UyH^uDg$JLg8>W^ z*s?IQ&rH`Fa2n`Q_kZ6v-#AFo{ur`E4eABTX)7>&8x9)gELTv*=w#U3mNP~T-`@@@hAXWvw|oy<+N9@W`rev%BzyQTEf*T2{;Uo7-wkHT z8-u^o{Ql8h0{{2_I2R<-06R5p?x6O2EbgmT6LIB(Vq8_R94V}C>_7;g--ub}lr5?9 z`o1`Ile?bi$e^%q6#=|e47I_^nj5IQ%3(dzjurzRid+{Ft(YWOyIH`n+J__BFDgv_ zQ!oRv!$8C=ze~4-BVjTiVQHU1k5F>M+R}LkM_`Yh7Ft3ji$mp|uyY7J{YsOSaq*9u zvqAd&L6wTTKiDQnL9^QTn=(Zy=gae~EKjS8s%{OXh8(D|A_;Dn%*FVd)YUTnAEs9! zwk8EtLN4^_AKfwEU~4PJok`~oDp}B$x%TU&nXI+ZE1(kwJJ>+S%o}Ut9@LU*J=q?w z6G#`0XbVov4E5Wc6>6E6g+6OIcm!ww(V^(%x1&8vd%Z5)Ztwi5nC{l9$7U-nFb3>( zmERO$quOi{W2VF#9GjYsXAXE7tnacK3leC&4>t(Nc zAFoIdoFOiNS0L&5O+0poB;rj+Q9@{AVO9kb>Sylj$Utb^TNRk~Cm1$0tmlZ6f4|!Q z8E5#f>kE7UZg4&EBoIHbewDZmPO969bkUVX8+2sojATGhWdo4{f*Udebt0N)hqNbH z;B$s2(-YyTs{|q=AAt&Cqv}Ut^i6#ze$1~8l>X5o$0*i3!bCHjX(+iT8_(Xl!T7NS>GRtlnF zLc`NQDuI(HDWQhy2iU?dvJ14q!E?rPw+28%IJ*A2b|R(|ZZ*X);9s3>#!R3CdZS5`2%6Z4Y~ z*-uxuU;ny%8Fw63IxC4_-*PCJ$Gaht!_9iU{6!Zhr9i8fYhH8l%~cP++*TYl+nl`z zzFh>_cbjE7j)js$k}Z|$V;!_X>{V3<^Vs#8+B&~V7u;VaQCf66W>VFZjAHQ2k~xR* zh>i(9{1K@(gw)yP*=a#bMte1Y@m@5Cqo&K-uB2T6<~&3dYSAD)ed*2HuRrA#vJ0Y|?4@=@Dkr$)A~Wr;x) z(1QZ15`S5A1Kr2BiW%8lj|QXv4h;Sqd&7mVi23ifPw^N)@N@8ivH?FC0VE?ws^KI-JxlSe36<{RP`d4Tjwe2RdVRKncpuA?f$=oK(|Zmp0%j`8 zp&AzAFntmi++{g^vA=X^y`Gk7A z;k4UrzoL__2cQYdr3j`S8;+Z#CQoM?>jr!Zr#HE@QcJCD^E60PgZjT$IAjoYax0sL zlng6yMRuAO=Gr|c`f>xA7%JBgTs$^->EzJ7Z%{Bp0QGF_SoZ@e zk7*w!;g<&E!P~j=Hr(RFA?hOh2AGNKpLZ&RB1e^b#L@1-Stn~(x6~2WdUaM)OpB5- zA7S*oPk0&hYG}%$!$&eJSE%xR)w>(UfyYGg1rIv4IbWCMFz|{jFO0nuKQ(BVaE|A@+7} zZ&S=>d2Vwn*oEO?PU5!aUHsgn9$Qmar%!xz>{=;hgH2R&R0}^^yBBxCru&A-aco8# zJ?+m)JtIBs3=XOpu{uO%g`!9|xb5L0NPKMYLeR}BjA4E9{YB5w3n{Z%F4Uwm8;VcK z>tKZ5DaC;3o#EBI)3ON_OdP71&3=x4&b; z+y0k>*=PMK%s^uim``O*Hd0OTy>$k+?mnNeaB8JKo(uKrb^PNc7>Qwfwo&=oW(_Q< zMc}aK*4=t)Bchd#R>z2(%x4=Ygf|BXbo4^8-lwDqlI!i1K0f`Nbwnw!4?u-nk=vCCGVulYOHUo4abJ{|k!v`@>%x^k2sWigX#U7(5$`_o|%Ud8iT^7X4WZlf};u zEDHIt{mTi8FaBLS!nW7Z&z52bQXonIt?u8YAcx9MkA z{!c{1AGQ01;ZcqCHpnv71K2%`Nb(BV_f5gMy%=Fr3_i6#F$>8$w8bcW;JJ--9qv?> zsp?=!4fd+&n$xXKawvITV-Gz@sipBt1n;4^k!u!%9s`fll6LwDW~PW$2*Bn)EQvi`=zx@6xariB$vbKK?o2 zqSiXf71827pgjLHz%ipifb6}$r~!EIcEH5M&Li>}3FSA&+AO|sFVD~=T847ZZmzY2 zaJdqz!{X;RBaC}(oYtLIaPctC%1_qjiBKReXSG7+iE*P+NQIZhxsytIVRObp!ry!G zWM#g5*jDGlU7{-a6|gpJ%E#1A52W$~Pk@&!pv&0m$XOmslgoGVx!W-FI?cK9p`3F) zit}<}SdoFQ_IZG%W8s7a9DBE;b}I5KbGv(+E84I8&ywe?HLhPc{_YC7<;xIWMjlGL zNU!W!Q#i1y`2|Qx6;ol>k1m5}ST6yV_$?0%HnU!&zRxXQw!$5uUCSj^miIlL!0j@* z+se+?$>pBi%CDG$m3sXleVu?`gX?6OODrTRi*B#lMqO^Ty&NyxPbOL2u1Z}`rn2Rq z%nrUM58ZNbaT!6*@vgU9&`x%d%9l^VVpa+zvbqV%TZz+fgddJElJ7I;pMIKheej)t znE%g<_xDl#>yQ6)V3$dt^?;RQqGt-fBZF2=M}Q1BaRWf#c1-3dziA4!_%S~P2l{sA z8~2#~$$^%BpcK>1&;TYBl6)r-mW)YJ64pf`3gpZrNwc6Cz0`TBaG>%5%Osx9yz?cM z?;Hd~U9+u^3+wHlPlb^K*o&?Yk^C&wDTE;uS33Rld)=&DN4I=I58R~n4)K)9zw zYAl6IAi1y4!^dmvtP@noSk>uP)^k&ppmhM(v4=)3q^q9Sb4O@0(HXzojkso;AOl7~ zOW&tTucJLv@?G$WTj!RRcE_l*3uV|RyRfM~G`P1YlFSA-8e=&p-+WcLmopQ3r}t@m z^PPpA{+_uT`hrZFYCUa80McNHQ6s=pH?(MB6LW5)BwxGk%btkaRMib9RDx+eh^x`a zBdbBy2nqnLVFE>-L3qG&%d@dDVcT(nGvB#AuWFQV<5;A^JJ$)DmipA7hg2i{z0|(^ z5R_oD)~cC4*HFe?g?esl8HK2&U6z<9Jn8sRAOCwEAwYc2^!x7jD06syZ|9 zI{Z$C7rQ`oZ~AdUb2V0i{SONUqzQ&@8h%|K8=};nOv!S(b$a#8A@r+PmvHyO05n)v zWlt2>!aQsF7`wQxPZ$V?`(X+@+P%;eQ@i22Ov>s6PL|kF;AQ9U7$FuH-{lIKijOIr zk?zmAx|me{iiR!`Eq{V10Wn=ZAf}5>x9z2s$O>AhTf_@J7l5aJ*m6VWDHUeIb9__Q zyzQvtw#bsjJ#xjW}2j6w=V z&wJ9|)r^GE&hJS$piV)w)oKNyo_NZ_CnD9oorzDb=lmPi`}Jo++fG9kT3AF=!^II5 zQ>__J<7v|-#l+x?M(y#|0)U~%aaJTe-6@-}qp#qCgUQ(JcyigZ~h9X}UThG{+dkG!VKD7djnlUSt4D*MtrGEGzH}bCE zOg8IS-#A3PU}O-dg^&lQ99PF`$m+OW*eu=c4O5vwQbHk1LnZ&U^GZX6pmEA9;u+bBOi7BC>d39!08MiO1JuUq4loaVHK3Xm@FI%n3X<;)& zdTbQ}-Fc>FSm)1%nRVNcFcj0|Y)u_PaH3FX=6gf`9%rfP zWFO4R4H@-n;LF=i@c|_sdG!62ul?-Lg5{mwENQ9xVppFXr=cD2)5zVkWE-?N?@mj} zcsH@7zq-I(u$tXcv65WL4Z19u(QCpvNxwN5k`#^keLp!7~Rr*VR9-(6lFuk0@Shg#6dq?MhA z?dgV@K5+d2fCx>t*%i7*Xf$|7lLO|wG5cYTVS5;5(R)u95R9FA@@Lw*+Gt1nn-{bGVxKEhf?-zkAP((%9Z8Hb&J34yQ%|zu=R~Thwf;B z4+j<>PL{{Il)&Y3?%2gi?v+qb*)V`vNaw<0Q$h)27+C+uZT|x@^*`IgwTvE*H5{A{ zBB}*msc5d6jxOYrBXmE@!BNA?kpV2}wqc|AD-r`PRurkYG5A{q*NCPA;n%=ZX=jws zim3am{*8)Cg`3wk#kfHpdDlfhM4!+JtRzp)Y>*0)3$H&6|BVGOiZ-3nHPdZL**&Mx znIwTf47gUxRQSUZ8qDyF!r8_S5QhXm4V;)bdgWt5^2bx6 z*v{g13R8q&bo<0rG^$;?ieGXuVctQRAXQ5Dyw0G$#MZts@OsPoNvSf$WaV9&M6tAY zeXYGCFKuymO9alhKyU`_1Ce#`nAWa2DNRCi(A;T3@}rDah>dkIv?E@50y?&{eMQkv zn#u)c@45H8yELF*BIA{-K5>yz;qKG}j~|hE-fwAYy{bv}agF`0JwPES&c{p-Xm|$f z=nF-8;YTxr`f2%mO~&>4iwW%AK!ZT-j)*V=PQ-%B2_Cy-wQGeSIH)0SOdq6i1kl3cn`#@ zjlI8hug)lF?&ky-+b=09uUpixEzuhJl#k>0|t5~V_)%qH#U`%4UXiMD{3MYNTLv0xenlo z4~9#2W5HR`dvzytQhJdU_ixvUDU71@4;jh4KCvZJfyP-&6d)e;$=*GaCH>tw8DUUQ zy{PgLprTom&I^kHuwurOkIxnXR$*c@2AY@rK-RJ%JPZv_BuS43@Ox;y*dQaw>TBB! zH|I(AVsH9T4Qyh2C+ry!IWs+*{}~{XG`uh6UhRCp#lv=U^*)qulNE&L-(i*+E^;-G z&;^%)_l8xU<#@mE@f3-kwakSg%ftBF+U+;aXK4a*gBuKkOlw22Kfup^HHYt8;*Q8K zPurof0$@AIFMi>$B-zI_sI%U_zjcHLE@Y(Gk`#dU6uG<^M&ihqLbW7SRtt-Q|6z?j z%_GCB_HfP)jP+{&)J_(>_5O$3NPm5Uu9)xeXl}2ayoaDi>PbuZjck(7`C}-R6U2tD zM4VUV^*JX?{nQmk*K0%3QyMP%cp3_&1FN zdb`deJ@-uoM4#B4nN-nQFg0|X@j@<(MVCMdy-H6)giUr$NF#hJk$rNQJ~ullvY}nK zf3@@Xl76iJp;Gnp_Y3b{fBGFtP^F$an`{Gzl~Kxa{PuZ2#K_%i7C%1>AbX$C!}(}; zHlkgqf2LNWqwn`9;lsW|xVmut`Pm?C_9gOb_%&V-BUd!X>`Q}v4%}>Y)s=Ce9%Y2t z13ZSjJEgcfa)%pzM*_pji5e^XQi|GY15kAImBOHupOiL4#w4Oc2TyEFL|7LFmZ!+Lv8H!w-DDCG3?PhGJ4YG_O{8 z#Zu=>&onmY8u*CE7fr?wgTiyJMa33?0a!^R6AgXz0DTJA`_!6>E=5R~`GIJ$5?$Zq zn@}k^38>&eJYhsEjh?Um=x`hjDfiO(vh#soYHC0LthVXt&G%`e$okpqQkMP7*vsmI z2jjp6KLpO?Ekjvy;<+4bY$rX=nN)DEe2UW!GK8;o9^-978Cegy=dl#B%TMs74uYfe2tbM!qTfK?V zY1LN+tjNISb=HYZJK`#YKpcUm$(`c$*L+$Y9bQd9gJc|6ocnB|3Vm{oscd*a<0X8W zHxL04M|Dt)De52Lx``FYN4wkNk;u9eJZrBDNSsm&@>(ah&^q_Zo43_%)K9U8?AoXF zeWE2EU6~)Nk8Hv2`!u~S!1j3m+!)~=)@;0X09onME^qmQ9=_ml5EV8m-wAN|?R)G}&y*vgK0lGWr=3;ixL{A89o%#~YN zw5<}Hlp7NMGZWQ`3U|OogS_#ZAi3O@RY7Rj;{=`Wo;`H`>yJUjecExXO)chN;&XBQVvc zVT)bl!>I`0=LIv)QS1AjU@^5Bz+BGz#C&dl2rkIJy1zMp|d|U`Y}i1WSU#GwDa>9_<=Vt1Ttd|&*d)`fz9Xg)%y-L+Fc&c z_g?7LgS;TNf)E=NUGi1@?<>H@C#Lq)dcD$=Cb{(qa5Qzu@a=ql`RtUM_UyH>A76Q8 zux^5rYkR2Eo3CLrg*q`cNv5j2y4{6rusAQQ4EbY3+2~Po+qg+?V^?VSfDL>LU~Ma~ zt%chdITa>Ax|lMT;N<}HFE(d2U-O$ zA;1G63Z~s}$hvLxxapl7%z999@LJ93bpWPManYs-9Og^a9&UXf#;4fGww<8x0ztLd zfM5J}%^-%0$#w3iOId=85a3QZ^G5Su<8I#ua_CR?8<~;mAxIC^xRKuK{oo3vy5afbxR?yXkl;_A@2btU5+R zYl}3)Lq8(<5>VWYA+S8}RfK-4$c|>?G%jHDgn4ce z3Ky=;`xS9DWQe(+*|cLpC`43Ro}pff2S9j84y{5Rx7Cn@^KJ0UEo*wr_Oxjoe2vx8 zV&yKS%rKgb6T@2oyplP(swr>#vZ#_Fd~+=5CIy^tt%z5mQrq&{-99^Nxs{MFRua)S zo3hXQEl1Lb9rJz8p_mArQ7y|`k)xubh^5DXsQs_8&jMD--2PS|>;T^D>cEv}vwd-l zI)DnXSY=C|XLEgxWjmk)yxVAXyoBJove-+~o~qsZ?KHkfa*@e%Km60=gT%KYmYp6A z*5`(Po1*BM@M+J)==GGYt=A1bD`sv-@Rt8j?tK-snAuzUxRvdN`b6;LmoB5UA$U`9 zw{-kTkI;4Z1eft_xo|40hakH(%C+tBg@@PO(_@588F4fUH`k+}2W&4lb&LNMpv3Iq zMMF&S3y-$Lb=O@lgxta$SzoAT!+8(HQLoc z02R!~@h$)s2_soj2qKhj{`7Y@m$qBUSIC* z=*{o^y%*}AM@CHc?X<22YifAsyvI)fSpHG*oD17P2?%*@-K%R?Nm+jAX9r~-09zkr z)8bm%xT%r9ekS@W$au>Ky6*}``(uj)pf?Ks$^Yl<{1sxr;BVku0uXc+*7KAW@8I2X zTjAI@qdGhKMeUZt)jx59-(aKK(@ZI2NWi!31V5ThDDXK@}A0mrBX z1pzj6k;nZ6F|-RXTS$AI?5O5?))VuufmP6->q&`g+vH+^a5yCm}_}W#QxF%G<#Hthu#{+^0Zv$8mBj#cZc>t=8t`i+uD41>U)o$_-)EFqVEYr)fKlucxlb|xPR~H zbf+{}TTi{@M%HqxYK1#fy?;)0tStV?1)+ANyPRBg677P%w}pBj6aSSSOfPFXTP4$8 zwF==H{|bn+;sU)|EB~YtA4>ulysIKzeD3mCrczY{sJsE0S@^1^GWC?$;XpG;S~~KE z11|mD)`Yt2J}|N%5uRnRsq2H}GCt{1Ixq6v!1VY`EA<+!vX>iOwOpb|_qBemzLRfS z4gbl#e8Vkl8As)P46Ao{O)qfTX$8K?Lio|R?&z%L#Mqnq>}rJQIW+)V1)c%En0N0H zGLY=t0QoXJmA_JoZbaJBh%>U!b7=`OBkA1M?lnpgbP@fSXn|oLOicvKy}fUAx? z9Q}*m)qycQFb^T;y`R4fI-IrCi*L@!t=U0>HFS_$+&lH|4p6XWsEe!tM8@stz!jsR zVRMPsL7L|XLDQf@stai$h|5YdF# z-`IKGa#49-e_X$A!AEYyD!_lnrj$&mOjM{x*31>ZlU835P>3uHk%z2`l% z`2p6-Cf=(xeU1cIj#}uTD|0d zN$&b+{Sl{m1PMFSfdoS{Zi7LWCnB=5cND42>={MNC|TJlBYS5%A(a`0Iu1&q?7ep!!a>F{j=ddw9ozpo+MnO| ze_bxu`!e3={eGVNe%-HoJokOW)0ODrmq1>w9xtx(pY}7V4z8T48~z!r>xxPG*PEcgs`aUHEW?`|<7z zGN=hL%KU~-rzKz0WrDhuJw3M3r_GsA;zEt5?dj?o8gt}}R$8Ge**OlMI~KJ~2MW_h z*f1I4nNsW3uMqG6grNsDr;_0`kF(EWlk&=Te)xVrO9N1mtoq`4chQh7?D>=B7dO;Kw>njqxvaw` z^Ct`Su3b&b*__#S6E;&KFX0z}cqj(8rFbv)r?nRC(eYIzGuqwuB?e)Umtihdnr)>EdGlf=eqgC)*!S%~q_yoIs~B+s@@M=u zyCTl7rD&ACr|96snwZ!0feUcu!yx#XY3YzV^ly^2Y)0&(WJsBULF(d1+0aMj5tZ=z z4P^_1uWnX1VATy@a@F$p!3oN;`^uc2H=kXlJZ*XKT`3My*9=*leZn!tm}`MMdWdI4 zHFLFQ^rb_`@Jn+=`{EDKi_DkJv|3^ViL|vOM&qR%c0&s;kP0emcZb!v4!VZqtma1$ zInR}sn!15w(O*mrKZHa^et#}tSihob2>?d*p33*_f}$bPEX#QtxA9Fv*80PEkMn1~ z`fTjC0i#m%9(yC@{fDT$<$?35q8S79d9-{w#`)&8;#Kc!bW~u4HleOn44P|s!!bFYtcaQ@k04gByEGRhVX|oe*+f zYFDMV!Z&(U(y)j)$2ik%iM(m?DZCb9UZ^>nZ^Onj;{GfvJLbUMggrJgnYrx7P>Qap zvv0ItkN0YXA?o<0p9N=?cj>q`+Z3+f)ok#-dGqF(efOP>>Fi;SU8x>l;ATlq(P%V~ zNiET%206o}h{G%Q0`@q@=<}D_W%-65wu{^hZ1>POp>cpa&OS7-^e)%pH-h}A7+WU5 zJ8?8@V+eHGpNGF0mImX82A&JVD}Z@}d56leXQt6%cly+C7eR|iGgj{?pWwOtO5T+y zTMLJ4*Ka3Ub8O~VqdDV4BZW#@pN3xQCslYHUe^4Ot6nFDfaU{YQ`-8H;BdgXo>$X2 z!5jexSD4KJ<}_Rs?;75jwHwM4oC>CG5z{LB#%s73_2A+=n(;g}>8IelpnbY0E}w?- zA(WgFP7)xRQNfS%Sna!4B_|;!y9#$(I+4I?9n5o zrESs;+;L`A7tmJi{<$x0x-3jKxeQ7n<;dQ6WZjavk5x+&IGGKOivUW=*HWBm0OX_)j~a??@WDkP|+ zI>iUYrLRHwsxjv{T+*%fxS9-+_nEAusGtz_8$^c^Q_GE=o@#TRLivVpa};Q^&HH3S z{aTNHf4+@B311nJReH&@oKWnrJNX(@i|<8YS^)-<>Ovcojl(`%Y^?f_8J|TFPDiKj zLn-MzAK|>TZfkbmV4} z{(diEDJ?oPpJ&0r5935GHKa38TYH@Ee1X;I;oXV20TZG?fyVYReg9CLrNDhPUcxIZ zeCh`}K;-hq>G$BJ-IteZx{eLfH~DPt)u*tRS57>9!yCPNrTUpAGs3N$VvjF6e8m_B zcKtkc`>>uAjn?}Zjd-I4!}@1)F_twZWt+$<$~~u*v2=Dlf#E$^t7_V|vi%nBYzsKK z9=0#HKaVHAIUyUi-NJJbVo-(^aGmi#Z}&b&uoHV+NK})efsCaUGCl0GFL7bt>ha^R zak^8xyqfs8N7DVkL~4T`{afVY-_?fK5_BBYsTjbdQgGZA>U4}Slq1T|=p^V+!iQ)F z=WOud+SoQZB=hlBW8j}<&FbiR8*DB%jQ3hlKD5s5YiKa!Gc0bjND(Z?U#wb_MT}_r z?Nw_~XZO*DH%YsB@kU%v`vi@M73b?(nftr&c0<@1_6lAL#GEdDN~U!~C>ZSx1wRVd2-- zZ49N}IjAv6qx2n)cPCp5qu4j5SXU)1@wIP~ zw6ZOUI$2VmsI3d;JJMe<5X&JWp)BRgDuZ!+< zS7n*z@mkB8R#PCn%ua)|RcudEb2NhKF4Zzc8H!GI(5odXBEnXba=KdK!E#fvD1s(c z_gx>QwXBLuX`fSPy1r9Hfb(|WeIx4$ck8gpyY%(D9iU$+g*BvvN}Lczl>~76`gLOv z#aLO{WnsHeTVTpsRWrN4VkSn;UGH=Arns zxb`w=OK1s0TA&`8qg;6$UF)OnnYq(sUIN(~SNZ|a`Ls+`^9k=~;+fx#c_W-4@Dt;m zt2dcBEj4aTM&3-Pkkk|P(5b#r7+N>6^`y5q<$OTKw`Ym@AE@-A$!h}Pgw=bgO3QC< zK`>+u2|}Q&dr0V`dC^znp;~fmB#sQn2hKntT-AJ&FF@JDTfE^+Z%@R_%7ZZ{MEN~o z>+Ra*rl2LJU~Ol^R*SYzD)GaLV z^>g2U0?wXDjm7D4{*5P(QsI?mm$3+>4^dZzKip)l*pK*6tUl`xYZwLdyw1T0_r z%0oNIjYNT`2NV9`EK-huIw{mwhLI7(b-2FK45`p7b^hF;YU+XSvv;6uN-zoM1r=!i zH%3MQy*}jy__=cUpsQmGf7dDQ8k_y{9fbu)!`*_0`&RS=cAd#T%He;}n2Vg%Zojqb z*MwbhJ{mDp<;+5|6K^?lti0FT;%gpYYl|t{t!`(Dh_QjA3o{SJ_(A`Gwt1zyGX?w@ zBQTbz;sK9r&X*OsKa;(PxS+W;vD#XJZq2+7cGqa)BgP3q!$C)IRu-LWOqi>2jly|b zy(J>Y$>*a1rGPOVXtnoL4fFzO2{8djF-p@*qH}WjWnqu=J;fS44^}(Lb;>R7cso!D+U?e$*;n?;cMVCz$LSuv6d~G%+D$>snq0$%Uz!8 z=#8v?XO$5ON;UNuV-hO%(4EcULPeglx@-HV#n4YWTSOwq-+|L(p_lmdt37EMYTBlJx2U5-p$OJ;|LRW$RSJcY6Yfigd~XmsBrxp6GNG}+oFCFykVWFt1V4fE88Qc zJJgl&KdlC-D2zpMbaUbrC@FRiNHFUJEgeoq)ZbvggUt!D2TlH;i7y+C6&R-LZ>m2J~H70uTok`j8&i!v*fD@x(GG{`24B>h^h4QMRn8EKpTLP21nq#>QMdtU`-3_Ra1 zpHa`RfeuHDE(l3~ zj!SyHjK<+M!%jvKW=f|+B`FK|eEXBJ3AY=P$cM;jRAArc3ygf?Q%j9~j_)LRFyBjj z{L%^>br)EDrq==BSVPq2+Erk9h4ei(=n<@PYg={ng4P1}0_JAEo|@lYI?fz87B2gQ z&Cq+aFaNxZ=%~+apKStD89nQYOFj3e*&51kGn)JClCskKN`aM5W^msMEclf3M~J-Z@9d1H8(!J(tB{cS3sC^uZ>jnDi+>K-nMPq;EWdE znagL|*=cUO<5d4CY-7IPNexkM{b;Y`3FV1+2ci7%WKq$92NyyrC;8HIdD%Xgj%kDx z8t%>q`5!pFHbo_PuHK3B>|fbhh)H1aekqe=u)B$?QDPRv6)mBqUGLo#)4~~phwzf* z9REv>1&XW(~LoqBr8O z896zW+WG>uy|yVxc=kYi8Pl2tIGjX2%b26qcD#Bi6HWjYl`yM?(_CIVzOUW^gE-w zS_9qP*##aoD9m&kR$R_d3)M9F1X@V~qwp}s&oJ0SrL^D!s&J^f8aL>TBQe2kr({~o z^Fy0TQEp|ksr>Kz{<)w4Lmruv#dQShRmB%V=$pTFzQBemg50G_;s-rU z4z<3U2v8$xzhM=>y{MO!guUW9`hIuG*4EYmS8_j=ofKwe(%jJW)pO}(n%1tfeHM$u zp+V7j`0(`C$ub69$G9E%6f-u58{4$aI7iFB2BZ^=i-Ul1agSQL)Gy=Hf_>~ot*#0o z$`x8-K_QsFXgVYLpjbzaXT@ggKD?OO{|09r|24K^j8Ds+YqDV6f#z}byLTTPmD*Vk zG|cPfg+lE>U|-2okUopS=M6AWyD_^_bj&0|rUZC!w*>Fi^QtndT+&b44z+q=Z1a^U z*7COsPVCZuPaWgofy|%6ne@|CxWRW)@4y++t6=o`lQ}=)^P1{EEsX@QG#?FNwxM@^ zpDqL=PG1l5=zt#5@|7pEz5keYpxErt(g zX(v&W5u)7jyuw{IwF#K!i#)YC7D^A%A~89`0t2a2^d1gQ8a?5#cZ{5B3h>qB?JInk z^fFf4%H1pXoaf>;bb7&9_2O&Ym5PM=cbxoTx46^-_k$J-gjFTSlTPp@n2mu0YU~gh8|Ngl3*sp&OuGd`k6qKUiz*YM`cFOBT{|Ae)uS^9})wC z4SK~qo_Q3zovV51NXRiVqJpL$!ve>B?g@BzH|Tn++0*^ zOMNiywJ{lcEXx_O(Bj=q2ap9eU+$hx01Y7aUnVm#F}=r3*O8yT0;E>6># zjo~ulb6ZiSD$=;q)~f0J#lFR<`N<(1YrjpvX%EtL?lRD`nxNYbPBr9$Qabqf0OCR8a8&#kko&e6!3}Sx zXm69-XLkNv&J9`}I>-K)>kTGT_8z5uaiWGB*K1s2o&vj7V^WHb*PLtb4(UjKxt3X6 zOw7&ZPVqRi_%v^&=Mr3uQ|+^+!w|0GV@*vaY`~V@AJ@|~iF}5DhNE`UWpontptYdK zb*0g($Jefu?+mZ=m+j}KYPEC;s(~*{HB`P`yOyqM)6QxQ8L+K7zUZ@MiQ<+}_T1W_ zO8`+)+b|-Ra9ktIVr94a%0aW4l6_k7vEj0$C5NE#J*T$kXd?N9==kd&^&*v_w)tvr zdzU~P9Pyn14JxgTO-vN$9$!@|Z|!3MBjEz~Mwc{W;@I5Jtb#pL)oH8O4yl#v(UW^` zbdGs*uV{@LleB^97qV?BcvWy0)#dGB9c(yQ`23?m;Sf)*Pgv<*WPp{$WPQvPyKfX3 zVuYTeAZi$)3?yrsQWka&s=$$WnCdhad{EbazOCs15hrKgbJ84l?BZHL9WSfu>n<#= zC^kZva4hKUFlRRk6x2MEMZ`^=XQ{-#?=^|a?6sTrK67S%r3gLSKi@N8;{oU`XlLQbc`)F1Ojeu2~HF7_%ZNabs;Y6#Yck5g9iBcVubIOwgg9wPZ0A+EF$C) z{$iz#G^plfLSmfwkV{4dXAFcU&U6{doB$ymOB{mCVX>8 zA~hk1F9S@WfCL?RDL*K)x)*-0=@FPg^(lL!pVsByIdt+Tb!c@d;!0O0fRPXSirw>a zTeL(dCfrRME7u5}BUqN7k|>OG98%V|3xR917!O*rU|*;;Bvl@)$c3&@xo z7F!MIzK6w0kzM&tq?KL^2t>9Pp&7}gS0JvK0ykBuoE0;Y+heGUXAOQFXSTP#fw&JKf>--ygHXfpeB(gx3zATl|Q2lDPfP;7fs#!CTm(K z4V3)6TFoP~7xw)!w(Z}p*?kE$xJn7RBTTjg%eXX2!NklpxcSR^38 z*|A@=+#g-PhE0c>gT_k(=d{mPc&i!v)OjK3HE6`s)DLilg!xHoFVLwYpfu5OkX&+H zTHmLJ2zLf(3)5D`Jl>@AZ1WJqUs+p+Rr~s0hP+lK$SX zj^9#beDTYPSqL{U$Gs)356(Pl<{Ei*Y!JjntGJ;)taH;JZoQU?@y#qP7C1U0g(HZ+ zzFx*1D^r8$vV$}IJL;YbWpaF#5AN;IoJ}2#;Pf!C&JY&KXp@Q1qNL#n6N%Yxp!dtug|el%>cmu_Av3Bu4)& zT*Msif~u1VmyV*NTt1nSz+mZ_W>X{_cCEBQ_{6Ci-D%BaJyKi;2mBB)-4%~X;8|JZ zvG6^`#I-p%i@$EZ{d?T*(GFP8zMgEVeA4ki&8Nh%-+r7_aAjLL8(DhTs2N2th#adA z7EEj_*_upA2;E2924#vb#K|^DGy1C-uZ+pDNVUH=B}7OUg+pJ&*E$W+n`NtbPOhdcOjZ?(T_k&7CTx|>!AGQF*pVA_Z8GqF`)g(ySA;BH&kyZ7 z`aVmc!&m#ESc2ubJMS$Ar}*kTU8+VrJQ@k^$H|+Y`d;YACY*(AQ%RVAmP(KQvt_^c zj`*282NBtt3_T&6!{x=eWqe*qH{?y87%+GtIFtTGp^bLa&3)T9qI@a`?;qN3oc*>7 zSU**XfZPba!g~r_d$D-$@#Ci|jm7T;#Le>Il&bi_I%zRBXS%&oacuZ$)@~bV;+Wyr z^37h%upq5(9INwI$CLGXF`;u;dxoF0x!p8EV|&+n!AZ;2o*Op1Aj1_JVo=2TB5>9a zvQW8Z8@n~n73Z^gOecEs6Os4zk;A?+>f#I2dxzr*Njx{Xcuk*6U5-8fp>*L>D3eHo zBgu<9(Nduw=I?FZdHw0CPa*vSfkv%6n|c<>4O(c1yy&+N&%bm$T*P)X4GtE-of0NX zGuKMlD&bCvwoe8MFGkmBSKHbl<2r7BG3Oia;h{HFO3WNX-$@SHri=Ahy4rTdg|wZV<%#GA zA#=zT1!QwWFV8qR9zKQqv7g`0a8=)#M%7c^U~Qt)89fB==zt>pRiTMbCK#jF@p^ zfpvvVKSsUq!@8e8SW%MOqrKWw-Veo8O9wonH*!M7hQ#fC_tm7FDhlVYjl;}ELxlnr z>g=9dpR{}PvMNod_3b~O?}9Jxg^ADUF!>@XM3WqSa>+9!24!;x4HJ|lU8xs++7B+T ziAZ^FT`L^EV$k{R*!jzH_RH#JyQAC-(T8Qj{qy%C7c9TCL5Qwi_49KoBtN~+p&kIy zE9m{^(tV4T-yvh|i@43QJZyhDW}yUS{v=w!OCqfzY@zV5gvH&Nt;}PuV6#?_zrVW1 zKmE=)=otNQ-K+JT2*U#1!<6pfEVA$EBQ_81``lxCnfR*~#p&QA)Jw33dPUCUXsN}< zhDM_N-u`{MwP@dcmwrrddEb=CA^Cez%cYopjwm+!>kF`8ZSLJD)+IiT|KWmq|L3G@XC{~p#yr~4{1b84d#@`XH_Vcc|F!!-4xj== zSMABb?z^>%4wt`$JNGkeer-Y7bS)wJOuT)fTzhq|wBD&=vlE`)>&|2l>+q?p3)_I* zlfe$zHhcNOQaOc$}x*wn)Z$CNQQ?Bae4s!!2!o zHfE~VJ6sliiWjp7T+Hq!HuB>6cR8Yqo1RfnivstBEKO0&0(K#}3ESs)g{H?RtHLHq zMYQSpOzJ9il=pcZ&t?Pi-WIXxfgyv<*3>Mo#eB`+jD-M=6K69XVyhG4Z!v~e9-L!x z=zHV3vxLmcdlr^`D~mD8+Nq6GtuHEXC|i5l*LoF=tWhYM9gO(LV;JAwA9y=`O3g4SvucB^Y#gc_Wu;-^)81F-=rc%MPQiG_Pre3Ul()C> zRLC(%4y8p-foJvh{`OLao_VPn<7Eruf&Bi4y!@d;&C4x2#;^Smml?;3XSv)383U4H>`?rQNO8DMsdla#n#$V-l*%DoxGrp-{ot~Ll zXvJ6Y%}iFf&Sgt)>8*uS;b2oi{L2{<>khA5sA~^bS#NMrxgrDW=4c#U;p*`h!XyvG zHfKT*8CKz$?yM61bqo1xs-k17YYT5t@kITW6-*maK)sC(5O+fz_1I%H92P@EO9&A~ymRHPf+dwn~`W)n>wd8M^(hxrllEpu7_`NeIF>F(L3fSoDA zg!_dz2YpW8t(f?Nu_;?{Yh0X|z2pcXWxSl9p;wseT6Xw-Q*Mi*a~lgASG#%eq#8KX zR=X@+-Ok>3Vb+9BH2K3R&T^RYCA^+QGuAtGopElqMS9<7+Btn~5OMaoaMzxt`Pah8 zLe6kmzS)ui{sFzT9=SXmaNTXb#YNS`=GDG}5)mpW)M#UllvXBN92$^xMst@GUcVI5 z1O!BtfJl7$bqtb{hYZjAinHFV(V_Hk2JhH%?wDRgxDT%T%b%0&vG{v+u!1!aLu*!vshW@ubAYJS~^YV_36|n@V}- zBFrlbLxEDGhn@OZCYpFg3;3=@8RH%xpb?Y%wC9{W4YNFU|UNbDnDH^l@hkQET*>$5od$!rwx&NJ=(5XZiH6Z#(5u9eF}52zWT5BfHmTW&UaDTBEw8M zJnq;#pq&rd1eryjB$8&F4~*qB1>&}df7-EUcfbnYfYUi+`LUwGBL2I4bQ_VxG%_as zr`EFKX}$byH;%Rsd|RP24kY@>Enh7m!+)RC&o8+8y*HUt`W>CcNjgKvbkp%?5SZA2 zS6!%xo~+SM$-Y~YuT}gbY?Yd<;Y7d5x%W3vNTKwt@#EwJ0Zg_ogS#(tv~{tHpn