Status: Current Last updated: 2026-04-08 07:40 EDT
This page documents the current public batchalign3 CLI surface. For anything
you are scripting against, confirm with batchalign3 <command> --help.
For detailed input/output patterns and mutation behavior per command, see Command I/O Parity.
batchalign3 [GLOBAL OPTIONS] COMMAND [COMMAND OPTIONS] [PATHS...]Global options go before the command name.
| Option | Meaning |
|---|---|
-v, -vv, -vvv |
Increase verbosity |
--workers N |
Maximum concurrent files per job (default: auto-tune based on RAM and CPU; capped at 8 for GPU commands) |
--force-cpu |
Disable MPS/CUDA and force CPU-only models |
--server URL |
Explicit server URL for server-backed dispatch |
--override-media-cache |
Bypass the media analysis cache (audio tasks only; text tasks skip cache by default) |
--text-cache |
Enable caching for text NLP tasks (off by default; useful for incremental editing) |
--batch-window N |
Files per batch window for text NLP commands (default: 25, 0 = all-in-one) |
--tui / --no-tui |
Toggle full-screen TUI for server-backed jobs (DirectHost local runs stay on terminal progress bars) |
--open-dashboard / --no-open-dashboard |
Toggle browser auto-open for submitted server job pages (macOS only, interactive TTY only) |
--engine-overrides JSON |
Select built-in alternative engines with a flat {string:string} JSON object; invalid JSON is rejected |
--sequential |
Process files one at a time with a single worker. No memory gate, no server. Ideal for small jobs on laptops |
--no-server |
Skip auto-detection of a local server; force direct in-process execution |
BA2 compatibility flags (--memlog, --mem-guard, --adaptive-workers,
--pool, --shared-models, etc.) have been removed. If your scripts use them,
remove them.
--sequential gives you the simplest possible execution path — similar to
batchalign2's direct mode. One worker per task type, files processed one at a
time, no concurrency infrastructure:
batchalign3 morphotag corpus/ -o output/ --sequentialWhat it does:
- Forces
--workers 1and--no-server - Disables the memory gate (no cross-process coordination)
- Keeps the worker alive for the entire run (no idle timeout kills)
- Preserves the utterance cache (repeated runs benefit from cached results)
When to use it:
- Processing a handful of files on a laptop
- Debugging pipeline issues (predictable, single-threaded execution)
- Environments where memory auto-tuning is unwanted
When NOT to use it:
- Large corpus runs (50+ files) — the default parallel mode is 3-5× faster
- Fleet machines with warm workers — use the server instead
--sequential is incompatible with --server (mutually exclusive).
On macOS, when you run a processing command interactively (e.g.,
batchalign3 transcribe corpus/ output/), the CLI automatically opens the
job's dashboard page in your default browser. This lets you monitor progress
in real time.
Direct local execution does not submit an HTTP job, so there is no dashboard
page to open. In that mode, --open-dashboard is a no-op and the CLI shows
local terminal progress inline instead.
The dashboard auto-open is only triggered when:
- Running on macOS (no-op on Linux/Windows)
- stderr is connected to an interactive terminal (TTY)
--no-open-dashboardwas not passed- The
BATCHALIGN_NO_BROWSERenvironment variable is not set
It will not fire in non-interactive contexts: cron jobs, CI pipelines,
SSH sessions without a display, piped output, or scripts. To suppress it
explicitly in interactive sessions, pass --no-open-dashboard.
The core processing commands documented below all accept:
| Option | Meaning |
|---|---|
PATHS... |
Input files or directories |
-o, --output DIR |
Output directory |
--file-list FILE |
Read input paths from a text file (see below) |
--in-place |
Modify inputs in place |
When exactly two positional paths are provided, the CLI still accepts the
legacy input/output directory form. For new scripts, prefer -o/--output.
--file-list FILE reads input paths from a plain-text file, one path per
line. Blank lines and lines beginning with # are ignored. All paths must
exist at the time the command runs; a missing path is a hard error.
# My align re-run list
/data/aphasia/Cantonese/Protocol/HKU/A023.cha
/data/aphasia/Cantonese/Protocol/HKU/A024.cha
# these two need re-running too
/data/ca/CallHome/English/4092.cha
/data/ca/CallHome/English/4093.cha
# Run align on every file in the list (in-place, using net's server)
batchalign3 --server http://net:8001 align --file-list my-list.txt
# Split a large list into batches of 10 for long re-runs
bash scripts/align_batch_run.sh -n 10 -s http://net:8001 my-list.txt--file-list is mutually exclusive with positional PATHS arguments. It
does not accept a separate -o/--output directory — each path in the list
is processed in-place (output overwrites input).
For batched text-NLP commands (morphotag, utseg, translate, coref),
large --file-list runs may not show file-by-file on-disk rewrites while the
invocation is still running. The command can batch/stage work internally and
then commit the in-place writes when the current invocation finishes. If you
need visible write-through during a long rerun, split the list into smaller
chunks and run those chunks sequentially.
Each processing command has a dedicated page with full options, a pipeline diagram, examples, and gotchas. Click the command name for complete documentation.
| Command | What it does |
|---|---|
| align | Add word-level and utterance-level timestamps via forced alignment |
| morphotag | Add %mor POS/lemma and %gra dependency tiers |
| utseg | Re-segment utterance boundaries using Stanza constituency parsing |
| translate | Add %xtra English translation tiers |
| coref | Add sparse %xcoref coreference annotation tiers (English only) |
| compare | Compare against gold .cha references; write %xsrep/%xsmor + .compare.csv |
| Command | What it does |
|---|---|
| transcribe | Create .cha transcripts from audio via ASR |
| benchmark | Transcribe and evaluate WER against gold .cha references |
| opensmile | Extract acoustic features → .opensmile.csv (positional I/O) |
| avqi | Calculate Acoustic Voice Quality Index from paired .cs/.sv audio (positional I/O) |
Initialize ~/.batchalign.ini:
batchalign3 setup
batchalign3 setup --non-interactive --engine whisper
batchalign3 setup --non-interactive --engine rev --rev-key <KEY>Options:
| Option | Meaning |
|---|---|
--engine {rev,whisper} |
Persist default ASR engine |
--rev-key KEY |
Rev.AI key for non-interactive setup |
--non-interactive |
Disable prompts |
batchalign3 logs
batchalign3 logs --last
batchalign3 logs --export
batchalign3 logs --clearKey options:
| Option | Meaning |
|---|---|
--last |
Show the most recent run log |
--raw |
Raw JSONL output with --last |
--export |
Zip recent logs |
--clear |
Delete log files |
--follow |
Tail the newest log file |
-n, --count N |
Number of recent runs to list |
batchalign3 serve start --foreground
batchalign3 serve status
batchalign3 serve stopserve start key options:
| Option | Meaning |
|---|---|
--port PORT |
Listen port |
--host HOST |
Bind address |
--config PATH |
Alternate server.yaml path |
--python PATH |
Worker Python executable |
--foreground |
Do not daemonize |
--test-echo |
Start test-echo workers |
--warmup-policy {off,minimal,full} |
Warmup preset |
--worker-idle-timeout-s N |
Idle worker shutdown timeout |
batchalign3 jobs --server http://myserver:8000
batchalign3 jobs --server http://myserver:8000 <JOB_ID>
batchalign3 jobs <JOB_ID>
batchalign3 jobs --json <JOB_ID>With --server, lists or inspects remote jobs. Without --server,
inspects the local job artifact directory for post-failure debugging.
Pass --json for machine-readable output.
batchalign3 cache stats
batchalign3 cache clear --yes
batchalign3 cache clear --all --yesBATCHALIGN_ANALYSIS_CACHE_DIR and BATCHALIGN_MEDIA_CACHE_DIR relocate
the underlying caches for isolated runs. BA2-compatible flag forms
cache --stats and cache --clear are still accepted.
batchalign3 openapi -o openapi.json
batchalign3 openapi --check --output openapi.json--check exits non-zero when the target file does not match the generated
schema.
Forwards arguments to the Python model training runtime
(python -m batchalign.models.training.run). See
Models Training Runtime ADR.
batchalign3 versionPrints version and build information.
batchalign3 uses stable non-zero exit code categories:
| Code | Meaning |
|---|---|
2 |
Usage/input error |
3 |
Configuration error |
4 |
Network/connectivity error |
5 |
Server/job lifecycle error |
6 |
Local runtime error |
Exit code 1 is reserved for unexpected failures outside the typed categories.