Add cli by mhrice · Pull Request #31 · Stability-AI/stable-audio-3

mhrice · 2026-05-20T06:16:04Z

No description provided.

Copilot

Pull request overview

This PR adds a stable-audio command-line interface entrypoint for running Stable Audio 3 generation (text-to-audio, audio-to-audio, inpainting/continuation) without writing Python, along with unit tests and documentation.

Changes:

Introduces stable_audio_3/cli.py implementing argparse-based generation and WAV output saving.
Adds tests/test_cli.py covering CLI flag wiring via mocked model and torchaudio I/O.
Updates README.md with CLI usage examples and adds the stable-audio console script in pyproject.toml.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`stable_audio_3/cli.py`	New CLI implementation: argument parsing, model loading, generation invocation, and output saving.
`tests/test_cli.py`	New unit tests validating CLI flag wiring with mocked dependencies.
`README.md`	Documents CLI usage and adds a note about `uv sync` behavior.
`pyproject.toml`	Adds a `stable-audio` console script entrypoint.

Comments suppressed due to low confidence (2)

stable_audio_3/cli.py:225

torchaudio.load() returns (waveform, sample_rate), but StableAudioModel.generate() expects audio inputs as (sample_rate, tensor) tuples (see StableAudioModel._encode_audio_input unpacking). Passing the raw torchaudio.load() result here will swap sample rate and waveform and break audio-to-audio/inpainting. Load as (waveform, sr) and then pass (sr, waveform) for both init_audio and inpaint_audio.

    init_audio = None
    if args.init_audio:
        init_audio = torchaudio.load(args.init_audio)

    inpaint_audio = None
    if args.inpaint_audio:
        inpaint_audio = torchaudio.load(args.inpaint_audio)

stable_audio_3/cli.py:195

When --prompt/--duration/--negative-prompt are passed with multiple values, the CLI will pass lists into model.generate(). The model validates list lengths with assert len(value) == batch_size (in StableAudioModel._build_conditioning_dicts), which will raise an AssertionError (and can become a silent logic bug under python -O). Add explicit CLI-side validation (or auto-infer batch_size from the list lengths when the user didn’t set it) and fail with parser.error(...) on mismatches.

    # --- Build scalar / list args ---
    prompt = args.prompt[0] if len(args.prompt) == 1 else args.prompt
    negative_prompt = None
    if args.negative_prompt:
        negative_prompt = (
            args.negative_prompt[0]
            if len(args.negative_prompt) == 1
            else args.negative_prompt
        )
    duration = args.duration[0] if len(args.duration) == 1 else args.duration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+SAMPLE_RATE = 44100
+CHANNELS = 2
+FAKE_AUDIO_PATH = "some/audio.wav"
+_FAKE_LOAD_RESULT = (SAMPLE_RATE, torch.zeros(CHANNELS, SAMPLE_RATE * 5))
+


+    if args.inpaint_starts and len(args.inpaint_starts) != len(args.inpaint_ends):
+        parser.error(
+            "--inpaint-start and --inpaint-end must be specified the same number of times"
+        )


Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

stable_audio_3/cli.py:199

Same issue as prompts: when --duration is provided multiple times, the CLI passes a list through to StableAudioModel.generate, which asserts that list length == batch_size. Please validate the length against --batch-size (or require batch_size to match) and fail via parser.error so users get a clear CLI message instead of an AssertionError.

    duration = args.duration[0] if len(args.duration) == 1 else args.duration

+    prompt = args.prompt[0] if len(args.prompt) == 1 else args.prompt
+    negative_prompt = None
+    if args.negative_prompt:
+        negative_prompt = (
+            args.negative_prompt[0]


mhrice · 2026-05-20T06:51:14Z

+    if args.loras:
+        print(f"Loading LoRA(s): {args.loras}")
+        model.load_lora(args.loras)
+    if args.lora_strength is not None:
+        model.set_lora_strength(args.lora_strength, lora_index=args.lora_index)


This is fine

+    warnings.filterwarnings("ignore")
+


Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

stable_audio_3/cli.py:188

Inpaint region times are not validated for non-negativity or ordering. Negative seconds can produce negative sample indices (masking the end of the clip due to Python negative indexing), and end <= start becomes a silent no-op region. Validate each start/end pair after parsing (>=0 and end > start) and parser.error on invalid regions.

    # --- Validate inpaint args ---
    if (args.inpaint_starts is None) != (args.inpaint_ends is None):
        parser.error("--inpaint-start and --inpaint-end must both be provided together")
    if args.inpaint_starts and len(args.inpaint_starts) != len(args.inpaint_ends):
        parser.error(
            "--inpaint-start and --inpaint-end must be specified the same number of times"
        )
    if args.inpaint_starts and not args.inpaint_audio:
        parser.error("--inpaint-start/--inpaint-end require --inpaint-audio")
    if args.inpaint_audio and not args.inpaint_starts:
        parser.error("--inpaint-audio requires --inpaint-start and --inpaint-end")

stable_audio_3/cli.py:214

--duration accepts zero/negative values. In StableAudioModel.generate, non-positive durations cause sample-size adaptation to fall back to the default sample_size, which can generate unexpectedly long audio instead of failing fast. Add a check that all provided durations are > 0 and raise parser.error otherwise.

    # --- Validate list-flag lengths against batch size ---
    if (
        args.negative_prompt
        and len(args.negative_prompt) > 1
        and len(args.negative_prompt) != batch_size
    ):
        parser.error(
            f"Got {len(args.negative_prompt)} --negative-prompt values but batch size is {batch_size}"
        )
    if len(args.duration) > 1 and len(args.duration) != batch_size:
        parser.error(
            f"Got {len(args.duration)} --duration values but batch size is {batch_size}"
        )

+    parser.add_argument(
+        "--init-noise-level",
+        type=float,
+        default=0.9,
+        help="Noise level for audio-to-audio (0.0–1.0, default: 0.9)",
+    )


mhrice added 3 commits May 19, 2026 22:58

Add cli

e389bd0

Set duration to 30

c26a040

Format

7583d67

mhrice requested a review from Copilot May 20, 2026 06:17

Copilot started reviewing on behalf of mhrice May 20, 2026 06:18 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Fix inpainting

fadd197

mhrice requested a review from Copilot May 20, 2026 06:35

Copilot started reviewing on behalf of mhrice May 20, 2026 06:36 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Add batch size check for batched inputs

21641db

mhrice requested a review from Copilot May 20, 2026 06:51

Copilot started reviewing on behalf of mhrice May 20, 2026 06:51 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread stable_audio_3/cli.py

Comment on lines +108 to +113

parser.add_argument(

"--init-noise-level",

type=float,

default=0.9,

help="Noise level for audio-to-audio (0.0–1.0, default: 0.9)",

)

mhrice merged commit bbcf738 into main May 20, 2026
5 checks passed

mhrice deleted the add-cli branch May 20, 2026 06:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cli#31

Add cli#31
mhrice merged 5 commits into
mainfrom
add-cli

mhrice commented May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

mhrice May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mhrice commented May 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

mhrice May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants