fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable by fadenb · Pull Request #19 · pretyflaco/millet

fadenb · 2026-05-07T11:20:10Z

Summary

TranscriptionConfig.post_init no longer raises ValueError when device='cuda' (or torch_device='cuda') is requested but CUDA is not present. Instead it automatically falls back to 'cpu' and downgrades compute_type from 'float16' to 'int8' (float16 is unsupported on CPU).
Model-loading log now indicates whether CPU was explicitly requested (forced), automatically chosen because no GPU was found (fallback - no GPU), or torch is missing (no torch).

Motivation
Running meet run on a machine without a GPU (laptop, container without passthrough, CI runner) currently crashes with an unhelpful ValueError. The user must know to pass --device cpu --compute-type int8 manually. This change makes it "just work" - the common case shouldn't require flags.

Test plan
I currently can only do one of them as I lack a device with CUDA capable GPU.

Run meet run on a machine without a GPU - should see warning, then transcribe successfully with int8 on CPU
Run meet run --device cpu on a machine with a GPU - should see (forced) in the log and use CPU as requested
Run meet run on a machine with a GPU - should use CUDA with float16 as before (no behavioural change)

Instead of raising ValueError when the requested CUDA device is not present, automatically fall back to CPU and downgrade compute_type from float16 to int8 (float16 is unsupported on CPU). Also indicate whether CPU is forced or a fallback in the model-loading print message.

pretyflaco

Direction is right — the current ValueError is genuinely user-hostile on no-GPU machines. The PR is well-scoped and the motivation is clear. Approving.

A few small follow-ups I'd want before this hits a release:

Unit test for the __post_init__ fallback. Mocking _torch_device_available lets us cover all three of your test scenarios in CI without the hardware mix you flagged. ~15 lines in tests/test_transcribe.py.
Warning log should mention the compute_type change. Right now a user who passed --compute-type float16 explicitly sees only the device fallback warning, while compute_type silently flips to int8. One-line append.
(Out of scope, just noting:) meet check on a CUDA-less machine should keep working — your change preserves the if available is None: continue branch so this should be fine, I'll smoke-test it on my side.

You're blocked on hardware for two of your three test scenarios anyway, so happy to land this as-is and push a follow-up commit with the unit test + warning tweak — or if you'd rather do it yourself for the learning, take a few days and add them here. Either works for me, just let me know which you prefer.

Either way, thanks for the clean PR — the diagnosis in the description was excellent.

pretyflaco · 2026-05-14T14:10:07Z

Superseded by #21 — picked up the follow-ups from review (compute_type warning, accurate fallback log via internal flag, unit tests, CHANGELOG). Your commit is preserved with full attribution via cherry-pick. Thanks again @fadenb for the clean diagnosis and patch!

@fadenb

Follow-up to pretyflaco#19 (cherry-picked) addressing review feedback: - __post_init__ now emits a second warning when compute_type is flipped from float16 to int8 because the device fell back to CPU. Previously the user only saw the device fallback message; the compute_type change was silent. - TranscriptionConfig gains an internal _device_auto_fallback flag set when device is auto-flipped to cpu. _load_whisperx_asr_model reads the flag instead of re-sniffing torch at print time, so the "(forced)" vs "(fallback — no GPU)" annotation is accurate even when the user explicitly passes --device cpu on a no-GPU machine. - Removed dead conditional `fallback = "cpu" if value == "cuda" else "cpu"`. - tests/test_transcribe.py: rewrote the two raise-expecting tests (test_invalid_torch_device_{cuda,mps}_raises) to assert the new fallback behavior, and added three tests covering the compute_type warning, the no-spurious-warning case when compute_type is already int8, and that explicit --device cpu does not set _device_auto_fallback. - CHANGELOG: v0.7.1 entry crediting @fadenb.

pretyflaco approved these changes May 7, 2026

View reviewed changes

pretyflaco mentioned this pull request May 14, 2026

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable #21

Merged

3 tasks

pretyflaco closed this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#19

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#19
fadenb wants to merge 1 commit into
pretyflaco:mainfrom
fadenb:main

fadenb commented May 7, 2026

Uh oh!

pretyflaco left a comment

Uh oh!

pretyflaco commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fadenb commented May 7, 2026

Uh oh!

pretyflaco left a comment

Choose a reason for hiding this comment

Uh oh!

pretyflaco commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants