Skip to content

feat: add MiniMax Cloud TTS as alternative speech provider#61

Open
octo-patch wants to merge 1 commit intopetermg:mainfrom
octo-patch:feature/add-minimax-cloud-tts
Open

feat: add MiniMax Cloud TTS as alternative speech provider#61
octo-patch wants to merge 1 commit intopetermg:mainfrom
octo-patch:feature/add-minimax-cloud-tts

Conversation

@octo-patch
Copy link
Copy Markdown

Summary

Adds MiniMax Cloud TTS as a new tab in the Gradio UI, giving users a GPU-free alternative for text-to-speech synthesis via the MiniMax T2A V2 API.

What is included

  • minimax_tts.py Standalone MiniMax TTS client with three output modes:
    • synthesize() raw MP3 bytes
    • synthesize_to_file() WAV or MP3 file
    • synthesize_to_tensor() PyTorch tensor (compatible with Chatterbox pipeline)
  • Cloud TTS (MiniMax) Gradio tab UI with API key input, text box, file upload, model/voice selection, speed control, and audio preview
  • 12 built-in voices English_Graceful_Lady, Deep_Voice_Man, Friendly_Person, sweet_girl, cute_boy, and more
  • 2 model tiers speech-2.8-hd (high quality) and speech-2.8-turbo (fast)
  • 25 unit tests + 5 integration tests covering API calls, error handling, file output, and tensor conversion
  • README updated with Cloud TTS documentation, setup instructions, and programmatic usage examples

Why

Not everyone has a GPU powerful enough for local Chatterbox synthesis. MiniMax Cloud TTS provides a high-quality cloud alternative that requires only an API key no VRAM needed.

Files changed (6 files, ~784 additions)

File Change
minimax_tts.py New: MiniMax TTS client module
Chatter.py Add Cloud TTS tab + helper functions
requirements.txt Add requests dependency
README.md Add Cloud TTS docs + TOC entry
tests/test_minimax_tts.py 25 unit tests
tests/test_minimax_tts_integration.py 5 integration tests

Test plan

  • 25 unit tests pass (pytest tests/test_minimax_tts.py)
  • 5 integration tests pass with real API key
  • Manual: launch python Chatter.py, open Cloud TTS tab, enter API key, generate speech
  • Manual: verify MP3 and WAV export both work
  • Manual: verify different voices and models produce different audio

Add a Cloud TTS tab in the Gradio UI powered by MiniMax's T2A V2 API,
giving users a GPU-free alternative for speech synthesis with 12 voices
and two quality tiers (speech-2.8-hd / speech-2.8-turbo).

New files:
- minimax_tts.py: standalone MiniMax TTS client (synthesize, to_file, to_tensor)
- tests/test_minimax_tts.py: 25 unit tests
- tests/test_minimax_tts_integration.py: 5 integration tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant