| title | Arivara TTS |
|---|---|
| emoji | 🍿 |
| colorFrom | indigo |
| colorTo | blue |
| sdk | gradio |
| sdk_version | 5.29.0 |
| app_file | app.py |
| pinned | false |
| short_description | Expressive Zeroshot TTS |
Arivara Voice-TTS is a powerful Text-to-Speech (TTS) application that generates high-quality speech from text with reference audio styling. It utilizes the ArivaraTTS model to provide expressive, zero-shot voice cloning and synthesis through a user-friendly Gradio interface.
- Expressive Synthesis: Generate natural-sounding speech from text (up to 3000 characters).
- Voice Styling (Zero-Shot): Upload a reference audio file to instantly capture a speaker's voice characteristics, prosody, and tone without any fine-tuning.
- Advanced Generation Controls:
- Exaggeration: Control the speech expressiveness (0.25 to 2.0).
- CFG / Pace Weight: Adjust the generation guidance and pacing of the speech.
- Temperature: Control randomness and variance in generation.
- Seed: Ensure reproducibility for specific voice outputs.
- GPU Acceleration: Automatically detects and leverages CUDA
devicefor faster generations if available.
git clone https://github.com/hariV0078/TTS_Chatterbox.git
cd TTS_ChatterboxTo avoid conflicts with other global packages, it's highly recommended to use a virtual environment:
On Windows:
python -m venv venv
venv\Scripts\activateOn Unix or macOS:
python -m venv venv
source venv/bin/activateInstall the necessary dependencies including PyTorch, Gradio, and Transformers:
pip install -r requirements.txt(Note: Depending on your system and GPU, you may need to install a specific version of PyTorch with CUDA support from the official PyTorch website.)
Start the application by running the main Python script:
python app.pyThis will initialize the ArivaraTTS model and start the Gradio local web server. You will see a local URL in your console (usually http://127.0.0.1:7860). Open that link in your browser to interact with the UI.
- Text: Enter the sentence or paragraph you wish to synthesize.
- Reference Audio File: Provide a
.flac,.wav, or.mp3file of the voice you want to mimic. You can use local paths or accessible web URLs. - Advanced Parameters: Toggle "More options" to fine-tune Exaggeration, Pace, Seed, and Temperature to get the perfect speech output.
- Click Generate and listen to the Voice-TTS output result!
This project is configured properly to be hosted as a Hugging Face Space using the Gradio SDK. Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference.