Skip to content

Amhitox/TTS-STT-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Darija Speech Processing App

🇲🇦 Speech-to-Text and Text-to-Speech for Moroccan Arabic (Darija)

Features

  • Speech-to-Text (STT): Transcribe Darija audio using SpeechBrain's wav2vec2 model
  • Text-to-Speech (TTS): Generate Darija speech using Coqui XTTS-v2 with voice cloning

Requirements

  • Python 3.10+
  • CUDA-capable GPU recommended (CPU supported but slower)
  • ~8GB+ disk space for models

Installation

The project uses a dedicated Python 3.11 environment managed by uv.

# Install uv (if not already installed)
pip install uv

# Create environment and install dependencies
uv venv --python 3.11
uv pip install -r requirements.txt

Usage

1. Start the FastAPI Backend

.\.venv\Scripts\python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

2. Start the Streamlit Frontend

.\.venv\Scripts\python -m streamlit run streamlit_app.py

3. Open in Browser

API Endpoints

POST /api/stt

Upload audio file → Get Darija transcription

curl -X POST -F "audio=@audio.wav" http://localhost:8000/api/stt

POST /api/tts

Send text → Get synthesized Darija audio

curl -X POST -F "text=السلام عليكم" http://localhost:8000/api/tts --output speech.wav

Models Used

Component Model Source
STT wav2vec2-dvoice-darija SpeechBrain
TTS darija_xtt_2.0 medmac01

Reference Audio

For TTS voice cloning, add a 3-10 second WAV file at:

reference_audio/darija_speaker.wav

License

Open source for educational purposes.

About

Real-time voice assistant bot integrating TTS & STT models with AI service pipeline — Flutter + Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages