Skip to content

mexican75/echoveil-server

Repository files navigation

EchoVeil Server

Companion server for the EchoVeil Chrome extension. Runs locally and performs real-time AI voice conversion using Seed-VC and optional audio-driven face animation using MuseTalk.

The extension captures tab audio, streams it to this server over WebSocket, and plays back the converted voice in real time. All processing happens on your machine — nothing leaves localhost.

Requirements

  • GPU: NVIDIA with CUDA (Windows) or Apple Silicon (macOS). 4+ GB VRAM recommended (8+ GB if using face animation).
  • ~5 GB disk space for voice conversion, ~9 GB with face animation models

Install (Windows)

Download the latest installer from Releases and run it.

The installer will:

  1. Install an embedded Python 3.12
  2. Download Seed-VC and MuseTalk source code
  3. Download MuseTalk model weights (~4 GB)
  4. Install PyTorch with CUDA and all dependencies
  5. Register the echoveil:// URI protocol so the extension can auto-launch the server

After installation, the server can be launched from the Start Menu or automatically by the extension.

Install (macOS)

There is no packaged installer for macOS yet. The server runs fine on Apple Silicon (M1/M2/M3/M4) via MPS — you just need to set it up manually.

# Install Python 3.12 (if you don't have it)
brew install python@3.12

# Clone and set up
git clone https://github.com/mexican75/echoveil-server.git
cd echoveil-server

# Install PyTorch (MPS support is included by default)
pip3 install torch torchaudio torchvision

# Clone seed-vc
git clone --depth 1 https://github.com/Plachtaa/seed-vc.git seed-vc

# Clone MuseTalk (optional — for face animation)
git clone --depth 1 https://github.com/TMElyralab/MuseTalk.git musetalk

# Download MuseTalk model weights (optional — ~4 GB)
python3 download_musetalk_models.py

# Install remaining dependencies
pip3 install -r requirements.txt

# Patch BigVGAN
python3 patch_bigvgan.py

# Run the server
python3 main.py

The extension cannot auto-launch the server on macOS (no echoveil:// URI protocol registered). You need to start it manually before using the extension. The server will auto-shut down after 10 minutes of inactivity.

Intel Macs fall back to CPU inference, which will be too slow for real-time use.

How It Works

  1. The extension detects the server isn't running and opens echoveil://launch
  2. The system tray launcher starts and loads ML models (~30s on first run, cached after)
  3. Audio streams over WebSocket at ws://127.0.0.1:8765/ws/convert
  4. The server auto-shuts down after 10 minutes of inactivity (configurable via IDLE_TIMEOUT_MINUTES env var, 0 to disable)

API

Endpoint Method Description
/status GET Server status, loaded reference, and avatar info
/reference POST Upload reference voice audio (WAV, MP3, FLAC, etc.)
/avatar POST Upload face image for lip-sync animation (JPG, PNG, WebP)
/settings GET Current face animation settings
/settings PATCH Update face animation settings (face_fps, use_tiny_vae)
/ws/convert WS Real-time PCM voice conversion stream
/ws/face WS Real-time face animation frames (JPEG binary)

The server binds to 127.0.0.1 only — it is not accessible from other machines.

Developer Setup

On Windows, setup.bat handles everything (Python, PyTorch CUDA, seed-vc, MuseTalk, deps). On macOS, follow the install instructions above.

# Run the server directly:
python main.py

# Or with the system tray launcher (Windows):
python tray_launcher.py

Project Structure

main.py                    — FastAPI server (REST + WebSocket endpoints)
voice_converter.py         — Seed-VC wrapper with streaming buffers + SOLA crossfade
face_animator.py           — MuseTalk wrapper for audio-driven face animation
dashboard.py               — Rich console dashboard for server status monitoring
tray_launcher.py           — System tray icon, starts/stops the server
launch.vbs                 — Hidden-console VBS launcher (full GPU priority)
patch_bigvgan.py           — Patches BigVGAN for newer huggingface_hub versions
download_musetalk_models.py — Downloads MuseTalk model weights from HuggingFace
setup.bat                  — Windows one-click setup
start.bat                  — Windows server launcher (console)
installer/                 — Inno Setup installer script
.github/workflows/         — CI: builds the Windows installer on tag push

Configuration

Env Variable Default Description
IDLE_TIMEOUT_MINUTES 10 Auto-shutdown after N minutes of inactivity. 0 to disable.
CORS_ORIGINS (empty) Comma-separated additional CORS origins. Chrome extension origins are allowed by default.
SEED_VC_ROOT ./seed-vc Path to the Seed-VC source directory.
MUSETALK_ROOT ./musetalk Path to the MuseTalk source directory.

License

GPL-3.0-or-later

About

Real-time AI voice conversion server — companion to the EchoVeil Chrome extension

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors