GitHub - wavekat/wavekat-tts: Text-to-speech library for Rust with a unified trait interface over multiple backends (Kokoro ONNX). Part of the WaveKat voice pipeline.

Unified text-to-speech for voice pipelines, wrapping multiple TTS engines behind common Rust traits. Same pattern as wavekat-vad and wavekat-turn.

Warning

Early development. API may change between minor versions.

Backends

Backend	Feature flag	Status	License
Qwen3-TTS	`qwen3-tts`	✅ Available	Apache 2.0
CosyVoice	`cosyvoice`	🚧 Planned	Apache 2.0

Quick start

cargo add wavekat-tts --features qwen3-tts

use wavekat_tts::{TtsBackend, SynthesizeRequest};
use wavekat_tts::backends::qwen3_tts::{Qwen3Tts, ModelConfig, ModelPrecision, ExecutionProvider};

// Auto-downloads INT4 model files on first run, runs on CPU (default):
let tts = Qwen3Tts::new()?;

// Or FP32 on CPU:
// let tts = Qwen3Tts::from_config(ModelConfig::default().with_precision(ModelPrecision::Fp32))?;

// Or INT4 from a local directory on CUDA:
// let tts = Qwen3Tts::from_config(
//     ModelConfig::default()
//         .with_dir("models/qwen3-tts-1.7b")
//         .with_execution_provider(ExecutionProvider::Cuda),
// )?;

let request = SynthesizeRequest::new("Hello, world")
    .with_instruction("Speak naturally and clearly.");
let audio = tts.synthesize(&request)?;

// Save to WAV (wavekat-core includes WAV I/O via the `wav` feature):
audio.write_wav("output.wav")?;

println!("{}s at {} Hz", audio.duration_secs(), audio.sample_rate());

Model files are cached by the HF Hub client at $HF_HOME/hub/ (default ~/.cache/huggingface/hub/). Set WAVEKAT_MODEL_DIR to load from a local directory and skip all downloads.

All backends produce AudioFrame<'static> from wavekat-core — the same type consumed by wavekat-vad and wavekat-turn.

Architecture

wavekat-vad   →  "is someone speaking?"
wavekat-turn  →  "are they done speaking?"
wavekat-tts   →  "synthesize the response"
     │                   │                     │
     └───────────────────┴─────────────────────┘
                         │
                   AudioFrame (wavekat-core)

Two trait families:

TtsBackend — batch synthesis: text → AudioFrame<'static>
StreamingTtsBackend — streaming: text → iterator of AudioFrame<'static> chunks

Examples

Generate a WAV file from text (model files are auto-downloaded on first run):

cargo run --example synthesize --features qwen3-tts -- "Hello, world\!"
cargo run --example synthesize --features qwen3-tts -- --instruction "Speak in a warm, friendly tone." "Give every small business the voice of a big one."
cargo run --example synthesize --features qwen3-tts -- --precision fp32 "Hello"
cargo run --example synthesize --features qwen3-tts -- --model-dir /path/to/model --output hello.wav "Hello"

Performance

Backend	Precision	Provider	Hardware	RTF short	RTF medium	RTF long
qwen3-tts	int4	CPU	Standard_NC4as_T4_v3	1.98	2.04	2.34
qwen3-tts	int4	CUDA	Standard_NC4as_T4_v3	0.78	0.85	1.07

RTF < 1.0 = faster-than-real-time. Lower is better.
To update: run make bench-csv-cuda on target hardware, then commit bench/results/.

Feature flags

Backends

Flag	Default	Description
`qwen3-tts`	off	Qwen3-TTS local ONNX inference
`cosyvoice`	off	CosyVoice local ONNX inference (planned)

Execution providers

Composable with any backend flag. Selects the inference hardware at build time.

Flag	Description	Status
`cuda`	NVIDIA CUDA GPU	✅ Working
`tensorrt`	NVIDIA TensorRT	🚧 Not configured
`coreml`	Apple CoreML (macOS)	🚧 Not configured

License

Licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
bench/results		bench/results
crates/wavekat-tts		crates/wavekat-tts
docs		docs
scripts		scripts
tools/qwen3-tts-onnx		tools/qwen3-tts-onnx
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
release-plz.toml		release-plz.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backends

Quick start

Architecture

Examples

Performance

Feature flags

Backends

Execution providers

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Backends

Quick start

Architecture

Examples

Performance

Feature flags

Backends

Execution providers

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages