Skip to content

feat: implement PipecatSmartTurn backend#9

Merged
wavekat-eason merged 7 commits intomainfrom
feat/pipecat-backend
Mar 28, 2026
Merged

feat: implement PipecatSmartTurn backend#9
wavekat-eason merged 7 commits intomainfrom
feat/pipecat-backend

Conversation

@wavekat-eason
Copy link
Copy Markdown
Contributor

@wavekat-eason wavekat-eason commented Mar 28, 2026

Summary

  • Implements PipecatSmartTurn turn detector using the Pipecat Smart Turn v3.2 ONNX model (int8 quantized, ~8 MB, ~12 ms CPU inference)
  • Adds build.rs to download the model at build time with version-aware caching; supports PIPECAT_SMARTTURN_MODEL_PATH override and docs.rs no-network builds
  • Adds incremental STFT preprocessing and shared ONNX session helpers in onnx.rs
  • Extends TurnPrediction with stage_times: Vec<StageTiming> for per-stage latency breakdown
  • Adds integration tests covering push/predict/reset lifecycle and from_file error paths

Test plan

  • cargo test --features pipecat — unit and integration tests pass
  • Build with PIPECAT_SMARTTURN_MODEL_PATH pointing to a local .onnx file — skips download
  • Build with DOCS_RS=1 — writes zero-byte placeholder, compiles without network

🤖 Generated with Claude Code

wavekat-eason and others added 7 commits March 28, 2026 16:39
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- build.rs: download smart-turn-v3.2-cpu.onnx at build time with
  version caching and PIPECAT_SMARTTURN_MODEL_{PATH,URL} overrides
- src/onnx.rs: shared session_from_file / session_from_memory helpers
- src/audio/pipecat.rs: full implementation
  - MelExtractor: Slaney mel filterbank, periodic Hann window, realfft STFT
  - PipecatSmartTurn::new() (embedded model) and from_file(path)
  - push_audio: 16 kHz ring buffer, 8s capacity, wrong-rate frames dropped
  - predict: mel features → ONNX inference → TurnPrediction
  - reset: clears ring buffer
- tests/pipecat.rs: 9 integration tests; RTF < 50ms enforced in release

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add StageTiming struct and stage_times field to TurnPrediction
- Instrument predict() with three stages: audio_prep, mel, onnx
- Incremental STFT: cache power spectrogram and shift on each call,
  recomputing only the ~50 new frames instead of all 801 (~16x faster)
- Incremental mel filterbank: cache mel_spec and update only new columns,
  reducing matmul from [80×201]×[201×800] to [80×201]×[201×50] (~16x)
- Invalidate caches on reset()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wavekat-eason wavekat-eason merged commit 1577a63 into main Mar 28, 2026
5 checks passed
@wavekat-eason wavekat-eason deleted the feat/pipecat-backend branch March 28, 2026 09:41
@github-actions github-actions bot mentioned this pull request Mar 28, 2026
wavekat-eason pushed a commit that referenced this pull request Mar 28, 2026
## 🤖 New release

* `wavekat-turn`: 0.0.3 -> 0.0.4 (✓ API compatible changes)

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[0.0.4](v0.0.3...v0.0.4)
- 2026-03-28

### Added

- implement PipecatSmartTurn backend
([#9](#9))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant