From 6d529734475298c53a4459e7d091408f6fe57c7f Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Tue, 24 Mar 2026 13:35:48 +0000 Subject: [PATCH] docs: add VAD stop_secs warnings for turn stop strategies Updated documentation to reflect new warnings added in pipecat PR #4115: - Added notes to SpeechTimeoutUserTurnStopStrategy and TurnAnalyzerUserTurnStopStrategy sections explaining that built-in STT P99 latency values assume VADParams.stop_secs=0.2 - Documented warnings that appear when stop_secs differs from default or when stop_secs >= STT p99 latency - Added references to stt-benchmark tool for measuring latency with custom VAD settings - Updated speech-input.mdx to mention the stop_secs dependency for STT latency values --- guides/learn/speech-input.mdx | 1 + .../turn-management/user-turn-strategies.mdx | 20 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/guides/learn/speech-input.mdx b/guides/learn/speech-input.mdx index 35d72d0f..79ef39c1 100644 --- a/guides/learn/speech-input.mdx +++ b/guides/learn/speech-input.mdx @@ -76,6 +76,7 @@ In the vast number of cases, the default values will work well. Only adjust thes - How much silence must be detected before confirming speech has stopped - Critical for turn-taking behavior - A short value (0.2s) allows STT services to finalize sooner, improving transcription speed +- **Important**: Built-in STT P99 latency values are measured with `stop_secs=0.2`. If you change this value, re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with your settings and pass the measured latency to your STT service via `ttfs_p99_latency` **`confidence` and `min_volume`** diff --git a/server/utilities/turn-management/user-turn-strategies.mdx b/server/utilities/turn-management/user-turn-strategies.mdx index 44a250e8..48b27a09 100644 --- a/server/utilities/turn-management/user-turn-strategies.mdx +++ b/server/utilities/turn-management/user-turn-strategies.mdx @@ -228,6 +228,16 @@ from pipecat.turns.user_stop import SpeechTimeoutUserTurnStopStrategy strategy = SpeechTimeoutUserTurnStopStrategy(user_speech_timeout=0.6) ``` + + Built-in STT P99 latency values assume `VADParams.stop_secs=0.2` (the + recommended default). If you change `stop_secs`, the strategy will log a + warning suggesting you re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) + with your VAD settings and pass the measured TTFS P99 latency to your STT + service constructor via `ttfs_p99_latency`. The strategy will also warn if + `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s + and may cause delayed turn detection. + + ### TurnAnalyzerUserTurnStopStrategy Uses an AI-powered turn detection model to determine when the user has finished speaking. This provides more intelligent end-of-turn detection that can understand conversational context. @@ -251,6 +261,16 @@ strategy = TurnAnalyzerUserTurnStopStrategy( more information on available turn analyzers. + + Built-in STT P99 latency values assume `VADParams.stop_secs=0.2` (the + recommended default). If you change `stop_secs`, the strategy will log a + warning suggesting you re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) + with your VAD settings and pass the measured TTFS P99 latency to your STT + service constructor via `ttfs_p99_latency`. The strategy will also warn if + `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s + and may cause delayed turn detection. + + ### ExternalUserTurnStopStrategy Delegates turn stop detection to an external processor. This strategy listens for `UserStoppedSpeakingFrame` frames emitted by other components in the pipeline.