diff --git a/guides/learn/speech-input.mdx b/guides/learn/speech-input.mdx index 35d72d0f..79ef39c1 100644 --- a/guides/learn/speech-input.mdx +++ b/guides/learn/speech-input.mdx @@ -76,6 +76,7 @@ In the vast number of cases, the default values will work well. Only adjust thes - How much silence must be detected before confirming speech has stopped - Critical for turn-taking behavior - A short value (0.2s) allows STT services to finalize sooner, improving transcription speed +- **Important**: Built-in STT P99 latency values are measured with `stop_secs=0.2`. If you change this value, re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with your settings and pass the measured latency to your STT service via `ttfs_p99_latency` **`confidence` and `min_volume`** diff --git a/server/utilities/turn-management/user-turn-strategies.mdx b/server/utilities/turn-management/user-turn-strategies.mdx index 44a250e8..48b27a09 100644 --- a/server/utilities/turn-management/user-turn-strategies.mdx +++ b/server/utilities/turn-management/user-turn-strategies.mdx @@ -228,6 +228,16 @@ from pipecat.turns.user_stop import SpeechTimeoutUserTurnStopStrategy strategy = SpeechTimeoutUserTurnStopStrategy(user_speech_timeout=0.6) ``` + + Built-in STT P99 latency values assume `VADParams.stop_secs=0.2` (the + recommended default). If you change `stop_secs`, the strategy will log a + warning suggesting you re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) + with your VAD settings and pass the measured TTFS P99 latency to your STT + service constructor via `ttfs_p99_latency`. The strategy will also warn if + `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s + and may cause delayed turn detection. + + ### TurnAnalyzerUserTurnStopStrategy Uses an AI-powered turn detection model to determine when the user has finished speaking. This provides more intelligent end-of-turn detection that can understand conversational context. @@ -251,6 +261,16 @@ strategy = TurnAnalyzerUserTurnStopStrategy( more information on available turn analyzers. + + Built-in STT P99 latency values assume `VADParams.stop_secs=0.2` (the + recommended default). If you change `stop_secs`, the strategy will log a + warning suggesting you re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) + with your VAD settings and pass the measured TTFS P99 latency to your STT + service constructor via `ttfs_p99_latency`. The strategy will also warn if + `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s + and may cause delayed turn detection. + + ### ExternalUserTurnStopStrategy Delegates turn stop detection to an external processor. This strategy listens for `UserStoppedSpeakingFrame` frames emitted by other components in the pipeline.