use audio context hook for InterruptibleTTSService by omChauhanDev · Pull Request #4099 · pipecat-ai/pipecat

omChauhanDev · 2026-03-21T14:39:08Z

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Issue :

The _bot_speaking guard in InterruptibleTTSService._handle_interruption() skips websocket disconnect/reconnect when BotStartedSpeakingFrame hasn't reached the TTS processor yet. If a user interrupts while audio is still being synthesized or in-transit, the TTS server keeps streaming stale audio into the next response.

Note: #4090 partially addressed this by routing audio through append_to_audio_context(), so stale audio is discarded when no active context exists. However, the server still continues synthesizing unused audio (wasted cost/bandwidth), and old audio can leak into the next response once a new audio context becomes active.

Approach :

Replaced the _bot_speaking guard with an on_audio_context_interrupted() override - the same hook ElevenLabs, Rime, & Deepgram already use. Audio contexts exist from synthesis start to playback end, so this fires exactly when needed & stays silent when the bot is idle (preserving the original optimization against unnecessary reconnects from VAD noise).

Changes :

tts_service.py: removed _bot_speaking, _handle_interruption, process_frame override; added on_audio_context_interrupted with disconnect/reconnect
fish/tts.py: added super() call in existing on_audio_context_interrupted override

codecov · 2026-03-21T15:00:04Z

Codecov Report

❌ Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pipecat/services/tts_service.py	33.33%	2 Missing ⚠️
src/pipecat/services/fish/tts.py	0.00%	1 Missing ⚠️

Files with missing lines	Coverage Δ
src/pipecat/services/fish/tts.py	`7.69% <0.00%> (-0.04%)`	⬇️
src/pipecat/services/tts_service.py	`66.43% <33.33%> (+0.56%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace the _bot_speaking guard with on_audio_context_interrupted() override so the websocket is always reconnected when audio is in-transit, fixing the race condition where interruptions during the BotStartedSpeakingFrame round-trip window would leave stale audio streaming. Fish Audio TTS now calls super().on_audio_context_interrupted() to trigger the reconnect before stopping metrics. Fixes pipecat-ai#3986 (based on PR pipecat-ai#4099)

yuki901 · 2026-03-22T08:05:33Z

Thank you very much!!
Btw, since #4090, all InterruptibleTTSService subclasses route audio through append_to_audio_context(). After an interruption, _handle_interruption replaces _audio_contexts with a fresh empty dict, so stale audio from _receive_messages() is silently discarded — the user-facing bug in #3986 seems already fixed regardless of the _bot_speaking guard.

This PR's fix would still prevent the server from continuing to synthesize and stream unused audio (cost/bandwidth), but is that the intended scope? Does the PR description account for #4090?

Replace the _bot_speaking guard with on_audio_context_interrupted() override so the websocket is always reconnected when audio is in-transit, fixing the race condition where interruptions during the BotStartedSpeakingFrame round-trip window would leave stale audio streaming. Fish Audio TTS now calls super().on_audio_context_interrupted() to trigger the reconnect before stopping metrics. Fixes pipecat-ai#3986 (based on PR pipecat-ai#4099)

omChauhanDev · 2026-03-22T13:14:57Z

Hey @yuki901, nice catch - you're right that #4090 handles the immediate window. After interruption, _create_audio_context_task() replaces _audio_contexts with a fresh dict and _playing_context_id is reset to None, so stale audio from _receive_messages() hits append_to_audio_context(None, ...) and gets silently dropped. The audio-plays-right-after-interruption symptom is largely gone.

That said, this PR still covers two things #4090 doesn't:

Server-side waste - without disconnecting, the TTS server keeps synthesizing and streaming audio nobody will use. That's wasted compute, bandwidth, and API cost.
Audio crossover into the next response - there's a subtler window where old audio can leak. Once the next LLM response starts and a new audio context is created, _playing_context_id gets set to the new context ID. If old audio from the still-connected server arrives at that point, get_active_audio_context_id() returns the new ID, and append_to_audio_context routes old audio into the new context. The disconnect/reconnect prevents this by clearing server-side state entirely.

Happy to update the PR description to reference #4090 & clarify the scope. Thanks for flagging it!

Replace the _bot_speaking guard with on_audio_context_interrupted() override so the websocket is always reconnected when audio is in-transit, fixing the race condition where interruptions during the BotStartedSpeakingFrame round-trip window would leave stale audio streaming. Fish Audio TTS now calls super().on_audio_context_interrupted() to trigger the reconnect before stopping metrics. Fixes pipecat-ai#3986 (based on PR pipecat-ai#4099)

markbackman · 2026-03-26T01:57:31Z

Tagging @filipi87 to take a look.

filipi87

Hi @omChauhanDev,

As discussed above, all TTS services now route audio through the audio context. When an interruption occurs, all audio contexts are canceled. As a result, even if audio arrives afterward, append_to_audio_context may be called with an invalid or undefined context_id, and the audio is effectively discarded. In this case, we log a debug message indicating that the audio was dropped.

The behavior of reconnecting only when the bot is speaking was introduced as an optimization, since reestablishing the connection can sometimes take a couple of seconds. Keeping this behavior helps maintain a better user experience.

For that reason, I believe it still makes sense to preserve the current approach, only reconnecting when the bot is actively speaking.

That said, I do see a small window where stale audio could leak through. However, this would likely only occur in the case where run_tts has been invoked but the BotStartedSpeakingFrame has not yet been received. This seems to be the only scenario where the issue could arise.

If that’s the case, we could address it more directly within the InterruptibleTTSService, by treating the presence of any audio context as an indication that the bot has started speaking. Something along these lines should be sufficient to prevent the race condition:

async def push_frame(self, frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM):
    if isinstance(frame, TTSStartedFrame):
        self._bot_speaking = True
    await super().push_frame(frame, direction)

filipi87 · 2026-04-13T17:59:25Z

If that’s the case, we could address it more directly within the InterruptibleTTSService, by treating the presence of any audio context as an indication that the bot has started speaking. Something along these lines should be sufficient to prevent the race condition:

This has been fixed in this PR:

TTS improvements. #4145

fix: use audio context hook for InterruptibleTTSService

7bc1d33

omChauhanDev changed the title ~~fix: use audio context hook for InterruptibleTTSService~~ use audio context hook for InterruptibleTTSService Mar 21, 2026

added changelog

ae96181

markbackman requested a review from filipi87 March 26, 2026 01:57

filipi87 requested changes Mar 26, 2026

View reviewed changes

filipi87 closed this Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use audio context hook for InterruptibleTTSService#4099

use audio context hook for InterruptibleTTSService#4099
omChauhanDev wants to merge 2 commits intopipecat-ai:mainfrom
omChauhanDev:fix/interruptible-tts-bot-speaking-race

omChauhanDev commented Mar 21, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

yuki901 commented Mar 22, 2026

Uh oh!

omChauhanDev commented Mar 22, 2026 •

edited

Loading

Uh oh!

markbackman commented Mar 26, 2026

Uh oh!

filipi87 left a comment •

edited

Loading

Uh oh!

filipi87 commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

omChauhanDev commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Please describe the changes in your PR. If it is addressing an issue, please reference that as well.

Issue :

Approach :

Uh oh!

codecov Bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yuki901 commented Mar 22, 2026

Uh oh!

omChauhanDev commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markbackman commented Mar 26, 2026

Uh oh!

filipi87 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipi87 commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

omChauhanDev commented Mar 21, 2026 •

edited

Loading

codecov Bot commented Mar 21, 2026 •

edited

Loading

omChauhanDev commented Mar 22, 2026 •

edited

Loading

filipi87 left a comment •

edited

Loading