Describe the bug
When running the example basic_agent.ts with the STT model deepgram/nova-3 or deepgram/nova-2 the agent's audio output will interrupt the agent. It will say the first few words but then cut itself off.
Here is an example: https://www.loom.com/share/53f4d5584eb54d0cbf74ba920c4a020b
Relevant log output
[17:54:38.597] INFO (11136): STT metrics
audioDurationMs: 5150
[17:54:38.598] INFO (11136): speech interrupted by audio activity
speech id: "speech_b6334c2e-ecc"
[17:54:38.600] DEBUG (11136): Aborting all pipeline reply tasks due to interruption
speech_id: "speech_b6334c2e-ecc"
[17:54:38.601] DEBUG (11136): Task.runTask: task performToolExecutions done
[17:54:38.601] DEBUG (11136): Task.runTask: task performLLMInference done
[17:54:38.602] DEBUG (11136): Task.runTask: task performTextForwarding done
[17:54:38.603] DEBUG (11136): firstFrameFut cancelled before first frame
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-ws-listener done
[17:54:38.603] DEBUG (11136): Task.runTask: task performAudioForwarding done
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-recv done
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-input done
[17:54:38.604] DEBUG (11136): Task.runTask: task inference-tts-sentence done
[17:54:38.604] DEBUG (11136): Task.runTask: task performTTSInference done
[17:54:38.605] INFO (11136): playout completed with interrupt
speech_id: "speech_b6334c2e-ecc"
message: ""
Describe your environment
Running latest code on your main branch.
System:
OS: macOS 15.1
CPU: (12) arm64 Apple M4 Pro
Memory: 772.67 MB / 24.00 GB
Shell: 5.9 - /bin/zsh
Binaries:
Node: 22.21.1 - /Users/x/.nvm/versions/node/v22.21.1/bin/node
npm: 10.9.4 - /Users/x/.nvm/versions/node/v22.21.1/bin/npm
pnpm: 9.7.0 - /opt/homebrew/bin/pnpm
npmPackages:
@livekit/agents: workspace:* => 1.0.39
@livekit/agents-plugin-anam: workspace:* => 1.0.39
@livekit/agents-plugin-baseten: workspace:* => 1.0.39
@livekit/agents-plugin-bey: workspace:* => 1.0.39
@livekit/agents-plugin-cartesia: workspace:* => 1.0.39
@livekit/agents-plugin-deepgram: workspace:* => 1.0.39
@livekit/agents-plugin-elevenlabs: workspace:* => 1.0.39
@livekit/agents-plugin-google: workspace:* => 1.0.39
@livekit/agents-plugin-hedra: workspace:* => 1.0.39
@livekit/agents-plugin-inworld: workspace:* => 1.0.39
@livekit/agents-plugin-lemonslice: workspace:* => 1.0.39
@livekit/agents-plugin-livekit: workspace:* => 1.0.39
@livekit/agents-plugin-neuphonic: workspace:* => 1.0.39
@livekit/agents-plugin-openai: workspace:* => 1.0.39
@livekit/agents-plugin-resemble: workspace:* => 1.0.39
@livekit/agents-plugin-silero: workspace:* => 1.0.39
@livekit/agents-plugin-xai: workspace:* => 1.0.39
@livekit/noise-cancellation-node: ^0.1.9 => 0.1.9
@livekit/plugins-ai-coustics: 0.1.7 => 0.1.7
@livekit/rtc-node: catalog: => 0.13.24
Minimal reproducible example
Run basic_agent.ts but swap out stt: 'assemblyai/universal-streaming:en' with stt: 'deepgram/nova-3', so the session is created like this:
const session = new voice.AgentSession({
stt: 'deepgram/nova-3',
llm: 'openai/gpt-4.1-mini',
tts: 'cartesia/sonic-2:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc',
vad: ctx.proc.userData.vad! as silero.VAD,
turnDetection: new livekit.turnDetector.MultilingualModel(),
voiceOptions: {
preemptiveGeneration: true,
},
connOptions: {
llmConnOptions: {
maxRetry: 1,
retryIntervalMs: 2000,
timeoutMs: 60000,
},
},
});
Additional information
Other STT models work fine. This seems isolated to deepgram (but I only tested a few).
When using the python LiveKit agent code there is no issue with deepgram.
Describe the bug
When running the example
basic_agent.tswith the STT modeldeepgram/nova-3ordeepgram/nova-2the agent's audio output will interrupt the agent. It will say the first few words but then cut itself off.Here is an example: https://www.loom.com/share/53f4d5584eb54d0cbf74ba920c4a020b
Relevant log output
Describe your environment
Running latest code on your
mainbranch.Minimal reproducible example
Run basic_agent.ts but swap out
stt: 'assemblyai/universal-streaming:en'withstt: 'deepgram/nova-3',so the session is created like this:Additional information
Other STT models work fine. This seems isolated to deepgram (but I only tested a few).
When using the python LiveKit
agentcode there is no issue with deepgram.