Skip to content

AgentSession using Deepgram STT continuously interrupts itself #1009

@jp-lemon

Description

@jp-lemon

Describe the bug

When running the example basic_agent.ts with the STT model deepgram/nova-3 or deepgram/nova-2 the agent's audio output will interrupt the agent. It will say the first few words but then cut itself off.
Here is an example: https://www.loom.com/share/53f4d5584eb54d0cbf74ba920c4a020b

Relevant log output

[17:54:38.597] INFO (11136): STT metrics
    audioDurationMs: 5150
[17:54:38.598] INFO (11136): speech interrupted by audio activity
    speech id: "speech_b6334c2e-ecc"
[17:54:38.600] DEBUG (11136): Aborting all pipeline reply tasks due to interruption
    speech_id: "speech_b6334c2e-ecc"
[17:54:38.601] DEBUG (11136): Task.runTask: task performToolExecutions done
[17:54:38.601] DEBUG (11136): Task.runTask: task performLLMInference done
[17:54:38.602] DEBUG (11136): Task.runTask: task performTextForwarding done
[17:54:38.603] DEBUG (11136): firstFrameFut cancelled before first frame
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-ws-listener done
[17:54:38.603] DEBUG (11136): Task.runTask: task performAudioForwarding done
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-recv done
[17:54:38.603] DEBUG (11136): Task.runTask: task inference-tts-input done
[17:54:38.604] DEBUG (11136): Task.runTask: task inference-tts-sentence done
[17:54:38.604] DEBUG (11136): Task.runTask: task performTTSInference done
[17:54:38.605] INFO (11136): playout completed with interrupt
    speech_id: "speech_b6334c2e-ecc"
    message: ""

Describe your environment

Running latest code on your main branch.

  System:
    OS: macOS 15.1
    CPU: (12) arm64 Apple M4 Pro
    Memory: 772.67 MB / 24.00 GB
    Shell: 5.9 - /bin/zsh
  Binaries:
    Node: 22.21.1 - /Users/x/.nvm/versions/node/v22.21.1/bin/node
    npm: 10.9.4 - /Users/x/.nvm/versions/node/v22.21.1/bin/npm
    pnpm: 9.7.0 - /opt/homebrew/bin/pnpm
  npmPackages:
    @livekit/agents: workspace:* => 1.0.39
    @livekit/agents-plugin-anam: workspace:* => 1.0.39
    @livekit/agents-plugin-baseten: workspace:* => 1.0.39
    @livekit/agents-plugin-bey: workspace:* => 1.0.39
    @livekit/agents-plugin-cartesia: workspace:* => 1.0.39
    @livekit/agents-plugin-deepgram: workspace:* => 1.0.39
    @livekit/agents-plugin-elevenlabs: workspace:* => 1.0.39
    @livekit/agents-plugin-google: workspace:* => 1.0.39
    @livekit/agents-plugin-hedra: workspace:* => 1.0.39
    @livekit/agents-plugin-inworld: workspace:* => 1.0.39
    @livekit/agents-plugin-lemonslice: workspace:* => 1.0.39
    @livekit/agents-plugin-livekit: workspace:* => 1.0.39
    @livekit/agents-plugin-neuphonic: workspace:* => 1.0.39
    @livekit/agents-plugin-openai: workspace:* => 1.0.39
    @livekit/agents-plugin-resemble: workspace:* => 1.0.39
    @livekit/agents-plugin-silero: workspace:* => 1.0.39
    @livekit/agents-plugin-xai: workspace:* => 1.0.39
    @livekit/noise-cancellation-node: ^0.1.9 => 0.1.9
    @livekit/plugins-ai-coustics: 0.1.7 => 0.1.7
    @livekit/rtc-node: catalog: => 0.13.24

Minimal reproducible example

Run basic_agent.ts but swap out stt: 'assemblyai/universal-streaming:en' with stt: 'deepgram/nova-3', so the session is created like this:

const session = new voice.AgentSession({
  stt: 'deepgram/nova-3',
  llm: 'openai/gpt-4.1-mini',
  tts: 'cartesia/sonic-2:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc',
  vad: ctx.proc.userData.vad! as silero.VAD,
  turnDetection: new livekit.turnDetector.MultilingualModel(),
  voiceOptions: {
    preemptiveGeneration: true,
  },
  connOptions: {
    llmConnOptions: {
      maxRetry: 1,
      retryIntervalMs: 2000,
      timeoutMs: 60000,
    },
  },
});

Additional information

Other STT models work fine. This seems isolated to deepgram (but I only tested a few).
When using the python LiveKit agent code there is no issue with deepgram.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions