Skip to content

Add barge-in for realtime voice playback#568

Merged
rogerchappel merged 4 commits into
mainfrom
codex/realtime-voice-barge-in
May 21, 2026
Merged

Add barge-in for realtime voice playback#568
rogerchappel merged 4 commits into
mainfrom
codex/realtime-voice-barge-in

Conversation

@rogerchappel
Copy link
Copy Markdown
Owner

Summary

  • Adds a realtime relay cancelOutput action and maps it to OpenClaw talk.session.cancelOutput
  • Adds client helper coverage for cancelling realtime output
  • Detects speech while assistant audio is playing, stops local playback immediately, and sends barge-in cancellation to the gateway

Why

Realtime audio exchange works, but speaking over the assistant did not interrupt live playback. This makes realtime voice behave closer to a normal conversation.

Local verification

  • pnpm test -- src/lib/gateway-client-realtime.test.ts 'src/app/api/runtimes/[id]/talk/realtime/relay/route.test.ts' src/lib/realtime-voice-client.test.ts src/lib/realtime-voice-audio.test.ts
  • pnpm typecheck
  • git diff --check origin/main..HEAD
  • pre-push: pnpm typecheck && pnpm build

Manual test notes

With NEXT_PUBLIC_CREWCMD_REALTIME_VOICE=1, start realtime voice and speak while the assistant is talking. Expected behavior:

  • current assistant audio stops immediately in the browser
  • /talk/realtime/relay receives action: "cancelOutput"
  • the next user utterance continues through the realtime relay without falling back to /api/stt

@rogerchappel rogerchappel merged commit 338910c into main May 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant