Skip to content

agentvoiceresponse/avr-sts-speechmatics

Repository files navigation

Agent Voice Response - Speechmatics STS Service

Discord GitHub Repo stars Docker Pulls Ko-fi

This service connects AVR to the Speechmatics Flow API over WebSocket. It receives audio chunks from avr-core, forwards them to Speechmatics, and streams generated audio back in near real-time.

Prerequisites

  • Node.js 18+ and npm
  • A valid Speechmatics API key with Flow access

Setup

1) Clone the repository

git clone https://github.com/agentvoiceresponse/avr-sts-speechmatics.git
cd avr-sts-speechmatics

2) Install dependencies

npm install

3) Configure environment variables

Copy .env.example to .env and set your values:

PORT=6040
SPEECHMATICS_API_KEY=your_speechmatics_api_key_here
SPEECHMATICS_REGION=eu

Required:

  • SPEECHMATICS_API_KEY: Speechmatics API key

Optional:

  • PORT: Local WebSocket server port (default: 6040)
  • SPEECHMATICS_REGION: JWT region used by Speechmatics auth (default: eu)

4) Run the service

npm start

Runtime behavior

  • Transport: WebSocket
  • Input audio expected from AVR: base64-encoded PCM s16le, 8 kHz
  • Output audio sent to AVR: base64-encoded PCM s16le, 8 kHz
  • Frame pacing: 20 ms frames (160 samples at 8000 Hz)

Compatibility

  • Input codec: pcm_s16le
  • Output codec: pcm_s16le
  • Sample rate: 8000 Hz
  • Recommended with AVR call paths using alaw, ulaw, or slin16 transcoding as configured in avr-core

Client message protocol

Incoming messages from AVR:

{ "type": "init", "uuid": "call-uuid" }
{ "type": "audio", "audio": "<base64_pcm_chunk>" }

Outgoing messages to AVR:

{ "type": "audio", "audio": "<base64_pcm_chunk>" }
{ "type": "error", "message": "..." }

Scripts

  • npm start: run service
  • npm run start:dev: run with nodemon and inspector
  • npm run dc:build: build Docker image
  • npm run dc:push: push Docker image

Support and community

Support AVR

AVR is free and open-source. Donations are optional and do not unlock extra features or services.

Support us on Ko-fi

License

MIT License - see LICENSE.md.

About

WebSocket Speechmatics Flow bridge for AVR: receives PCM audio from avr-core and streams STS audio responses in real time.

Topics

Resources

License

Stars

Watchers

Forks

Contributors