Gemini Live API offers a powerful speech-to-speech engine that can be seamlessly integrated into FreeSWITCH using mod_audio_stream v1.0.3.
From the Gemini LIVE API docs:
When integrating with Live API, you'll need to choose one of the following implementation approaches:
- Server-to-server: Your backend connects to the Live API using WebSockets. Typically, your client sends stream data (audio, video, text) to your server, which then forwards it to the Live API.
- Client-to-server: Your frontend code connects directly to the Live API using WebSockets to stream data, bypassing your backend.
We use the server-to-server approach which perfectly aligns with mod_audio_stream architecture.
Quick Start
You can try it out with the provided ws_gemini.js - a simple WebSocket server in Node.js. It uses GEMINI_API_KEY from environment variables:
# Set your Gemini API key
export GEMINI_API_KEY=YourGeminiApiKey
Alternatively, you can modify the script to include your API key directly in the source code.
Install required packages:
npm install @google/genai ws
package.json will look like:
{
"type": "module",
"dependencies": {
"@google/genai": "^1.29.1",
"ws": "^8.18.3"
}
}
Uses import so add type: module.
# Run the server
node ws_gemini.js
Important: Gemini Live API has specific audio format requirements:
- Input: L16 PCM audio at 16kHz sample rate
- Output: Returns audio at 24kHz sample rate
FreeSWITCH Configuration
From the module side, set STREAM_PLAYBACK to true or 1 (channel variable for automatic playback) and you must stream at 16kHz using the API call:
uuid_audio_stream <uuid> start ws://host:3001 mono 16k
Gemini Live API offers a powerful speech-to-speech engine that can be seamlessly integrated into FreeSWITCH using mod_audio_stream v1.0.3.
From the Gemini LIVE API docs:
We use the server-to-server approach which perfectly aligns with
mod_audio_streamarchitecture.Quick Start
You can try it out with the provided ws_gemini.js - a simple WebSocket server in Node.js. It uses
GEMINI_API_KEYfrom environment variables:Alternatively, you can modify the script to include your API key directly in the source code.
Install required packages:
package.json will look like:
Uses import so add
type: module.# Run the server node ws_gemini.jsImportant: Gemini Live API has specific audio format requirements:
FreeSWITCH Configuration
From the module side, set
STREAM_PLAYBACKto true or 1 (channel variable for automatic playback) and you must stream at 16kHz using the API call: