jambonz Agent Toolkit

Build voice applications on jambonz with AI-assisted development.

This monorepo contains two packages:

Package	Description
`@jambonz/sdk`	TypeScript SDK for building jambonz webhook and WebSocket voice applications
`@jambonz/mcp-schema-server`	MCP server that gives AI coding assistants deep knowledge of jambonz APIs

Quick Start

npm install @jambonz/sdk

Webhook app (Express)

import express from 'express';
import { WebhookResponse } from '@jambonz/sdk/webhook';

const app = express();
app.use(express.json());

app.post('/incoming', (req, res) => {
  const jambonz = new WebhookResponse();
  jambonz
    .say({ text: 'Hello from jambonz!' })
    .gather({
      input: ['speech', 'digits'],
      actionHook: '/handle-input',
      say: { text: 'Press 1 for sales or 2 for support.' },
    })
    .hangup();

  res.json(jambonz);
});

app.listen(3000);

WebSocket app

import http from 'http';
import { createEndpoint } from '@jambonz/sdk/websocket';

const server = http.createServer();
const makeService = createEndpoint({ server, port: 3000 });

const svc = makeService({ path: '/' });

svc.on('session:new', (session) => {
  session
    .on('/gather-result', (evt) => {
      const transcript = evt.speech?.alternatives?.[0]?.transcript || '';
      session.say({ text: `You said: ${transcript}` }).hangup().reply();
    });

  session
    .say({ text: 'Hello! Say something.' })
    .gather({ input: ['speech'], actionHook: '/gather-result', timeout: 10 })
    .hangup()
    .send();
});

Handling audio streams (listen/stream verb)

When you use a listen or stream verb, jambonz opens a separate WebSocket connection to deliver real-time call audio. The audio WebSocket URL can point anywhere — a separate server, a different process, or a third-party service.

However, if you want to handle both the call control and the audio stream in a single application, makeService.audio() lets you register an audio WebSocket handler on the same server as the control pipe:

import http from 'http';
import { createEndpoint } from '@jambonz/sdk/websocket';

const server = http.createServer();
const makeService = createEndpoint({ server, port: 3000 });

// Control pipe — handles call sessions
const svc = makeService({ path: '/' });

// Audio pipe — receives audio from listen/stream verbs
const audioSvc = makeService.audio({ path: '/audio-stream' });

svc.on('session:new', (session) => {
  session
    .say({ text: 'Listening...' })
    .listen({
      url: '/audio-stream',  // relative path — jambonz connects back to same server
      sampleRate: 8000,
      bidirectionalAudio: {
        enabled: true,
        streaming: true,
        sampleRate: 8000,
      },
    })
    .send();
});

audioSvc.on('connection', (stream) => {
  console.log(`Audio connected: ${stream.callSid} @ ${stream.sampleRate}Hz`);

  // Receive audio (L16 PCM binary frames)
  stream.on('audio', (pcm: Buffer) => {
    // Process audio — e.g. feed to an STT engine
  });

  // Send audio back (streaming mode — raw binary PCM)
  stream.sendAudio(pcmBuffer);

  // Or send audio back (non-streaming mode — base64-encoded file)
  stream.playAudio(base64Content, {
    audioContentType: 'raw',  // or 'wav'
    sampleRate: 16000,
  });

  stream.on('close', () => console.log('Audio stream closed'));
});

The AudioStream object also provides killAudio(), disconnect(), sendMark(name), and clearMarks() methods. See the Audio WebSocket section in AGENTS.md for full API details.

AI-Assisted Development

The @jambonz/mcp-schema-server package is an MCP (Model Context Protocol) server that gives AI coding assistants — Claude, Cursor, GitHub Copilot, Windsurf, and others — deep knowledge of jambonz APIs, verb schemas, and SDK usage patterns. This means the AI can generate correct jambonz application code without you having to manually explain the API.

What it provides

The MCP server exposes two tools to the AI:

jambonz_developer_toolkit — A comprehensive developer guide covering the SDK API, verb model, webhook/WebSocket transports, actionHook lifecycle, mid-call control, recording, and working code examples.
get_jambonz_schema — Full JSON Schema for any jambonz verb (say, gather, dial, llm, etc.), component (recognizer, synthesizer, target, etc.), or actionHook callback payload.

When you ask the AI to build a jambonz application, it calls these tools automatically to get the context it needs.

Setup

Choose the setup that matches your development environment. You only need one.

Claude Code (CLI)

Add to your project's .mcp.json:

{
  "mcpServers": {
    "jambonz": {
      "command": "npx",
      "args": ["-y", "@jambonz/mcp-schema-server"]
    }
  }
}

Or run interactively:

claude mcp add jambonz -- npx -y @jambonz/mcp-schema-server

Claude Desktop

Open Settings > Developer > Edit Config and add to mcpServers:

{
  "mcpServers": {
    "jambonz": {
      "command": "npx",
      "args": ["-y", "@jambonz/mcp-schema-server"]
    }
  }
}

Restart Claude Desktop after saving.

Cursor

Open Cursor Settings > MCP and add a new server:

Name: jambonz
Type: command
Command: npx -y @jambonz/mcp-schema-server

Or add to your project's .cursor/mcp.json:

{
  "mcpServers": {
    "jambonz": {
      "command": "npx",
      "args": ["-y", "@jambonz/mcp-schema-server"]
    }
  }
}

VS Code (GitHub Copilot / Claude Extension)

Add to your workspace's .vscode/mcp.json:

{
  "servers": {
    "jambonz": {
      "command": "npx",
      "args": ["-y", "@jambonz/mcp-schema-server"]
    }
  }
}

Windsurf

Open Windsurf Settings > MCP and add:

{
  "mcpServers": {
    "jambonz": {
      "command": "npx",
      "args": ["-y", "@jambonz/mcp-schema-server"]
    }
  }
}

Verifying it works

After configuring the MCP server, start a new conversation with your AI assistant and ask it to build a jambonz application. For example:

"Create a jambonz WebSocket app that answers a call, asks for the caller's name using speech recognition, and greets them by name."

The AI should automatically call the jambonz_developer_toolkit tool, then generate correct code using @jambonz/sdk with proper session.on() actionHook handling, .send() for the initial verbs, and .reply() for subsequent responses.

If the AI generates code using the old @jambonz/node-client-ws package, raw JSON arrays without the SDK, or uses .send() where .reply() is needed, the MCP server is not connected. Check your configuration and restart the AI tool.

API Documentation

Full API reference documentation is available at jambonz.github.io/node-sdk.

The reference covers all exported classes, methods, properties, and types — including Session, WebhookResponse, AudioStream, JambonzClient, and VerbBuilder.

Docs are auto-generated from TSDoc comments in the source using TypeDoc and deployed to GitHub Pages on every push to main. To generate locally: cd typescript && npm run docs.

Testing

The SDK has a test suite built on Vitest.

cd typescript
npm test              # run all tests once
npm run test:watch    # re-run on file changes

To generate a coverage report:

npx vitest run --coverage

Tests live in typescript/test/ and cover the webhook response builder, signature verification, environment variable middleware, REST API client, JSON schema validation, and schema drift detection.

SDK Reference

Imports

// Webhook apps (Express/HTTP)
import { WebhookResponse } from '@jambonz/sdk/webhook';

// WebSocket apps
import { createEndpoint } from '@jambonz/sdk/websocket';

// REST API client (mid-call control, outbound calls)
import { JambonzClient } from '@jambonz/sdk/client';

Verb methods

Both WebhookResponse and WebSocket Session support the same chainable verb methods:

.say() .play() .gather() .dial() .llm() .conference() .enqueue() .dequeue() .hangup() .pause() .redirect() .config() .tag() .dtmf() .listen() .transcribe() .message() .stream() .pipeline() .dub() .alert() .answer() .leave() .sipDecline() .sipRefer() .sipRequest()

All methods accept the same options as the corresponding verb JSON schemas and are chainable.

Webhook: sending responses

const jambonz = new WebhookResponse();
jambonz.say({ text: 'Hello' }).hangup();
res.json(jambonz);  // Express response

WebSocket: .send() vs .reply()

.send() — Use once for the initial verb array in response to session:new.
.reply() — Use for all subsequent responses to actionHook events.

svc.on('session:new', (session) => {
  // Bind actionHook handlers first
  session.on('/my-hook', (evt) => {
    session.say({ text: 'Got it.' }).reply();  // .reply() for actionHooks
  });

  // Send initial verbs
  session.gather({ actionHook: '/my-hook', input: ['speech'] }).send();  // .send() once
});

REST API client

import { JambonzClient } from '@jambonz/sdk/client';

const client = new JambonzClient({ baseUrl: 'https://api.jambonz.us', accountSid, apiKey });

// Outbound call
await client.calls.create({
  from: '+15085551212',
  to: { type: 'phone', number: '+15085551213' },
  call_hook: '/incoming',
});

// Mid-call control
await client.calls.mute(callSid, 'mute');
await client.calls.redirect(callSid, 'https://example.com/new-flow');

Examples

See the examples/ directory:

Example	Transport	Description
hello-world	Webhook + WS	Minimal greeting
echo	Webhook + WS	Speech echo using gather with actionHook
ivr-menu	Webhook	Interactive menu with speech and DTMF
dial	Webhook	Outbound dial to a phone number
listen-record	Webhook	Record audio via WebSocket stream
voice-agent	Webhook + WS	LLM-powered conversational AI with tool calls
openai-realtime	WebSocket	OpenAI Realtime API voice agent
deepgram-voice-agent	WebSocket	Deepgram Voice Agent API
llm-streaming	WebSocket	Anthropic LLM with TTS streaming and barge-in
queue-with-hold	Webhook + WS	Call queue with hold music
call-recording	Webhook + WS	Mid-call recording control

Repository Structure

node-sdk/
├── typescript/          # @jambonz/sdk — the TypeScript SDK
├── mcp-server/          # @jambonz/mcp-schema-server — MCP server for AI assistants
├── schema/
│   ├── verbs/           # JSON Schema for each jambonz verb
│   ├── components/      # JSON Schema for shared types (recognizer, synthesizer, etc.)
│   └── callbacks/       # JSON Schema for actionHook callback payloads
├── examples/            # Working example applications
└── AGENTS.md            # Developer guide (served by the MCP server)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
docs/verbs		docs/verbs
examples		examples
mcp-server		mcp-server
schema		schema
typescript		typescript
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

jambonz Agent Toolkit

Quick Start

Webhook app (Express)

WebSocket app

Handling audio streams (listen/stream verb)

AI-Assisted Development

What it provides

Setup

Claude Code (CLI)

Claude Desktop

Cursor

VS Code (GitHub Copilot / Claude Extension)

Windsurf

Verifying it works

API Documentation

Testing

SDK Reference

Imports

Verb methods

Webhook: sending responses

WebSocket: .send() vs .reply()

REST API client

Examples

Repository Structure

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 2

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

jambonz Agent Toolkit

Quick Start

Webhook app (Express)

WebSocket app

Handling audio streams (listen/stream verb)

AI-Assisted Development

What it provides

Setup

Claude Code (CLI)

Claude Desktop

Cursor

VS Code (GitHub Copilot / Claude Extension)

Windsurf

Verifying it works

API Documentation

Testing

SDK Reference

Imports

Verb methods

Webhook: sending responses

WebSocket: .send() vs .reply()

REST API client

Examples

Repository Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 2

Languages

Packages