Skip to content

architecture/elevenlabs-ruby

Repository files navigation

ElevenLabs Ruby SDK

This gem mirrors the public interface of elevenlabs-python and provides a Ruby-native client with the same resource tree, streaming helpers, and request semantics. The client is generated directly from the upstream SDK, so keeping pace with new endpoints only requires re-running the included extraction script.

Installation

This gem is published to GitHub Packages (not RubyGems.org). Add the GitHub Packages source to your Gemfile:

source "https://rubygems.pkg.github.com/architecture" do
  gem "elevenlabs", "0.3.4"
end

GitHub Packages requires authentication. Create a personal access token with read:packages scope and add it to ~/.gem/credentials:

---
:github: Bearer <YOUR_TOKEN>

Then run bundle install.

Note: The gem install elevenlabs command shown on the GitHub Packages page points to RubyGems.org and will not work — this gem is only available via GitHub Packages.

Alternative: install directly from GitHub (no registry auth required)

Bundler can pull the gem straight from the git repository. This works for public repos without any token setup:

# Pin to a release tag (recommended for production)
gem "elevenlabs", git: "https://github.com/architecture/elevenlabs-ruby", tag: "v0.3.4"

# Or track the latest main branch
gem "elevenlabs", git: "https://github.com/architecture/elevenlabs-ruby", branch: "main"

Then run bundle install.

Quick start

require "elevenlabs"

client = ElevenLabs::Client.new(api_key: ENV.fetch("ELEVENLABS_API_KEY"))

# Mirrors client.history.list(...) from Python
history = client.history.list(page_size: 5)
history["history"].each do |item|
  puts "#{item["voice_name"]}: #{item["text"]}"
end

Every namespace from the Python SDK shows up at the same path:

client.voices.get_all
client.text_to_speech.convert("voice_id", text: "Hello!")
client.conversational_ai.agents.list
client.workspace.invites.create(email: "teammate@example.com")

Optional parameters default to the ElevenLabs::OMIT sentinel. Pass nil to send null, or skip the argument entirely to remove it from the payload.

Available resources

Every top-level namespace described in lib/elevenlabs/spec.json is available under client. That includes:

audio_isolation, audio_native, conversational_ai, dubbing, forced_alignment, history, models, music, pronunciation_dictionaries, samples, service_accounts, speech_to_speech, speech_to_text, studio, text_to_dialogue, text_to_sound_effects, text_to_speech, text_to_voice, tokens, usage, user, voices, webhooks, workspace.

Below are example snippets that demonstrate each namespace. Substitute IDs and payloads with real values from your account.

# audio_isolation
clean = client.audio_isolation.convert(audio: ElevenLabs::Upload.from_path("noisy.wav"))

# audio_native
client.audio_native.projects.list

# conversational_ai
client.conversational_ai.agents.list(page_size: 10)

# dubbing
client.dubbing.transcript.create(
  project_id: "proj_123",
  source_language: "en",
  target_languages: ["es"]
)

# forced_alignment
client.forced_alignment.jobs.create(audio: ElevenLabs::Upload.from_path("clip.wav"))

# history
client.history.list(page_size: 20)

# models
client.models.list

# music
client.music.composition_plan.create(prompt: "lofi chill beats")
client.music.upload(file: ElevenLabs::Upload.from_path("track.mp3"))

# pronunciation_dictionaries
client.pronunciation_dictionaries.list
client.pronunciation_dictionaries.rules.set(
  pronunciation_dictionary_id: "dict_123",
  rules: [{ "type" => "phoneme", "string_to_replace" => "ElevenLabs", "phoneme" => "ɛlɛvənlæbz", "alphabet" => "ipa" }]
)

# samples
client.samples.list

# service_accounts
client.service_accounts.api_keys.list

# speech_to_speech
client.speech_to_speech.convert(
  voice_id: "voice_123",
  audio: ElevenLabs::Upload.from_path("input.wav")
)

# speech_to_text
client.speech_to_text.convert(model_id: "scribe_v1", file: ElevenLabs::Upload.from_path("meeting.mp3"))

# studio
client.studio.projects.list

# text_to_dialogue
client.text_to_dialogue.convert(
  inputs: [
    { "voice_id" => "voice_a", "text" => "Hi!" },
    { "voice_id" => "voice_b", "text" => "Hey there!" }
  ]
)

# text_to_sound_effects
client.text_to_sound_effects.convert(text: "city street ambience")

# text_to_speech
client.text_to_speech.convert("voice_id", text: "Hello world")

# text_to_voice
client.text_to_voice.create(text: "Generate new voice preview")

# tokens
client.tokens.single_use.create(
  voice_id: "voice_123",
  usage_limit: 3
)

# usage
client.usage.get

# user
client.user.get

# voices
client.voices.get_all

# webhooks
client.webhooks.list

# workspace
client.workspace.members.list

Each namespace exposes the full set of nested resources (for example client.conversational_ai.knowledge_base.documents.create) exactly as defined in the Python SDK.

Common workflows

Generate speech

stream = client.text_to_speech.convert(
  "pNInz6obpgDQGcFmaJgB",
  text: "Welcome to ElevenLabs Ruby!",
  output_format: "mp3_44100_128",
  model_id: "eleven_monolingual_v1"
)

File.open("welcome.mp3", "wb") { |f| stream.each { |chunk| f.write(chunk) } }

Stream live dialogue

stream = client.text_to_dialogue.convert(
  inputs: [
    { "voice_id" => "voice_a", "text" => "Hello, how are you?" },
    { "voice_id" => "voice_b", "text" => "Doing great, thanks!" }
  ],
  model_id: "eleven_dialogue_v1"
)

File.open("dialogue.wav", "wb") { |f| stream.each { |chunk| f.write(chunk) } }

Transcribe audio

upload = ElevenLabs::Upload.from_path("meeting.m4a", content_type: "audio/mp4a-latm")

transcript = client.speech_to_text.convert(
  model_id: "scribe_v1",
  file: upload,
  language_code: "en",
  diarize: true
)

puts transcript["text"]

Manage conversational agents

agents = client.conversational_ai.agents.list(page_size: 10)
agents["agents"].each do |agent|
  puts "#{agent["agent_id"]} => #{agent["name"]}"
end

client.conversational_ai.agents.link.create(
  agent_id: agents["agents"].first["agent_id"],
  workspace_group_id: "group_123"
)

Invite teammates

client.workspace.invites.create(email: "teammate@example.com", role: "member")

Streaming audio

Streaming endpoints return Ruby Enumerators so you can write the same buffering logic:

stream = client.text_to_speech.convert("voice_id", text: "Streaming example")

File.open("hello.mp3", "wb") do |file|
  stream.each { |chunk| file.write(chunk) }
end

Use request_options: { chunk_size: 4096 } to tweak streaming chunk sizes.

Request options

Each call accepts a request_options: hash mirroring the Python SDK:

client.history.list(
  page_size: 25,
  request_options: {
    timeout_in_seconds: 30,
    additional_headers: { "x-trace-id" => SecureRandom.uuid },
    additional_query_parameters: { debug: true }
  }
)

additional_body_parameters merge into JSON/form payloads, giving you a consistent escape hatch when the API adds fields before the SDK regenerates.

File uploads

Use ElevenLabs::Upload to wrap local paths, strings, or IO objects for multipart endpoints:

upload = ElevenLabs::Upload.from_path("sample.wav", content_type: "audio/wav")
client.voices.pvc.samples.create("voice_id", files: [upload], remove_background_noise: true)

Uploads created from paths auto-close their file handles after the request finishes. For custom IO objects, pass auto_close: true so the SDK can close them:

io = File.open("clip.mp3", "rb")
upload = ElevenLabs::Upload.from_io(io, auto_close: true)
client.voices.ivc.samples.create("voice_id", files: [upload])

You can stub ElevenLabs::Upload.file_opener in tests to avoid touching the filesystem.

Error handling

Non-success responses raise ElevenLabs::HTTPError with useful context:

begin
  client.history.get("missing_id")
rescue ElevenLabs::HTTPError => e
  warn "HTTP #{e.status}"
  warn e.body.inspect
end

Regenerating the API surface

The gem includes scripts/extract_spec.py, which parses the upstream Python SDK and writes lib/elevenlabs/spec.json. Run the script after pulling the latest upstream changes:

python3 scripts/extract_spec.py

This keeps every endpoint, request shape, and child resource in sync without hand editing Ruby code.

Development & testing

Run the full test suite using Rake:

rake test

Or run individual test files:

ruby -Ilib:test test/operation_serialization_test.rb
ruby -Ilib:test test/http_client_test.rb
ruby -Ilib:test test/utils_test.rb
ruby -Ilib:test test/upload_test.rb
ruby -Ilib:test test/errors_test.rb
ruby -Ilib:test test/client_test.rb
ruby -Ilib:test test/environment_test.rb

Test Coverage (134 tests, 347 assertions):

  • operation_serialization_test.rb - Tests request serialization for various operations
  • operation_executor_test.rb - Tests path building, query/body/file resolution, streaming dispatch, request_options forwarding
  • http_client_test.rb - Tests file upload handling, redirect following, streaming cleanup
  • http_client_headers_test.rb - Tests default headers, API key injection, JSON/form body prep, timeouts, response parsing, error handling
  • resources_test.rb - Tests spec loading, class generation, operation methods, child resource accessors, caching, deep nesting
  • utils_test.rb - Tests utility functions (deep_dup, assign_path, deep_compact, etc.)
  • upload_test.rb - Tests Upload helper methods for files, bytes, strings, and IO
  • errors_test.rb - Tests error classes and their attributes
  • client_test.rb - Tests client initialization, resource caching, all namespace accessors, custom environments
  • environment_test.rb - Tests environment URL resolution

Launch IRB with the project on the load path:

irb -Ilib -Itest

From there you can require "elevenlabs" or load specific test files to iterate interactively.

Building and installing the gem locally

To generate and install a local build for testing:

gem build elevenlabs-ruby.gemspec
gem install ./elevenlabs-$(ruby -Ilib -e 'require "elevenlabs"; puts ElevenLabs::VERSION').gem

You can then require the gem from any project (or IRB) and point Gemfile entries to the local path if desired:

gem "elevenlabs", path: "/path/to/elevenlabs-ruby"

Recent Updates

2026-03-14: v0.3.4 — Updated API Spec from elevenlabs-python v2.39.1

Updated lib/elevenlabs/spec.json by running the extraction script against elevenlabs-python v2.39.1 (commit 8303d37, SDK regeneration #744 — March 2026).

New Operations:

  • workspace.groups.list — list all groups in the workspace

New/Updated Parameters:

  • audio_native.update_content_from_url — added author and title optional params
  • conversational_ai.batch_calls.create — added target_concurrency_limit for controlling simultaneous call dispatch
  • conversational_ai.users.list — added branch_id filter and sort_by ordering
  • conversational_ai.whatsapp_accounts.update — added enable_audio_message_response
  • music.compose — added respect_sections_durations for stricter section timing
  • speech_to_text.convert — added no_verbatim to strip filler words (scribe_v2)
  • workspace.invites.create — added seat_type param

Removed Parameters:

  • conversational_ai.agents.create / update — removed coaching_settings

Test suite now at 65 runs, 182 assertions, 0 failures.

2026-03-05: v0.3.3 — Tests and docs for SDK #740 new operations

Added serialization tests and README examples for the two new operations introduced in SDK #740:

  • music.upload — multipart file upload test covering path, form field, and file entry
  • pronunciation_dictionaries.rules.set — JSON body test covering path and rules payload

Test suite now at 57 runs, 144 assertions, 0 failures.

2026-03-05: v0.3.2 — Updated API Spec from elevenlabs-python (SDK #740)

Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 78ed67e, SDK regeneration #740 — March 2026).

New Operations:

  • music.upload — upload an audio file for use in music workflows
  • pronunciation_dictionaries.rules.set — replace the full rules set on a pronunciation dictionary

New Ruby access patterns:

client.music.upload(file: ElevenLabs::Upload.from_path("track.mp3"))
client.pronunciation_dictionaries.rules.set(
  pronunciation_dictionary_id: "dict_123",
  rules: [{ "type" => "phoneme", "string_to_replace" => "ElevenLabs", "phoneme" => "ɛlɛvənlæbz", "alphabet" => "ipa" }]
)

New Types: CheckServiceAvailabilityParams, CreateAssetParams, CreateClientAppointmentParams, CustomGuardrailsConfigInput, CustomGuardrailsConfigOutput, DeleteAssetParams, DeleteCalendarEventParams, GetClientAppointmentsParams, GuardrailExecutionMode, ListCalendarEventsParams, MusicUploadResponse, RequiredConstraint, RequiredConstraints, StudioAgentSettingsModel, StudioAgentToolSettingsModel, TelephonyCallConfig, UpdateAssetParams, UpdateCalendarEventParams, VoiceStatisticsResponseModel

2026-03-05: v0.3.1 — HTTP Redirect Following

The HTTP client now automatically follows 3xx redirects when the server returns a redirect response. This means requests that hit moved or redirected endpoints will transparently re-issue to the new location without any change to your calling code.

2026-03-01: Updated API Spec from elevenlabs-python (v0.2.0)

Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit f71bcd8, SDK regeneration #736 — March 2026). 6 new operations added.

New Operations:

  • audio_native.update_content_from_url — update audio native content from a URL
  • conversational_ai.conversations.files.create — upload files within a conversation context
  • conversational_ai.conversations.files.delete — delete files from a conversation
  • conversational_ai.conversations.messages.search — search conversation messages
  • conversational_ai.conversations.messages.text_search — text search across conversation messages
  • conversational_ai.llm.list — list available LLMs for conversational AI

New Ruby access patterns:

client.audio_native.update_content_from_url(project_id: "proj_123", url: "https://example.com/audio.mp3")
client.conversational_ai.conversations.files.create(conversation_id: "conv_123", file: upload)
client.conversational_ai.conversations.files.delete(conversation_id: "conv_123", file_id: "file_456")
client.conversational_ai.conversations.messages.search(conversation_id: "conv_123", query: "hello")
client.conversational_ai.conversations.messages.text_search(conversation_id: "conv_123", query: "hello")
client.conversational_ai.llm.list

Upstream changes also included:

  • Coaching settings for agent create/patch operations
  • New types: ClipAnimation, CoachingAgentSettings, FocusGuardrail, PromptInjectionGuardrail, ReferenceVideo, LlmInfoModel, ConstantSchemaOverride, DynamicVariableSchemaOverride, MessagesSearchResponse, ConversationHistoryTranscriptResponseModel, PrivacyConfigOutput, ProcedureRefResponseModel, WidgetConfig, GenerationSourceContext
  • Renamed: AlignmentGuardrailFocusGuardrail, PrivacyConfigPrivacyConfigInput

2026-02-17: Updated API Spec from elevenlabs-python (v0.2.0)

Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 0b87e77, SDK regeneration #730 — February 16, 2026). This is a major update with many new endpoints.

New Namespaces:

  • MCP server management (client.conversational_ai.mcp_servers + tool approvals, tool configs, tools listing)
  • Agent branches (client.conversational_ai.agents.branches) — list, create, get, update, merge
  • Agent deployments (client.conversational_ai.agents.deployments)
  • Agent drafts (client.conversational_ai.agents.drafts)
  • Conversational AI tests (client.conversational_ai.tests) with invocations sub-resource
  • Conversational AI tools (client.conversational_ai.tools)
  • Twilio integration (client.conversational_ai.twilio) — outbound calls and call registration
  • SIP trunk (client.conversational_ai.sip_trunk) — outbound calls
  • Analytics (client.conversational_ai.analytics.live_count)
  • Dashboard settings (client.conversational_ai.dashboard.settings)
  • LLM usage (client.conversational_ai.llm_usage, client.conversational_ai.agents.llm_usage)
  • Users listing (client.conversational_ai.users)
  • Agent simulation (client.conversational_ai.agents.simulate_conversation, simulate_conversation_stream, run_tests)
  • Professional voice cloning expanded (client.voices.pvc) — samples, speakers, verification, captcha, waveform

Enhanced Features:

  • Music: added compose_detailed, stream, separate_stems operations
  • Text-to-dialogue: added stream_with_timestamps and convert_with_timestamps
  • Dubbing: expanded resource operations (transcribe, translate, dub, render, segment/speaker management)
  • Studio: added get_muted_tracks for projects
  • Knowledge base: added rag_index_overview, per-document RAG index compute, chunk and summary retrieval

2026-01-24: Expanded Test Coverage

Added comprehensive test suite with 45 tests and 100 assertions covering:

  • Utils module (deep_dup, assign_path, deep_compact, symbolize_keys, encode_path_segment)
  • Upload helpers (from_bytes, from_string, from_io, from_path)
  • Error classes (HTTPError attributes and inheritance)
  • Client initialization and resource caching
  • Environment URL resolution for all regions
  • Created Rakefile for easy test execution (rake test)

All tests passing with 0 failures.

2026-01-24: GitHub Actions Workflow Updated to Ruby 3.3

Updated .github/workflows/gem-push.yml to use Ruby 3.3 (from 2.6.x) to match the project's Ruby version requirements. Also enabled bundler caching for faster CI builds.

2026-01-24: Updated API Spec from elevenlabs-python

Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 23cb5ff). This update includes:

New Features:

  • Agent summaries endpoint (client.conversational_ai.agents.summaries)
  • WhatsApp integration (client.conversational_ai.whatsapp and client.conversational_ai.whatsapp_accounts)
  • Batch calls functionality for conversational AI
  • Workspace resources management (client.workspace.resources)
  • Knowledge base improvements (dependent type filtering, source file URL retrieval)
  • Dubbing transcripts management (client.dubbing.transcripts)

Enhanced Parameters:

  • show_only_owned_agents filter for agent listing
  • branch_id support for conversation workflows
  • main_languages and conversation_initiation_source for conversations
  • entity_detection capability for speech-to-text
  • Custom SIP headers for phone number workflows
  • Widget configuration and language presets

API Changes:

  • Removed deprecated use_typesense parameter from knowledge base operations
  • Updated output format enums to use consolidated allowed_output_formats
  • Enhanced phone number transfer configuration with custom headers
  • Improved permission types for workspace API keys

To update your local spec.json in the future, run:

cd tmp-elevenlabs-python && git pull origin main && cd .. && python3 scripts/extract_spec.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors