Skip to content

Configuration: Add default parameters for STT/TTS #675

@mdshamoon

Description

@mdshamoon

Summary

Currently, STT and TTS configs require users to explicitly provide all parameters. We should add sensible defaults for all parameters except the core input attributes:

  • STT: audio input (the speech) should remain required, everything else should have defaults
  • TTS: text input should remain required, everything else should have defaults

Proposed Defaults

STT (STTLLMParams)

Param Current Proposed Default
provider required "google"
model required "gemini-2.5-pro"
instructions None None (already optional)
input_language None "auto"
output_language None None (already optional)
response_format None "text"
temperature None None (already optional)

TTS (TTSLLMParams)

Param Current Proposed Default
provider required "google"
model required "gemini-2.5-pro"
voice required "Puck"
language required "en"
response_format "wav" "wav" (already has default)

Goal

Allow minimal API calls like:

STT:

{
  "query": { "input": { "type": "audio", "content": { "format": "base64", "value": "..." } } },
  "config": {
    "blob": {
      "completion": { "type": "stt", "params": {} }
    }
  }
}

TTS:

{
  "query": { "input": "Hello world" },
  "config": {
    "blob": {
      "completion": { "type": "tts", "params": {} }
    }
  }
}

Files to Update

  • backend/app/models/llm/request.pySTTLLMParams, TTSLLMParams, CompletionConfig discriminator logic
  • May need provider-specific default resolution in backend/app/services/llm/providers/gai.py and oai.py

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions