-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
Summary
Currently, STT and TTS configs require users to explicitly provide all parameters. We should add sensible defaults for all parameters except the core input attributes:
- STT:
audioinput (the speech) should remain required, everything else should have defaults - TTS:
textinput should remain required, everything else should have defaults
Proposed Defaults
STT (STTLLMParams)
| Param | Current | Proposed Default |
|---|---|---|
provider |
required | "google" |
model |
required | "gemini-2.5-pro" |
instructions |
None |
None (already optional) |
input_language |
None |
"auto" |
output_language |
None |
None (already optional) |
response_format |
None |
"text" |
temperature |
None |
None (already optional) |
TTS (TTSLLMParams)
| Param | Current | Proposed Default |
|---|---|---|
provider |
required | "google" |
model |
required | "gemini-2.5-pro" |
voice |
required | "Puck" |
language |
required | "en" |
response_format |
"wav" |
"wav" (already has default) |
Goal
Allow minimal API calls like:
STT:
{
"query": { "input": { "type": "audio", "content": { "format": "base64", "value": "..." } } },
"config": {
"blob": {
"completion": { "type": "stt", "params": {} }
}
}
}TTS:
{
"query": { "input": "Hello world" },
"config": {
"blob": {
"completion": { "type": "tts", "params": {} }
}
}
}Files to Update
backend/app/models/llm/request.py—STTLLMParams,TTSLLMParams,CompletionConfigdiscriminator logic- May need provider-specific default resolution in
backend/app/services/llm/providers/gai.pyandoai.py
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
In Review