Which SDK is this feature request for?
Feature Request: Add support for known speakers in SpeakerDiarizationConfig
Summary
Request to add support for known speaker identification/enrollment in the Python SDK's SpeakerDiarizationConfig to enable persistent speaker identification across sessions.
Context
Currently, the SpeakerDiarizationConfig only supports:
max_speakers
speaker_sensitivity
prefer_current_speaker
However, the API documentation mentions SpeakersResult as a preview feature, and the SDK code contains references to GET_SPEAKERS and SPEAKERS_RESULT message types (marked as "Internal, Speechmatics only").
Use Case
We're building voice AI applications where identifying specific speakers across sessions is critical, such as:
- Meeting transcription: Identifying recurring participants without having to voice match with their speaker labels again
Currently, every time the same speakers in our system join a meeting we have to match their identities to the speaker. Which in the case of diarization becomes very annoying to have to do each time and a bad user experience. This feature to allow them to be identified beforehand would be incredibly useful.
Proposed Solution
Add a speakers field to SpeakerDiarizationConfig to support known speaker enrollment:
@dataclass
class SpeakerDiarizationConfig:
max_speakers: Optional[int] = None
speaker_sensitivity: Optional[float] = None
prefer_current_speaker: Optional[bool] = None
speakers: Optional[Dict[str, List[str]]] = None # New field for known speakers
# Usage example:
config = SpeakerDiarizationConfig(
max_speakers=2,
speaker_sensitivity=0.5,
speakers={
"John": ["speaker_id_john_123"], # Speaker name -> identifiers
"Jane": ["speaker_id_jane_456"],
}
)
Current Workaround
We tested whether the API would accept a speakers field even though it's not in the SDK:
config.speaker_diarization_config = {
"speakers": {
"John": ["speaker_id_john_123"],
"Jane": ["speaker_id_jane_456"],
}
}
But the API rejects it with:
Error: Additional property speakers is not allowed
Questions
- Is the
SpeakersResult preview feature available for early access?
- Is there a timeline for when known speaker support will be added to the public API?
- Would you accept a PR to add this functionality to the SDK once the API supports it?
Related
Environment
- speechmatics-rt version: 0.4.0
- Python version: 3.11
- Use case: Real-time transcription with speaker identification
Would love to hear if this is on the roadmap or if there's an alternative approach we should consider!
Related issues/PRs
Link any related issues or pull requests:
Priority/Impact
How important is this feature to you?
Which SDK is this feature request for?
Feature Request: Add support for known speakers in SpeakerDiarizationConfig
Summary
Request to add support for known speaker identification/enrollment in the Python SDK's
SpeakerDiarizationConfigto enable persistent speaker identification across sessions.Context
Currently, the
SpeakerDiarizationConfigonly supports:max_speakersspeaker_sensitivityprefer_current_speakerHowever, the API documentation mentions
SpeakersResultas a preview feature, and the SDK code contains references toGET_SPEAKERSandSPEAKERS_RESULTmessage types (marked as "Internal, Speechmatics only").Use Case
We're building voice AI applications where identifying specific speakers across sessions is critical, such as:
Currently, every time the same speakers in our system join a meeting we have to match their identities to the speaker. Which in the case of diarization becomes very annoying to have to do each time and a bad user experience. This feature to allow them to be identified beforehand would be incredibly useful.
Proposed Solution
Add a
speakersfield toSpeakerDiarizationConfigto support known speaker enrollment:Current Workaround
We tested whether the API would accept a
speakersfield even though it's not in the SDK:But the API rejects it with:
Questions
SpeakersResultpreview feature available for early access?Related
Environment
Would love to hear if this is on the roadmap or if there's an alternative approach we should consider!
Related issues/PRs
Link any related issues or pull requests:
Priority/Impact
How important is this feature to you?