Skip to content

feat: expose speaker embeddings and subsegments in DiarizeResult#4

Open
smm-h wants to merge 1 commit intoFoxNoseTech:mainfrom
smm-h:main
Open

feat: expose speaker embeddings and subsegments in DiarizeResult#4
smm-h wants to merge 1 commit intoFoxNoseTech:mainfrom
smm-h:main

Conversation

@smm-h
Copy link
Copy Markdown

@smm-h smm-h commented Mar 25, 2026

Summary

The diarize() function already computes speaker embeddings and subsegments via extract_embeddings(), but these are discarded before building the DiarizeResult. This change simply preserves them on the result object by adding two new optional fields.

Changes

  • utils.py: Added embeddings: Any = None and subsegments: list[SubSegment] | None = None fields to DiarizeResult. Added model_config = ConfigDict(arbitrary_types_allowed=True) to support numpy arrays in the Pydantic model.
  • __init__.py: Pass embeddings and subsegments to the DiarizeResult constructor in diarize().

Motivation

Use case: cross-recording speaker clustering and identification. When processing multiple audio files, having access to the raw speaker embeddings allows users to cluster or match speakers across recordings -- something that is not possible with just the segment labels.

Notes

  • Both fields default to None, so the change is fully backward-compatible.
  • No performance impact -- this just stores a reference to already-computed data instead of discarding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant