Add player timing capabilities by maximmaxim345 · Pull Request #69 · Sendspin/spec

maximmaxim345 · 2026-02-26T10:33:15Z

There were 2 major gaps in the player@v1 role this PR aims to solve:

There is no minimum buffer size for live streams and other real-time content, potentially causing buffer underruns depending on the server implementation.
Audio data could be sent immediately after starting a stream, causing the beginning of the audio to be cut off.

Both issues are resolved by adding required_lead_time_ms and min_buffer_ms to the player@v1_support object. The server is then responsible for ensuring that both constraints are respected.
Network latency is factored into required_lead_time_ms and min_buffer_ms by the player, also allowing for high-latency clients (like a player on a mobile network).

Limitations

Network latency is not measured and remains static throughout the lifetime of the connection.
While this is sufficient for almost all real-world scenarios, including mobile-network players, it is still a minor limitation.

kahrendt · 2026-03-04T14:14:48Z

README.md

+  - `required_lead_time_ms`: integer - minimum startup lead time in milliseconds (e.g., codec init, decode warmup, audio backend buffering, DAC latency). Measure this from server transmit time of the start/restart trigger message ([`stream/start`](#server--client-streamstart) or [`stream/clear`](#server--client-streamclear)) to the timestamp of the first subsequent audio chunk.
+  - `min_buffer_ms`: integer - requested minimum ongoing buffer duration in milliseconds during playback (primarily for live streams), used to absorb network jitter and continuous-playback pipeline delays.


I'm a bit conflicted on these being statically set in the hello message (I know the PR mentions we make this assumption). In ESPHome, the very first run has at least double the lead time whereas subsequent lead times are much shorter (though this depends on the exact configuration, so it's tricky). For the min buffer, this could easily change on a cell connection at various points in time depending on signal strength. This one seems a lot harder for the server to dynamically update with though if we even had some mechanism to update it.

Should we consider allowing the hello message to be sent again in the middle of a live connection to update values (this has also come up for devices that want to switch between allowing volume control from the server or not; e.g., you plug a VPE into an amp and want the amp to handle the reset so you disable volume control on the server.

I've been thinking more about this. I can't come up with a way, at least in the ESPHome implementation, to actually determine these numbers. At best, I can make an educated case, but some of it will depend on the chip (ESP32 vs ESP32S3) and how many other things are running (microWakeWord/voice assistant stuff) and that makes it even harder to get good values that aren't overly conservative.

However, it is easy to determine this stuff empirically if we play audio once or have the connection open. I can just measure directly and add in a small margin. The client can also easily get an estimate for the latency from the time messages by just considering worst-case RTT (or something like 95th percentile).

So I don't know how to best do this in the hello message! I could save these details to the flash, but I don't know if I trust network latency to always be consistent day to day/across reboots. I don't know how a phone app could do it well either based on historical values... what if you are connected to local wifi vs on a cellular network

README.md

maximmaxim345 added 2 commits February 26, 2026 11:32

docs(spec): clarify player timing and buffer parameters

aec9639

Make required_lead_time_ms and min_buffer_ms manditory

4bf39cf

kahrendt reviewed Mar 4, 2026

View reviewed changes

README.md Show resolved Hide resolved

docs: strengthen live stream min_buffer_ms requirement

f5d7e32

maximmaxim345 force-pushed the player-time-capabilities branch from e9f9964 to f5d7e32 Compare March 16, 2026 16:17

maximmaxim345 mentioned this pull request Mar 27, 2026

Add support for per-player delay configuration Sendspin/aiosendspin#194

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add player timing capabilities#69

Add player timing capabilities#69
maximmaxim345 wants to merge 3 commits intomainfrom
player-time-capabilities

maximmaxim345 commented Feb 26, 2026

Uh oh!

kahrendt Mar 4, 2026

Uh oh!

kahrendt Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- `required_lead_time_ms`: integer - minimum startup lead time in milliseconds (e.g., codec init, decode warmup, audio backend buffering, DAC latency). Measure this from server transmit time of the start/restart trigger message ([`stream/start`](#server--client-streamstart) or [`stream/clear`](#server--client-streamclear)) to the timestamp of the first subsequent audio chunk.
		- `min_buffer_ms`: integer - requested minimum ongoing buffer duration in milliseconds during playback (primarily for live streams), used to absorb network jitter and continuous-playback pipeline delays.

Conversation

maximmaxim345 commented Feb 26, 2026

Limitations

Uh oh!

kahrendt Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

kahrendt Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants