Skip to content

feat: livekit cast improvements#7347

Open
cyaiox wants to merge 53 commits intodevfrom
feat/enhance-livekit-track-selector
Open

feat: livekit cast improvements#7347
cyaiox wants to merge 53 commits intodevfrom
feat/enhance-livekit-track-selector

Conversation

@cyaiox
Copy link
Copy Markdown
Member

@cyaiox cyaiox commented Mar 5, 2026

Pull Request Description

What does this PR change?

Enhances the LiveKit media player to support full Cast scenarios with multiple participants, automatic speaker tracking, presentation bot priority, data messaging between scenes and participants, and self-healing stream recovery.

Video routing

  • Active speaker tracking: In CurrentStream mode, video automatically follows the dominant speaker using room.ActiveSpeakers with a 1.5s debounce to prevent flickering
  • Presentation bot priority: Participants with identity prefix presentation-bot: are always prioritized — once locked on, speaker tracking is suppressed until the bot leaves
  • Video muted state: Tracks TrackPublication.Muted per frame and renders a black texture when muted
  • Resolution capping: LiveKit video textures are capped at 1920x1080 to prevent GPU stalls
  • Auto-recovery: When a specific user stream dies, falls back to CurrentStream (first available track)

Audio routing

  • Multi-audio playback: Plays all audio tracks from the room simultaneously instead of just the paired one — all participants are heard regardless of which video is displayed
  • Additive discovery: New participants joining mid-session are picked up without disrupting existing audio sources
  • Rescan throttling: Audio rescans every 2s (AUDIO_RESCAN_INTERVAL_SECONDS), or immediately when a dead track is detected

Data messaging API

  • PublishData(topic, data) — Send messages to a topic (rate-limited to 10 msg/s, 16KB cap)
  • SubscribeToTopic(topic) — Subscribe to receive messages on a topic
  • ConsumeMessages(topic) — Consume buffered messages for a topic
  • UpdateMetadata(metadata) — Update local participant metadata (broadcast by LiveKit)
  • Handles non-protobuf data on the shared DataPipe gracefully (InvalidProtocolBufferException)

Stream metadata

  • Enriched getActiveVideoStreams response with displayName, speaking, trackName, width, height
  • Display name resolved from participant metadata JSON (displayName field), falling back to Name then Identity

Documentation

  • Updated docs/cast.md covering all new features

Test Instructions

Prerequisites

  • Deploy the template: Streaming Theatre to .zone/.org

Test Steps — Video routing

  1. Open the Admin tools smart item and start a DCL Cast
  2. Join the cast as a speaker (cast app) with camera on
  3. Verify the video screen streams your camera
  4. Have a second speaker join — verify video switches to whoever is talking (with ~1.5s hold)
  5. Stop talking — verify video stays on the last speaker (no flickering)

Test Steps — Presentation bot

  1. Start a cast session with a presentation bot connected
  2. Verify video locks onto the bot's track immediately
  3. Have other participants speak — verify video stays on the bot
  4. Disconnect the bot — verify video falls back to active speaker tracking

Test Steps — Audio

  1. Have multiple speakers join the cast
  2. Verify audio plays from all participants simultaneously
  3. Have a new participant join mid-session — verify their audio is picked up within ~2s
  4. Have a participant leave and rejoin — verify audio recovers

Test Steps — Data messaging

  1. From a scene, call SubscribeToTopic("test")
  2. From another participant, call PublishData("test", "hello")
  3. Call ConsumeMessages("test") — verify the message is returned with sender identity
  4. Verify rate limiting: publish >10 messages in 1s, confirm excess are dropped

Test Steps — Screen share recovery

  1. Join the cast and share your screen
  2. Select the screen option from the Showcase dropdown
  3. Stop sharing your screen
  4. Verify the video screen falls back to the first available stream (your camera)

Additional Testing Notes

  • Verify participant names display correctly for unauthenticated Cast users (castV2)
  • Verify no audio leaks (sources properly released) after stopping/restarting a stream
  • Verify video renders black when a track is muted

Quality Checklist

  • Changes have been tested locally
  • Documentation has been updated
  • Performance impact has been considered
  • For SDK features: Test scene is included

Code Review Reference

Please review our Code Review Standards before submitting.

Screenshots

image

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 5, 2026

badge

Build failed! Check the logs to see what went wrong

@cyaiox cyaiox added the force-build Used to trigger a build on draft PR label Mar 5, 2026
@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 4609f6b to 46fbe57 Compare March 6, 2026 19:40
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 6, 2026
@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 46fbe57 to dd905a9 Compare March 10, 2026 18:23
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 10, 2026
@cyaiox
Copy link
Copy Markdown
Member Author

cyaiox commented Mar 10, 2026

@claude review

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 10, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

@Ludmilafantaniella
Copy link
Copy Markdown
Contributor

Tested! Functionality works as expected ✅

TC5-6 TC3-4-evi TC1-2-evi
7347-evi.mp4

⚠️ Side note — cosmetic issues found:

  • "Showcase List" button label wraps to two lines instead of rendering on a single line.
  • Showcase camera dropdown: neither option shows a checkmark by default, even though the camera is already active when Cast is activated. The checkmark only appears after manually selecting an option. Feels a bit counterintuitive visually.
  • Horizontal scrollbar visible in the camera dropdown (already flagged — dev will fix by adjusting dropdown width).

None of these block the functionality, just leaving them on record!

11.03.2026_11.08.39_REC.mp4
image

@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 390bb2e to 66e96cc Compare March 11, 2026 23:24
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 11, 2026
@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 43e595a to ccdc25a Compare March 12, 2026 21:04
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 12, 2026
@m3taphysics
Copy link
Copy Markdown
Collaborator

@claude review

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 17, 2026

Claude finished @m3taphysics's task in 2m 53s —— View job


Code Review

  • Gather context
  • Read changed files
  • Analyze for bugs, security, and performance issues
  • Post review findings

Issues Found

🔴 Performance — EnsureAudioIsPlaying polls participants lock every frame when room has no audio (line 100)

When audioSources.Count == 0 (room has no audio tracks, or after initial open), anyDied is false and the guard condition !anyDied && audioSources.Count > 0 evaluates to false. The method falls through every frame to OpenAllAudioStreams(), which acquires lock(room.Participants) and iterates all participants. For any scene where audio is temporarily unavailable, this is a per-frame lock + allocation in a hot path. Fix this →

🔴 Audio disruption on every video recovery (line 71)

When a user stream's video dies, EnsureVideoIsPlaying() calls OpenMedia(LivekitAddress.CurrentStream()). OpenMedia always calls CloseCurrentStream() first, which calls ReleaseAllAudioSources() — pooling all audio MonoBehaviours — then immediately re-acquires them. This causes a brief audio dropout on every video fallback, even when audio streams are healthy. The video and audio reset paths need to be separated. Fix this →

🟡 Dead code — FindPairedAudio never called (line 255)

Private method with no callers. Violates the project's code standards: "Don't design for hypothetical future requirements." Remove until actually needed. Fix this →

🟡 Silent JsonException swallowing (line 39)

catch (JsonException) { } in ResolveDisplayName silently discards all metadata parse failures. Malformed participant metadata would silently fall back to Name/Identity with no trace in logs, making it impossible to diagnose in production. Log a warning via ReportHub. Fix this →

🟡 asCurrent values written but never read in CommsApiWrap.cs

asCurrent is used only as a null-check sentinel — the stored (identity, track) tuple is never read back. This is misleading; it should be a plain bool hasWrittenCurrent = false. Fix this →


REVIEW_RESULT: FAIL ❌

}
}

if (!anyDied && audioSources.Count > 0) return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance bug — participants lock acquired every frame when room has no audio

When audioSources.Count == 0 (e.g. room has no audio tracks yet, or after Stop()), anyDied stays false, so !anyDied && audioSources.Count > 0 evaluates to false and the method always falls through to ReleaseAllAudioSources() + OpenAllAudioStreams(). OpenAllAudioStreams acquires lock(room.Participants) and iterates all participants — every frame, for the lifetime of the player when no audio is present.

Fix: track a separate flag for "attempted to open audio, found nothing".

Suggested change
if (!anyDied && audioSources.Count > 0) return;
if (!anyDied && audioSources.Count > 0) return;
if (audioSources.Count == 0 && !anyDied)
{
// No sources and nothing died — room has no audio tracks.
// Only re-check when the frame budget allows (or when first opened).
OpenAllAudioStreams();
return;
}

Simpler fix: add an audioInitialized bool that is set to true in OpenMedia/OpenAllAudioStreams, and only skip the redundant re-poll when audioInitialized && audioSources.Count == 0.

/// Finds the audio track paired to a specific video track from the same participant.
/// Available for future targeted audio scenarios.
/// </summary>
private Weak<AudioStream> FindPairedAudio(string identity, string videoSid)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code — FindPairedAudio is never called

This private method has no callers. Per the project's code standards: "Don't create helpers, utilities, or abstractions for one-time operations. Don't design for hypothetical future requirements."

The docs note it's reserved for "future targeted audio scenarios", but shipping dead code adds maintenance burden and cognitive noise. Remove it until it's actually needed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it in case it will be required in the future, as livekit sends Video and Audio in separate tracks, this utility helps to match the participant Video with their respective Audio.

// If a specific user stream died, fallback to current-stream (first available track)
if (playingAddress.Value.IsUserStream(out _))
{
OpenMedia(LivekitAddress.CurrentStream());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audio disruption on every video fallback

OpenMedia(LivekitAddress.CurrentStream()) calls CloseCurrentStream() first, which calls ReleaseAllAudioSources() — releasing and re-pooling all audio MonoBehaviour objects — then immediately re-acquires them in OpenAllAudioStreams(). This causes a brief audio dropout on every video recovery cycle, even when the audio streams are perfectly healthy.

Consider splitting the video and audio reset paths so a dying video track doesn't interrupt audio that is still alive:

Suggested change
OpenMedia(LivekitAddress.CurrentStream());
// Video stream died — swap to first available track without resetting audio
if (playingAddress.Value.IsUserStream(out _))
{
currentVideoIdentity = null;
currentVideoStream = FirstVideoTrackingIdentity();
playingAddress = LivekitAddress.CurrentStream();
videoSwitchedAtTime = UnityEngine.Time.realtimeSinceStartup;
playerState = PlayerState.PLAYING;
return;
}

if (!string.IsNullOrEmpty(displayName))
return displayName;
}
catch (JsonException) { }
Copy link
Copy Markdown
Contributor

@claude claude Bot Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent exception swallowing — invalid metadata is silently ignored

catch (JsonException) { } discards all parse failures without any trace. If a participant sends malformed metadata this will silently fall through to the Name/Identity fallback, making it very hard to diagnose in the field.

;

Suggested change
catch (JsonException) { }
catch (JsonException e) { ReportHub.LogWarning(ReportCategory.COMMS, $"Failed to parse participant metadata for display name: {e.Message}"); }

@cyaiox cyaiox marked this pull request as ready for review March 17, 2026 12:49
@cyaiox cyaiox requested review from a team as code owners March 17, 2026 12:49
@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 1366c4f to 125a030 Compare April 23, 2026 14:07
@cyaiox cyaiox requested a review from NickKhalow April 23, 2026 14:41
@NickKhalow
Copy link
Copy Markdown
Contributor

@NickKhalow - replying to #7347 (comment)

With new routing CommsData = 3, // CommsApi publish/subscribe topic-based data, I believe the catch is excessive. Let's remove it

the catch handles unexpected participants on the shared IDataPipe that send arbitrary non-protobuf bytes into scene data, this is purely a defensive fallback for unexpected senders.

Hey @cyaiox, I see the intention. It can be useful for prod, but I would exclude it for development for sure, otherwise that catch silently swallows the data that could be important for the context and traceability.

Copy link
Copy Markdown
Contributor

@NickKhalow NickKhalow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your updates! Please address the comments

/// internal depth stack corrupted. The allocation is negligible — this
/// runs only on error paths that already allocated an exception object.
/// </summary>
private void ResetWriter()
Copy link
Copy Markdown
Contributor

@NickKhalow NickKhalow Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the idea, makes a sense. Would be great to enforce the invariant in the code, because there is always a chance that new method may forget to reset the JTW.

WDYT?

UPD: addressed

private PlayerState playerState;
private LivekitAddress? playingAddress;

private LivekitAddress? currentVideo;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second LivekitAddress is redundant due the following:

FromIdentity is used only in LiveKitPlayer. LiveKitPlayer is NOT responsible for issuing the LiveKitAddress by design. It only executes the address like a command and never creates that internally

Please remove FromIdentity method. IsPresentationBotIdentity should be placed in LivekitAddress.New method, it will handle the union resolution and LivekitPlayer never needs to create a value.

Comment thread Explorer/Assets/DCL/SDKComponents/MediaStream/LivekitPlayer.cs Outdated
Comment thread Explorer/Assets/DCL/SDKComponents/MediaStream/LivekitPlayer.cs Outdated
}
catch (InvalidOperationException)
{
// Collection modified during snapshot; retry next frame.
Copy link
Copy Markdown
Contributor

@NickKhalow NickKhalow Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to avoid such defensive programming in concurrent scenarios and solve the root cause in place. Would it be possible to address the issue here?

UPD: addressed

Comment thread Explorer/Assets/DCL/SDKComponents/MediaStream/LivekitPlayer.cs Outdated
…rack-selector

# Conflicts:
#	Explorer/Assets/DCL/Infrastructure/Utility/DecentralandUrls/DecentralandUrl.cs
#	Explorer/Assets/DCL/NetworkDefinitions/Browser/DecentralandUrlsSource.cs
Comment thread Explorer/Assets/DCL/SDKComponents/MediaStream/Systems/UpdateMediaPlayerSystem.cs Outdated
Comment thread Explorer/Packages/manifest.json Outdated
Comment thread Explorer/Assets/DCL/NetworkDefinitions/Browser/DecentralandUrlsSource.cs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

force-build Used to trigger a build on draft PR new-dependency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants