fix(cloud): surface worker param-update errors via WebRTC data channel by livepeer-tessa · Pull Request #725 · daydreamlive/scope

livepeer-tessa · 2026-03-20T18:24:08Z

Problem

Closes #724

When a cloud worker fails to fetch ip_adapter_style_image_url (or any other URL-type param) during a parameter update, it sends an error response over the WebRTC data channel:

{"last_error": "Error updating params: Request timeout while fetching image from URL"}

The on_dc_message handler in CloudWebRTCClient was just logging this at DEBUG level — silently dropping the error. No Kafka event was published, and the UI received no notification. The pipeline continued running but with a stale/missing style image.

Fix

cloud_webrtc_client.py

Parse incoming data channel messages as JSON
If an error or last_error field is present, call cloud_manager._on_worker_error(error_text, raw_payload)

cloud_connection.py

Add _worker_error_callbacks list + add_worker_error_callback / remove_worker_error_callback helpers
Add _on_worker_error(error_message, raw_payload) which:
- Logs at WARNING
- Publishes a Kafka error event (error_type=cloud_worker_param_update_error)
- Notifies all registered callbacks

cloud_track.py

After WebRTC starts, register _on_worker_error callback (when a notification_callback is set)
Forwards errors to notification_callback as type=worker_param_update_error for frontend visibility
Deregisters on stop() to avoid memory leaks / stale references

Testing

All 335 existing tests pass (pytest tests/ -x -q).

Notes

This PR does not add retry logic for URL fetches — that would require changes on the fal.ai worker side (tracked in #724). This PR ensures the error is at minimum:

Visible in logs at WARNING level (not silently dropped)
Published as a Kafka event for monitoring
Surfaced to the frontend via the existing notification_callback chain

…ly outputs Signed-off-by: Rafal Leszko <rafal@livepeer.org>

…eprocessVideoBlock On the first chunk (current_start_frame == 0), target_num_frames is num_frame_per_block * vae_temporal_downsample_factor + 1 (e.g. 13 for default config). PreprocessVideoBlock already resamples 'video' and 'vace_input_frames' to this count, but 'vace_input_masks' was never adjusted. When masks arrive from a queue or client parameter they have the base chunk size (e.g. 12 frames), causing VaceEncodingBlock to raise: ValueError: vace_input_masks shape mismatch: expected [B, 1, 13, ...] got [B, 1, 12, ...] Fix: add vace_input_masks to PreprocessVideoBlock inputs/outputs and resample its temporal dimension to target_num_frames whenever it does not already match, using the same linear-interpolation index strategy used for video/vace_input_frames. Fixes #721 Signed-off-by: livepeer-robot <robot@livepeer.org>

Signed-off-by: livepeer-robot <robot@livepeer.org>

Previously the WebRTC data channel on_message handler silently dropped all messages from the cloud worker at debug log level, including error responses like: {"last_error": "Error updating params: Request timeout while fetching image from URL"} This meant IP adapter URL fetch failures (and similar param-update errors) were completely invisible to the UI and not published as Kafka events. Changes: - cloud_webrtc_client: parse incoming data channel messages as JSON; if 'error' or 'last_error' is present, call cloud_manager._on_worker_error - cloud_connection: add _worker_error_callbacks list, add/remove helpers, and _on_worker_error() which logs at WARNING, publishes a Kafka error event (type=cloud_worker_param_update_error), and notifies callbacks - cloud_track: register _on_worker_error callback after WebRTC starts; forwards errors to notification_callback as type=worker_param_update_error so the frontend can surface them; deregisters on stop() to avoid leaks Fixes #724 Signed-off-by: livepeer-robot <robot@livepeer.org>

coderabbitai · 2026-03-20T18:24:16Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ee2cb8b-3c13-48b7-accb-6d1bea66f487

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/724-ip-adapter-url-timeout-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

github-actions · 2026-03-20T18:32:24Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-725--preview`
WebSocket	`wss://fal.run/daydream/scope-pr-725--preview/ws`
Commit	`e88440e`

Testing

Connect to this preview deployment by running this on your branch:

uv run build && SCOPE_CLOUD_APP_ID="daydream/scope-pr-725--preview/ws" uv run daydream-scope

🧪 E2E tests will run automatically against this deployment.

github-actions · 2026-03-20T18:41:41Z

✅ E2E Tests passed


Status	passed
fal App	`daydream/scope-pr-725--preview`
Run	View logs

Test Artifacts

Check the workflow run for screenshots.

leszko and others added 4 commits March 20, 2026 08:19

fix(pipeline_processor): idle backoff and prepared state for audio-on…

c8559d0

…ly outputs Signed-off-by: Rafal Leszko <rafal@livepeer.org>

style: ruff format preprocess_video.py

cd07b5c

Signed-off-by: livepeer-robot <robot@livepeer.org>

livepeer-tessa requested review from emranemran and mjh1 March 20, 2026 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cloud): surface worker param-update errors via WebRTC data channel#725

fix(cloud): surface worker param-update errors via WebRTC data channel#725
livepeer-tessa wants to merge 4 commits intomainfrom
fix/724-ip-adapter-url-timeout-handling

livepeer-tessa commented Mar 20, 2026

Uh oh!

coderabbitai bot commented Mar 20, 2026

Review skipped

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

livepeer-tessa commented Mar 20, 2026

Problem

Fix

Testing

Notes

Uh oh!

coderabbitai bot commented Mar 20, 2026

Review skipped

Uh oh!

github-actions bot commented Mar 20, 2026

🚀 fal.ai Preview Deployment

Testing

Uh oh!

github-actions bot commented Mar 20, 2026

✅ E2E Tests passed

Test Artifacts

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants