Summary
krea-realtime-video pipeline fails to load on fal.ai workers with a tensor dimension mismatch error, suggesting a stale or incompatible model cache on the worker. The error message itself advises clearing /data/models but this is not surfaced to the user in a useful way.
Error Details
From Grafana fal.ai logs (2026-03-14 19:57 – 2026-03-15 06:09 UTC, 4 occurrences across 2 jobs):
scope.server.pipeline_manager - ERROR - [1ee1c374] Failed to load pipeline krea-realtime-video: The size of tensor a (5120) must match the size of tensor b (1536) at non-singleton dimension 1. If this error persists, consider removing the models directory '/data/models' and re-downloading models.
scope.server.pipeline_manager - ERROR - [1ee1c374] Failed to load pipeline: krea-realtime-video
scope.server.pipeline_manager - ERROR - [1ee1c374] Some pipelines failed to load
Jobs affected:
f1d3920e-30ce-4da5-a596-82ea9dffbcc4 (1 occurrence, 2026-03-14 19:57 UTC)
4ed8373d-9714-4f3e-9db6-7c862a4be208 (3 occurrences, 2026-03-14 21:04–21:06 UTC)
App: github_f1lhgmk5v76a0ev1w0u378by-scope-app--prod
Root Cause
The tensor dimensions 5120 and 1536 suggest a mismatch between the model checkpoint's weight shapes and what the current code expects — likely a stale cached model file on the fal.ai worker that was built against an older version of krea-realtime-video, or a partial/corrupt download.
Tensor shape 5120 = 5 × 1024, 1536 = 3 × 512 — the ratio suggests this could be a projection layer or embedding that changed shape between model versions.
Expected Behaviour
- The pipeline manager should detect this class of checkpoint-incompatibility error and automatically clear the stale model cache + re-download, rather than failing repeatedly
- Alternatively, a version hash / manifest check on startup would catch the mismatch before attempting to load
- If manual intervention is required, the error message should surface to the user as "model cache incompatible, please re-download" rather than a raw tensor error
Notes
Summary
krea-realtime-videopipeline fails to load on fal.ai workers with a tensor dimension mismatch error, suggesting a stale or incompatible model cache on the worker. The error message itself advises clearing/data/modelsbut this is not surfaced to the user in a useful way.Error Details
From Grafana fal.ai logs (2026-03-14 19:57 – 2026-03-15 06:09 UTC, 4 occurrences across 2 jobs):
Jobs affected:
f1d3920e-30ce-4da5-a596-82ea9dffbcc4(1 occurrence, 2026-03-14 19:57 UTC)4ed8373d-9714-4f3e-9db6-7c862a4be208(3 occurrences, 2026-03-14 21:04–21:06 UTC)App:
github_f1lhgmk5v76a0ev1w0u378by-scope-app--prodRoot Cause
The tensor dimensions
5120and1536suggest a mismatch between the model checkpoint's weight shapes and what the current code expects — likely a stale cached model file on the fal.ai worker that was built against an older version ofkrea-realtime-video, or a partial/corrupt download.Tensor shape 5120 = 5 × 1024, 1536 = 3 × 512 — the ratio suggests this could be a projection layer or embedding that changed shape between model versions.
Expected Behaviour
Notes