krea-realtime-video fails to load: tensor size mismatch (5120 vs 1536) at non-singleton dimension 1

## Summary

`krea-realtime-video` pipeline fails to load on fal.ai workers with a tensor dimension mismatch error, suggesting a stale or incompatible model cache on the worker. The error message itself advises clearing `/data/models` but this is not surfaced to the user in a useful way.

## Error Details

From Grafana fal.ai logs (2026-03-14 19:57 – 2026-03-15 06:09 UTC, 4 occurrences across 2 jobs):

```
scope.server.pipeline_manager - ERROR - [1ee1c374] Failed to load pipeline krea-realtime-video: The size of tensor a (5120) must match the size of tensor b (1536) at non-singleton dimension 1. If this error persists, consider removing the models directory '/data/models' and re-downloading models.
scope.server.pipeline_manager - ERROR - [1ee1c374] Failed to load pipeline: krea-realtime-video
scope.server.pipeline_manager - ERROR - [1ee1c374] Some pipelines failed to load
```

Jobs affected:
- `f1d3920e-30ce-4da5-a596-82ea9dffbcc4` (1 occurrence, 2026-03-14 19:57 UTC)
- `4ed8373d-9714-4f3e-9db6-7c862a4be208` (3 occurrences, 2026-03-14 21:04–21:06 UTC)

App: `github_f1lhgmk5v76a0ev1w0u378by-scope-app--prod`

## Root Cause

The tensor dimensions `5120` and `1536` suggest a mismatch between the model checkpoint's weight shapes and what the current code expects — likely a stale cached model file on the fal.ai worker that was built against an older version of `krea-realtime-video`, or a partial/corrupt download.

Tensor shape 5120 = 5 × 1024, 1536 = 3 × 512 — the ratio suggests this could be a projection layer or embedding that changed shape between model versions.

## Expected Behaviour

- The pipeline manager should detect this class of checkpoint-incompatibility error and automatically clear the stale model cache + re-download, rather than failing repeatedly
- Alternatively, a version hash / manifest check on startup would catch the mismatch before attempting to load
- If manual intervention is required, the error message should surface to the user as "model cache incompatible, please re-download" rather than a raw tensor error

## Notes

- Different from #669 (Float8Tensor dispatch error in krea)
- Different from #602 (optical flow tensor mismatch) and #601 (VaceEncoding resolution mismatch)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

krea-realtime-video fails to load: tensor size mismatch (5120 vs 1536) at non-singleton dimension 1 #693

Summary

Error Details

Root Cause

Expected Behaviour

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

krea-realtime-video fails to load: tensor size mismatch (5120 vs 1536) at non-singleton dimension 1 #693

Description

Summary

Error Details

Root Cause

Expected Behaviour

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions