You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Precondition for #643 (and for fix surface 1 of #628). The bin-pick branch of addEntry denormalizes library.artwork_url onto the new flowsheet row at INSERT time, eliminating the metadata race for bin-picks across all client surfaces — but only for library rows that haveartwork_url populated.
Per the precondition comment on #628 (issuecomment-4338014422), the production query quoted there reports that the current population is 155/64163 rows (0.2%). So #643's denormalization SELECT is shipped (held in draft) but does ≈nothing in practice today. This issue tracks the backfill that gives it teeth.
Proposal
A one-shot backfill job under jobs/library-artwork-backfill/, modeled after jobs/flowsheet-dj-name-backfill/ (the canonical pattern; see Backend-Service CLAUDE.md for the rule that bulk DML is always a separate job, never inside a migration).
Approach:
For each library row with artwork_url IS NULL and a non-null artist_id + album_title, call LML's /lookup endpoint (same call path as metadata.service.ts:fetchMetadata).
On a single high-confidence match, set library.artwork_url = artwork.artwork_url.
Batched UPDATEs with BACKFILL_BATCH_SIZE (default 5000), synchronous_commit=off per the bulk-update playbook, idempotent (WHERE artwork_url IS NULL filter resumes naturally).
Phase A observability tags (repo, tool=library-artwork-backfill, step, run_id).
Cooperate with LML rate limits — the job dwarfs LML's request volume from the live insert path, so will need throttling.
Out of scope
Streaming-link backfill (spotify_url, apple_music_url, etc. on library). The current schema (shared/database/src/schema.ts:280-290) only has artwork_url on library; the other LML metadata fields land on flowsheet per the inline-metadata model. If we want to denormalize more fields onto library, that's a schema change worth a separate proposal.
jobs/library-artwork-backfill/ exists, builds, runs in dry-run mode locally, and emits Phase A logs.
One-shot run on a representative slice (e.g. top-1k most-played albums) reaches a sane hit rate (≥50% on the heavy-rotation slice; the long tail will be lower).
Context
Precondition for #643 (and for fix surface 1 of #628). The bin-pick branch of
addEntrydenormalizeslibrary.artwork_urlonto the new flowsheet row at INSERT time, eliminating the metadata race for bin-picks across all client surfaces — but only for library rows that haveartwork_urlpopulated.Per the precondition comment on #628 (issuecomment-4338014422), the production query quoted there reports that the current population is 155/64163 rows (0.2%). So #643's denormalization SELECT is shipped (held in draft) but does ≈nothing in practice today. This issue tracks the backfill that gives it teeth.
Proposal
A one-shot backfill job under
jobs/library-artwork-backfill/, modeled afterjobs/flowsheet-dj-name-backfill/(the canonical pattern; see Backend-ServiceCLAUDE.mdfor the rule that bulk DML is always a separate job, never inside a migration).Approach:
libraryrow withartwork_url IS NULLand a non-nullartist_id+album_title, call LML's/lookupendpoint (same call path asmetadata.service.ts:fetchMetadata).library.artwork_url = artwork.artwork_url.BACKFILL_BATCH_SIZE(default 5000),synchronous_commit=offper the bulk-update playbook, idempotent (WHERE artwork_url IS NULLfilter resumes naturally).repo,tool=library-artwork-backfill,step,run_id).Out of scope
library). The current schema (shared/database/src/schema.ts:280-290) only hasartwork_urlonlibrary; the other LML metadata fields land onflowsheetper the inline-metadata model. If we want to denormalize more fields ontolibrary, that's a schema change worth a separate proposal.flowsheet.artwork_urlfrom history (covered by [Epic] Historical metadata backfill for ~1.86M flowsheet rows #631).Acceptance criteria
jobs/library-artwork-backfill/exists, builds, runs in dry-run mode locally, and emits Phase A logs.library.artwork_urlnon-null fraction is materially higher than the 0.2% baseline reported on Re-emit liveFs SSE event after metadata enrichment UPDATE lands #628. Quote the post-run number in the closing comment.Cross-repo links
/lookup— the same endpoint the live insert path already uses.jobs/flowsheet-dj-name-backfill/— reference implementation for the one-shot job pattern.