Skip to content

flowsheet-etl: post-import dj_name UPDATE crashes on large legacy_entry_id batches (400+ param ANY clause) #939

@jakebromberg

Description

@jakebromberg

Problem

After ingesting a batch of new flowsheet entries from tubafrenzy, jobs/flowsheet-etl/job.ts runs a post-import step that backfills flowsheet.dj_name for the just-inserted rows. The query shape is:

UPDATE wxyc_schema.flowsheet AS f
SET dj_name = COALESCE(u.dj_name, s.legacy_dj_name, u.name)
FROM wxyc_schema.shows AS s
LEFT JOIN auth_user AS u ON u.id = s.primary_dj_id
WHERE f.show_id = s.id
  AND f.dj_name IS NULL
  AND f.legacy_entry_id = ANY(($1, $2, $3, ..., $401))

When the batch is large (observed: 401 entries during the 2026-05-17 post-wedge backfill, ingesting tubafrenzy ids 2615446 → 2615846), the query fails. Inserts before this step committed fine; only the post-import dj_name backfill crashes, so data is intact but the new rows have dj_name = NULL until the next nightly run picks them up via flowsheet-metadata-backfill.

The crash shape — postgres-js gets a Failed query error with all 401 params dumped in the message — points at the ANY((...)) syntax. That parses as ANY(row_constructor), which expects exactly one row; the correct form for "match against any of these values" is f.legacy_entry_id = ANY($1::int[]) with a single array param, or f.legacy_entry_id IN ($1, $2, ...).

Where

  • jobs/flowsheet-etl/job.ts:76 — the resolveDjNames-style post-import UPDATE (search for the WHERE f.dj_name IS NULL AND f.legacy_entry_id = ANY(...) shape).
  • jobs/flowsheet-etl/job.ts:102 — uses ${legacyEntryIds} in a tagged template; that's the spot where drizzle constructs the ANY expression.

Reproducer

The 2026-05-17 manual ETL run (post-wedge) reproduces this. Any wedge >30 min where tubafrenzy receives >~50 entries will trigger it on recovery. Smaller batches (the normal 30-min cron) don't hit it because the batch size stays under whatever PG / postgres-js threshold is degrading.

Desired end state

Post-import dj_name UPDATE handles batches up to at least the size of a full wedge-recovery backfill (~1000 entries) without crashing.

Suggested approach

Switch from drizzle's auto-expanded ANY to an explicit array param:

// before
.where(and(
  isNull(flowsheet.dj_name),
  inArray(flowsheet.legacy_entry_id, legacyEntryIds),  // or sql`ANY(${legacyEntryIds})`
))

// after — pass as a single int[] param
.where(and(
  isNull(flowsheet.dj_name),
  sql`${flowsheet.legacy_entry_id} = ANY(${legacyEntryIds}::int[])`,
))

Or batch the UPDATE itself: process the 401 entries in chunks of, say, 50, so the query parameter count stays small. The flowsheet-dj-name-backfill job (jobs/flowsheet-dj-name-backfill/) already does this batching well and is the pattern reference.

Acceptance

  • Manual run of flowsheet-etl over a 500+ entry tubafrenzy backlog completes without crashing on the dj_name UPDATE.
  • Unit test that exercises the ANY clause with a large legacy_entry_id list (≥500 ids).
  • Inline comment cross-references feedback_check_upstream_direction (memory): we don't add a UNIQUE on legacy_entry_id, but we DO need to handle whatever batch sizes tubafrenzy's webhook backlog produces.

Related

  • project_bs_ec2_wedge_pattern (memory) — the 2026-05-17 wedge recovery is where this surfaced. The recovery recipe in that memory mentions the crash but doesn't fix it.
  • project_bulk_update_playbook (memory) / BS#934 — same WHERE shape, same per-row cost concerns; the playbook covers the bulk-UPDATE side but not the ANY-param-count side.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstatus:readyActionable now — no upstream blockers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions