Skip to content

fix(library-identity-consumer): stringify lastVerifiedAt to defeat Drizzle's date-serializer override#976

Merged
jakebromberg merged 1 commit into
mainfrom
library-identity-consumer-date-bind
May 21, 2026
Merged

fix(library-identity-consumer): stringify lastVerifiedAt to defeat Drizzle's date-serializer override#976
jakebromberg merged 1 commit into
mainfrom
library-identity-consumer-date-bind

Conversation

@jakebromberg
Copy link
Copy Markdown
Member

Summary

First prod run of library-identity-consumer on 2026-05-20 had 14,405 / 14,405 UPSERTs fail with zero library_identity rows landing. LML resolution was healthy (matched dry-run exactly); the failure was 100% writer-side.

Root cause is a Drizzle / postgres-js interaction documented at porsager/postgres#761 and present in node_modules/drizzle-orm/postgres-js/driver.js:19:

const transparentParser = (val) => val;
for (const type of ['1184', '1082', '1083', '1114', '1182', '1185', '1115', '1231']) {
  client.options.parsers[type] = transparentParser;
  client.options.serializers[type] = transparentParser;  // ← also rebinds outbound serializer
}

The override is intended for inbound parsers (so reads return raw text for Drizzle to convert itself), but the same loop also rebinds the outbound serializer for OID 1184 (timestamptz). The native postgres-js serializer would call .toISOString() on Date objects; the transparent passthrough returns the Date unchanged. Then postgres-js's Bind()b.str(date)Buffer.byteLength(date)ERR_INVALID_ARG_TYPE: The "string" argument must be of type string or an instance of Buffer or ArrayBuffer. Received an instance of Date.

Why no other job hit this

library-canonical-entity-backfill, flowsheet-metadata-backfill, library-artwork-url-backfill, and flowsheet-dj-name-backfill all sidestep the bug by either using now() in SQL or by going through db.insert(...).values({...}) (Drizzle's column-aware encoder path, which doesn't hit the transparent serializer). The consumer's writeSingleArtist is the only writer in the repo that passes a JS Date as ${value} inside a sql\`` raw template.

The fix

new Date()new Date().toISOString(). A pre-stringified ISO 8601 string passes through the transparent serializer unchanged, and PG parses the literal as a timestamptz happily.

Diff is +11/-1 in a single file, with a long comment block explaining the citation so future readers don't need to re-walk this investigation.

Test plan

  • Direct repro of the failure via a Node script using the same drizzle() + tx.execute(sql\`)` code path as the writer — fails with the documented ERR_INVALID_ARG_TYPE.
  • End-to-end verification of the fix against prod: both the library_identity_source and library_identity UPSERTs land cleanly when lastVerifiedAt is pre-stringified.
  • npx tsc --noEmit -p jobs/library-identity-consumer/tsconfig.json clean.

Deploy + re-run

After this lands:

  1. Deploy via Manual Build & Deploy target=library-identity-consumer.
  2. Pull the new image on EC2: docker pull 203767826763.dkr.ecr.us-east-1.amazonaws.com/library-identity-consumer:<new tag>.
  3. Re-run the consumer (no DRY_RUN) — expected: 14,405 `rows_resolved`, ~30K source rows, 15,376 `rows_unresolved`, 0 `writer_error`.
  4. Confirms library_identity populated for the 14,405 resolved rows.

Recommend also shipping the companion observability fix (PR #975) so future writer failures actually surface the PG code/constraint in logs instead of just "Failed query".

Related

…izzle's date-serializer override

The first prod run of library-identity-consumer on 2026-05-20 had every UPSERT (14,405/14,405) fail at the writer with no library_identity rows landing. Diagnosis traced the failure to Drizzle's drizzle() factory in node_modules/drizzle-orm/postgres-js/driver.js:19, which overrides postgres-js's default date serializer (OIDs 1184/1082/1083/1114/1182/1185/1115/1231) with a transparent passthrough.

The override cites porsager/postgres#761 and is intended to defeat postgres-js's *inbound* date parser (so reads return raw text for Drizzle to convert itself). But the same loop also rebinds the *outbound* serializer — a JS Date passed via `${...}` in a `sql\`\`` template arrives at postgres-js's Bind() as a Date, the transparent serializer returns it unchanged, and Buffer.byteLength() inside b.str() throws ERR_INVALID_ARG_TYPE.

Other backfill jobs in this repo (library-canonical-entity-backfill, etc.) sidestep the bug by using NOW() in SQL rather than passing a JS Date as a parameter; this writer was the only one passing a Date through the template. Pre-stringifying with .toISOString() defeats the override (the resulting string passes through the transparent serializer unchanged) and gives PG a parseable timestamptz literal.

End-to-end verified against prod with the exact writer code path: both library_identity_source and library_identity INSERTs now succeed.
@jakebromberg
Copy link
Copy Markdown
Member Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

  • Fix is minimal (+11/-1) and the root-cause analysis matches the postgres-js / Drizzle behavior cited in the body.
  • new Date().toISOString() is the same pattern already used elsewhere in jobs/ for ISO-8601-into-raw-SQL paths.
  • The two other new Date() callsites in jobs (flowsheet-etl/job.ts, library-etl/job.ts) go through db.insert(...).values({...}) and so don't hit the transparent serializer — consistent with the PR body's diagnosis.
  • No other ${...Date...} interpolations in sql\`raw templates acrossjobs/, apps/, shared/` — the consumer's writer was the only site.

@jakebromberg jakebromberg merged commit 8afd086 into main May 21, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant