Deploy wxyc_identity_match_* plpgsql functions (wiki §3.3.5)#806
Deploy wxyc_identity_match_* plpgsql functions (wiki §3.3.5)#806jakebromberg wants to merge 3 commits into
Conversation
#805) Vendors the canonical artifacts from WXYC/wxyc-etl@v0.4.0 (`data/`) under `vendor/wxyc-etl/` and ships them as migration 0076. The migration sets up the `wxyc_unaccent` text-search dictionary, then inlines the canonical four-function SQL byte-for-byte (drizzle-kit applies plain SQL in a single transaction, so `\i` isn't an option). SHA-pinned in `wxyc-etl-pin.txt`. Both `db` (dev profile) and `ci-db` (ci profile) in `dev_env/docker-compose.yml` now mount the rules + version files into `/usr/local/share/postgresql/tsearch_data/` so the dictionary creates cleanly on first migrate. The integration spec at `tests/integration/wxyc-identity-match-functions.spec.js` exercises three layers: pin SHA freshness, migration-vs-canonical byte-equality, and a small canonical-artist smoke + idempotence on the live PG. Column flip on `library_identity*` is deliberately out of scope here — gated on the E2-BS step-2 backfill PR (#663) per the ticket and the wiki §3.3.0 row 6. This migration ships the function definitions so the backfill window has them available. The journal entry uses `when = previous + 1ms` per the hand-edit recipe in `docs/migrations.md`. Snapshot 0076 mirrors 0075's table/enum/etc state with new id/prevId UUIDs since no schema-level changes accompany the function deploy.
Schema constraint shape reportno new constraints detected in this diff (uniqueIndex, .unique(), SET NOT NULL, CHECK, FK) |
The CI 'Migration guards' check greps for any reference to library_identity* in migration SQL; my body comment mentioning the downstream #663 column flip tripped it. Add the documented opt-out comment as the first line of the migration. Refreeze applied-hashes.json with the new SHA since the migration body changed. Does NOT address the second dry-run failure (RDS managed PG can't load custom tsearch_data files) — that's tracked separately as a deployment-pattern concern.
CI exposed an architectural blocker — converting to draftThe lint failure (cross-cache-identity precondition-guard regex matching my body-comment mention of The deeper failure is the Two problems exposed:
Proposed path forwardThis PR is not feasible as designed on Backend's current managed-PG stack. Options:
Recommendation: Option B for the eventual BS#663 backfill, since Backend's role post-pivot is "thin writer" (LML composes identity, Backend stores verbatim). Server-side SQL functions are only needed for ad-hoc queries; the backfill itself is naturally a job-level computation. The other three sibling deploys (mb-cache#52, wikidata#38, discogs-etl#195) ship as-is on their self-managed PG instances and serve LML#280's needs without Backend's leg. Converting this PR to draft pending a decision on the path forward. |
Closing — work folds into BS#663 step 2Per the review feedback and the post-pivot architecture review:
The sibling deploys WXYC/musicbrainz-cache#52, WXYC/wikidata-cache#38, WXYC/discogs-etl#195 ship on self-managed PG instances and serve LML#280's needs without Backend's leg. LML#280 is unblocked on WXYC/discogs-etl#195 merging. Closing. |
Summary
Backend half of the four-cache function deploy. Mirrors WXYC/musicbrainz-cache#52 + WXYC/wikidata-cache#38; vendors canonical artifacts from WXYC/wxyc-etl@v0.4.0 (
data/) undervendor/wxyc-etl/and ships them as migration 0076.shared/database/src/migrations/0076_wxyc-identity-match-functions.sql— extension + dictionary setup, then the canonical SQL inlined byte-for-byte.vendor/wxyc-etl/{wxyc_unaccent.rules,.version,wxyc_identity_match_functions.sql}— vendored verbatim from upstream. SHA-pinned inwxyc-etl-pin.txt.dev_env/docker-compose.yml—db(dev) andci-db(ci) mount the rules + version files into/usr/local/share/postgresql/tsearch_data/so the dictionary creates cleanly on first migrate.tests/integration/wxyc-identity-match-functions.spec.js— three-layer check: SHA pin freshness, migration-vs-canonical byte-equality, and a small canonical-artist smoke + idempotence on the live PG.docs/migrations.md— adds a@ruleblock documenting the vendoring convention and the refresh procedure.when = previous + 1msper the hand-edit recipe.Scope
Function deploy only. Per the ticket and wiki §3.3.0 row 6, Backend's
library_identity*column flip is downstream — gated on the E2-BS step-2 backfill PR (#663). This migration ships the function definitions so the backfill window has them available.Closes #805.
Related: parent epic WXYC/wxyc-etl#73, prerequisite WXYC/wxyc-etl#113 (merged), sibling deploys WXYC/musicbrainz-cache#52, WXYC/wikidata-cache#38, WXYC/discogs-etl#194.
Test plan
node scripts/validate-migrations.mjs— passes (75 entries, 2 historical warnings unchanged)psql -f shared/database/src/migrations/0076_*.sql→ 8 functions created, smoke queries return expected normalizationci-dbmounts pick up the rules; integration spec runs on the live container)