Fix error handling gaps in recent releases#3639
Conversation
Lets us ship a patch (e.g. 4.4.6) from a release/4.4.x branch without
including unreleased work merged into main, and without the patch
clobbering floating tags incorrectly.
The release-pipeline pieces this touches and how each behaves now:
npm dist-tag latest if version > current latest, else release-<M.m>
Docker :v4-beta same gate (highest version only)
Docker :release-X.Y new per-line floating tag, always set on a semver build
GitHub release --latest=true|false set explicitly (no auto-detect)
How the gate is computed:
release.yml's 'Compare new version to current latest' step queries
npm view @trigger.dev/sdk dist-tags.latest, compares via sort -V,
sets is_latest=true|false. Drives every floating tag.
Triggers / refs:
- pull_request:branches[main, release/**]
- if-conditions allow head.ref starting with 'changeset-release/'
- workflow_dispatch ref must be reachable from main OR a release/* branch
- changesets-pr.yml fires on push to release/** too; PR-enhance step
discovers source branch dynamically (no more hardcoded changeset-release/main)
Other changes:
- gh release create: drop --target main (tag carries right commit)
- dispatch-changelog payload includes is_latest so the marketing site
can render lagged-line releases differently
- enhance-release-pr.mjs prepends a Release prep header on release/*
branches showing version, current latest, and whether the PR will
take the latest dist-tag
release-helm.yml unchanged — already creates as draft+prerelease so it
can't claim Latest. publish-worker.yml (coordinator/provider) unchanged
since those don't have a :v4-beta-equivalent floating tag.
Validated end-to-end in ericallam/pkgring-sandbox across both scenarios:
Scenario A (lagged hotfix): latest stays put, only release-X.Y moves
Scenario B (main has unreleased work, hotfix is highest): latest moves
- enhance-release-pr.mjs: fix dead try/catch, use proper semver comparison - release.yml: add retry loop for npm view to prevent silent failures incorrectly promoting lagged hotfix to :latest
|
|
Hi @deepshekhardas, thanks for your interest in contributing! This project requires that pull request authors are vouched, and you are not in the list of vouched users. This PR will be closed automatically. See https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md for more details. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
WalkthroughThis PR extends release workflow infrastructure to support releases from any Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| } | ||
| } | ||
|
|
||
| const sortedAiKeys = dropPriority.filter((k) => aiKeys.has(k)); |
There was a problem hiding this comment.
🔴 applyTotalSizeBackstop fails to drop sub-keyed attributes, rendering the backstop ineffective
The applyTotalSizeBackstop function collects actual attribute keys into aiKeys via matchPrefix (lines 87-94), so sub-keyed attributes like "ai.prompt.messages.0" are correctly identified. However, sortedAiKeys on line 96 filters dropPriority entries by checking exact membership in aiKeys (dropPriority.filter((k) => aiKeys.has(k))). Since aiKeys contains actual attribute keys (e.g., "ai.prompt.messages.0") and NOT the prefix strings (e.g., "ai.prompt.messages"), the filter yields an empty array whenever attribute keys are sub-keyed. The subsequent delete loop on lines 98-101 then deletes nothing, and the backstop silently returns an oversized attributes map.
Concrete example showing the failure
Given attributes {"ai.prompt.messages.0": "<large>", "ai.prompt.messages.1": "<large>"} exceeding the total size limit:
aiKeys={"ai.prompt.messages.0", "ai.prompt.messages.1"}(matched via prefix"ai.prompt.messages")sortedAiKeys=dropPriority.filter(k => aiKeys.has(k))=[]— because"ai.prompt.messages"is NOT inaiKeys- Nothing is deleted, backstop returns oversized data
The truncateAttributes function (otlpAttributeLimits.ts:55-74) handles sub-keys correctly via matchPrefix, but applyTotalSizeBackstop does not.
Prompt for agents
The bug is in applyTotalSizeBackstop in apps/webapp/app/v3/otlpAttributeLimits.ts at line 96. The line `const sortedAiKeys = dropPriority.filter((k) => aiKeys.has(k))` filters drop priority prefix strings by checking exact membership in aiKeys, but aiKeys contains actual attribute keys (e.g. 'ai.prompt.messages.0'), not prefix strings (e.g. 'ai.prompt.messages'). When attributes are sub-keyed, this produces an empty array and nothing gets deleted.
The fix should iterate over the actual attribute keys in aiKeys (not the prefix strings in dropPriority), but still respect the priority ordering from dropPriority. One approach: sort the aiKeys based on which dropPriority prefix they match (using matchPrefix), then delete them in that priority order. For example:
const sortedAiKeys = [...aiKeys].sort((a, b) => {
const aIdx = dropPriority.findIndex(p => matchPrefix(a, p));
const bIdx = dropPriority.findIndex(p => matchPrefix(b, p));
return aIdx - bIdx;
});
Then the for-of loop on the next line would iterate over actual attribute keys and delete them from result.
Was this helpful? React with 👍 or 👎 to provide feedback.
| case "log": | ||
| return; |
There was a problem hiding this comment.
🟡 "log" error recovery strategy silently swallows errors without logging
The handle method in createReplicationErrorRecovery does nothing when the strategy is "log" — it just returns without logging the error (apps/webapp/app/services/replicationErrorRecovery.server.ts:100-101). Both the "reconnect" strategy (line 58) and the "exit" strategy (line 88) call logger.error(...) before taking action. The "log" strategy should at minimum log the error, otherwise replication stream errors are silently swallowed when this strategy is selected, making it impossible to diagnose issues.
| case "log": | |
| return; | |
| case "log": | |
| logger.error("Replication stream error (log-only strategy)", { error }); | |
| return; |
Was this helpful? React with 👍 or 👎 to provide feedback.
| import { initMollifierDrainerWorker } from "~/v3/mollifierDrainerWorker.server"; | ||
| initMollifierDrainerWorker().catch((error) => { | ||
| logger.error("Mollifier drainer initialization failed", { error }); | ||
| }); |
There was a problem hiding this comment.
🚩 Missing mollifierDrainerWorker.server module will break the webapp build
The new import at apps/webapp/app/entry.server.tsx:250 references ~/v3/mollifierDrainerWorker.server, but no file matching this path exists anywhere in the repository (verified via find and grep). This will cause a build failure. Not reported as a bug per review guidelines since it would be caught by the type checker / bundler, but the reviewer should be aware that this PR cannot build the webapp as-is.
Was this helpful? React with 👍 or 👎 to provide feedback.
Three bug fixes: