Skip to content

Fix error handling gaps in recent releases#3640

Closed
deepshekhardas wants to merge 7 commits into
triggerdotdev:mainfrom
deepshekhardas:fix/error-handling-gaps-v2
Closed

Fix error handling gaps in recent releases#3640
deepshekhardas wants to merge 7 commits into
triggerdotdev:mainfrom
deepshekhardas:fix/error-handling-gaps-v2

Conversation

@deepshekhardas
Copy link
Copy Markdown

@deepshekhardas deepshekhardas commented May 16, 2026

Picked up a few error handling gaps while reviewing recent releases:

Mollifier drainer init was missing a catch block - if Redis is down at startup the whole app crashes. Now logs the error and continues.

Replication error recovery was checking the shutdown flag before scheduling the reconnect timeout, but not inside the callback. Could reconnect during graceful shutdown. Added the check inside.

DynamicFlushScheduler had a closure bug - itemCount was captured from the outer map loop instead of the flush function, so metrics got wrong counts. Moved it inside. Also extracted the OTel attribute truncation logic into its own module so it's testable.

ericallam and others added 7 commits May 11, 2026 10:08
Lets us ship a patch (e.g. 4.4.6) from a release/4.4.x branch without
including unreleased work merged into main, and without the patch
clobbering floating tags incorrectly.

The release-pipeline pieces this touches and how each behaves now:

  npm dist-tag        latest if version > current latest, else release-<M.m>
  Docker :v4-beta     same gate (highest version only)
  Docker :release-X.Y new per-line floating tag, always set on a semver build
  GitHub release      --latest=true|false set explicitly (no auto-detect)

How the gate is computed:
  release.yml's 'Compare new version to current latest' step queries
  npm view @trigger.dev/sdk dist-tags.latest, compares via sort -V,
  sets is_latest=true|false. Drives every floating tag.

Triggers / refs:
  - pull_request:branches[main, release/**]
  - if-conditions allow head.ref starting with 'changeset-release/'
  - workflow_dispatch ref must be reachable from main OR a release/* branch
  - changesets-pr.yml fires on push to release/** too; PR-enhance step
    discovers source branch dynamically (no more hardcoded changeset-release/main)

Other changes:
  - gh release create: drop --target main (tag carries right commit)
  - dispatch-changelog payload includes is_latest so the marketing site
    can render lagged-line releases differently
  - enhance-release-pr.mjs prepends a Release prep header on release/*
    branches showing version, current latest, and whether the PR will
    take the latest dist-tag

release-helm.yml unchanged — already creates as draft+prerelease so it
can't claim Latest. publish-worker.yml (coordinator/provider) unchanged
since those don't have a :v4-beta-equivalent floating tag.

Validated end-to-end in ericallam/pkgring-sandbox across both scenarios:
  Scenario A (lagged hotfix): latest stays put, only release-X.Y moves
  Scenario B (main has unreleased work, hotfix is highest): latest moves
- enhance-release-pr.mjs: fix dead try/catch, use proper semver comparison
- release.yml: add retry loop for npm view to prevent silent failures
  incorrectly promoting lagged hotfix to :latest
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 16, 2026

⚠️ No Changeset found

Latest commit: fccc2fc

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

Hi @deepshekhardas, thanks for your interest in contributing!

This project requires that pull request authors are vouched, and you are not in the list of vouched users.

This PR will be closed automatically. See https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md for more details.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7c6ed784-e647-4fe1-95fb-1bdc2238bd20

📥 Commits

Reviewing files that changed from the base of the PR and between 05d3ab1 and fccc2fc.

📒 Files selected for processing (10)
  • .github/workflows/changesets-pr.yml
  • .github/workflows/publish-webapp.yml
  • .github/workflows/publish-worker-v4.yml
  • .github/workflows/publish.yml
  • .github/workflows/release.yml
  • apps/webapp/app/entry.server.tsx
  • apps/webapp/app/services/replicationErrorRecovery.server.ts
  • apps/webapp/app/v3/dynamicFlushScheduler.server.ts
  • apps/webapp/app/v3/otlpAttributeLimits.ts
  • scripts/enhance-release-pr.mjs

Walkthrough

This PR refactors the release and publishing workflow to support multiple release branches and conditionally publish Docker images. The release workflow now dynamically processes changeset-release/* branches and validates refs against both main and release branches. A new step compares the proposed version against current npm latest, emitting an is_latest flag that gates the global :v4-beta Docker tag across publish pipelines. The changesets PR enhancement script now derives release context and renders version comparison guidance. Additionally, new OTLP span attribute truncation utilities enforce size limits with AI-content overrides, a new replication error recovery service handles stream failures with configurable strategies, and supporting fixes address mollifier drainer initialization and dynamic flush scheduler logic.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/error-handling-gaps-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot closed this May 16, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

}
}

const sortedAiKeys = dropPriority.filter((k) => aiKeys.has(k));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 applyTotalSizeBackstop fails to drop prefix-matched sub-key attributes

The backstop function collects attribute keys that match AI prefixes into aiKeys (line 87-94) using matchPrefix, which handles both exact matches and key.startsWith(prefix + "."). However, sortedAiKeys on line 96 filters dropPriority entries by checking aiKeys.has(k) — this only finds entries from dropPriority that are exact attribute keys in the map.

For example, if attributes has a key "ai.prompt.messages.0.content", it is matched by prefix "ai.prompt.messages" and added to aiKeys. But dropPriority.filter((k) => aiKeys.has(k)) checks whether "ai.prompt.messages" is in aiKeys — it isn't (only "ai.prompt.messages.0.content" is). So the sub-key is identified as AI content but never deleted, defeating the backstop.

Concrete trace showing the bug

Given attributes { "ai.prompt.messages.0.content": "huge...", "other": "x" }:

  1. aiKeys = {"ai.prompt.messages.0.content"} (matched via prefix)
  2. sortedAiKeys = dropPriority.filter(k => aiKeys.has(k)) = [] (no dropPriority entry is in aiKeys)
  3. Nothing is deleted — the oversized result is returned as-is.
Prompt for agents
In applyTotalSizeBackstop (otlpAttributeLimits.ts), the sortedAiKeys line at line 96 uses dropPriority.filter((k) => aiKeys.has(k)), which only finds exact matches between dropPriority entries and actual attribute keys. But aiKeys was populated with actual attribute keys matched by prefix, not the prefix strings themselves.

The fix should iterate dropPriority prefixes in order, and for each prefix, find and delete all matching attribute keys from the result (using matchPrefix). Something like:

for (const prefix of dropPriority) {
  for (const key of [...Object.keys(result)]) {
    if (matchPrefix(key, prefix)) {
      delete result[key];
    }
  }
  if (JSON.stringify(result).length <= limits.totalAttributesLengthLimit) break;
}

This preserves the drop-priority ordering while correctly handling sub-keys (e.g. ai.prompt.messages.0.content matched by prefix ai.prompt.messages).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +99 to +101
switch (strategy.type) {
case "log":
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 replicationErrorRecovery 'log' strategy silently swallows errors

In replicationErrorRecovery.server.ts:99-101, when strategy.type === 'log', the handle() method immediately returns without doing anything — including not logging the error. This is counterintuitive given the strategy name "log". By contrast, both scheduleReconnect (line 58) and scheduleExit (line 88) log the error before taking action. This may be intentional if callers are expected to log before calling handle(), but it's worth verifying the design intent since a strategy named "log" that doesn't log is surprising.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +250 to +253
import { initMollifierDrainerWorker } from "~/v3/mollifierDrainerWorker.server";
initMollifierDrainerWorker().catch((error) => {
logger.error("Mollifier drainer initialization failed", { error });
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 entry.server.tsx imports non-existent mollifierDrainerWorker module

Lines 250-253 import initMollifierDrainerWorker from ~/v3/mollifierDrainerWorker.server, but this file does not exist in the repository at this commit. This will fail at build time / typecheck. Not reported as a bug per review guidelines (would be caught by the type checker / CI), but worth noting this PR cannot be merged as-is without the missing file being added in a follow-up commit or concurrent PR.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants