Skip to content

feat(profiling): Add 30-second timeout for chunk downloads#664

Closed
sentry[bot] wants to merge 1 commit intomainfrom
seer/feat/profiling-chunk-download-timeout
Closed

feat(profiling): Add 30-second timeout for chunk downloads#664
sentry[bot] wants to merge 1 commit intomainfrom
seer/feat/profiling-chunk-download-timeout

Conversation

@sentry
Copy link
Copy Markdown

@sentry sentry bot commented Mar 20, 2026

Fixes VROOM-56. The issue was that: Handler blocks on channel receive without checking request context, leading to 'context canceled' errors when client disconnects.

  • Introduced a 30-second timeout for chunk download operations in the postProfileFromChunkIDs endpoint.
  • Updated chunk.ReadJob to use the new context with the timeout.
  • Added error handling for context.Canceled errors, returning HTTP 503 Service Unavailable.

This fix was generated by Seer in Sentry, triggered by txiao@sentry.io. 👁️ Run ID: 12113277

Not quite right? Click here to continue debugging with Seer.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@sentry sentry bot requested a review from a team as a code owner March 20, 2026 16:57
@sentry sentry bot requested a review from Zylphrex March 20, 2026 16:57
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

if errors.Is(err, context.Canceled) {
w.WriteHeader(http.StatusServiceUnavailable)
return
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout produces DeadlineExceeded, not Canceled error

High Severity

The downloadContext is created with context.WithTimeout, which produces context.DeadlineExceeded when the 30-second timeout fires. However, the error check only matches context.Canceled. This means timeout errors will bypass the 503 handler and fall through to the generic 500 path, getting reported to Sentry — the exact problem this PR aims to fix. The existing flamegraph code correctly checks for context.DeadlineExceeded in the same scenario.

Fix in Cursor Fix in Web

Comment on lines +134 to +137
if errors.Is(err, context.Canceled) {
w.WriteHeader(http.StatusServiceUnavailable)
return
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The new timeout error handling only checks for context.Canceled and misses context.DeadlineExceeded, causing timeouts to return an incorrect HTTP 500 error instead of 503.
Severity: MEDIUM

Suggested Fix

Add a check for context.DeadlineExceeded alongside the existing check for context.Canceled. If the error is context.DeadlineExceeded, write the http.StatusServiceUnavailable header and return, mirroring the behavior for a canceled context. This should be applied in both postProfileFromChunkIDs() and getRawChunk().

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: cmd/vroom/chunk.go#L134-L137

Potential issue: The error handling for the newly introduced 30-second chunk download
timeout is incomplete. When the timeout is exceeded, `context.WithTimeout` generates a
`context.DeadlineExceeded` error. However, the code only checks for `context.Canceled`.
This causes the timeout error to be missed, leading the request to fall through to the
generic error handler which returns an HTTP 500 Internal Server Error instead of the
intended HTTP 503 Service Unavailable. This same logical error exists in both
`postProfileFromChunkIDs()` and `getRawChunk()`.

Did we get this right? 👍 / 👎 to inform future reviews.

@Zylphrex Zylphrex closed this Mar 20, 2026
@Zylphrex Zylphrex deleted the seer/feat/profiling-chunk-download-timeout branch March 20, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant