fix(compactor): stop compaction at TXID gap to prevent repeated failures#1155
Open
fix(compactor): stop compaction at TXID gap to prevent repeated failures#1155
Conversation
1ccf74b to
0e5a589
Compare
When an S3 upload fails after retries, a gap appears in L0 LTX files. Compaction then collects all L0 files including those after the gap and passes them to ltx.Compactor, which rejects non-contiguous inputs. This error repeats indefinitely because the gap persists. Add a contiguity check in Compact() that tracks expectedMinTXID as files are collected. When a gap is detected, compaction stops and only compacts the contiguous prefix before the gap. The monitor recovers the missing file on its next sync cycle, and remaining files compact in a future pass. Fixes #1151
0e5a589 to
456a2d3
Compare
benbjohnson
requested changes
Mar 3, 2026
Owner
benbjohnson
left a comment
There was a problem hiding this comment.
I don't think this fixes #1151. The problem in that issue is that we have gaps in L0 as you can see from this compaction to L1:
time=2026-02-20T03:12:00.987+01:00 level=ERROR msg="compaction failed" level=1 error="write ltx file: extract timestamp from LTX header: non-contiguous transaction ids in input files: (000000000004b1c6,000000000004b1c6) -> (000000000004b1c8,000000000004b1c8)"
Before that, it looks like 000000000004b1c7 failed during upload:
time=2026-02-20T03:10:20.673+01:00 level=ERROR msg="monitor error" db=state.db replica=s3 error="calc pos: max ltx file: operation error S3: ListObjectsV2, https response error StatusCode: 504, RequestID: N/A, HostID: N/A, api error GatewayTimeout: The server did not respond in time.\nfailed to get rate limit token, retry quota exceeded, 0 available, 5 requested" consecutive_errors=2 backoff=2s
And then it failed again when trying to recalculate the remote replica L0 position:
time=2026-02-20T03:11:30.851+01:00 level=ERROR msg="compaction failed" level=1 error="write ltx file: extract timestamp from LTX header: non-contiguous transaction ids in input files: (000000000004b1c6,000000000004b1c6) -> (000000000004b1c8,000000000004b1c8)"
This shouldn't happen as Litestream should re-upload based on the latest version in S3 once it clears its position and recalculates. That code is in Replica.Sync().
I just pushed a PR to get some more info on replica sync and uploads: #1182
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Compact()that stops file collection at the first TXID gap, compacting only the contiguous prefixltx.Compactornever receives non-contiguous inputcalcPos()) re-uploads the missing file on its next sync cycle, and remaining files compact in a future passFixes #1151
Test Plan
TestCompactor_Compact/L0GapStopsAtGap— creates L0 files 1,2,3,5,6 (gap at 4), verifies compaction produces L1 with range 1-3, then fills gap and verifies 4-6 compactsTestCompactor_Compact/L0GapAtStart— creates L0 files 3,4,5 with no prior L1, verifies compactor compacts the available contiguous set (3-5)-race