Conversation
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
maxtable [core dumped]
sc_swapfields
consumer_non_atomic_default_consumer_generated
sc_truncate_lockorder_generated
reco-ddlk-sql
4cbd8b6 to
ae2d5ef
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
sc_resume_logicalsc_generated
consumer_non_atomic_default_consumer_generated
remsql_locks_rte_connect_generated
remsql_locks
reco-ddlk-sql
SChakravorti21
left a comment
There was a problem hiding this comment.
Writing down my understanding so far...
We are introducing a tunable, gbl_emit_gen_commits, so that we can turn off the logic for emitting cluster generation in commit records. By disabling this logic, a replicant can't possibly pick up its previous cluster generation from the transaction log. In other words, it helps us test how logfill behaves when a node restarts and its cluster generation resets.
If we haven't enabled elect_highest_committed_gen or match_on_ckp, then our gen could foreseeably be much lower (defaults to 0) than that of the current master. We want logfill to be able to continue making progress in this case, hence the changes to gen_okay.
The reset_gen test makes sure the logfill makes progress even when our gen resets on restart, instead of refusing to apply log records.
The noresetgen test ensures that with elect_highest_committed_gen, match_on_ckp, etc. enabled, our gen is always fetched from logs on restart instead of resetting.
Questions:
- In
gen_okay, do we need to check forretrieve_gen_from_ckpin addition tomatch_on_ckp? Or doesmatch_on_ckpimplyretrieve_gen_from_ckp? How doesmatch_on_ckphelp guarantee that we have retrieved our most recent gen from the transaction logs?
ae2d5ef to
f77fed3
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
scindex [core dumped]
consumer_non_atomic_default_consumer_generated
remsql_locks_rte_connect_generated
remsql_locks
reco-ddlk-sql
f77fed3 to
8e84307
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
2026-03-17T19:01:21EDT [141605] sc_resume [db unavailable at finish]
2026-03-17T19:01:21EDT [141605] consumer_non_atomic_default_consumer_generated **quarantined**
2026-03-17T19:01:21EDT [141605] remsql_locks_rte_connect_generated **quarantined**
2026-03-17T19:01:21EDT [141605] remsql_locks **quarantined**
2026-03-17T19:01:21EDT [141605] reco-ddlk-sql [timeout] **quarantined**
66e3712 to
b7861f4
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
2026-03-18T10:46:47EDT [275862] consumer_non_atomic_default_consumer_generated **quarantined**
2026-03-18T10:46:47EDT [275862] remsql_locks_rte_connect_generated **quarantined**
2026-03-18T10:46:47EDT [275862] remsql_locks **quarantined**
2026-03-18T10:46:47EDT [275862] truncatesc_offline_generated [timeout] **quarantined**
2026-03-18T10:46:47EDT [275862] reco-ddlk-sql [timeout] **quarantined**
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
2026-03-18T10:51:38EDT [24607] sc_resume_logicalsc_generated **quarantined**
2026-03-18T10:51:38EDT [24607] consumer_non_atomic_default_consumer_generated **quarantined**
2026-03-18T10:51:38EDT [24607] sc_transactional_rowlocks_generated **quarantined**
2026-03-18T10:51:38EDT [24607] remsql_locks **quarantined**
2026-03-18T10:51:38EDT [24607] remsql_locks_rte_connect_generated **quarantined**
2026-03-18T10:51:38EDT [24607] truncatesc_offline_generated [timeout] **quarantined**
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
2026-03-18T13:01:37EDT [253639] reco-ddlk-sql **quarantined**
2026-03-18T13:01:37EDT [253639] consumer_non_atomic_default_consumer_generated **quarantined**
2026-03-18T13:01:37EDT [253639] remsql_locks_rte_connect_generated **quarantined**
2026-03-18T13:01:37EDT [253639] remsql_locks **quarantined**
2026-03-18T13:01:37EDT [253639] noresetgen [timeout]
57d193a to
ad792d2
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
2026-03-18T15:46:20EDT [9195] incremental_backup_usenames
2026-03-18T15:46:20EDT [9195] scindex_logicalsc_generated
2026-03-18T15:46:20EDT [9195] consumer_non_atomic_default_consumer_generated **quarantined**
2026-03-18T15:46:20EDT [9195] remsql_locks_rte_connect_generated **quarantined**
2026-03-18T15:46:20EDT [9195] remsql_locks **quarantined**
2026-03-18T15:46:20EDT [9195] sqllogfill_reset_gen [timeout]
2026-03-18T15:46:20EDT [9195] reco-ddlk-sql [timeout] **quarantined**
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
analyze_partial_index_off_generated **quarantined**
consumer_non_atomic_default_consumer_generated **quarantined**
remsql_locks_rte_connect_generated **quarantined**
remsql_locks **quarantined**
reco-ddlk-sql [timeout] **quarantined**
ad792d2 to
e2eaf40
Compare
Signed-off-by: mhannum <mhannum@bloomberg.net>
Signed-off-by: mhannum <mhannum@bloomberg.net>
Signed-off-by: mhannum <mhannum@bloomberg.net>
e2eaf40 to
32d038a
Compare
roborivers
left a comment
There was a problem hiding this comment.
Cbuild submission: Success ✓.
Regression testing: Success ✓.
The first 10 failing tests are:
sc_resume_logicalsc_generated **quarantined**
consumer_non_atomic_default_consumer_generated **quarantined**
remsql_locks_rte_connect_generated **quarantined**
remsql_locks **quarantined**
truncatesc_offline_generated [timeout] **quarantined**
sqllogfill_reset_gen [timeout]
reco-ddlk-sql [timeout] **quarantined**
guid [timeout]
Fix bug in sql-logfill where the logfill thread won't apply a record which has a generation which is less than the node's current generation, as this assertion is only true if we explicitly set the generation during recovery by reading a commit or a checkpoint record which contains it. Additionally, this provides the sqllogfill_reset_gen test which reproduces the issue and exercises the fix.