Skip to content

treedb: improve live vlog maintenance observability#856

Merged
snissn merged 86 commits intomainfrom
pr/live-vacuum-observability
Mar 30, 2026
Merged

treedb: improve live vlog maintenance observability#856
snissn merged 86 commits intomainfrom
pr/live-vacuum-observability

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 28, 2026

Summary

  • export treedb.cache.vlog_generation.* metrics through expvar selection
  • add rewrite-plan byte counters (total/live/stale) to maintenance stats
  • add cumulative rewrite reclaimed-bytes counter to make progress visible under live rewrite+gc
  • cover new metric exposure with expvar and scheduler unit tests

Testing

  • go test ./TreeDB/caching -count=1
  • go test ./TreeDB/caching -run 'TestSelectTreeDBExpvarStatsFiltersAndCoerces|TestObserveVlogGenerationRewritePlanOutcome_SelectedTracksBytes|TestObserveVlogGenerationRewritePlanOutcome_SelectedTracksSegmentFallbackBytes' -count=1

Scope note (compression profiling changes)

This branch also carries earlier compression-profiling changes made during the same investigation cycle (bounded sampling, encoder/scratch reuse, and concurrency-setting adjustments in TreeDB/internal/compression/profile.go). Those changes are intentionally kept in this stacked branch for continuity with the live-maintenance experiments and benchmark tooling updates.

Copilot AI review requested due to automatic review settings March 28, 2026 03:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves TreeDB live value-log (vlog) maintenance observability by exposing vlog-generation metrics via expvar and adding additional byte-level counters to make rewrite planning and rewrite+GC progress visible.

Changes:

  • Export treedb.cache.vlog_generation.* metrics through the expvar stat-selection filter.
  • Add rewrite-plan selected-byte counters (total/live/stale) and a cumulative rewrite “reclaimed bytes” counter.
  • Add unit tests covering new expvar exposure and rewrite-plan byte accounting paths.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
TreeDB/caching/db.go Adds new vlog-generation stats counters (plan-selected bytes + reclaimed bytes) and emits them via Stats().
TreeDB/caching/expvar_stats.go Extends expvar filtering to include treedb.cache.vlog_generation.*.
TreeDB/caching/expvar_stats_test.go Verifies expvar selection/coercion includes a representative vlog_generation metric.
TreeDB/caching/vlog_generation_scheduler_test.go Adds unit tests for rewrite-plan outcome observation and byte counter tracking (including fallback path).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/caching/expvar_stats.go
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Follow-up slice pushed in 9ce5339a.

What changed

  • Added live maintenance gate/rejection counters:
    • treedb.cache.vlog_generation.maintenance.{attempts,acquired,collisions}
    • treedb.cache.vlog_generation.maintenance.skip.*
    • treedb.cache.vlog_generation.maintenance.passes.{noop,with_rewrite,with_gc}
  • Added vacuum skip counters:
    • treedb.cache.vlog_generation.vacuum.{skipped_disabled,skipped_rewrite_bytes,skipped_cooldown}
  • Added scheduler tests for the new counters.
  • Added run notes and reproduction in worklog/2026-03-27.md.

run_celestia validation (fast profile)

Command:
LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast STOP_AT_LOCAL_HEIGHT=500 FREEZE_REMOTE_HEIGHT_AT_START=1 NO_PROGRESS_WARN_SECONDS=120 NO_PROGRESS_FAIL_SECONDS=1800 HEAP_CAPTURE_RSS_DELTA_KB=1 CAPTURE_HEAP_ON_MAX_RSS=1 ~/run_celestia.sh

Run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327173138

Latest diagnostics snapshot sampled:
sync/diagnostics/pprof-heap-max-rss-7070580k-20260327173541.treedb_vars.json

Key counters:

  • maintenance.attempts=637
  • maintenance.acquired=56
  • maintenance.collisions=581
  • maintenance.passes.noop=53
  • maintenance.passes.with_gc=2
  • maintenance.passes.with_rewrite=0
  • maintenance.skip.quiet_window=26
  • gc.runs=2
  • rewrite.runs=0
  • vacuum.runs=0

Takeaway: in this early state-sync window, rewrite/vacuum did not trigger, and the dominant behavior was high active-pass contention. We now have explicit counters to drive the next tuning step.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9ce5339a4e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/caching/db.go Outdated
Comment on lines +13107 to +13110
db.scheduleDueVlogGenerationDeferredMaintenance()
db.schedulePendingVlogGenerationCheckpointKick()
rewriteRan := db.vlogGenerationRewriteRuns.Load() > rewriteRunsBefore
gcRan := db.vlogGenerationGCRuns.Load() > gcRunsBefore
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count pass outcomes before queuing follow-up maintenance

The new pass-outcome counters are computed after scheduleDueVlogGenerationDeferredMaintenance()/schedulePendingVlogGenerationCheckpointKick() run in the defer path, so a just-scheduled retry goroutine can increment vlogGenerationRewriteRuns/vlogGenerationGCRuns first and make this pass look like it performed rewrite/GC when it did not. This appears under contention (fast retry wakeups on another core) and skews the new observability metrics (maintenance.passes.*) by overcounting rewrite/gc and undercounting noop for the current pass.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 28, 2026 03:49
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Pushed another follow-up in bf11ec9d to reduce live maintenance collision churn.

Code change

  • runVlogGenerationMaintenanceRetries now coalesces retries while maintenanceActive=true if retry intent is already pending:
    • checkpoint-kick path: when checkpoint_kick.pending=true, wait/backoff instead of re-entering maybeRun... and colliding.
    • deferred path: same behavior when deferred_pending=true.
  • maybeRunPeriodicVlogGenerationMaintenance now exits early while maintenanceActive=true.
  • Added test: TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries.

Validation

  • go test ./TreeDB/caching -count=1

A/B run_celestia comparison

Both runs used:
LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast STOP_AT_LOCAL_HEIGHT=500 FREEZE_REMOTE_HEIGHT_AT_START=1 NO_PROGRESS_WARN_SECONDS=120 NO_PROGRESS_FAIL_SECONDS=1800 HEAP_CAPTURE_RSS_DELTA_KB=1 CAPTURE_HEAP_ON_MAX_RSS=1 ~/run_celestia.sh

Baseline run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327173138
(snapshot pprof-heap-max-rss-7070580k-20260327173541.treedb_vars.json)

New run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327174335
(snapshot pprof-heap-max-rss-6605092k-20260327174744.treedb_vars.json)

Key counter deltas:

  • maintenance.attempts: 637 -> 124
  • maintenance.acquired: 56 -> 119
  • maintenance.collisions: 581 -> 5
  • maintenance.passes.with_rewrite: 0 -> 1
  • maintenance.passes.with_gc: 2 -> 1
  • rewrite.runs: 0 -> 1
  • vacuum.runs: 0 -> 1

This looks like a material reduction in scheduler contention, and maintenance made actual rewrite/vacuum progress in the same early-state-sync lab window.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/caching/db.go Outdated
Comment on lines +13136 to +13152
}
if !rewriteRan && !gcRan {
db.vlogGenerationMaintenancePassNoop.Add(1)
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maintenance pass outcome counters (passes.noop/with_rewrite/with_gc) can be mis-attributed due to the defer order: maintenanceActive is cleared and new maintenance may be scheduled before computing rewriteRan/gcRan from the global counters. If a deferred/checkpoint-kick pass starts quickly after maintenanceActive.Store(false), rewriteRan/gcRan may reflect the next pass instead of the one being released. Consider computing/incrementing the pass outcome counters before clearing vlogGenerationMaintenanceActive and before scheduling any follow-up maintenance (or track rewrite/gc execution via local booleans set during this pass).

Copilot uses AI. Check for mistakes.
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Pushed c83e2d61 with one more low-risk scheduler optimization.

Code change

  • Added periodic preflight in maybeRunPeriodicVlogGenerationMaintenance:
    • when runGC=false and foreground is hot, periodic loop now skips entering maintenance unless a deferred/checkpoint wake is pending.
    • intent: reduce high-frequency no-op maintenance acquisitions.
  • Added test: TestVlogGenerationMaintenance_PeriodicPreflightSkipsHotNoPending.

Validation

  • go test ./TreeDB/caching -count=1

Third comparison run (same lab command)

Run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327175340

Snapshot:
pprof-heap-max-rss-6560568k-20260327175747.treedb_vars.json

Counters:

  • maintenance.attempts=46
  • maintenance.acquired=41
  • maintenance.collisions=5
  • maintenance.passes.noop=38
  • maintenance.passes.with_rewrite=1
  • maintenance.passes.with_gc=1
  • maintenance.skip.quiet_window=0
  • rewrite.runs=1
  • vacuum.runs=1

Trend across runs:

  • baseline: attempts=637, collisions=581, rewrite=0, vacuum=0
  • retry-coalesced: attempts=124, collisions=5, rewrite=1, vacuum=1
  • preflight: attempts=46, collisions=5, rewrite=1, vacuum=1

So the latest change primarily removes maintenance churn/noops while preserving rewrite/vacuum progress.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c83e2d6164

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/caching/db.go Outdated
Comment on lines +12205 to +12208
!db.vlogGenerationCheckpointKickPending.Load() &&
!db.vlogGenerationDeferredMaintenancePending.Load() &&
!db.vlogGenerationDeferredMaintenanceDue(now) {
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Surface deferred-due check errors before preflight skip

This new hot-foreground preflight can return early without entering maybeRunVlogGenerationMaintenance, but the !db.vlogGenerationDeferredMaintenanceDue(now) term internally treats rewrite-stage read errors as a plain false result. In sustained non-quiet workloads with no pending flags, backend errors from the deferred-due probe are now silently ignored, so scheduler error state and notifyError are never triggered until the node eventually becomes quiet. That regresses operational visibility compared with the previous path, which always entered maintenance and reported queue/stage load failures immediately.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 28, 2026 04:17
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Instrumentation-first update pushed to pr/live-vacuum-observability (31fbb0a5).

What this adds

  • Maintenance/rewrite/GC/vacuum timing counters exposed in Stats():
    • treedb.cache.vlog_generation.maintenance.pass.{total,max,avg}_ms
    • treedb.cache.vlog_generation.rewrite.plan.{total,max,avg}_ms
    • treedb.cache.vlog_generation.rewrite.exec.{total,max,avg}_ms
    • treedb.cache.vlog_generation.gc.exec.{total,max,avg}_ms
    • treedb.cache.vlog_generation.vacuum.exec.{total,max,avg}_ms
  • Rewrite debt/backlog observability:
    • treedb.cache.vlog_generation.rewrite.ledger_segments
    • treedb.cache.vlog_generation.rewrite.ledger_bytes_{total,live,stale}
    • treedb.cache.vlog_generation.rewrite.ledger_stale_ratio_ppm
    • treedb.cache.vlog_generation.rewrite.stage_{pending,observed_unix_nano}
    • treedb.cache.vlog_generation.rewrite.penalties_active
    • treedb.cache.vlog_generation.rewrite.age_blocked_{until_unix_nano,remaining_ms}
  • Rewrite budget runtime visibility:
    • treedb.cache.vlog_generation.rewrite_budget.tokens_{bytes,cap_bytes}
    • treedb.cache.vlog_generation.rewrite_budget.tokens_utilization_pct
    • treedb.cache.vlog_generation.rewrite_budget.consumed_bytes_total
  • Token-consume path now tracks consumed bytes in vlogGenerationConsumeRewriteBudgetBytes.

Test coverage

  • Added TestVlogGenerationStats_ReportRewriteBacklogAndDurations in TreeDB/caching/vlog_generation_scheduler_test.go.
  • Validated with:
    • go test ./TreeDB/caching -run TestVlogGenerationStats_ReportRewriteBacklogAndDurations -count=1
    • go test ./TreeDB/caching -count=1

Next step: run run_celestia again and pull these new stats from .treedb_vars.json to quantify where we are bottlenecked (budget-limited, age-blocked, or collision/phase-gated) before changing rewrite behavior.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/caching/expvar_stats.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 31fbb0a53e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/caching/db.go Outdated
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Follow-up pushed: 7cc50d6d (treedb: stop maintenance retry collision amplification).

Why

In a longer run_celestia window (STOP_AT_LOCAL_HEIGHT=2000), application-db metrics showed retry amplification under load:

  • maintenance.attempts=333, acquired=29, collisions=304
  • rewrite.plan_runs=3 but plan_selected=0, rewrite.runs=0
  • checkpoint_kick.pending=true persisted while attempts/collisions climbed

This indicated retry loops were still attempting while maintenanceActive=true in some pending-state transitions.

Change

  • runVlogGenerationMaintenanceRetries now always backs off while maintenanceActive=true (until release/deadline), rather than conditionally retrying based on pending flags.
  • Added regression coverage in TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries for checkpoint path with checkpoint_pending=false + maintenanceActive=true.

Validation

  • go test ./TreeDB/caching -run TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries -count=1
  • go test ./TreeDB/caching -count=1

A/B run_celestia outcome (application.db instance snapshots)

Pre-fix bad snapshot (20260327183236) vs post-fix (20260327184427):

  • maintenance.attempts: 333 -> 38
  • maintenance.acquired: 29 -> 38
  • maintenance.collisions: 304 -> 0
  • rewrite.plan_selected: 0 -> 2
  • rewrite.runs: 0 -> 1
  • vacuum.runs: 0 -> 1
  • rewrite_budget.consumed_bytes_total: 0 -> 33073906

Also logged in worklog/2026-03-27.md with run homes and command context.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7cc50d6de3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/caching/db.go
Comment thread TreeDB/caching/db.go
Comment thread TreeDB/caching/db.go Outdated
Copilot AI review requested due to automatic review settings March 28, 2026 05:30
Copy link
Copy Markdown
Owner Author

snissn commented Mar 28, 2026

Follow-up update pushed: 00355572 (treedb: add stage-gate and rewrite segment counters).

What was added:

  • New maintenance split counters:
    • treedb.cache.vlog_generation.maintenance.skip.stage_gate_not_due
    • treedb.cache.vlog_generation.maintenance.skip.stage_gate_due_reserved
  • New rewrite selection/execution counters:
    • treedb.cache.vlog_generation.rewrite.plan_selected_segments_total
    • treedb.cache.vlog_generation.rewrite.exec.source_segments_total
  • Test coverage extended in TestVlogGenerationStats_ReportRewriteBacklogAndDurations.

Validation:

  • go test ./TreeDB/caching -count=1

Live run (baseline fast profile, no trigger override):

  • Command:
    • LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast FREEZE_REMOTE_HEIGHT_AT_START=1 STOP_AT_LOCAL_HEIGHT=2000 ~/run_celestia.sh
  • Home:
    • /home/mikers/.celestia-app-mainnet-treedb-20260327192126
  • Final snapshot:
    • pprof-heap-max-rss-final-7983364k-20260327192636.treedb_vars.json
  • Key counters:
    • maintenance.attempts=38 acquired=38 collisions=0
    • rewrite.plan_runs=2 plan_selected=2
    • rewrite.plan_selected_segments_total=3
    • rewrite.runs=1 rewrite.exec.source_segments_total=1
    • rewrite.bytes_in=33073442 rewrite.reclaimed_bytes=0
    • maintenance.skip.stage_gate=7
    • maintenance.skip.stage_gate_not_due=7
    • maintenance.skip.stage_gate_due_reserved=0

Interpretation:

  • Planner selected more rewrite debt than was executed in-run (3 selected vs 1 executed).
  • Stage gating was entirely the "not due yet" confirmation wait path in this run.

Offline headroom on same home:

  • pre: du -sb application.db = 4707839386
  • treemap vlog-rewrite -rw:
    • segments_before=20 segments_after=16 bytes_before=4637168004 bytes_after=2039183405 records=964467
  • post: du -sb application.db = 2077350273

So live maintenance remains significantly below post-rewrite floor, and these counters now make that bottleneck explicit.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings March 30, 2026 03:57
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Follow-up updates pushed:

  • 00410937 leafrefscan: reuse walk state to cut scan alloc churn
  • a0e3d3c8 db: refresh leaf-log segment fallback when force-refresh is enabled

Safety fix (AI thread)

  • finalizeCommitLocked now attempts ensureLeafPageLogSegmentRegistered() whenever forceValueLogRefresh=true (not only when touchedValueLogSegments is empty).
  • If registration cannot be confirmed, commit publish falls back to one manager refresh before publishing the new ValueLogSet.
  • Added regression test:
    • TestOuterLeafPointerCommitRefreshesWhenLeafSegmentUnreported

Alloc/path optimization

  • leafrefscan.Walk now reuses DFS stack + visited-map state via sync.Pool to reduce repeated allocation churn in leaf-ref scans.

Validation

  • GOWORK=off go test ./TreeDB/internal/leafrefscan -count=1
  • GOWORK=off go test ./TreeDB/db -run "TestOuterLeafCommitPublishesRegisteredSegmentWithoutExplicitRefresh|TestOuterLeafWriteLoopSkipsForcedValueLogRefreshScans|TestOuterLeafPointerCommitRefreshesWhenLeafSegmentUnreported" -count=1
  • GOWORK=off go test ./TreeDB/db -run "^$" -bench "^BenchmarkValueLogRewriteOnline_LeafRefs_ReserveRIDs$" -benchmem -count=3
  • GOWORK=off go test ./TreeDB/db -count=1

Recent microbench samples:

  • rewrite_allocs/op: 241-245
  • B/op: ~4.381-4.391 MB
  • prior immediate baseline on this branch before these two follow-ups: ~249-254 rewrite_allocs/op

Also resolved the two previously open non-outdated review threads (one code-safety, one PR scope note) and updated PR description with the compression-profile scope note.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/db/leaf_page_log.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96b8d28368

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread scripts/run_celestia_ab.sh Outdated
Comment thread scripts/celestia_fast_gate.sh Outdated
Copilot AI review requested due to automatic review settings March 30, 2026 04:30
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Pushed follow-up 25f59bbd (alloc-churn reduction in rewrite cleanup path).

Change

  • In ValueLogRewriteOnline, avoid taking a CurrentSetNoRefresh() snapshot during zombie-candidate cleanup when ProtectedPaths is not in use.
  • This keeps behavior identical for maintenance/protected-path runs while removing an unnecessary set snapshot+release cycle in the common non-protected rewrite path.

Validation

  • GOWORK=off go test ./TreeDB/db -run "TestValueLogRewrite|TestOuterLeafPointerCommitRefreshesWhenLeafSegmentUnreported" -count=1
  • GOWORK=off go test ./TreeDB/db -count=1

Microbench signal

Command:

  • GOWORK=off go test ./TreeDB/db -run "^$" -bench "^BenchmarkValueLogRewriteOnline_LeafRefs_ReserveRIDs$" -benchmem -count=5

Observed:

  • before: ~243-246 rewrite_allocs/op
  • after: ~237-241 rewrite_allocs/op
  • bytes/op stayed ~flat (~4.38MB/op)

So this is a modest but consistent alloc reduction without changing rewrite semantics.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Pushed follow-up 71c1c139 to fix the failing race-check (linux) job.

CI failure root cause

  • TestProfileFast_ModelCorrectness_WithFlushAndIterators reported data races in internal/valuelog.(*File).readGroupedCompressedFromFileTo under concurrent flush/iterator activity.
  • The race involved shared scratch/cache backing bytes being observed across concurrent grouped-compressed reads.

Fix

  • Added a cache-read helper that copies cached value bytes while holding cacheMu before returning to callers (groupedFrameCacheReadTo).
  • Kept a compatibility groupedFrameCacheLookup path that returns detached raw copies for callers that need offsets.
  • In readGroupedCompressedFromFileTo, replaced pooled payload/raw scratch usage with per-call buffers on this fallback path and stored non-pooled raw in grouped cache entries for this path.

Validation

  • GOWORK=off go test -race ./TreeDB -run "^TestProfileFast_ModelCorrectness_WithFlushAndIterators$" -count=2
  • GOWORK=off go test ./TreeDB/internal/valuelog -count=1
  • GOWORK=off go test ./TreeDB/db -count=1

This should directly address the race-stack seen in CI while preserving functional behavior.

Copilot AI review requested due to automatic review settings March 30, 2026 05:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/internal/valuelog/manager.go Outdated
Comment on lines +360 to +361
rawCopy := make([]byte, len(e.raw))
copy(rawCopy, e.raw)
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groupedFrameCacheLookup now allocates and copies the full decoded grouped frame (rawCopy := make(...); copy(...)) on every cache hit. This can be a large per-read allocation and may significantly reduce the benefit of the grouped-frame cache for compressed reads. Since you already added groupedFrameCacheReadTo to copy just the requested value (and handle templates safely), consider refactoring the callers that only need one sub-value to use groupedFrameCacheReadTo (or add a lookup helper that copies only the [valStart:valEnd] range under the lock) to avoid copying the entire frame.

Suggested change
rawCopy := make([]byte, len(e.raw))
copy(rawCopy, e.raw)
// Copy only the requested sub-value range so the returned slice is not
// backed by cache-owned storage, but avoid allocating the full frame.
valLen := valEnd - valStart
rawCopy := make([]byte, valLen)
copy(rawCopy, e.raw[int(valStart):int(valEnd)])

Copilot uses AI. Check for mistakes.
Comment thread TreeDB/internal/valuelog/manager.go Outdated
Comment on lines +703 to +706
raw := make([]byte, 0, int(rawLen))
pooledRaw := false
raw, err := decodeFramePayloadTo(frame, payload[prefixLen:], f.dictLookup, rawLen, raw)
if err != nil {
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In readGroupedCompressedFromFileTo, pooledRaw is always set to false and raw is allocated with make(...). This makes the pooledRaw/releaseDecodeScratch branches dead code and defeats the existing decode-scratch reuse mechanism, adding avoidable allocations on the grouped+compressed fallback path. Consider using f.takeDecodeScratch(int(rawLen)) (setting pooledRaw=true) and only allocating when the scratch pool can’t satisfy the request, then ensuring the buffer is released when it isn’t cached.

Copilot uses AI. Check for mistakes.
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Superseded by a corrected follow-up comment: #856 (comment)

@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Follow-up after Windows flake on TreeDB vet+test / windows-latest (job 69118858707).

Pushed commit 5610d26:

  • Hardened stage-confirm rewrite tests to wait for async rewrite completion (bounded by 2*schedulerTestWait) before asserting call counts:
    • TestVlogGenerationRewritePlan_StageConfirmationDebtDrainProcessesMultipleSegments
    • TestVlogGenerationRewritePlan_StageConfirmationReplansEvenWhenOtherTriggersFire

Local validation:

  • GOMAXPROCS=2 GOWORK=off go test ./TreeDB/caching -run 'TestVlogGenerationRewritePlan_StageConfirmationDebtDrainProcessesMultipleSegments|TestVlogGenerationRewritePlan_StageConfirmationReplansEvenWhenOtherTriggersFire|TestVlogGenerationRewritePlan_AgeBlockedRetryRunsWhenDue' -count=120
  • GOMAXPROCS=2 GOWORK=off go test ./TreeDB/caching -count=1
  • GOWORK=off go test ./TreeDB/internal/valuelog -count=1
  • GOWORK=off go test -race ./TreeDB -run '^TestProfileFast_ModelCorrectness_WithFlushAndIterators$' -count=1

Also on this branch, previous commit 3d83c8b addressed the two new AI comments in TreeDB/internal/valuelog/manager.go (sub-value-only grouped cache copy + safe decode scratch reuse).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/internal/valuelog/manager.go Outdated
Comment on lines 333 to 366
@@ -290,16 +347,20 @@ func (f *File) groupedFrameCacheLookup(start int64, verifyCRC bool, subIndex int
if e.k <= 0 || e.start != start || e.verifyCRC != verifyCRC || subIndex < 0 || subIndex >= e.k {
continue
}
valStart = e.offsets[subIndex]
valEnd = e.offsets[subIndex+1]
rawLen = e.offsets[e.k]
if valEnd < valStart || valEnd > rawLen || uint32(len(e.raw)) != rawLen {
valueStart := e.offsets[subIndex]
valueEnd := e.offsets[subIndex+1]
frameRawLen := e.offsets[e.k]
if valueEnd < valueStart || valueEnd > frameRawLen || uint32(len(e.raw)) != frameRawLen {
continue
}
f.groupedFrameCacheClock++
e.used = f.groupedFrameCacheClock
f.groupedFrameCacheHits++
return e.raw, valStart, valEnd, rawLen, true

valueLen := int(valueEnd - valueStart)
rawCopy := make([]byte, valueLen)
copy(rawCopy, e.raw[int(valueStart):int(valueEnd)])
return rawCopy, 0, uint32(valueLen), uint32(valueLen), true
}
f.groupedFrameCacheMisses++
return nil, 0, 0, 0, false
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groupedFrameCacheLookup now allocates a new rawCopy on every cache hit (even though the callers in reader_mmap.go still copy/allocate again before returning). This adds avoidable allocations and an extra copy on the mmap fast path. Consider returning a view into the cached decoded frame (as before) and having callers copy exactly once (or refactor mmap callers to use groupedFrameCacheReadTo-style copy-under-lock) so a cache hit doesn't necessarily allocate twice.

Copilot uses AI. Check for mistakes.
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Follow-up for the subsequent TreeDB windows-latest flake (job 69119674672).

Pushed commit f2ff99d:

  • Stage-confirm tests now wait for recorded plan calls (mutex-backed recorder state) instead of asserting immediately via local counters.
  • Updated tests:
    • TestVlogGenerationRewritePlan_StageConfirmationReplansEvenWhenOtherTriggersFire
    • TestVlogGenerationRewritePlan_StageConfirmationClearsStagedDebtWhenPlanEmpties

Local validation:

  • GOMAXPROCS=2 GOWORK=off go test ./TreeDB/caching -run 'TestVlogGenerationRewritePlan_StageConfirmationDebtDrainProcessesMultipleSegments|TestVlogGenerationRewritePlan_StageConfirmationReplansEvenWhenOtherTriggersFire|TestVlogGenerationRewritePlan_StageConfirmationClearsStagedDebtWhenPlanEmpties|TestVlogGenerationRewritePlan_AgeBlockedRetryRunsWhenDue' -count=160
  • GOMAXPROCS=2 GOWORK=off go test ./TreeDB/caching -count=1

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread TreeDB/batch/batch.go
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 30, 2026

Addressed the latest AI review follow-ups in commit 5e846f4.

What changed:

  • TreeDB/internal/valuelog:
    • Removed groupedFrameCacheLookup and switched mmap read paths to groupedFrameCacheReadTo (copy-under-lock path) to avoid extra cache-hit copy/alloc churn.
    • Updated readViaMmapView, readViaMmapViewTo, and readViaMmapAppend accordingly.
  • TreeDB/batch:
    • Reset now nils touchedValueLog (both resetLocked and resetForPool) so future small touched-segment sets can use the fixed-size small-path again.

Validation run locally:

  • GOWORK=off go test ./TreeDB/batch -count=1
  • GOWORK=off go test ./TreeDB/internal/valuelog -count=1
  • GOWORK=off go test ./TreeDB/caching -run '^TestVlogGenerationRewritePlan_AgeBlockedRetryRunsWhenDue$' -count=1
  • GOWORK=off go test -race ./TreeDB -run '^TestProfileFast_ModelCorrectness_WithFlushAndIterators$' -count=1

Current PR state:

  • All checks passing
  • mergeState=CLEAN
  • unresolved review threads=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants