Skip to content

valuelog: cut sealed lazy-mmap stat churn on read path#832

Open
snissn wants to merge 14 commits intopr/append-only-retention-entry-hintfrom
pr/vlog-sealed-lazy-mmap-stat-churn
Open

valuelog: cut sealed lazy-mmap stat churn on read path#832
snissn wants to merge 14 commits intopr/append-only-retention-entry-hintfrom
pr/vlog-sealed-lazy-mmap-stat-churn

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 15, 2026

Summary

  • Memoize sealed lazy-mmap deny decisions per segment so repeated read misses do not re-run the same allow/deny + Stat path on every lookup.
  • Track last-known segment size on File and reuse it for sealed lazy-map budget checks.
  • Allow stale sealed mappings (segments that were current-writable before sealing) to remap once to full file size when reads miss out-of-range.

Why

random_read* profiles were dominated by os.(*File).Stat under (*File).tryEnableSealedLazyMmap, creating avoidable alloc/syscall churn and inflating read-path CPU/GC pressure.

Tests

  • go test ./TreeDB/internal/valuelog -count=1
  • go vet ./TreeDB/internal/valuelog
  • Added/updated tests:
    • TestReadUnsafe_SealedLazyMmapBudgetFallsBackToReadAt (deny memoization assertion)
    • TestReadUnsafe_SealedMappedOutOfRangeRemapsToKnownFileSize

Bench Evidence

Unified bench command:
./bin/unified-bench -dbs treedb -profile fast -keys 500000 -progress=false -treedb-index-outer-leaves-in-vlog=true -valsize=100 -treedb-force-value-pointers=false -checkpoint-between-tests

Before (baseline): /home/mikers/tmp/perf-1773572288
After (this branch): /home/mikers/tmp/perf-1773575465

Throughput deltas (TreeDB):

  • Random Read: 200,470 -> 234,557 (+16.99%)
  • Random Read (Parallel): 1,363,211 -> 1,667,356 (+22.31%)
  • Random Read (Parallel, Snapshot Per Key): 1,270,540 -> 1,436,009 (+13.02%)
  • Random Read (Batch): 994,194 -> 1,186,844 (+19.38%)

Alloc-space (allocs_random_read_parallel_treedb.pprof):

  • Total: 1406.80MB -> 285.11MB (-79.73%)
  • os.(*File).Stat is no longer the dominant allocator in random-read sections.

Celestia Validation (serial)

  • fast: /home/mikers/.celestia-app-mainnet-treedb-20260315015213
    • duration_seconds=280
    • max_rss_kb=9398584
    • end_app_bytes=4534118682
    • du_bytes=4534118682
  • wal_on_fast: /home/mikers/.celestia-app-mainnet-treedb-20260315015715
    • duration_seconds=335
    • max_rss_kb=10225652
    • end_app_bytes=4779157599
    • du_bytes=4779157599

Copilot AI review requested due to automatic review settings March 15, 2026 12:04
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dd2cdff8bd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces sealed-segment read-path churn in the value-log by caching sealed lazy-mmap deny outcomes per segment, reusing a last-known segment size to avoid repeated Stat() calls, and allowing a one-time “grow to full size” remap for sealed segments whose mmap was created while they were current-writable.

Changes:

  • Memoize sealed lazy-mmap deny decisions per File and add counters/tests to ensure repeat reads don’t re-run budget gating.
  • Track File.fileSize and reuse it in sealed lazy-mmap budget checks and remap fast-path decisions.
  • Allow stale sealed mappings to attempt a best-effort growth remap to full file size when reads miss out-of-range.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization, file-size reuse, and sealed stale-remap behavior in the mmap read path.
TreeDB/internal/valuelog/manager.go Adds File.fileSize and uses it to reduce Stat() churn for sealed mapping budget checks.
TreeDB/internal/valuelog/manager_test.go Extends tests to assert deny memoization and adds coverage for stale sealed mapping growth remap.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the TreeDB value-log (valuelog) read path by reducing repeated os.File.Stat and sealed-lazy-mmap budget evaluation churn during read misses, aiming to cut syscall/allocation/GC overhead in random-read workloads.

Changes:

  • Memoizes sealed lazy-mmap deny decisions per segment to avoid re-running budget checks and Stat on repeated read misses.
  • Introduces per-File cached fileSize (last known on-disk size) and reuses it for sealed lazy-mmap budget checks and remap decisions.
  • Adds test coverage for deny memoization and for remapping stale sealed mappings to a known file size.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization and uses cached file size to reduce Stat churn; adjusts sealed remap behavior.
TreeDB/internal/valuelog/manager.go Extends File with fileSize + sealedLazyMmapDenied; seeds fileSize on open; reuses cached size in sealed mmap budgeting.
TreeDB/internal/valuelog/manager_test.go Adds/extends tests validating deny memoization and stale sealed remap behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 18, 2026 00:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimize TreeDB valuelog read-path performance by reducing Stat calls and memoizing sealed lazy-mmap allow/deny decisions, while handling stale sealed mappings.

Changes:

  • Memoize sealed lazy-mmap deny decisions per segment and cache last-known file size to avoid repeated Stat calls on read misses.
  • Allow stale sealed mappings to remap/grow to (known) full file size.
  • Add tests asserting deny memoization and stale-mapping remap behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization, file size caching, and stale-mapping remap logic to reduce Stat churn.
TreeDB/internal/valuelog/manager.go Tracks File.fileSize and reuses it in sealed lazy-mmap byte-budget checks.
TreeDB/internal/valuelog/manager_test.go Adds coverage for deny memoization and sealed stale-map remap behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go
Copilot AI review requested due to automatic review settings March 18, 2026 20:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces read-path syscall/alloc churn by caching sealed lazy-mmap decisions and reusing a last-known file size to avoid repeated Stat calls during random reads.

Changes:

  • Memoizes sealed lazy-mmap deny decisions per segment to avoid repeated allow/deny + Stat work on read misses.
  • Tracks and reuses a cached fileSize for sealed mmap budget checks and remap sizing.
  • Adds coverage to ensure deny memoization and stale sealed mapping remap behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization, cached-size usage, and a stale-mapping remap path to reduce Stat churn.
TreeDB/internal/valuelog/manager.go Introduces File.fileSize + sealedLazyMmapDenied and reuses cached size in budget checks.
TreeDB/internal/valuelog/manager_test.go Adds tests covering deny memoization and remapping stale sealed mappings to known size.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/reader_mmap.go
Copilot AI review requested due to automatic review settings March 18, 2026 20:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes TreeDB value-log read performance by reducing sealed lazy-mmap overhead (notably repeated Stat() calls) on read-miss paths, while adding tests to validate deny memoization and stale-mapping remap behavior.

Changes:

  • Cache last-known segment size on valuelog.File and reuse it for sealed mmap budget checks / remap decisions.
  • Add per-segment memoization for sealed lazy-mmap deny decisions and allow best-effort remap for stale sealed mappings.
  • Extend valuelog manager tests to assert deny memoization and out-of-range sealed remap behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization flow, cached target sizing, and remap behavior tweaks for sealed lazy-mmap.
TreeDB/internal/valuelog/manager.go Introduces fileSize + sealedLazyMmapDenied fields; seeds fileSize at open; uses cached size for byte-budget gating.
TreeDB/internal/valuelog/manager_test.go Adds/extends tests covering deny memoization and stale sealed mapping remap behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Comment thread TreeDB/internal/valuelog/manager_test.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba68694e20

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Copilot AI review requested due to automatic review settings March 18, 2026 21:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes TreeDB value-log read performance by reducing Stat() churn and repeated sealed lazy-mmap budget checks on hot read paths, primarily through memoization and caching of segment file size / deny decisions.

Changes:

  • Memoize sealed lazy-mmap deny decisions per segment, with a budget-snapshot retry path when budgets change.
  • Cache last-known segment file size on valuelog.File and reuse it for sealed lazy-mmap size/budget decisions; update it on remap.
  • Add/extend tests to assert deny memoization behavior and validate stale sealed mapping growth-remap behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization fast-path, introduces sealedLazyMmapTargetSize(), and uses cached fileSize to reduce Stat() calls during sealed lazy mmap enable/remap.
TreeDB/internal/valuelog/manager.go Extends File with fileSize + deny memoization fields; updates sealed lazy-mmap allow/deny logic to account for already-mapped targets and cached size.
TreeDB/internal/valuelog/manager_test.go Updates cleanup for synthetic stale mappings and adds tests covering deny memoization and stale sealed mapping remap behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/manager.go
Comment thread TreeDB/internal/valuelog/manager_test.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d57e579548

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/internal/valuelog/reader_mmap.go Outdated
Copilot AI review requested due to automatic review settings March 18, 2026 21:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes TreeDB’s value-log read path by reducing os.File.Stat churn and lock/scanning work when sealed lazy-mmap is denied, while also enabling a one-time remap growth for stale sealed mappings using a cached last-known file size.

Changes:

  • Memoize sealed lazy-mmap deny outcomes per segment (with budget-snapshot retry when budgets change) to avoid repeated allow/deny checks on hot read misses.
  • Cache and reuse last-known segment file size (File.fileSize) for sealed mmap budget checks and remap decisions.
  • Add/extend tests to assert deny memoization and stale sealed mapping remap behavior; relax a batch-arena pool drain assertion to match class-granular draining.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
TreeDB/internal/valuelog/reader_mmap.go Adds deny memoization fast-path, cached target-size helper, and uses cached fileSize to avoid repeated stats where possible.
TreeDB/internal/valuelog/manager.go Extends File with cached fileSize + deny state; updates sealed mmap budget gating to account for already-mapped targets.
TreeDB/internal/valuelog/manager_test.go Adds tests for deny memoization and stale sealed mapping remap; adjusts test cleanup in one existing test.
TreeDB/caching/batch_arena_budget_test.go Updates drain test to allow one class-cap overshoot due to class-granular draining behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread TreeDB/internal/valuelog/manager_test.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa5f1fd09d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread TreeDB/internal/valuelog/reader_mmap.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants