Skip to content

feat: Reduce the size of the assignment by dropping the last block hash#27

Open
define-null wants to merge 9 commits into
masterfrom
defnull/NET-585/feat-reduce-the-size-of-the-assignment-by-dropping-the-last-block-hash
Open

feat: Reduce the size of the assignment by dropping the last block hash#27
define-null wants to merge 9 commits into
masterfrom
defnull/NET-585/feat-reduce-the-size-of-the-assignment-by-dropping-the-last-block-hash

Conversation

@define-null
Copy link
Copy Markdown
Contributor

@define-null define-null commented May 20, 2026

Closes: NET-585

What is this PR about?

last_block_hash is stored per chunk in the assignment to verify chunk continuity. However, given the current hash format, it occupies a significant amount of space. This PR adds a clear_last_block_hash config option that skips generating the hash. This requires strict_continuity_check to be disabled, but reduces the assignment size substantially.

Example (mainnet, May 2026):

Flatbuffer Gzipped
With hash 1.09 GB 545 MB
Without hash 819 MB 321 MB
Reduction 27% 41%

Updated: conditional last_block_hash call:
722 MB Flatbuffer
319 MB Gzipped

How does it work?

  • clear_last_block_hash: bool added to the scheduler config. When true, encode_fb writes actual hashes only for the last chunks in the dataset.
  • src/lib.rs extracted so tools can depend on the main crate.
  • A clear-block-hash CLI utility is included for rewriting existing assignments with the hash cleared, useful for evaluating the impact before enabling in production.

Running the tool

cargo run -p clear-block-hash -- -c path/to/config.yaml input.fb output.fb

define-null and others added 3 commits May 20, 2026 15:11
…k_hash

Reads a flatbuffer assignment, rebuilds it using the builder API with
last_block_hash set to empty string on every chunk, preserving all
other fields and worker data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@define-null define-null requested a review from kalabukdima May 20, 2026 14:05
Comment thread src/types/assignment.rs Outdated
Comment thread src/types/assignment.rs Outdated
@define-null define-null requested a review from kalabukdima May 20, 2026 14:29
@define-null
Copy link
Copy Markdown
Contributor Author

As discussed - the portal only needs hashes for the head of the dataset, so pushed the change to conditionally set hashes to None, while keeping the hash for the latest dataset. That mitigates the issue with the size, without breaking compatibility with existing clients

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants