feat: Reduce the size of the assignment by dropping the last block hash#27
Open
define-null wants to merge 9 commits into
Conversation
…k_hash Reads a flatbuffer assignment, rebuilds it using the builder API with last_block_hash set to empty string on every chunk, preserving all other fields and worker data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… in the assignment
kalabukdima
requested changes
May 20, 2026
Contributor
Author
|
As discussed - the portal only needs hashes for the head of the dataset, so pushed the change to conditionally set hashes to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes: NET-585
What is this PR about?
last_block_hashis stored per chunk in the assignment to verify chunk continuity. However, given the current hash format, it occupies a significant amount of space. This PR adds aclear_last_block_hashconfig option that skips generating the hash. This requiresstrict_continuity_checkto be disabled, but reduces the assignment size substantially.Example (mainnet, May 2026):
Updated: conditional last_block_hash call:
722 MB Flatbuffer
319 MB Gzipped
How does it work?
clear_last_block_hash: booladded to the scheduler config. Whentrue,encode_fbwrites actual hashes only for the last chunks in the dataset.src/lib.rsextracted so tools can depend on the main crate.clear-block-hashCLI utility is included for rewriting existing assignments with the hash cleared, useful for evaluating the impact before enabling in production.Running the tool