fix: prevent piped stream from dropping overflow fragment#3455
Closed
stephensong wants to merge 1 commit intofreenet:mainfrom
Closed
fix: prevent piped stream from dropping overflow fragment#3455stephensong wants to merge 1 commit intofreenet:mainfrom
stephensong wants to merge 1 commit intofreenet:mainfrom
Conversation
Contributor
Claude Rule Review: Failed to runThe Claude rule review step failed (likely an authentication or API issue). Check the workflow logs for details. Advisory review against |
4 tasks
…ragment arrives When metadata is embedded in fragment freenet#1, the sender uses one extra "overflow" fragment beyond the base fragment count. The stream iterator's `poll_next` was returning `None` when `is_complete()` fired (at base count contiguous) and the overflow slot was empty — but the overflow fragment might still be in transit. This caused piped stream forwarding at intermediate nodes to skip the last fragment, leaving all downstream nodes stuck waiting for it with 163/164 fragments received, eventually timing out after 30s. The fix adds a `bytes_read >= total_bytes` check: the overflow slot is only considered genuinely unused when the iterator has already yielded enough bytes to cover the declared stream size. If not, the iterator waits for the missing fragment instead of terminating. Includes a regression test that reproduces the exact scenario. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4dae773 to
93b7e03
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bug: When metadata is embedded in fragment Overall architecture RFC #1 of a PUT stream, the sender uses one extra "overflow" fragment. At intermediate forwarding nodes,
StreamingInboundStream::poll_nextreturnedNonewhenis_complete()fired (at base fragment count) but the overflow fragment hadn't arrived yet. This caused piped stream forwarding to skip the last fragment, leaving all downstream nodes stuck at 163/164 fragments for 30s before timing out.Fix: Add
bytes_read >= total_bytescheck — only treat the overflow slot as unused when the iterator has already yielded enough bytes to cover the declared stream size. If not, wait for the missing fragment.Test:
test_streaming_inbound_stream_basicupdated to use correct total_bytes (was 15 but only pushed 5 bytes of data — unrealistic). New regression testtest_stream_iterator_waits_for_overflow_fragmentreproduces the exact race condition.Root Cause Analysis
The streaming buffer allocates
base + 1fragment slots, wherebase = ceil(total_bytes / FRAGMENT_PAYLOAD_SIZE). The extra slot handles metadata overhead in fragment #1. Whenis_complete()returns true (contiguous fragments >= base), the iterator checkedis_complete() && get(next_idx).is_none()to detect unused overflow. But when the overflow IS used (metadata embedded), there's a window whereis_complete()fires before the overflow fragment arrives via UDP retransmission.In a multi-hop PUT chain (e.g.,
n2 → g1 → g3 → n3 → gw → n1), node g1 would finish its piped forwarding with only 163 fragments. The 164th fragment would eventually arrive at g1 via packet retransmission, but the pipe task had already exited.Observed Impact
In a 7-node local network running CREAM (dairy marketplace), ~40% of PUT operations for user contracts (~230KB with WASM contract code) would fail with stream assembly timeouts. After the fix, 0 stream assembly failures across multiple test runs.
Test plan
🤖 Generated with Claude Code