Skip to content

fix: prevent piped stream from dropping overflow fragment#3455

Closed
stephensong wants to merge 1 commit intofreenet:mainfrom
stephensong:fix/piped-stream-premature-termination
Closed

fix: prevent piped stream from dropping overflow fragment#3455
stephensong wants to merge 1 commit intofreenet:mainfrom
stephensong:fix/piped-stream-premature-termination

Conversation

@stephensong
Copy link
Copy Markdown
Contributor

Summary

  • Bug: When metadata is embedded in fragment Overall architecture RFC #1 of a PUT stream, the sender uses one extra "overflow" fragment. At intermediate forwarding nodes, StreamingInboundStream::poll_next returned None when is_complete() fired (at base fragment count) but the overflow fragment hadn't arrived yet. This caused piped stream forwarding to skip the last fragment, leaving all downstream nodes stuck at 163/164 fragments for 30s before timing out.

  • Fix: Add bytes_read >= total_bytes check — only treat the overflow slot as unused when the iterator has already yielded enough bytes to cover the declared stream size. If not, wait for the missing fragment.

  • Test: test_streaming_inbound_stream_basic updated to use correct total_bytes (was 15 but only pushed 5 bytes of data — unrealistic). New regression test test_stream_iterator_waits_for_overflow_fragment reproduces the exact race condition.

Root Cause Analysis

The streaming buffer allocates base + 1 fragment slots, where base = ceil(total_bytes / FRAGMENT_PAYLOAD_SIZE). The extra slot handles metadata overhead in fragment #1. When is_complete() returns true (contiguous fragments >= base), the iterator checked is_complete() && get(next_idx).is_none() to detect unused overflow. But when the overflow IS used (metadata embedded), there's a window where is_complete() fires before the overflow fragment arrives via UDP retransmission.

In a multi-hop PUT chain (e.g., n2 → g1 → g3 → n3 → gw → n1), node g1 would finish its piped forwarding with only 163 fragments. The 164th fragment would eventually arrive at g1 via packet retransmission, but the pipe task had already exited.

Observed Impact

In a 7-node local network running CREAM (dairy marketplace), ~40% of PUT operations for user contracts (~230KB with WASM contract code) would fail with stream assembly timeouts. After the fix, 0 stream assembly failures across multiple test runs.

Test plan

  • All 40 existing streaming tests pass
  • All 37 streaming buffer tests pass
  • All 15 piped stream tests pass
  • New regression test passes
  • CREAM integration tests: 0 stream assembly failures (was ~40% failure rate)

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 7, 2026

Claude Rule Review: Failed to run

The Claude rule review step failed (likely an authentication or API issue). Check the workflow logs for details.


Advisory review against .claude/rules/. Critical patterns are enforced by the Rule Lint CI job.

…ragment arrives

When metadata is embedded in fragment freenet#1, the sender uses one extra
"overflow" fragment beyond the base fragment count. The stream iterator's
`poll_next` was returning `None` when `is_complete()` fired (at base
count contiguous) and the overflow slot was empty — but the overflow
fragment might still be in transit.

This caused piped stream forwarding at intermediate nodes to skip the
last fragment, leaving all downstream nodes stuck waiting for it with
163/164 fragments received, eventually timing out after 30s.

The fix adds a `bytes_read >= total_bytes` check: the overflow slot is
only considered genuinely unused when the iterator has already yielded
enough bytes to cover the declared stream size. If not, the iterator
waits for the missing fragment instead of terminating.

Includes a regression test that reproduces the exact scenario.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stephensong stephensong force-pushed the fix/piped-stream-premature-termination branch from 4dae773 to 93b7e03 Compare March 28, 2026 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants