Skip to content

memory optimization of tracing #11558

Closed
DarkLord017 wants to merge 3 commits into
masterfrom
memory-fix-trace
Closed

memory optimization of tracing #11558
DarkLord017 wants to merge 3 commits into
masterfrom
memory-fix-trace

Conversation

@DarkLord017
Copy link
Copy Markdown
Contributor

Fixes Closes Resolves #11513

Please choose one of the keywords above to refer to the issue this PR solves followed by the issue number (e.g. Fixes #000). If no issue number, remove the line. Also, remove everything marked optional that is not applicable. Remove this note after reading.

Changes

  • List the changes

Types of changes

What types of changes does your code introduce?

  • Bugfix (a non-breaking change that fixes an issue)
  • New feature (a non-breaking change that adds functionality)
  • Breaking change (a change that causes existing functionality not to work as expected)
  • Optimization
  • Refactoring
  • Documentation update
  • Build-related changes
  • Other: Description

Testing

Requires testing

  • Yes
  • No

If yes, did you write tests?

  • Yes
  • No

Notes on testing

Optional. Remove if not applicable.

Documentation

Requires documentation update

  • Yes
  • No

If yes, link the PR to the docs update or the issue with the details labeled docs. Remove if not applicable.

Requires explanation in Release Notes

  • Yes
  • No

If yes, fill in the details here. Remove if not applicable.

Remarks

Optional. Remove if not applicable.

@DarkLord017
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

Claude finished @DarkLord017's task in 7m 40s —— View job


PR Review: memory optimization of tracing

Tasks

  • Gather context and understand the request
  • Read changed files
  • Read relevant agent rules (coding-style, robustness, performance)
  • Analyze for correctness, security, and performance issues
  • Post inline comments for specific findings
  • Post top-level summary

Summary

The approach is sound: replacing list-accumulation with a bounded channel + IAsyncEnumerable pipeline is the right way to achieve O(1) peak trace memory. The hold-back mechanism for ReportReward is well-reasoned. However, there are two High issues blocking merge, and several Medium issues that should be addressed.


High

1. Silent data loss / behavioral regression on timeout in EndBlockTrace
→ inline comment on ChannelParityLikeBlockTracer.cs:109

When the RPC timeout fires while the consumer is slow (serialising the second-to-last trace), WriteAsync(_held, _cancellationToken) in EndBlockTrace throws OperationCanceledException. The finally { _writer.TryComplete(); } then completes the channel without error, so the consumer exits ReadAllAsync normally — missing the last trace and returning a truncated-but-apparently-valid response to the client. Before this PR, the same timeout would propagate as an error, returning a proper timeout response.

Note: CancellationBlockTracer.EndBlockTrace() does not call ThrowIfCancellationRequested() (line 47 of CancellationBlockTracer.cs), so the only cancellation surface here is the inner WriteAsync. Fix: capture the exception and pass it to TryComplete(ex) in the finally.

2. Dead code: TraceBlock(Block, ParityLikeBlockTracer) is unreachable

Before this PR, trace_callMany called TraceBlock (confirmed via git show HEAD^). This PR replaced that call with StreamBlockViaChannelAsync without removing the now-dead TraceBlock method (TraceRpcModule.cs:341). Fix this →

3. No tests

The PR introduces a new concurrent producer-consumer pattern with: a bounded channel, a hold-back slot, ReportReward mutation, cancellation propagation, and backpressure. There are zero new tests. Given the subtlety of the cancellation bug found above, regression tests (at minimum: cancellation mid-stream, block with rewards, empty block) are needed. Per repo guidelines: "When fixing a bug, always add a regression test" — and this is a new feature of equivalent complexity.


Medium

4. Unused constructorChannelParityLikeBlockTracer(Hash256 txHash, ...) is never called in the PR. → inline

5. DRY violationParityTxTraceFromReplayStreamingResult and ParityTxTraceFromStoreStreamingResult are byte-for-byte identical except for the generic type. Any future bug or serialisation change requires a dual fix. → inline

6. Thread-pool starvation in batch-request fallbackGetEnumerator() calls ToBlockingEnumerable() on a channel-backed IAsyncEnumerable, blocking one thread-pool thread while another does EVM execution. Under concurrent batch load this is a starvation vector (robustness rules: never .Wait() on a Task on the thread pool). → inline

7. trace_filter is still O(block_size) per blockStreamFilterAsync calls ExecuteBlock(...) (the old buffering path) for every block. The outer IAsyncEnumerable reduces cross-block memory but within a block it's unchanged. → inline


Low

8. Extra initial pipe flush — both new streaming results call await pipeWriter.FlushAsync() right after WriteStartArray(), sending [ to the client as a lone byte. The existing GethLikeTxTraceStreamingResult doesn't do this; remove it for consistency. → inline

9. Verbose extension callTaskAsyncEnumerableExtensions.WithCancellation(source, cancellationToken) should just be source.WithCancellation(cancellationToken). → inline

10. Trailing blank lineChannelParityLikeBlockTracer.cs:74, empty line inside AddTrace before the closing brace.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

Claude finished @DarkLord017's task in 7m 26s —— View job


Code Review – memory optimization of tracing

  • Gather context and read rule files
  • Read all changed files
  • Analyze diff
  • Post review findings

Summary

The PR replaces the buffered List<ParityLikeTxTrace> accumulation in several trace_* RPC methods with a producer-consumer channel (ChannelParityLikeBlockTracer) that streams traces one at a time as EVM execution produces them. The approach is sound and the core design — the _held trace hold-back to accommodate ReportReward mutation — is clever and well-documented. However there are two issues that need to be addressed before merge.


Findings

High

trace_filter: silent truncation replaces RPC error responses (inline comment at line 438)

The original trace_filter returned ResultWrapper.Fail(blockSearch) when a mid-range block lookup failed or state was unavailable. The new StreamFilterAsync replaces all four error paths with yield break, which yields a successful JSON-RPC response with a partial or empty result array. API consumers that check for error responses will silently receive wrong data.

This is a breaking behavioral change relative to the existing Trace API contract.

Medium

StreamFilterAsync is async without await — EVM blocks the request thread (inline comment at line 422)

The method is declared async IAsyncEnumerable<T> but has no await expressions (compiler warning CS1998). ExecuteBlock runs synchronous EVM execution directly on whichever thread iterates the enumerator — i.e. the ASP.NET request thread. For a trace_filter spanning many blocks this can stall the thread pool. All other optimized paths in this PR correctly offload EVM work via StreamBlockViaChannelAsync / Task.Run. trace_filter should do the same, or at minimum the async keyword should be dropped and the synchronous semantics documented.

No tests

The PR introduces three new types and materially changes the streaming behaviour of trace_block, trace_replayBlockTransactions, trace_callMany, and trace_filter. There are no new tests, and the template marks "Requires testing: No". At a minimum, regression tests for the trace_filter error path (which now silently truncates instead of returning an error) would prevent future regressions.

Low

trace_filter still uses buffered ExecuteBlock per block — the channel-based optimization is not applied here, so the memory improvement claimed in the PR title is absent for trace_filter. This is inconsistent with the other methods and leaves the heaviest workload (potentially hundreds of blocks) unoptimized.

Nit: trailing blank line inside ChannelParityLikeBlockTracer.AddTrace (inline comment).


What works well

  • The ChannelParityLikeBlockTracer design is clean and the _held hold-back approach for ReportReward compatibility is correctly implemented and thoroughly documented.
  • StreamBlockViaChannelAsync handles cancellation and producer cleanup correctly: the finally block cancels the linked CTS, and the producer is awaited with SuppressThrowing.
  • Both IStreamableResult types follow the established GethLikeTxTraceStreamingResult pattern (including the final pipeWriter.FlushAsync being elided before framework completion, which is correct).
  • Constructors cover all three ParityLikeBlockTracer variants (types, hash+types, dict), so no call-site is left without streaming.

Verdict: not ready to merge — the trace_filter behavioral regression (High) must be resolved; the StreamFilterAsync CS1998 / thread-blocking issue (Medium) should also be addressed.

Comment thread src/Nethermind/Nethermind.JsonRpc/Modules/Trace/TraceRpcModule.cs Outdated
Comment thread src/Nethermind/Nethermind.JsonRpc/Modules/Trace/TraceRpcModule.cs Outdated
Comment thread src/Nethermind/Nethermind.JsonRpc/Modules/Trace/TraceRpcModule.cs Outdated
Comment thread src/Nethermind/Nethermind.JsonRpc/Modules/Trace/TraceRpcModule.cs
@DarkLord017
Copy link
Copy Markdown
Contributor Author

@claude re-review

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

Claude finished @DarkLord017's task in 6m 26s —— View job


Re-review: memory optimization of tracing

Tasks

  • Gather context
  • Read current state of changed files
  • Compare with previous review findings
  • Post updated review

Summary

Significant progress since the last review. The latest commit (2f70927) addresses all the previous High and most Medium/Low findings. The core implementation is now correct and the design is solid. One High issue (no tests) and two Medium issues remain.


What's Fixed

Previous Finding Status
High: Silent data loss on timeout in EndBlockTrace ✅ Fixed — exception is captured in fault and passed to TryComplete(fault)
High: Dead code TraceBlock method ✅ Fixed — method removed
High: trace_filter silent truncation instead of error ✅ Fixed — now throws InvalidOperationException
Medium: Unused Hash256 txHash constructor ✅ Fixed — removed
Medium: DRY violation (two identical streaming result classes) ✅ Fixed — merged into single generic ParityTxTraceStreamingResult<T>
Medium: trace_filter still buffers full blocks ✅ Fixed — now uses StreamBlockViaChannelAsync per block
Medium: CS1998 / StreamFilterAsync runs on request thread ✅ Fixed — now fully async via StreamBlockViaChannelAsync with await foreach
Low: Verbose TaskAsyncEnumerableExtensions.WithCancellation(...) ✅ Fixed — uses source.WithCancellation(...) directly
Low: Extra pipeWriter.FlushAsync after WriteStartArray ✅ Fixed — removed the network flush, jsonWriter.Flush() (to pipe buffer) is correct and consistent with GethLikeTxTraceStreamingResult
Low: Trailing blank line in AddTrace ✅ Fixed

Remaining Issues

High

No tests

All three PR commits introduce zero tests. The new ChannelParityLikeBlockTracer contains a non-trivial concurrent design: a _held slot that requires correct ordering across AddTrace / ReportReward / EndBlockTrace, synchronous blocking on WriteAsync from a producer thread, and cancellation-fault propagation through TryComplete. The just-fixed EndBlockTrace bug (TryComplete() without the exception) existed because this interaction was not covered by tests. At minimum, unit tests should cover:

  • Normal block (multiple traces, no reward) — channel yields all traces
  • Block with reward — _held is mutated by ReportReward, then emitted correctly in EndBlockTrace
  • Cancellation mid-stream — TryComplete(ex) propagates through ReadAllAsync
  • StreamFilterAsync with count limit — correct early yield break

Medium

GetEnumerator() thread-pool starvation risk

// ParityTxTraceStreamingResult.cs:53
source.ToListAsync(CancellationToken.None).AsTask().GetAwaiter().GetResult().GetEnumerator();

Under concurrent batch requests, this blocks one thread-pool thread per request while a second thread (the Task.Run producer) runs EVM execution. N concurrent batch requests consume 2N threads, which can trigger thread-pool starvation (per the robustness rules: no .GetAwaiter().GetResult() on the thread pool). The comment documents the design intent, but the risk is real. Consider materialising via a dedicated thread (Task.Factory.StartNew(..., TaskCreationOptions.LongRunning)) or limiting batch concurrency, rather than blocking a pool thread indefinitely.

Mid-stream trace_filter errors return HTTP 500, not a JSON-RPC error object (→ code)

WriteStreamableResponseAsync (Startup.cs:513) writes {"jsonrpc":"2.0","result": before calling WriteToAsync. If StreamFilterAsync then throws InvalidOperationException (e.g. state unavailable for a mid-range block), the catch at line 408 only handles OperationCanceledException, so the exception propagates as an unhandled error, resulting in an HTTP 500 or abrupt connection close rather than a well-formed {"jsonrpc":"2.0","error":{"code":...}} response.

This affects only edge-case paths (pruned state in the middle of a trace_filter range) — the common-path upfront validation (fromBlock/toBlock checks, lines 232–242) still returns proper JSON-RPC errors. The old behaviour (silent empty result) was arguably worse, but the change in error shape for this path is worth documenting explicitly.

Low

Nullable warning regressionParityTxTraceStreamingResult.cs:70 removes the null-forgiving ! from Activator.CreateInstance(converterType) without adding a null check. Activator.CreateInstance returns object?, so this now produces a CS8600 warning. The original ! was correct: a generic closed type with a parameterless constructor will never return null here.

        return (JsonConverter)Activator.CreateInstance(converterType)!;

Verdict: not ready to merge — the absence of tests for a non-trivial concurrent component (High) should be addressed before merge. The two Medium findings should either be fixed or explicitly acknowledged with rationale.
| Branch

public override JsonConverter? CreateConverter(Type typeToConvert, JsonSerializerOptions options)
{
Type itemType = typeToConvert.GetGenericArguments()[0];
Type converterType = typeof(ParityTxTraceStreamingResultConverter<>).MakeGenericType(itemType);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low: Nullable warning regression — the ! was removed but Activator.CreateInstance returns object?. A concrete closed generic type with a default constructor will never return null here, so the null-forgiving operator was correct. Restoring it keeps the compiler happy without adding a dead null check.

Suggested change
Type converterType = typeof(ParityTxTraceStreamingResultConverter<>).MakeGenericType(itemType);
return (JsonConverter)Activator.CreateInstance(converterType)!;

@svlachakis
Copy link
Copy Markdown
Contributor

I think I'll take that over based on #11693, closing

@svlachakis svlachakis closed this May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve streaming in trace_* to use less memory

2 participants