refactor(node): introduce BatchVerifier, remove derivation service#904
refactor(node): introduce BatchVerifier, remove derivation service#904
Conversation
Extract batch fetching/parsing/verification logic into a stateless BatchVerifier. Remove the standalone derivation block-production service and its isValidator startup branch. BatchVerifier will be integrated into BlockTagService as the next step.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces the monolithic Derivation service with a stateless BatchVerifier and new BatchProcessor, rewires node startup and executor initialization, integrates optional full L1→L2 batch verification into BlockTagService, makes geth config retries context-aware, and removes the prior Derivation lifecycle and related artifacts. Changes
Sequence Diagram(s)sequenceDiagram
participant App as BatchProcessor / BlockTagService
participant BV as BatchVerifier
participant L1 as L1 Client
participant Beacon as L1 Beacon
participant L2 as L2 Client
participant Val as Validator
App->>L1: Find CommitBatch tx (FilterLogs / txHash)
App->>BV: FetchBatchRoots(ctx, txHash / batchIndex)
BV->>L1: Get transaction & calldata
alt blobs exist
BV->>Beacon: Fetch blob sidecars
Beacon-->>BV: Blob sidecars
end
BV->>BV: Parse calldata -> BatchRoots / BlockContexts
App->>BV: FetchBatchData(ctx, txHash)
BV->>BV: Decode blobs -> BatchInfo (optional)
App->>BV: VerifyBatch(ctx, l2Client, roots, batchData)
rect rgba(100,150,255,0.5)
Note over BV,L2: Step 1: PostStateRoot vs L2 state
BV->>L2: HeaderByNumber(LastBlockNum)
L2-->>BV: Header (StateRoot)
BV->>BV: Compare roots
end
rect rgba(100,150,255,0.5)
Note over BV,L2: Step 2-3: Withdrawal & PrevStateRoot checks
BV->>L2: BlockByNumber(LastBlockNum)
L2-->>BV: Block (WithdrawalRoot)
BV->>BV: Compare withdrawal & prev roots
end
rect rgba(100,150,255,0.5)
Note over BV,L2: Step 4-5: Block contexts & tx content verification
BV->>L2: HeaderByNumber(per-block) / BlockByNumber(per-block)
L2-->>BV: Headers / Blocks / Tx bytes
BV->>BV: Verify timestamps/gas/basefee and tx bytes vs blobs
end
alt Mismatch Detected
BV->>Val: Trigger challenge
Val-->>BV: Challenge response
end
BV-->>App: Verification result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
node/cmd/node/main.go (1)
89-93: Wrap startup errors with%winstead of%v.At Line 89 and Line 92,
%vloses error-chain semantics and blockserrors.Is/Ashandling upstream.♻️ Proposed fix
- if tmNode, err = sequencer.SetupNode(tmCfg, tmVal, executor, nodeConfig.Logger); err != nil { - return fmt.Errorf("failed to setup consensus node, error: %v", err) + if tmNode, err = sequencer.SetupNode(tmCfg, tmVal, executor, nodeConfig.Logger); err != nil { + return fmt.Errorf("failed to setup consensus node: %w", err) } - if err = tmNode.Start(); err != nil { - return fmt.Errorf("failed to start consensus node, error: %v", err) + if err = tmNode.Start(); err != nil { + return fmt.Errorf("failed to start consensus node: %w", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/cmd/node/main.go` around lines 89 - 93, The error returns around the consensus startup lose error-chain semantics because they use fmt.Errorf with %v; update both fmt.Errorf calls that return errors from the consensus setup and tmNode.Start() to use %w so callers can use errors.Is/errors.As to inspect wrapped errors (look for the fmt.Errorf usages that construct "failed to setup consensus node, error: %v" and "failed to start consensus node, error: %v" adjacent to tmNode.Start()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@node/cmd/node/main.go`:
- Around line 75-77: The call to tmVal.GetPubKey() currently ignores its error
and can pass a nil/zero pubKey into node.NewExecutor (via tmVal/GetPubKey and
newSyncerFunc), which later causes silent behavioral changes in
updateSequencerSet; change the code that calls GetPubKey (the tmVal.GetPubKey()
invocation in main.go) to check and handle the returned error explicitly: if
GetPubKey returns an error, log the error (with context) and return/exit early
(or propagate the error) instead of continuing with a nil pubKey so NewExecutor
receives a valid key or the process fails fast; apply the same fix pattern where
GetPubKey is used (e.g., ops/tools/keygen/main.go) to avoid silent zero-valued
tmPubKey use that breaks updateSequencerSet.
In `@node/derivation/batch_verifier.go`:
- Around line 199-223: The fallback read is using the incoming parameter
batchIndex instead of the decoded index derived from calldata (parentBatchIndex
-> roots.BatchIndex), which can read the wrong BatchDataStore entry; update the
BatchDataStore call to use the decoded index (roots.BatchIndex or
parentBatchIndex+1) converted to *big.Int (same conversion already used) so the
fallback uses the same batch index as set on BatchRoots; ensure the call remains
bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx},
new(big.Int).SetUint64(roots.BatchIndex)).
- Around line 342-366: The loop currently treats any single match as sufficient
(only checking matchedCount == 0) so partial coverage of blobHashes can slip
through; change the matching logic to ensure every expected hash in blobHashes
is matched exactly once: build a map keyed by expectedHash (from blobHashes) and
mark entries when you find a matching versionedHash (inside the loop over
blobSidecars using KZGToVersionedHash), append to blobTxSidecar as before, and
after scanning all sidecars verify that every expectedHash map entry was matched
(or that matchedCount == len(blobHashes)); if any expectedHash is not matched,
return the error currently returned for no matches (use l1TxHash.Hex() for
context). Ensure you still copy/convert the blob, commitment and proof into
blobTxSidecar (Blob, KZGBlob, commitment, kzg4844.Proof) only when a match is
recorded.
---
Nitpick comments:
In `@node/cmd/node/main.go`:
- Around line 89-93: The error returns around the consensus startup lose
error-chain semantics because they use fmt.Errorf with %v; update both
fmt.Errorf calls that return errors from the consensus setup and tmNode.Start()
to use %w so callers can use errors.Is/errors.As to inspect wrapped errors (look
for the fmt.Errorf usages that construct "failed to setup consensus node, error:
%v" and "failed to start consensus node, error: %v" adjacent to tmNode.Start()).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9d7ac09e-edf3-48b2-94ab-d61270652ac7
📒 Files selected for processing (5)
node/cmd/node/main.gonode/derivation/batch_verifier.gonode/derivation/derivation.gonode/derivation/derivation_test.gonode/types/retryable_client.go
💤 Files with no reviewable changes (1)
- node/derivation/derivation.go
| // Derive batchIndex from parentBatchHeader embedded in calldata | ||
| parentBatchHeader := types.BatchHeaderBytes(batch.ParentBatchHeader) | ||
| parentBatchIndex, err := parentBatchHeader.BatchIndex() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decode parent batch index: %w", err) | ||
| } | ||
|
|
||
| roots := &BatchRoots{ | ||
| BatchIndex: parentBatchIndex + 1, | ||
| LastBlockNum: batch.LastBlockNumber, | ||
| PrevStateRoot: batch.PrevStateRoot, | ||
| PostStateRoot: batch.PostStateRoot, | ||
| WithdrawalRoot: batch.WithdrawRoot, | ||
| NumL1Messages: batch.NumL1Messages, | ||
| } | ||
|
|
||
| // v0/legacy batches do not encode LastBlockNumber in calldata (it's in the blob). | ||
| // Fall back to the on-chain BatchDataStore. | ||
| if roots.LastBlockNum == 0 { | ||
| batchData, err := bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx}, new(big.Int).SetUint64(batchIndex)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("query BatchDataStore for batchIndex %d: %w", batchIndex, err) | ||
| } | ||
| roots.LastBlockNum = batchData.BlockNumber.Uint64() | ||
| } |
There was a problem hiding this comment.
Validate and consistently use the decoded batch index in fallback reads.
At Line 207 you derive roots.BatchIndex from calldata, but Line 218 queries BatchDataStore using the method argument batchIndex. If these diverge, fallback can read the wrong batch metadata and lead to false verification results.
🛠️ Proposed fix
- roots := &BatchRoots{
- BatchIndex: parentBatchIndex + 1,
+ decodedBatchIndex := parentBatchIndex + 1
+ if batchIndex != decodedBatchIndex {
+ return nil, fmt.Errorf("batch index mismatch: arg=%d decoded=%d", batchIndex, decodedBatchIndex)
+ }
+
+ roots := &BatchRoots{
+ BatchIndex: decodedBatchIndex,
LastBlockNum: batch.LastBlockNumber,
PrevStateRoot: batch.PrevStateRoot,
PostStateRoot: batch.PostStateRoot,
WithdrawalRoot: batch.WithdrawRoot,
NumL1Messages: batch.NumL1Messages,
}
@@
- batchData, err := bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx}, new(big.Int).SetUint64(batchIndex))
+ batchData, err := bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx}, new(big.Int).SetUint64(decodedBatchIndex))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Derive batchIndex from parentBatchHeader embedded in calldata | |
| parentBatchHeader := types.BatchHeaderBytes(batch.ParentBatchHeader) | |
| parentBatchIndex, err := parentBatchHeader.BatchIndex() | |
| if err != nil { | |
| return nil, fmt.Errorf("decode parent batch index: %w", err) | |
| } | |
| roots := &BatchRoots{ | |
| BatchIndex: parentBatchIndex + 1, | |
| LastBlockNum: batch.LastBlockNumber, | |
| PrevStateRoot: batch.PrevStateRoot, | |
| PostStateRoot: batch.PostStateRoot, | |
| WithdrawalRoot: batch.WithdrawRoot, | |
| NumL1Messages: batch.NumL1Messages, | |
| } | |
| // v0/legacy batches do not encode LastBlockNumber in calldata (it's in the blob). | |
| // Fall back to the on-chain BatchDataStore. | |
| if roots.LastBlockNum == 0 { | |
| batchData, err := bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx}, new(big.Int).SetUint64(batchIndex)) | |
| if err != nil { | |
| return nil, fmt.Errorf("query BatchDataStore for batchIndex %d: %w", batchIndex, err) | |
| } | |
| roots.LastBlockNum = batchData.BlockNumber.Uint64() | |
| } | |
| // Derive batchIndex from parentBatchHeader embedded in calldata | |
| parentBatchHeader := types.BatchHeaderBytes(batch.ParentBatchHeader) | |
| parentBatchIndex, err := parentBatchHeader.BatchIndex() | |
| if err != nil { | |
| return nil, fmt.Errorf("decode parent batch index: %w", err) | |
| } | |
| decodedBatchIndex := parentBatchIndex + 1 | |
| if batchIndex != decodedBatchIndex { | |
| return nil, fmt.Errorf("batch index mismatch: arg=%d decoded=%d", batchIndex, decodedBatchIndex) | |
| } | |
| roots := &BatchRoots{ | |
| BatchIndex: decodedBatchIndex, | |
| LastBlockNum: batch.LastBlockNumber, | |
| PrevStateRoot: batch.PrevStateRoot, | |
| PostStateRoot: batch.PostStateRoot, | |
| WithdrawalRoot: batch.WithdrawRoot, | |
| NumL1Messages: batch.NumL1Messages, | |
| } | |
| // v0/legacy batches do not encode LastBlockNumber in calldata (it's in the blob). | |
| // Fall back to the on-chain BatchDataStore. | |
| if roots.LastBlockNum == 0 { | |
| batchData, err := bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx}, new(big.Int).SetUint64(decodedBatchIndex)) | |
| if err != nil { | |
| return nil, fmt.Errorf("query BatchDataStore for batchIndex %d: %w", batchIndex, err) | |
| } | |
| roots.LastBlockNum = batchData.BlockNumber.Uint64() | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/derivation/batch_verifier.go` around lines 199 - 223, The fallback read
is using the incoming parameter batchIndex instead of the decoded index derived
from calldata (parentBatchIndex -> roots.BatchIndex), which can read the wrong
BatchDataStore entry; update the BatchDataStore call to use the decoded index
(roots.BatchIndex or parentBatchIndex+1) converted to *big.Int (same conversion
already used) so the fallback uses the same batch index as set on BatchRoots;
ensure the call remains bv.rollup.BatchDataStore(&bind.CallOpts{Context: ctx},
new(big.Int).SetUint64(roots.BatchIndex)).
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all callers of FetchBatchRoots
rg -n "FetchBatchRoots" --type go -C 5
# Also look at the full FetchBatchRoots function in batch_verifier.go
rg -n "FetchBatchRoots|batchIndex" node/derivation/batch_verifier.go -A 3 -B 3
# Check how BlockTagService calls FetchBatchRoots
rg -n "FetchBatchRoots\|BatchIndex\|batchIndex" node/blocktag/service.go -C 5
Length of output: 9460
@curryxbo, here's the deeper analysis:
Confirmed — the original concern is valid.
From the call site in node/blocktag/service.go line 369:
roots, err := s.batchVerifier.FetchBatchRoots(s.ctx, txHash, batchIndex)The batchIndex parameter comes from L1 event log scanning (the CommitBatch(uint256, bytes32) event). Inside FetchBatchRoots, roots.BatchIndex is independently derived from the calldata's embedded ParentBatchHeader.
In a well-formed chain these two sources agree. However, the inconsistency is architectural: after line 230 sets roots.BatchIndex from calldata, the rest of the function uses roots.BatchIndex as the authoritative decoded value (logging, ChallengeState, etc.) — except the BatchDataStore fallback at line 241, which still reaches back to the raw method parameter.
The fix is simply to use roots.BatchIndex in the fallback so the function is internally consistent with its own decoded state:
if roots.LastBlockNum == 0 {
batchData, err := bv.rollup.BatchDataStore(
&bind.CallOpts{Context: ctx},
new(big.Int).SetUint64(roots.BatchIndex), // use decoded index, not method arg
)
if err != nil {
return nil, fmt.Errorf("query BatchDataStore for batchIndex %d: %w", roots.BatchIndex, err)
}
roots.LastBlockNum = batchData.BlockNumber.Uint64()
}This is a minimal, low-risk change that makes the fallback consistent with the rest of the function. No need for the divergence check I initially proposed — but the use of roots.BatchIndex in the fallback is the right fix regardless.
| for _, sidecar := range blobSidecars { | ||
| var commitment kzg4844.Commitment | ||
| copy(commitment[:], sidecar.KZGCommitment[:]) | ||
| versionedHash := KZGToVersionedHash(commitment) | ||
|
|
||
| for _, expectedHash := range blobHashes { | ||
| if !bytes.Equal(versionedHash[:], expectedHash[:]) { | ||
| continue | ||
| } | ||
| matchedCount++ | ||
| b, err := hexutil.Decode(sidecar.Blob) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decode blob hex for tx %s: %w", l1TxHash.Hex(), err) | ||
| } | ||
| var blob Blob | ||
| copy(blob[:], b) | ||
| blobTxSidecar.Blobs = append(blobTxSidecar.Blobs, *blob.KZGBlob()) | ||
| blobTxSidecar.Commitments = append(blobTxSidecar.Commitments, commitment) | ||
| blobTxSidecar.Proofs = append(blobTxSidecar.Proofs, kzg4844.Proof(sidecar.KZGProof)) | ||
| break | ||
| } | ||
| } | ||
| if matchedCount == 0 { | ||
| return nil, fmt.Errorf("FetchBatchData: no matching blob found for tx %s", l1TxHash.Hex()) | ||
| } |
There was a problem hiding this comment.
Require full blob-hash coverage, not just “at least one match.”
At Line 364, only matchedCount == 0 is treated as failure. Partial matches currently pass, which can leave batch.Sidecar incomplete and cause incorrect batch parsing/verification.
🛠️ Proposed fix
- if matchedCount == 0 {
- return nil, fmt.Errorf("FetchBatchData: no matching blob found for tx %s", l1TxHash.Hex())
+ if matchedCount != len(blobHashes) {
+ return nil, fmt.Errorf(
+ "FetchBatchData: incomplete blob match for tx %s: matched=%d expected=%d",
+ l1TxHash.Hex(), matchedCount, len(blobHashes),
+ )
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/derivation/batch_verifier.go` around lines 342 - 366, The loop currently
treats any single match as sufficient (only checking matchedCount == 0) so
partial coverage of blobHashes can slip through; change the matching logic to
ensure every expected hash in blobHashes is matched exactly once: build a map
keyed by expectedHash (from blobHashes) and mark entries when you find a
matching versionedHash (inside the loop over blobSidecars using
KZGToVersionedHash), append to blobTxSidecar as before, and after scanning all
sidecars verify that every expectedHash map entry was matched (or that
matchedCount == len(blobHashes)); if any expectedHash is not matched, return the
error currently returned for no matches (use l1TxHash.Hex() for context). Ensure
you still copy/convert the blob, commitment and proof into blobTxSidecar (Blob,
KZGBlob, commitment, kzg4844.Proof) only when a match is recorded.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Get the blob matching section of batch_verifier.go
rg -n "matchedCount|blobHashes|blobSidecars|blobTxSidecar" node/derivation/batch_verifier.go -A2 -B2
Length of output: 2194
@curryxbo, the issue is confirmed — Line 387 only guards against zero matches, so partial sidecar coverage (e.g., 2 out of 3 blobs matched) silently passes and leaves batch.Sidecar incomplete.
The minimal fix is the one originally proposed (matchedCount != len(blobHashes)). However, there is a subtle secondary risk: if two different sidecars happen to compute the same versionedHash and both match the same entry in blobHashes, matchedCount would reach len(blobHashes) without all distinct hashes being covered. A map-based approach eliminates this entirely:
// Match each blob hash to its sidecar and assemble the BlobTxSidecar.
var blobTxSidecar eth.BlobTxSidecar
// Track which expected hashes have been matched to prevent double-counting.
matched := make(map[common.Hash]bool, len(blobHashes))
for _, sidecar := range blobSidecars {
var commitment kzg4844.Commitment
copy(commitment[:], sidecar.KZGCommitment[:])
versionedHash := KZGToVersionedHash(commitment)
for _, expectedHash := range blobHashes {
if !bytes.Equal(versionedHash[:], expectedHash[:]) {
continue
}
if matched[expectedHash] {
// Already matched this hash via another sidecar; skip duplicate.
break
}
matched[expectedHash] = true
b, err := hexutil.Decode(sidecar.Blob)
if err != nil {
return nil, fmt.Errorf("decode blob hex for tx %s: %w", l1TxHash.Hex(), err)
}
var blob Blob
copy(blob[:], b)
blobTxSidecar.Blobs = append(blobTxSidecar.Blobs, *blob.KZGBlob())
blobTxSidecar.Commitments = append(blobTxSidecar.Commitments, commitment)
blobTxSidecar.Proofs = append(blobTxSidecar.Proofs, kzg4844.Proof(sidecar.KZGProof))
break
}
}
if len(matched) != len(blobHashes) {
return nil, fmt.Errorf(
"FetchBatchData: incomplete blob match for tx %s: matched=%d expected=%d",
l1TxHash.Hex(), len(matched), len(blobHashes),
)
}
bv.logger.Info("FetchBatchData: blobs matched", "matched", len(matched), "expected", len(blobHashes))
batch.Sidecar = blobTxSidecarKey changes:
matchedmap tracks whichexpectedHashvalues have been covered, preventing double-counting.- Guard on duplicate sidecar matches (
if matched[expectedHash]) skips a sidecar that would re-match an already-covered hash. len(matched) != len(blobHashes)enforces full coverage.
- Wire BatchVerifier into BlockTagService; replace lightweight CommittedStateRoots-only check with 5-step VerifyBatch (PostStateRoot, WithdrawalRoot, PrevStateRoot, BlockContexts, txs) - Add fetchCommitBatchTxHash to locate CommitBatch L1 tx via FilterLogs - Use separate safeSearchTracker / finalizedSearchTracker to avoid safe-tag progress overwriting finalized-tag search start - Add l1SearchTracker: fixed mode (L1StartBlock > 0) or auto mode (refined from last finalized batch at startup, then progressive) - Read DerivationBaseHeight flag in main.go bvCfg to handle snapshot-synced nodes correctly - Pass context.Context to FetchGethConfigWithRetry for cancellation - Add derivation/README.md documenting BatchVerifier and integration
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
node/cmd/node/main.go (1)
136-149: Consider stopping the Executor's Syncer during shutdown.The shutdown logic stops
mock.Sequencer,tmNode, andblockTagSvc, but theExecutormay hold aSyncerinstance (created vianewSyncerFuncindevSequencermode pernode/core/executor.go:163-170). While the Syncer'sStart()is currently commented out, if it's ever enabled, theSyncer.Stop()method (seenode/sync/syncer.go:106-118) should be called to properly release its resources.♻️ Suggested improvement
Consider adding a
Stop()orClose()method toExecutorthat manages the Syncer lifecycle, and call it in the shutdown path:// In shutdown section: if executor != nil { executor.Stop() // Would internally call syncer.Stop() if syncer was started }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/cmd/node/main.go` around lines 136 - 149, Shutdown currently omits stopping the Executor's Syncer, so add a lifecycle method (e.g., Executor.Stop or Executor.Close) on the Executor that checks for a non-nil Syncer (created via newSyncerFunc in devSequencer) and calls Syncer.Stop() when appropriate; then update the shutdown sequence to call executor.Stop() alongside ms.Stop(), tmNode.Stop(), and blockTagSvc.Stop() so the Syncer resources are properly released if Start() is ever enabled.node/derivation/README.md (1)
22-32: Add language specifier to ASCII diagram code block.The fenced code block for the architecture diagram should have a language specifier to satisfy markdown linting. Since this is an ASCII diagram, use
textorplaintext.📝 Proposed fix
-``` +```text BlockTagService (scheduler) │ polls L1 every N seconds🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/derivation/README.md` around lines 22 - 32, The fenced ASCII diagram block containing "BlockTagService (scheduler)" needs a language specifier to satisfy markdown linting; update the opening fence from ``` to ```text (or ```plaintext) so the diagram is treated as plain text—no code changes beyond replacing the opening triple backticks for the diagram block in the README.node/blocktag/service.go (1)
9-19: Consider grouping imports consistently.The
github.com/morph-l2/go-ethereumimport (line 12) should be grouped with other go-ethereum imports (lines 9-11) for consistency.📝 Proposed fix
import ( "context" "fmt" "math/big" "time" "github.com/morph-l2/go-ethereum/accounts/abi/bind" "github.com/morph-l2/go-ethereum/common" "github.com/morph-l2/go-ethereum/ethclient" - "github.com/morph-l2/go-ethereum" + ethereum "github.com/morph-l2/go-ethereum" "github.com/morph-l2/go-ethereum/rpc" tmlog "github.com/tendermint/tendermint/libs/log"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/blocktag/service.go` around lines 9 - 19, The imports in node/blocktag/service.go are inconsistently grouped: the standalone import "github.com/morph-l2/go-ethereum" should be moved into the go-ethereum group with the other related imports (e.g., "github.com/morph-l2/go-ethereum/accounts/abi/bind", "github.com/morph-l2/go-ethereum/common", "github.com/morph-l2/go-ethereum/ethclient", "github.com/morph-l2/go-ethereum/rpc") inside the same import block; update the import section so all go-ethereum packages are grouped together for consistency and readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@node/derivation/batch_verifier.go`:
- Around line 193-197: The Close method on BatchVerifier currently only closes
l1Client and l2EthClient but must also release the HTTP connection pool held by
l1BeaconClient; add a Close() method to the HTTP-based beacon client (e.g.,
BasicHTTPClient or the L1BeaconClient type) that calls the underlying
*http.Client.CloseIdleConnections (and any other cleanup), update the beacon
client interface if necessary to include Close(), and modify
BatchVerifier.Close() to check if bv.l1BeaconClient != nil and call its Close()
before returning.
---
Nitpick comments:
In `@node/blocktag/service.go`:
- Around line 9-19: The imports in node/blocktag/service.go are inconsistently
grouped: the standalone import "github.com/morph-l2/go-ethereum" should be moved
into the go-ethereum group with the other related imports (e.g.,
"github.com/morph-l2/go-ethereum/accounts/abi/bind",
"github.com/morph-l2/go-ethereum/common",
"github.com/morph-l2/go-ethereum/ethclient",
"github.com/morph-l2/go-ethereum/rpc") inside the same import block; update the
import section so all go-ethereum packages are grouped together for consistency
and readability.
In `@node/cmd/node/main.go`:
- Around line 136-149: Shutdown currently omits stopping the Executor's Syncer,
so add a lifecycle method (e.g., Executor.Stop or Executor.Close) on the
Executor that checks for a non-nil Syncer (created via newSyncerFunc in
devSequencer) and calls Syncer.Stop() when appropriate; then update the shutdown
sequence to call executor.Stop() alongside ms.Stop(), tmNode.Stop(), and
blockTagSvc.Stop() so the Syncer resources are properly released if Start() is
ever enabled.
In `@node/derivation/README.md`:
- Around line 22-32: The fenced ASCII diagram block containing "BlockTagService
(scheduler)" needs a language specifier to satisfy markdown linting; update the
opening fence from ``` to ```text (or ```plaintext) so the diagram is treated as
plain text—no code changes beyond replacing the opening triple backticks for the
diagram block in the README.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 818eeef9-e689-4d4e-b392-2b8908db36f3
📒 Files selected for processing (7)
node/blocktag/config.gonode/blocktag/service.gonode/cmd/node/main.gonode/core/executor.gonode/derivation/README.mdnode/derivation/batch_verifier.gonode/types/retryable_client.go
| // Close releases the L1 and L2 RPC connections held by the BatchVerifier. | ||
| func (bv *BatchVerifier) Close() { | ||
| bv.l1Client.Close() | ||
| bv.l2EthClient.Close() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if L1BeaconClient or BasicHTTPClient has a Close method
rg -nP --type=go 'func.*\(.*L1BeaconClient\).*Close|func.*\(.*BasicHTTPClient\).*Close' node/derivation/Repository: morph-l2/morph
Length of output: 40
🏁 Script executed:
# Check NewBatchVerifier to see if l1BeaconClient is initialized and if it's optional
sed -n '100,200p' node/derivation/batch_verifier.go | head -100Repository: morph-l2/morph
Length of output: 3440
🏁 Script executed:
# Check the L1BeaconClient and BasicHTTPClient definitions
cat -n node/derivation/base_client.goRepository: morph-l2/morph
Length of output: 1557
🏁 Script executed:
# Search for L1BeaconClient type definition in batch_verifier.go
rg -n "l1BeaconClient" node/derivation/batch_verifier.go | head -20Repository: morph-l2/morph
Length of output: 574
Close() should also clean up the L1 beacon HTTP client when configured.
The Close() method releases l1Client and l2EthClient but omits l1BeaconClient. When a beacon RPC endpoint is configured, l1BeaconClient is initialized with a BasicHTTPClient that holds an *http.Client (line 35 in base_client.go) with open connection pools. These connections should be explicitly released to follow proper resource cleanup practices.
Add a Close() method to BasicHTTPClient (or L1BeaconClient) and invoke it in BatchVerifier.Close() when l1BeaconClient is not nil.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/derivation/batch_verifier.go` around lines 193 - 197, The Close method
on BatchVerifier currently only closes l1Client and l2EthClient but must also
release the HTTP connection pool held by l1BeaconClient; add a Close() method to
the HTTP-based beacon client (e.g., BasicHTTPClient or the L1BeaconClient type)
that calls the underlying *http.Client.CloseIdleConnections (and any other
cleanup), update the beacon client interface if necessary to include Close(),
and modify BatchVerifier.Close() to check if bv.l1BeaconClient != nil and call
its Close() before returning.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check base_client.go fully
cat -n node/derivation/base_client.go
echo "---"
# Check beacon.go for L1BeaconClient struct and any existing Close methods
cat -n node/derivation/beacon.go 2>/dev/null || echo "beacon.go not found"
echo "---"
# Search for all Close methods in the derivation package
rg -n "func.*Close" node/derivation/
echo "---"
# Find where L1BeaconClient is defined
rg -n "type L1BeaconClient" node/derivation/
echo "---"
# Find all files in derivation package
fd -e go . node/derivation/
Length of output: 10802
@curryxbo — The review comment is valid. Here's the full picture and a concrete fix:
Root cause:
BasicHTTPClient(base_client.go) holds an*http.Clientbut has noClose()method.L1BeaconClient(beacon.go) holds theHTTPinterface (which only definesGet()), so there's no cleanup path.BatchVerifier.Close()never calls any cleanup onl1BeaconClient, leaving the HTTP connection pool open.
Proposed fix — three small changes:
1. node/derivation/base_client.go — add Close() to BasicHTTPClient:
// Close releases idle connections held by the underlying HTTP client.
func (cl *BasicHTTPClient) Close() {
cl.client.CloseIdleConnections()
}2. node/derivation/beacon.go — add Close() to L1BeaconClient (using a type assertion to avoid changing the HTTP interface):
// Close releases any resources held by the underlying HTTP client, if it supports cleanup.
func (cl *L1BeaconClient) Close() {
if c, ok := cl.cl.(interface{ Close() }); ok {
c.Close()
}
}3. node/derivation/batch_verifier.go — update BatchVerifier.Close():
// Close releases the L1 and L2 RPC connections held by the BatchVerifier.
func (bv *BatchVerifier) Close() {
bv.l1Client.Close()
bv.l2EthClient.Close()
if bv.l1BeaconClient != nil {
bv.l1BeaconClient.Close()
}
}The type-assertion approach in step 2 keeps the HTTP interface minimal (no forced Close() on all implementors, which would break beacon_test.go mocks) while still enabling proper cleanup for the concrete *BasicHTTPClient used in production.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@node/cmd/node/main.go`:
- Around line 115-122: derivation.NewBatchVerifier is being called with
context.Background(), which lets its internal retry loop block startup forever;
replace that with a shared, cancellable/root signal-aware context (the same
context you use for shutdown) and pass that context into
derivation.NewBatchVerifier so its retries can be cancelled; thread the same
context into blocktag.NewBlockTagService (and any related constructors) so
BatchVerifier creation can be interrupted and the fallback path (bv == nil)
executes and graceful shutdown works; refer to derivation.NewBatchVerifier,
blocktag.NewBlockTagService, BatchVerifier/bv, and nodeConfig.Logger when
locating the calls to update.
Replace context.Background() passed to NewBatchVerifier with a signal.NotifyContext so that startup retries can be interrupted by OS signals, preventing an indefinite hang when the L2 endpoint is down. Also removes the now-redundant manual interrupt channel.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@node/blocktag/service.go`:
- Around line 391-405: The tracker only advances via tracker.Advance, causing
fetchCommitBatchTxHash to miss events if batchIndex is rewound by
findCompletedBatchForL2BlockWithDepth; modify the logic that updates the tracker
so when the cached batch index moves backwards you call the tracker's reset
method (e.g., tracker.Reset or set its FromBlock to zero) or detect the rewind
and bypass the optimized FromBlock by passing a nil/zero FromBlock to
s.l1Client.FilterLogs; update both the CommitBatch lookup here (around
fetchCommitBatchTxHash usage where tracker.Advance is called) and the analogous
spot mentioned (lines ~618-622) to ensure the tracker is reset/bypassed on
batch-index rewind.
In `@node/derivation/batch_verifier.go`:
- Around line 108-118: NewBatchVerifier currently uses ethclient.Dial which
ignores the caller context; change both L1 and L2 dials to context-aware calls
(use ethclient.DialContext with the provided ctx or equivalent context-aware RPC
dial) so the function respects cancellation/timeouts, and keep the existing
cleanup logic (ensure l1Client.Close() is still called if the L2 dial fails).
Update the L1 dial for l1Client and the L2 dial for l2EthClient to use ctx and
propagate the wrapped error messages as before.
In `@node/types/retryable_client.go`:
- Around line 67-89: FetchGethConfigWithRetry currently calls
FetchGethConfig(rpcURL, logger) without a context, so long-running
rpc.Dial()/client.Call() calls can't be canceled; change FetchGethConfig
signature to accept ctx context.Context, update its implementation to use
rpc.DialContext(ctx, ...) and client.CallContext(ctx, ...) (or the context-aware
equivalents), and then pass the ctx from FetchGethConfigWithRetry into
FetchGethConfig(ctx, rpcURL, logger) so cancellation/timeout in the retry loop
actually aborts the underlying RPC operations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 24adc770-ccd8-4fce-9073-53f97b320e53
📒 Files selected for processing (4)
node/blocktag/service.gonode/derivation/batch_info.gonode/derivation/batch_verifier.gonode/types/retryable_client.go
💤 Files with no reviewable changes (1)
- node/derivation/batch_info.go
| logs, err := s.l1Client.FilterLogs(s.ctx, ethereum.FilterQuery{ | ||
| FromBlock: new(big.Int).SetUint64(tracker.FromBlock()), | ||
| Addresses: []common.Address{s.rollupAddress}, | ||
| Topics: [][]common.Hash{ | ||
| {derivation.RollupEventTopicHash}, | ||
| {batchIndexHash}, | ||
| }, | ||
| }) | ||
| if err != nil { | ||
| return common.Hash{}, fmt.Errorf("filter CommitBatch logs for batch %d: %w", batchIndex, err) | ||
| } | ||
| if len(logs) == 0 { | ||
| return common.Hash{}, fmt.Errorf("no CommitBatch event found for batch %d", batchIndex) | ||
| } | ||
| tracker.Advance(logs[0].BlockNumber) |
There was a problem hiding this comment.
Reset the log tracker when batch selection moves backward.
These trackers only advance forward. That breaks the rollback path in this same file: when findCompletedBatchForL2BlockWithDepth resets the cached batch index after an L2 reorg, an earlier batchIndex can become valid again, but FromBlock() is now past its CommitBatch event, so fetchCommitBatchTxHash will never find it. Please reset the corresponding tracker on batch-index rewind, or bypass the optimized FromBlock for that lookup.
Also applies to: 618-622
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/blocktag/service.go` around lines 391 - 405, The tracker only advances
via tracker.Advance, causing fetchCommitBatchTxHash to miss events if batchIndex
is rewound by findCompletedBatchForL2BlockWithDepth; modify the logic that
updates the tracker so when the cached batch index moves backwards you call the
tracker's reset method (e.g., tracker.Reset or set its FromBlock to zero) or
detect the rewind and bypass the optimized FromBlock by passing a nil/zero
FromBlock to s.l1Client.FilterLogs; update both the CommitBatch lookup here
(around fetchCommitBatchTxHash usage where tracker.Advance is called) and the
analogous spot mentioned (lines ~618-622) to ensure the tracker is
reset/bypassed on batch-index rewind.
| func NewBatchVerifier(ctx context.Context, cfg *Config, vt *validator.Validator, logger tmlog.Logger) (*BatchVerifier, error) { | ||
| l1Client, err := ethclient.Dial(cfg.L1.Addr) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("dial l1 node error: %w", err) | ||
| } | ||
|
|
||
| l2EthClient, err := ethclient.Dial(cfg.L2.EthAddr) | ||
| if err != nil { | ||
| l1Client.Close() | ||
| return nil, fmt.Errorf("dial l2 eth node error: %w", err) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file and the specific lines mentioned
head -n 150 node/derivation/batch_verifier.go | tail -n 50Repository: morph-l2/morph
Length of output: 1698
🏁 Script executed:
# Check if ethclient.DialContext exists in the codebase or go-ethereum
rg "DialContext" --type goRepository: morph-l2/morph
Length of output: 478
🏁 Script executed:
# Check the import statement and go.mod to understand which go-ethereum version is used
cat go.mod | grep ethereumRepository: morph-l2/morph
Length of output: 96
🏁 Script executed:
# Check for any existing usage of context-aware dials in the codebase
rg "ethclient\.Dial" --type go -A 2 -B 2Repository: morph-l2/morph
Length of output: 5606
🏁 Script executed:
# Check if there are any context-aware patterns used elsewhere in batch_verifier.go
grep -n "context\|ctx\|Context" node/derivation/batch_verifier.go | head -20Repository: morph-l2/morph
Length of output: 1796
🏁 Script executed:
# Verify ethclient is imported from go-ethereum
grep -n "import" node/derivation/batch_verifier.go | head -5Repository: morph-l2/morph
Length of output: 68
🏁 Script executed:
# Get the full imports section of batch_verifier.go
sed -n '3,30p' node/derivation/batch_verifier.goRepository: morph-l2/morph
Length of output: 941
Use context-aware dials for L1 and L2 connections in NewBatchVerifier.
Lines 109 and 114 accept a caller context but ignore it when dialing L1 and L2 endpoints. A slow or blackholed RPC endpoint can block NewBatchVerifier indefinitely, preventing context cancellation from taking effect even though the later geth-config fetch at line 154 is context-aware.
Suggested fix
- l1Client, err := ethclient.Dial(cfg.L1.Addr)
+ l1Client, err := ethclient.DialContext(ctx, cfg.L1.Addr)
if err != nil {
return nil, fmt.Errorf("dial l1 node error: %w", err)
}
- l2EthClient, err := ethclient.Dial(cfg.L2.EthAddr)
+ l2EthClient, err := ethclient.DialContext(ctx, cfg.L2.EthAddr)
if err != nil {
l1Client.Close()
return nil, fmt.Errorf("dial l2 eth node error: %w", err)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/derivation/batch_verifier.go` around lines 108 - 118, NewBatchVerifier
currently uses ethclient.Dial which ignores the caller context; change both L1
and L2 dials to context-aware calls (use ethclient.DialContext with the provided
ctx or equivalent context-aware RPC dial) so the function respects
cancellation/timeouts, and keep the existing cleanup logic (ensure
l1Client.Close() is still called if the L2 dial fails). Update the L1 dial for
l1Client and the L2 dial for l2EthClient to use ctx and propagate the wrapped
error messages as before.
| // FetchGethConfigWithRetry fetches geth config with retry, waiting for geth to be ready. | ||
| func FetchGethConfigWithRetry(rpcURL string, logger tmlog.Logger) (*GethConfig, error) { | ||
| // The call blocks until geth responds, ctx is canceled, or the retry limit is reached. | ||
| func FetchGethConfigWithRetry(ctx context.Context, rpcURL string, logger tmlog.Logger) (*GethConfig, error) { | ||
| var lastErr error | ||
| for i := 0; i < GethRetryAttempts; i++ { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return nil, ctx.Err() | ||
| default: | ||
| } | ||
| config, err := FetchGethConfig(rpcURL, logger) | ||
| if err == nil { | ||
| return config, nil | ||
| } | ||
| lastErr = err | ||
| logger.Info("Waiting for geth to be ready...", "attempt", i+1, "error", err) | ||
| time.Sleep(GethRetryInterval) | ||
| select { | ||
| case <-ctx.Done(): | ||
| return nil, ctx.Err() | ||
| case <-time.After(GethRetryInterval): | ||
| } | ||
| } | ||
| return nil, fmt.Errorf("geth not ready after %d attempts: %w", GethRetryAttempts, lastErr) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "retryable_client.go" -type fRepository: morph-l2/morph
Length of output: 90
🏁 Script executed:
wc -l ./node/types/retryable_client.goRepository: morph-l2/morph
Length of output: 94
🏁 Script executed:
sed -n '67,89p' ./node/types/retryable_client.goRepository: morph-l2/morph
Length of output: 862
🏁 Script executed:
rg "func FetchGethConfig\(" ./node/types/retryable_client.go -A 20Repository: morph-l2/morph
Length of output: 704
🏁 Script executed:
head -30 ./node/types/retryable_client.goRepository: morph-l2/morph
Length of output: 870
🏁 Script executed:
rg "rpc\.(Dial|Call)" ./node/types/retryable_client.go -B 2 -A 2Repository: morph-l2/morph
Length of output: 319
🏁 Script executed:
rg "DialContext|CallContext" ./node/types/retryable_client.goRepository: morph-l2/morph
Length of output: 40
🏁 Script executed:
find . -path "*/go-ethereum/rpc*" -type f | head -20Repository: morph-l2/morph
Length of output: 40
🏁 Script executed:
grep "go-ethereum" ./go.modRepository: morph-l2/morph
Length of output: 99
🏁 Script executed:
find . -name "go.mod" -type f | head -5Repository: morph-l2/morph
Length of output: 151
🏁 Script executed:
grep -i "go-ethereum\|rpc\|backoff" ./node/go.mod | head -20Repository: morph-l2/morph
Length of output: 356
🏁 Script executed:
web_search
Does github.com/morph-l2/go-ethereum rpc.Client support DialContext and CallContext methods?Repository: morph-l2/morph
Length of output: 151
🌐 Web query:
go-ethereum rpc.Client DialContext CallContext methods
💡 Result:
In go-ethereum’s rpc package:
-
Connect (with context for initial dial only):
func DialContext(ctx context.Context, rawurl string) (*rpc.Client, error)— likeDial, butctxonly cancels/times out connection establishment; it does not affect later RPC calls. The client will auto-reconnect if the connection drops.rawurlcan behttp(s)://,ws(s)://, or an IPC path. [1]
-
Make a JSON-RPC request (with per-call cancel/deadline):
func (c *Client) CallContext(ctx context.Context, result interface{}, method string, args ...interface{}) error— performs the JSON-RPC call; ifctxis canceled before it returns, it returns immediately.resultmust be a pointer (ornilto ignore the result). [1]
Minimal example:
c, err := rpc.DialContext(ctx, "https://rpc.example.com")
if err != nil { /* handle */ }
defer c.Close()
var block hexutil.Big
err = c.CallContext(ctx, &block, "eth_blockNumber")If you want a higher-level Ethereum API wrapper, ethclient.Client is built on top of the same RPC transport and provides typed methods (e.g., BlockNumber wrapping eth_blockNumber). [2]
Sources: go-ethereum rpc package docs [1], Geth “Go API” docs [2].
🏁 Script executed:
rg "FetchGethConfig" ./node/types/retryable_client.goRepository: morph-l2/morph
Length of output: 541
🏁 Script executed:
rg "FetchGethConfig\(" ./node/types/retryable_client.go -B 2 -A 2Repository: morph-l2/morph
Length of output: 364
Propagate ctx into the eth_config RPC call to respect cancellation.
Line 77 calls FetchGethConfig(rpcURL, logger) without passing the context, and FetchGethConfig uses rpc.Dial() and client.Call() instead of their context-aware variants. If the connection or RPC call blocks (e.g., network timeout, unresponsive geth), the context cancellation checks in lines 73-75 and 86-89 cannot interrupt the underlying blocking operations, defeating the startup/shutdown timeout behavior.
Update FetchGethConfig to accept context.Context and use rpc.DialContext() and client.CallContext(), then pass ctx from FetchGethConfigWithRetry (line 77).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/types/retryable_client.go` around lines 67 - 89,
FetchGethConfigWithRetry currently calls FetchGethConfig(rpcURL, logger) without
a context, so long-running rpc.Dial()/client.Call() calls can't be canceled;
change FetchGethConfig signature to accept ctx context.Context, update its
implementation to use rpc.DialContext(ctx, ...) and client.CallContext(ctx, ...)
(or the context-aware equivalents), and then pass the ctx from
FetchGethConfigWithRetry into FetchGethConfig(ctx, rpcURL, logger) so
cancellation/timeout in the retry loop actually aborts the underlying RPC
operations.
Introduce a new BatchProcessor that sequentially walks every committed batch, verifies via BatchVerifier, and updates safe/finalized block tags. This replaces BlockTagService which skipped intermediate batches.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
node/batchprocessor/config.go (1)
33-48: Consider validatingL1Addris not empty.
SetCliContextretrievesL1Addrfrom the CLI flag but doesn't validate it. If the flag is unset or empty, the config will pass through here successfully but fail later inNewBatchProcessor(line 67-68 of processor.go). For consistency with the rollup address validation, consider adding an empty check here.♻️ Optional: Add L1Addr validation
func (c *Config) SetCliContext(ctx *cli.Context) error { c.L1Addr = ctx.GlobalString(flags.L1NodeAddr.Name) + if c.L1Addr == "" { + return fmt.Errorf("L1 node address is required: specify --%s", flags.L1NodeAddr.Name) + } if ctx.GlobalBool(flags.MainnetFlag.Name) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/batchprocessor/config.go` around lines 33 - 48, SetCliContext currently assigns c.L1Addr from the CLI but doesn't validate it; add a check in Config.SetCliContext after c.L1Addr = ctx.GlobalString(flags.L1NodeAddr.Name) to ensure c.L1Addr is not empty (zero-length) and return a descriptive error if it is (similar style to the RollupAddress error). Reference the Config.SetCliContext function and flags.L1NodeAddr.Name; this prevents later failures in NewBatchProcessor by failing fast with a clear message.node/derivation/README.md (1)
22-32: Add language specifiers to fenced code blocks.Static analysis flagged missing language specifiers on the architecture diagram code blocks. Adding
textorplaintextimproves accessibility and tooling support.📝 Proposed fix
-``` +```text BatchProcessor (scheduler, node/batchprocessor)Apply the same change to the code block starting at line 154.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/derivation/README.md` around lines 22 - 32, The fenced diagram code blocks in the README (the block that begins with "BatchProcessor (scheduler, node/batchprocessor)" and the later block that includes "BatchVerifier (stateless, no goroutines, node/derivation)") are missing language specifiers; update each opening fence from ``` to ```text (or ```plaintext) so the blocks read e.g. ```text BatchProcessor (scheduler, node/batchprocessor) and similarly for the BatchVerifier block to improve tooling and accessibility.node/batchprocessor/processor.go (1)
285-290: Address TODO: define verification failure behavior.The TODO comment indicates undefined behavior on verification failure. Currently, the error propagates and stops cursor advancement. Consider whether:
- The current behavior (stop advancing) is correct for safety
- A challenge should be triggered (as mentioned in README for steps 1-2)
- Metrics/alerts should be emitted
Would you like help defining the failure handling strategy or opening an issue to track this?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/batchprocessor/processor.go` around lines 285 - 290, The verification failure TODO indicates undefined failure handling for bp.verifyBatch when bp.batchVerifier != nil; implement a clear strategy rather than just returning an error: add a handler (e.g., method handleVerificationFailure on the processor) and call it from the branch that catches err from verifyBatch(batchIndex, currentL1); the handler should encapsulate chosen behavior — either stop advancing the cursor and return the error for safety, or trigger a challenge via bp.challenger.TriggerChallenge(batchIndex, err) and emit metrics/alerts via bp.metrics (e.g., IncrementVerificationFailures and emit an alert event) — ensure only successful verification advances the cursor, and update tests to cover both success and failure flows.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@node/batchprocessor/processor.go`:
- Around line 325-351: The fetchCommitBatchTxHash function currently may Scan
from genesis when bp.lastCommitL1Block == 0; update initCursors to seed
bp.lastCommitL1Block instead of leaving it zero (or set a bounded fallback
lookback) by locating the CommitBatch event for the finalized batch: after
resolving the finalized L2 block in initCursors, query FilterLogs (or use
indexed event lookup) for the CommitBatch with batch index == finalized index
and set bp.lastCommitL1Block to that event's BlockNumber; alternatively, if that
lookup fails, set bp.lastCommitL1Block = max(0, currentL1 - LOOKBACK_BLOCKS) so
fetchCommitBatchTxHash never searches from block 0.
---
Nitpick comments:
In `@node/batchprocessor/config.go`:
- Around line 33-48: SetCliContext currently assigns c.L1Addr from the CLI but
doesn't validate it; add a check in Config.SetCliContext after c.L1Addr =
ctx.GlobalString(flags.L1NodeAddr.Name) to ensure c.L1Addr is not empty
(zero-length) and return a descriptive error if it is (similar style to the
RollupAddress error). Reference the Config.SetCliContext function and
flags.L1NodeAddr.Name; this prevents later failures in NewBatchProcessor by
failing fast with a clear message.
In `@node/batchprocessor/processor.go`:
- Around line 285-290: The verification failure TODO indicates undefined failure
handling for bp.verifyBatch when bp.batchVerifier != nil; implement a clear
strategy rather than just returning an error: add a handler (e.g., method
handleVerificationFailure on the processor) and call it from the branch that
catches err from verifyBatch(batchIndex, currentL1); the handler should
encapsulate chosen behavior — either stop advancing the cursor and return the
error for safety, or trigger a challenge via
bp.challenger.TriggerChallenge(batchIndex, err) and emit metrics/alerts via
bp.metrics (e.g., IncrementVerificationFailures and emit an alert event) —
ensure only successful verification advances the cursor, and update tests to
cover both success and failure flows.
In `@node/derivation/README.md`:
- Around line 22-32: The fenced diagram code blocks in the README (the block
that begins with "BatchProcessor (scheduler, node/batchprocessor)" and the later
block that includes "BatchVerifier (stateless, no goroutines, node/derivation)")
are missing language specifiers; update each opening fence from ``` to ```text
(or ```plaintext) so the blocks read e.g. ```text BatchProcessor (scheduler,
node/batchprocessor) and similarly for the BatchVerifier block to improve
tooling and accessibility.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5ce2c425-7499-4309-ae4d-0222afdd5b06
📒 Files selected for processing (4)
node/batchprocessor/config.gonode/batchprocessor/processor.gonode/cmd/node/main.gonode/derivation/README.md
| // fetchCommitBatchTxHash finds the CommitBatch event on L1 for a given batch | ||
| // index. It searches forward from lastCommitL1Block (the L1 block of the | ||
| // previous CommitBatch event) so the window naturally covers catch-up scenarios | ||
| // instead of being limited to a fixed range. | ||
| func (bp *BatchProcessor) fetchCommitBatchTxHash(batchIndex, currentL1 uint64) (common.Hash, error) { | ||
| fromBlock := bp.lastCommitL1Block | ||
|
|
||
| batchIndexBig := new(big.Int).SetUint64(batchIndex) | ||
| logs, err := bp.l1Client.FilterLogs(bp.ctx, ethereum.FilterQuery{ | ||
| FromBlock: new(big.Int).SetUint64(fromBlock), | ||
| ToBlock: new(big.Int).SetUint64(currentL1), | ||
| Addresses: []common.Address{bp.rollupAddress}, | ||
| Topics: [][]common.Hash{ | ||
| {derivation.RollupEventTopicHash}, | ||
| {common.BigToHash(batchIndexBig)}, | ||
| }, | ||
| }) | ||
| if err != nil { | ||
| return common.Hash{}, fmt.Errorf("FilterLogs: %w", err) | ||
| } | ||
| if len(logs) == 0 { | ||
| return common.Hash{}, fmt.Errorf("no CommitBatch event found for batch %d in L1 blocks [%d, %d]", batchIndex, fromBlock, currentL1) | ||
| } | ||
|
|
||
| bp.lastCommitL1Block = logs[0].BlockNumber | ||
| return logs[0].TxHash, nil | ||
| } |
There was a problem hiding this comment.
Cold start: searching from block 0 may be slow.
When lastCommitL1Block is 0 (initial state), fetchCommitBatchTxHash searches from L1 genesis to current head. On mainnet with millions of blocks, this FilterLogs call can time out or be rate-limited.
Consider initializing lastCommitL1Block in initCursors by querying the L1 block of the finalized batch's CommitBatch event, or using a bounded lookback window as a fallback.
🔧 Sketch: Initialize lastCommitL1Block in initCursors
// After resolving the L2 block hash in initCursors, also resolve the L1 block
// of the CommitBatch event for the finalized batch to seed lastCommitL1Block.
if idx > 0 {
// Find the CommitBatch event for idx to seed lastCommitL1Block
// This avoids searching from genesis on first tick
}Alternatively, use a fixed lookback (e.g., 10000 blocks) when lastCommitL1Block == 0.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@node/batchprocessor/processor.go` around lines 325 - 351, The
fetchCommitBatchTxHash function currently may Scan from genesis when
bp.lastCommitL1Block == 0; update initCursors to seed bp.lastCommitL1Block
instead of leaving it zero (or set a bounded fallback lookback) by locating the
CommitBatch event for the finalized batch: after resolving the finalized L2
block in initCursors, query FilterLogs (or use indexed event lookup) for the
CommitBatch with batch index == finalized index and set bp.lastCommitL1Block to
that event's BlockNumber; alternatively, if that lookup fails, set
bp.lastCommitL1Block = max(0, currentL1 - LOOKBACK_BLOCKS) so
fetchCommitBatchTxHash never searches from block 0.
090a62b to
2ce7470
Compare
Extract batch fetching/parsing/verification logic into a stateless BatchVerifier. Remove the standalone derivation block-production service and its isValidator startup branch. BatchVerifier will be integrated into BlockTagService as the next step.
Summary by CodeRabbit
New Features
Refactor
Chores