Skip to content

feat: add batch-submitter module#5

Merged
GCdePaula merged 5 commits intomainfrom
feature/batch-submitter
Mar 17, 2026
Merged

feat: add batch-submitter module#5
GCdePaula merged 5 commits intomainfrom
feature/batch-submitter

Conversation

@stephenctw
Copy link
Collaborator

  • L1 batch posting wired: EthereumBatchPoster signs and calls InputBox.addInput(app_address, payload).
  • Batch payload format: 0x01 || batch_index (u64 BE) || ssz(Batch).
  • S from InputBox events: input reader scans InputAdded from (prev_safe, Latest], using:
    • Safe for when directs become persisted,
    • Latest for advancing last_submitted_batch_index.
  • S persisted in DB: submitted_batches_state.last_submitted_batch_index updated inside append_safe_direct_inputs.
  • Worker uses DB S only: submits closed batches in (S ..= last_closed], at-least-once, no L1 reads.
  • Config/docs cleaned up: no confirmations_depth; batch submitter config only has idle_poll_interval_ms and max_batches_per_loop.

@stephenctw stephenctw requested a review from GCdePaula March 13, 2026 14:15
@stephenctw stephenctw self-assigned this Mar 13, 2026
Copy link
Collaborator

@GCdePaula GCdePaula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new batch-submitter config is reparsed inside the library runtime, so normal sequencer startup will reject its own CLI flags.

main already parses RunConfig, but run() then calls BatchSubmitterConfig::parse() again. That second Clap parser only knows about the submitter-specific flags, so it will see the normal sequencer flags (--eth-rpc-url, --domain-chain-id, etc.) as unknown arguments and exit the process. This should be threaded through RunConfig, not reparsed inside the runtime.

Indeed, the benchmarks just -f benchmarks/justfile all are no longer running because of this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first real closed batch is treated as “already submitted” and will never be posted to L1.

initialize_open_state() creates batch 0, but the new submitted_batches_state singleton is bootstrapped to 0. The submitter then computes first_to_submit = latest_submitted + 1, so it starts at batch 1 and permanently skips closed batch 0. The tests even encode this behavior by asserting only batches 1 and 2 are submitted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current “last submitted batch” approach is unsound for the at-least-once model we want.

Right now the system persists a monotonic submitted_batches_state cursor in SQLite, and that cursor is advanced from Latest-head observations in the input reader. That means a batch can briefly appear onchain, get recorded durably as “submitted”, then get reorged out before it is safe/final, and we will never retry it again because this worker treats the stored value as authoritative and starts from S + 1.

In other words, we’re turning an optimistic, reorgable observation into permanent truth.

I think the cleaner model is:

  • keep the input reader as it was before, focused on what it already does well (safe direct-input ingestion);
  • and make this submit worker fetch the latest submitted batch from the chain fresh on each loop iteration, keeping that state only in memory.

Concretely in this submit worker:

  • wake up,
  • read the current chain view for batch-submission inputs (maybe at Latest - k rather than raw Latest),
  • derive the highest submitted batch nonce,
  • compare that with locally closed batches,
  • submit any missing ones,
  • then sleep and repeat.

That gives us the “at least once, maybe duplicates” behavior we want, without persisting reorg-prone state, and lets the next loop automatically recover from short reorgs or temporarily missing inclusions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...keeping that state only in memory.

My only concern is that if the sequencer restarts, it would resend all batches from the very first one right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...keep the input reader as it was before

Just to make sure that we want only direct input right, so the msg.sender is used to filter out the batches?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly: the worker is meant to be stateless, but not blind.

On every loop iteration, it rebuilds its view from the two real sources of truth:

  • B: the blockchain view for our app’s InputAdded events, up to Latest - k
  • L: the local DB view of closed/ready batches produced by the inclusion lane

So on restart it does not resend everything from batch 0. It just fetches S from B again, compares with L, and resumes from there.

More concretely, the loop I have in mind is:

  • wake up
  • query B for all InputAdded events for our app up to Latest - k
  • from those, filter only the inputs whose msgSender is the batch submitter / sequencer sender
  • decode the batch nonce from those inputs and derive the highest contiguous submitted nonce S (or “none yet” if nothing has been submitted)
  • query L for the locally closed batches C (all batches except the current open one)
  • compare S with C
  • if C has a suffix that is still missing from B, submit that suffix (or better yet, just the first missing batch and leave the rest for the next iteration, which is probably cleaner)
  • sleep, then repeat

A few important properties fall out of this:

  • Restart-safe: no persistent “submitted” cursor is needed, because S is recomputed from chain truth every time.
  • Reorg-safe: if a batch briefly appears and then gets reorged out, the next iteration just stops seeing it and will retry it.
  • At-least-once: duplicates are fine, because the scheduler deduplicates by batch nonce.
  • No optimistic state gets fossilized: we never promote a Latest-head observation into durable DB truth.

And yes, on the second point: my suggestion is that the input reader should stay focused on direct-input ingestion only.

The reader’s job is to persist safe direct inputs.

The submitter’s job is different: it wants a fresh, possibly shallow, chain observation of which batch submissions have already been seen on L1.

Those are different concerns with different finality requirements, so I think it is cleaner to keep them separate rather than teaching the input reader about batch-submission tracking.

One small nuance: I would use the highest contiguous observed batch nonce, not just the maximum seen nonce. That avoids treating a later observed batch as proof that all earlier batches were also included.

There are ways to optimize the state fetching, but let's do the naive implementation first of just re-fetching everything every loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, all make perfect sense now. But I'm worried when sequencer restarts, we may be querying a large block range that would need the partition design we have in the input-reader. What do you think?

Copy link
Collaborator

@GCdePaula GCdePaula Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and even worse it would need to do this large query every iteration of the loop. It needs the partition logic. Just like the PRT dispute fighter.

Let's do it naively first.

The obvious optimization is querying only in the range between the safe block stored in the db and the latest block of the blockchain. The full list of inputs then becomes the concatenation of the inputs in the db followed by the inputs we queried.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wire format is carrying more structure than it needs.

Right now we have two different ways to classify inputs:

  • Scheduler: metadata.msg_sender at the rollups layer
  • Batch Submitter: an explicit tag byte in the payload

So the scheduler already has the real protocol boundary available via msg_sender, and already treats the sequencer address differently from all other senders. So the tag feels redundant.

I’d simplify this to:

  • classify by sender only
  • move the batch nonce into Batch itself, so that it becomes Batch { nonce, frames }

That would make the payload just:

  • ssz(Batch { nonce, frames })

instead of the current:

  • tag || nonce || ssz(Batch { frames })

I think that would give us a much cleaner boundary:

  • one source of truth for “is this a batch?”
  • one protocol object
  • one decode path

It would also make the nonce part of the actual wire type, which feels more honest than keeping it as an out-of-band prefix.

@stephenctw stephenctw force-pushed the feature/batch-submitter branch from 18d9c6a to 37dfcf9 Compare March 15, 2026 15:14
@stephenctw stephenctw requested a review from GCdePaula March 15, 2026 15:16
@stephenctw stephenctw force-pushed the feature/batch-submitter branch from 2c3a53a to a014506 Compare March 17, 2026 01:50
@stephenctw stephenctw force-pushed the feature/batch-submitter branch from a014506 to 1238b1a Compare March 17, 2026 01:52
@stephenctw
Copy link
Collaborator Author

Latest change:

  • L1: EthereumBatchPoster calls InputBox.addInput with ssz(Batch { nonce, frames }), waits for confirmation_depth.
  • S from chain: Worker stays stateless; each tick it gets highest submitted nonce from L1 via observed_submitted_batch_nonces(from_block) (batch-submitter sender + SSZ). from_block comes from persisted safe-input prefix to keep scans small.
  • Loop: Load latest closed batch from DB → safe observed nonces (DB) + chain nonces → advance_expected_batch_nonce → submit first missing batch → idle sleep. At-least-once; scheduler dedupes by nonce.
  • Partition: New module, static config (init() at startup). Shared block-range retry for input reader and batch poster; default codes -32005,-32600,-32602,-32616.
  • Wire: Batch { nonce, frames }, DirectInput { sender, block_number }. Scheduler enforces batch.nonce == next_expected_batch_nonce.
    Storage: Safe-input naming/schema only; no submitted_batches_state.
  • Config: Batch submitter options on RunConfig; benchmarks pass --batch-submitter-private-key. just test-sequencer uses --test-threads=1.

Copy link
Collaborator

@GCdePaula GCdePaula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a couple of comments!


/// Lazy default, then overwritten by `init()`. No leak: default is an empty config;
/// when `init()` is called we replace it with the real config.
static CONFIG: OnceLock<RwLock<PartitionConfig>> = OnceLock::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this global variable approach. I think it also allows one runtime instance silently change another instance’s RPC retry behavior.

We should either always pass down the config in the get_input_added_events function, or transform this whole module into an object/structure, that has the long_block_range_error_codes as one of its fields, along with the provider, app_address_filter and input_box_address.

error_message_matches_retry_codes(&format!("{err:?}"), codes)
}

pub(crate) fn decode_evm_advance_input(input: &[u8]) -> Result<EvmAdvanceCall, String> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could also move this function to a shared module, possibly to the partition.rs. This whole file might not make sense anymore too.

@stephenctw stephenctw requested a review from GCdePaula March 17, 2026 14:38
@GCdePaula GCdePaula merged commit 8f9105b into main Mar 17, 2026
2 checks passed
@GCdePaula GCdePaula deleted the feature/batch-submitter branch March 17, 2026 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants