Skip to content

feat: Support airbender delegations in precompiles#209

Draft
popzxc wants to merge 10 commits intomainfrom
popzxc-airbender-precompiles
Draft

feat: Support airbender delegations in precompiles#209
popzxc wants to merge 10 commits intomainfrom
popzxc-airbender-precompiles

Conversation

@popzxc
Copy link
Copy Markdown
Member

@popzxc popzxc commented Feb 24, 2026

This PR adds support for airbender delegations in precompiles, which is a prerequisite for making EraVM provable in airbender.

It does it through airbender-crypto crate from airbender-platfrom which is adapted from zksync-os dev branch with one exception: keccak delegation was taken from the rr/ethrpoofs branch.

Dependency on popzxc/airbender-platform with main branch is temporary -- before we merge this PR, I expect that we'll move the repo to the matter-labs org, have a release, and will pin via tag.

Copy link
Copy Markdown
Contributor

@0xVolosnikov 0xVolosnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look ok, but left some comments.

But an important note: I don't think that we actually run these tests in CI. And we do not run anything in RISC-V context at all.

Additionally, imo keccak256 testing should be enhanced. It is true for all precompiles in general, but this is out of scope for this PR.

Comment thread crates/zk_evm_abstractions/src/precompiles/ecadd.rs Outdated
Comment thread crates/zk_evm_abstractions/src/precompiles/ecadd.rs Outdated
Comment thread crates/zk_evm_abstractions/src/precompiles/keccak256.rs Outdated
Comment thread crates/zk_evm_abstractions/src/precompiles/keccak256.rs Outdated
Comment thread Cargo.toml Outdated
Comment thread crates/zk_evm_abstractions/src/utils/airbender_bn254.rs Outdated
Comment thread crates/zk_evm_abstractions/src/precompiles/ecpairing.rs Outdated
Comment thread crates/zk_evm_abstractions/src/precompiles/ecrecover.rs Outdated
@0xVolosnikov
Copy link
Copy Markdown
Contributor

0xVolosnikov commented Apr 17, 2026

Looks fine, I'll take one more deeper manual review later, after we finalize other parts. I'll leave some AI-generated comments below

Copy link
Copy Markdown
Contributor

@0xVolosnikov 0xVolosnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep review pass. Three coverage/observability concerns + two trait-shape suggestions. Left inline.


pub(super) fn execution_output_bytes(execution: &KeccakExecution) -> [u8; 32] {
let (_, witness) = execution;
let (_, write_queries, _) = witness
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differential tests compare output bytes only — the Vec<Keccak256RoundWitness> is discarded here (let (_, write_queries, _) = ...). Witness parity is the actual invariant between legacy and airbender (zk-proof validity), but the current tests can't catch a divergence in round count, read ordering, or new_request placement as long as the final hash matches. Consider adding a delegated_witness_matches_legacy assertion that compares reads/writes/per-round witness element-wise. The same pattern applies across ecadd/ecmul/ecpairing/ecrecover/secp256r1_verify.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care? We have a test suite for the verifier that compares VM runs frame-to-frame on real batches, and it passes.
We can polish and make this implementation more and more tested, but I see diminishing returns in making tests more thorough.

I mean, I can do it, but -- we stared at "efficiently no tests" and now we got to "we need to enforce that reads are the same even given that output matches"; at the same time above you quote "Additionally, imo keccak256 testing should be enhanced. It is true for all precompiles in general, but this is out of scope for this PR.".

If you see value in these tests other than for being pedantic in our test approach and fearing about "Swiss cheese model" (e.g. frame-to-frame VM tests elsewhere do not justify lack of thorough testing here), I can add it; otherwise I'd say that the scope of this PR is already big enough.


#[test]
fn delegated_backend_matches_legacy_quickcheck() {
fn property(left_scalar: u64, right_scalar: u64) -> bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differential fuzz space is narrow on two axes:

  • QUICKCHECK_NUM_CASES = 256 everywhere.
  • ecadd/ecmul/ecpairing drive scalars from u64 (≤2⁶⁴ of a ~2²⁵⁴ input space).
  • ecrecover/secp256r1 fuzz only well-formed sign-then-verify inputs — the rejection paths of Signature::from_scalars, secp256k1::recover, secp256r1::verify are never probed with random (r, s, v, x, y) bytes.

An off-by-one between legacy's u256_to_scalar subtraction loop and airbender's airbender_fr_from_u256 reduction, or a difference in how malformed signatures are rejected, would ship green. Consider raising the case count (or reading from env) and adding malformed-input differential properties.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. quickcheck is meant for quick checks, it's not a differential fuzzing tool per se. tests method defines the maximum number of successful test runs (i.e., not discarded), and by default it's 100, so we're actually increasing the quickcheck defaults this way.
  2. Driving from U256 would be significantly more verbose, as we'll need to provide Arbitrary implementation, increasing the diff further -- do we want that? Would the theoretical possibility of covering the whole U256 space with 256 successful tests significantly increase reliability of such tests? deterministic_ecmul_cases already has a bunch of points that are outside of u256.
  3. Do we really care within the scope of this PR?

let result = Backend::recover(&digest, &r, &s, rec_id);
let output_values = match result {
Ok(verifying_key) => [U256::one(), verifying_key_to_address(&verifying_key)],
Err(_) => [U256::zero(), U256::zero()],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error context is stripped twice: airbender_bn254.rs:37,73 collapse ark-ff errors to Error::msg("invalid x"), and here the err-arm matches Err(_) => [U256::zero(), ...] and discards even that. Same in ecadd/ecmul/ecpairing/secp256r1_verify mod.rs.

When a legacy/airbender divergence shows up in prod logs, there's nothing to inspect. A tracing::debug!(?e, "precompile rejected input") on the rejection arm would enable forensics without touching consensus behavior.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would legacy/airbender divergence show up in prod logs if we are not going to run it in shadow mode? The "more logs more better" argument can be made to the workspace as a whole, and currently tracing is not present as a dependency even, adding a one-off tracing invocation is a bogus change IMHO.

padding_space: usize,
);

fn finalize(&mut self) -> [u8; 32];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalize(&mut self) forces both backends into a take-out-of-self hack: legacy uses std::mem::take (legacy_backend.rs:73), airbender wraps in Option<AirbenderKeccak256> purely so .take() is available and generates two .expect("... must exist ...") sites (airbender_backend.rs:14, 37-40, 56-60).

Change to fn finalize(self) -> [u8; 32]. The executor at mod.rs:228 already calls it exactly once per instance, so this compiles; the Option and both expects disappear, and double-finalization becomes a compile-time error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree in principle, but it would require refactoring of execute_keccak_precompile function -- it finalizes the hash computation based on is_last and runs extra shared logic as the last step of the loop iteration; borrow checker cannot figure out that the value is guaranteed to never be accessed after that. It would increase the chance of messing up, and I'd prefer to keep as much of original code as possible.

Is it a code smell? Sure. Is fixing it justified as a part of this PR? Not sure.


fn absorb_query_bytes(&mut self, input: &[u8; 32], offset: usize, meaningful_bytes: usize);

fn finish_round(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The four args (full_round_padding, is_last, paddings_round, padding_space) are shaped around legacy's pad10*1 stamping. LegacyKeccakBackend uses all four; DelegatedKeccakBackend discards all four (airbender_backend.rs:44-52) because MiniDigest::finalize handles padding internally.

The trait leaks one backend's private contract. A cleaner seam: absorb(&[u8]) + finalize(self) -> [u8;32] + can_absorb(n) -> bool, with padding stamping moved into legacy's backend where it belongs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care?

@popzxc
Copy link
Copy Markdown
Member Author

popzxc commented Apr 20, 2026

@0xVolosnikov -- my pushback is not because I disagree with the feedback; it's just I believe that LLM-induced "improvement loop" can never end in practice. LLMs are good at things that can be criticized, but LLMs are still bad at deciding whether the critique is justified given context. I have had several rounds of LLM-based self-review before publishing myself with different prompting techniques.

I would classify provided LLM comments as "technically correct, practically low-impact" (again, given the purpose and original scope of the PR), so to me it sounds like "we've reached the point where LLM suggestions are not that useful, and a human review will provide much more value"; otherwise we might never merge this PR simply because the codebase is not perfect in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants