Skip to content

feat: provekit-ffi v2.0 #384

Merged
ashpect merged 12 commits intomainfrom
ash/sdk
Mar 31, 2026
Merged

feat: provekit-ffi v2.0 #384
ashpect merged 12 commits intomainfrom
ash/sdk

Conversation

@ashpect
Copy link
Copy Markdown
Collaborator

@ashpect ashpect commented Mar 25, 2026

Summery :
The v1 FFI required callers to manage file paths for schemes, inputs, and proofs. This made SDK integration (Swift, Kotlin ) awkward and limited as SDKs had to write temp files and coordinate paths. The v2 handle-based API lets SDKs hold compiled schemes in memory, pass inputs as strings, and receive proofs as byte buffers, which seems a natural interface for mobiles.

Changes made :

  • Rewrote provekit-ffi to a handle-based API replacing file-path-oriented functions (pk_prove_to_file, pk_prove_to_json) with opaque PKProver/PKVerifier handles that SDK consumers create, reuse, and free explicitly
  • Adds in-memory serialization to provekit-common (file::serialize / file::deserialize) so schemes and proofs can be loaded from and exported to byte buffers without touching the filesystem
  • Adds pk_prepare which takes the nargo compiled circuit artifact (.json) and prepares prover + verifier handles in memory (in mem equivalent of pkp and pkv)
  • Proofs use the standard .np binary format (header + compressed postcard), identical to CLI output : proofs from FFI and CLI are cross-compatible
  • pk_get_last_error() captures full diagnostic messages via thread-local storage

@Bisht13
Copy link
Copy Markdown
Collaborator

Bisht13 commented Mar 27, 2026

/claude-review

@github-actions
Copy link
Copy Markdown

Code Review

Summary

This PR rewrites provekit-ffi from a file-path-oriented API to a handle-based API (PKProver/PKVerifier). It also adds in-memory serialize/deserialize helpers to provekit-common so schemes and proofs can be exchanged as byte buffers without touching the filesystem. The motivation (easier Swift/Kotlin SDK integration) is sound and the overall design is clean.


Review

Correctness & Soundness

  1. Memory leak in pk_prepare under OOM (tooling/provekit-ffi/src/ffi.rs, the Box::into_raw lines for pk and vk)
    The code does:

    let pk = Box::into_raw(Box::new(PKProver { prover }));
    let vk = Box::into_raw(Box::new(PKVerifier { verifier }));
    Ok((pk, vk))

    If the second allocation panics (OOM), catch_panic catches it and returns the error code while *out_prover is still null_mut(). The already-allocated pk is irrecoverably leaked. Fix: keep both values as Box<_> until both are ready, then Box::into_raw together, or write to *out_prover immediately after each raw allocation so a caller can free on partial failure.

  2. Proof format inconsistency — silent footgun (ffi.rs, pk_prove_toml / pk_prove_json / pk_verify)
    Proofs are serialised with raw postcard::to_allocvec (no versioned header), whereas the CLI/existing code uses file::write which wraps in a magic-byte + version + compression header (the .np format). pk_verify mirrors this with raw postcard::from_bytes, so the FFI round-trip is internally consistent — but proofs from pk_prove_* cannot be verified with file::read, and .np files from the CLI cannot be passed to pk_verify. This will silently produce a deserialization error with no helpful message. Either align the proof format with the file format (preferred for interop), or add a prominent doc-comment: "These bytes are raw postcard, NOT the .np file format. Pass only to pk_verify."

  3. pk_verify swallows all errors as ProofError (ffi.rs)
    Err(_) => Ok(false) maps any internal verifier error (corrupted handle, internal bug) to the same return code as a legitimately invalid proof. Consider a distinct error code (e.g. InternalError) so callers can distinguish "proof is invalid" from "something went wrong internally."

Code Quality

  1. Missing // SAFETY: comments (multiple sites in ffi.rs) — CLAUDE.md violation
    Every unsafe block must have a // SAFETY: comment. The following raw-pointer operations are missing one:

    • std::slice::from_raw_parts(ptr, len) in pk_load_prover_bytes (~line 108) and pk_load_verifier_bytes
    • (*prover).prover.clone() / (*verifier).verifier.clone() inside the inner closures in pk_prove_toml, pk_prove_json, pk_verify, pk_save_prover, pk_save_verifier

    The enclosing unsafe fn signature does not substitute for per-site SAFETY explanations.

  2. parking_lot = "0.12" not using workspace dep (tooling/provekit-ffi/Cargo.toml, line ~29) — CLAUDE.md violation
    CLAUDE.md: "Declare in root Cargo.toml [workspace.dependencies], reference with { workspace = true } in crate Cargo.toml." All other deps in this PR use workspace = true. Fix: parking_lot.workspace = true (adding to root workspace deps if needed).

  3. #[allow(private_bounds)] on serialize (provekit/common/src/file/io/mod.rs, ~line 159)
    This suppresses a future-compatibility lint because MaybeHashAware is a private supertrait on a public function bound. Consider a sealed-trait pattern or making MaybeHashAware pub(crate) explicitly. Non-blocking, but worth addressing before this API stabilises.

  4. Error context is silently discarded
    Most map_err(|_| PKError::*) calls drop the underlying anyhow::Error. This makes debugging very hard for SDK consumers who only get an integer error code. Consider at minimum logging the error with tracing::error! before discarding it.

Testing

  • No tests are added for the new serialize_to_bytes / deserialize_from_bytes functions or the file::serialize / file::deserialize wrappers. A round-trip property test (proptest) would catch any header-encoding bugs early.
  • No integration tests for the new FFI entry points (pk_prepare, pk_load_prover_bytes, pk_prove_json, pk_verify).

Suggestions (non-blocking)

  • decompress_bytes in bin.rs duplicates the magic-byte detection logic that already exists in decompress_stream. Consider extracting detect_compression(data: &[u8]) -> Result<Compression> and sharing it.
  • The doc-comment on PKVerifier mentions whir_for_witness is consumed via .take() — confirm this is still accurate after the clone-then-verify pattern introduced here.

Verdict

🔄 Request changes

Items 1 (memory leak), 2 (proof format footgun), 4 (missing SAFETY comments), and 5 (non-workspace dep) need to be addressed before merge. Items 3, 6, and 7 are strongly encouraged. Testing gaps should be filled.

ram_limit_bytes: usize,
use_file_backed: bool,
swap_file_path: *const c_char,
pub unsafe extern "C" fn pk_prove_json(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp_dir.join("provekit_ffi_inputs.toml") is shared across all threads. Two concurrent pk_prove_json calls will overwrite each other's inputs, silently producing a proof for the wrong witness. Use tempfile::NamedTempFile or append a unique suffix (e.g. thread ID + atomic counter).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No temp file is used. pk_prove_json parses the JSON string entirely in-memory, and pk_prove_toml reads from a caller-provided path. There is no shared tmp_dir or intermediate file that could be overwritten by concurrent calls.

@ashpect ashpect merged commit 6741ceb into main Mar 31, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants