feat(core): ShaMap Merkle tree & ledger hashing#151
Conversation
Port the xrpl.js ShaMap full-tree approach to Rust. The ShaMap is the core data structure used in the XRP Ledger for transaction trees and account state trees, producing the root hashes committed in ledger headers. Implements: - ShaMap tree with Box-allocated inner/leaf nodes (16-branch radix trie) - SHA-512Half streaming hasher (no_std compatible via sha2 crate) - HashPrefix constants for domain separation in all hash contexts - Ledger header hash computation - Transaction and account state tree hash helpers - Inclusion proof extraction and verification (ShaMapProof) - 42 unit tests covering correctness, order independence, proofs, and fuzz-style tampering detection - Criterion benchmarks for 100/1000/10000 item trees All code is no_std compatible (uses alloc for Vec/Box).
Fix trailing whitespace in nftoken_mint.rs and apply rustfmt across transaction models, CLI module, and test mock that had import ordering and line length issues.
…ctors Optimize the ShaMap with academic state-of-the-art techniques: - HAMT bitmap-indexed inner nodes (Bagwell 2001): u16 bitmap + compact Vec instead of fixed 16-slot arrays. ~70-80% memory reduction for sparse nodes. O(1) branch lookup via popcount. - Hash caching with dirty-flag propagation: Cell<Option<Hash256>> on every node. Mutations invalidate only the root-to-leaf path — O(log16 n). Repeated hash() calls are O(1). - Single-buffer inner hash: 516-byte stack buffer with zero-init so empty branches contribute ZERO_256 for free. One sha512half() call. - One-shot sha512half: Sha512::digest() instead of streaming new/update/finalize. - Merkle inclusion proofs with extract_proof() and verify_proof(). - add_item_no_index() for TRANSACTION_NO_METADATA leaf type. - Ported xrpl.js test vectors: 8 incremental ShaMap hashes + ledger header hash from ripple-binary-codec. - Extended benchmarks: 6 groups (build, cached hash, incremental add, proof extraction, proof verify, contains) at 100/1K/10K items. Closes #150
Four coverage-guided fuzz targets using libFuzzer + arbitrary: - shamap_order_independence: root hash is insertion-order independent - shamap_add_remove_roundtrip: add+remove extras restores original hash - shamap_proof_integrity: valid proofs verify, tampered proofs fail, missing items have no proof - shamap_cache_consistency: incremental tree hash equals fresh-build hash after random add/remove sequences Run with: cargo +nightly fuzz run <target> -- -max_total_time=30
b09266c to
7a73c6a
Compare
e-desouza
left a comment
There was a problem hiding this comment.
Two correctness issues flagged inline: the fuzz cache-consistency check fires only after all ops (not per-mutation as documented), and add_item panics with an OOB index access on duplicate keys.
Review by Claude Opus 4.6 · Prompt: V12
…ncy check add_item now detects duplicate keys and replaces the existing leaf instead of recursing infinitely until an OOB panic on nibble(). Move the fuzz cache-consistency assertion inside the per-op loop so transient hash divergences that self-correct mid-sequence are caught.
|
/ai-review |
| let existing = self.children.remove(idx); | ||
| self.bitmap &= !(1u16 << slot); | ||
|
|
||
| let mut new_inner = ShaMapInner::new(self.depth + 1); |
There was a problem hiding this comment.
u8 depth overflows silently in release builds — use checked_add or assert max depth (63 for 32-byte keys):
| let mut new_inner = ShaMapInner::new(self.depth + 1); | |
| let mut new_inner = ShaMapInner::new(self.depth.checked_add(1).expect("ShaMap depth overflow")); |
| } | ||
| new_inner.add_item(leaf); | ||
|
|
||
| let new_idx = compact_index(self.bitmap, slot); |
There was a problem hiding this comment.
add_item silently overwrites duplicate keys with no return value — callers building canonical trees can't detect collisions. Consider returning bool (inserted vs. replaced) or an error on collision.
| /// A level in a ShaMap inclusion proof. | ||
| pub struct ProofLevel { | ||
| /// Which branch (0..15) was taken at this level. | ||
| pub nibble: u8, |
There was a problem hiding this comment.
nibble is a public u8 with no range constraint — if ProofLevel is deserialized from untrusted input with nibble > 15, verify_proof silently computes a wrong hash and always returns false. Add a validation guard in verify_proof: if level.nibble > 15 { return false; }
| map.add_item(*extra_idx, prefix, extra_data.clone()); | ||
| let h = black_box(map.hash()); | ||
| // Remove to reset for next iteration | ||
| map.remove_item(extra_idx); |
There was a problem hiding this comment.
Unchecked remove_item return — failed reset silently inflates the tree across iterations, skewing timing:
| map.remove_item(extra_idx); | |
| assert!(map.remove_item(extra_idx), "failed to remove extra item in benchmark"); |
…ation, and benchmark assertion - Use checked_add(1) for inner node depth to panic on overflow instead of silent u8 wrap - Return bool from add_item: true for new insert, false for duplicate key replacement - Validate nibble <= 15 in verify_proof to reject malformed proofs early - Assert remove_item success in incremental_add benchmark instead of ignoring the return value
The rippleci/rippled:develop image updated after 2026-04-01 and broke integration tests across all PRs (container exits before becoming healthy, causing Connection refused on localhost:5005). Pin to the last known-good digest and replace the simple until loop with a bounded retry that checks container liveness, prints status per attempt, and dumps container logs on failure.
…ibility, bench, and cache - CRITICAL: verify_proof now binds each level.nibble to nibble(&proof.index, depth). Previously the verifier walked level.nibble values blindly, so for TRANSACTION_NO_METADATA leaves (whose hash excludes the index) an attacker could pair any existing leaf_hash with an arbitrary index and have it verify. Also reject paths longer than 64 levels early. Added test_proof_wrong_index_fails. - HIGH: ShaMapInner::new is now pub(crate) so external callers cannot construct an inner at depth >= 64, which would panic in nibble() on the first op. Documented the depth < 64 invariant. - MEDIUM: bench_shamap_incremental_add switched to iter_batched so the timed closure only measures add_item + hash. Previously it also ran remove_item + hash per iteration, inflating the reported cost by roughly 2x. - MEDIUM: ShaMapInner::hash now caches ZERO_256 when bitmap == 0 instead of re-running the empty-check on every call. - Added doc warning on ShaMapProof fields pointing callers at extract_proof and explaining the index-binding invariant enforced by verify_proof.
… #3270) (#291) ## Summary The `rippled` binary was renamed to `xrpld` upstream, and the `rippleci/rippled` image stopped receiving updates. Our integration tests across every open PR started failing because the published `develop` image exited before becoming healthy (`Connection refused` on `localhost:5005`, **0 passed / 41 failed**). This PR mirrors the upstream fix in xrpl.js: [XRPLF/xrpl.js#3270](XRPLF/xrpl.js#3270). Switching to `rippleci/xrpld:develop` is the **actual root-cause fix** rather than pinning an old digest of the deprecated image. ## Changes `.github/workflows/integration_test.yml`: - `RIPPLED_DOCKER_IMAGE` -> `XRPLD_DOCKER_IMAGE: rippleci/xrpld:develop`. - `docker run` simplified to `${IMAGE} --standalone` (the `xrpld` image handles `mkdir` + launch internally; no more `bash -c "mkdir -p /var/lib/rippled/db/ && rippled -a"` wrapper). - Volume mount changed from `/etc/opt/ripple/` to `/etc/opt/xrpld/`. - Container name: `rippled-service` -> `xrpld-service`. - Removed the docker `--health-cmd` (which shelled out to the renamed `rippled` CLI and always failed) in favour of a direct JSON-RPC poll against `http://localhost:5005/`. - Always dump container logs on the stop step for post-mortem visibility. `.ci-config/rippled.cfg` -> `.ci-config/xrpld.cfg`: - `path=/var/lib/rippled/db/nudb` -> `path=/var/lib/xrpld/db/nudb`. - `[database_path] /var/lib/rippled/db` -> `/var/lib/xrpld/db`. - `[debug_logfile] /var/log/rippled/debug.log` -> `/var/log/xrpld/debug.log`. ## Verification Validated on throwaway PR #292 (now closed): **Integration Test green in 2m53s** on this exact workflow. Unit tests, Build & Lint, Quality Check also pass. ## Related follow-up The 7 in-flight PRs (#130, #131, #151, #153, #156, #157, #158) currently carry a stopgap commit pinning `rippleci/rippled:develop` to a specific digest. After this PR merges to `main`, those branches should: 1. Rebase on `main` to pick up the xrpld switch, or 2. Cherry-pick this commit and drop the stopgap digest pin. ## Test plan - [x] Validated end-to-end on PR #292 - [x] Build & Lint, Unit Test, Integration Test, Quality Check all pass - [ ] Merge and confirm subsequent PRs inherit the fix without manual cherry-pick ## Credit Approach lifted from @ckeshava's [xrpl.js#3270](XRPLF/xrpl.js#3270).
Summary
ShaMap radix-16 Merkle trie and ledger hashing for the XRP Ledger, per #150.
The tree uses HAMT bitmap-indexed inner nodes (
u16bitmap + compactVec) so sparse nodes don't waste memory. Hash caching uses a dirty flag — mutations only invalidate the root-to-leaf path. Inner node hashing writes into a single 516-byte stack buffer and callssha512half()once.Also includes:
extract_proof()/verify_proof()) for SPV-style verificationledger_hash(),transaction_tree_hash(),account_state_hash()) matching xrpl.jsno_stdcompatible (alloc only)Closes #150
Test plan
cargo test --release -- shamap— 68 testscargo test --release --no-default-features --features embassy-rt,core,utils,wallet,models,helpers,websocket,json-rpc -- shamap— no_std variantcargo clippy --features std,tokio-rt,embassy-rt,actix-rt,futures-rt,smol-rt,core,wallet,models,utils,helpers,json-rpc,websocket,cli -- -D warningscargo +nightly fuzz run shamap_order_independence -- -max_total_time=30— zero crashescargo bench -- shamap