Skip to content

feat(core): ShaMap Merkle tree & ledger hashing#151

Open
e-desouza wants to merge 11 commits intomainfrom
feat/shamap-tree
Open

feat(core): ShaMap Merkle tree & ledger hashing#151
e-desouza wants to merge 11 commits intomainfrom
feat/shamap-tree

Conversation

@e-desouza
Copy link
Copy Markdown
Collaborator

@e-desouza e-desouza commented Mar 28, 2026

Summary

ShaMap radix-16 Merkle trie and ledger hashing for the XRP Ledger, per #150.

The tree uses HAMT bitmap-indexed inner nodes (u16 bitmap + compact Vec) so sparse nodes don't waste memory. Hash caching uses a dirty flag — mutations only invalidate the root-to-leaf path. Inner node hashing writes into a single 516-byte stack buffer and calls sha512half() once.

Also includes:

  • Merkle inclusion proofs (extract_proof() / verify_proof()) for SPV-style verification
  • Ledger hashing (ledger_hash(), transaction_tree_hash(), account_state_hash()) matching xrpl.js
  • no_std compatible (alloc only)
  • 68 unit tests including ported xrpl.js test vectors (8 incremental ShaMap hashes + ledger header hash)
  • 4 cargo-fuzz targets and 6 criterion benchmark groups

Closes #150

Test plan

  • cargo test --release -- shamap — 68 tests
  • cargo test --release --no-default-features --features embassy-rt,core,utils,wallet,models,helpers,websocket,json-rpc -- shamap — no_std variant
  • cargo clippy --features std,tokio-rt,embassy-rt,actix-rt,futures-rt,smol-rt,core,wallet,models,utils,helpers,json-rpc,websocket,cli -- -D warnings
  • cargo +nightly fuzz run shamap_order_independence -- -max_total_time=30 — zero crashes
  • cargo bench -- shamap

@e-desouza e-desouza requested a review from LimpidCrypto as a code owner March 28, 2026 13:41
@e-desouza e-desouza requested review from Patel-Raj11, kuan121 and pdp2121 and removed request for LimpidCrypto March 28, 2026 13:42
Port the xrpl.js ShaMap full-tree approach to Rust. The ShaMap is the
core data structure used in the XRP Ledger for transaction trees and
account state trees, producing the root hashes committed in ledger
headers.

Implements:
- ShaMap tree with Box-allocated inner/leaf nodes (16-branch radix trie)
- SHA-512Half streaming hasher (no_std compatible via sha2 crate)
- HashPrefix constants for domain separation in all hash contexts
- Ledger header hash computation
- Transaction and account state tree hash helpers
- Inclusion proof extraction and verification (ShaMapProof)
- 42 unit tests covering correctness, order independence, proofs, and
  fuzz-style tampering detection
- Criterion benchmarks for 100/1000/10000 item trees

All code is no_std compatible (uses alloc for Vec/Box).
Fix trailing whitespace in nftoken_mint.rs and apply rustfmt across
transaction models, CLI module, and test mock that had import ordering
and line length issues.
…ctors

Optimize the ShaMap with academic state-of-the-art techniques:

- HAMT bitmap-indexed inner nodes (Bagwell 2001): u16 bitmap + compact
  Vec instead of fixed 16-slot arrays. ~70-80% memory reduction for
  sparse nodes. O(1) branch lookup via popcount.
- Hash caching with dirty-flag propagation: Cell<Option<Hash256>> on
  every node. Mutations invalidate only the root-to-leaf path — O(log16 n).
  Repeated hash() calls are O(1).
- Single-buffer inner hash: 516-byte stack buffer with zero-init so
  empty branches contribute ZERO_256 for free. One sha512half() call.
- One-shot sha512half: Sha512::digest() instead of streaming new/update/finalize.
- Merkle inclusion proofs with extract_proof() and verify_proof().
- add_item_no_index() for TRANSACTION_NO_METADATA leaf type.
- Ported xrpl.js test vectors: 8 incremental ShaMap hashes + ledger
  header hash from ripple-binary-codec.
- Extended benchmarks: 6 groups (build, cached hash, incremental add,
  proof extraction, proof verify, contains) at 100/1K/10K items.

Closes #150
Four coverage-guided fuzz targets using libFuzzer + arbitrary:

- shamap_order_independence: root hash is insertion-order independent
- shamap_add_remove_roundtrip: add+remove extras restores original hash
- shamap_proof_integrity: valid proofs verify, tampered proofs fail,
  missing items have no proof
- shamap_cache_consistency: incremental tree hash equals fresh-build hash
  after random add/remove sequences

Run with: cargo +nightly fuzz run <target> -- -max_total_time=30
Copy link
Copy Markdown
Collaborator Author

@e-desouza e-desouza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two correctness issues flagged inline: the fuzz cache-consistency check fires only after all ops (not per-mutation as documented), and add_item panics with an OOB index access on duplicate keys.

Review by Claude Opus 4.6 · Prompt: V12

Comment thread fuzz/fuzz_targets/shamap_cache_consistency.rs Outdated
Comment thread src/core/shamap/tree.rs
…ncy check

add_item now detects duplicate keys and replaces the existing leaf
instead of recursing infinitely until an OOB panic on nibble().

Move the fuzz cache-consistency assertion inside the per-op loop so
transient hash divergences that self-correct mid-sequence are caught.
@pdp2121
Copy link
Copy Markdown
Collaborator

pdp2121 commented Apr 8, 2026

/ai-review

Copy link
Copy Markdown

@xrplf-ai-reviewer xrplf-ai-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three correctness issues and one API design gap flagged — see inline comments.

Review by Claude Opus 4.6 · Prompt: V13

Comment thread src/core/shamap/tree.rs Outdated
let existing = self.children.remove(idx);
self.bitmap &= !(1u16 << slot);

let mut new_inner = ShaMapInner::new(self.depth + 1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u8 depth overflows silently in release builds — use checked_add or assert max depth (63 for 32-byte keys):

Suggested change
let mut new_inner = ShaMapInner::new(self.depth + 1);
let mut new_inner = ShaMapInner::new(self.depth.checked_add(1).expect("ShaMap depth overflow"));

Comment thread src/core/shamap/tree.rs
}
new_inner.add_item(leaf);

let new_idx = compact_index(self.bitmap, slot);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_item silently overwrites duplicate keys with no return value — callers building canonical trees can't detect collisions. Consider returning bool (inserted vs. replaced) or an error on collision.

Comment thread src/core/shamap/tree.rs
/// A level in a ShaMap inclusion proof.
pub struct ProofLevel {
/// Which branch (0..15) was taken at this level.
pub nibble: u8,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nibble is a public u8 with no range constraint — if ProofLevel is deserialized from untrusted input with nibble > 15, verify_proof silently computes a wrong hash and always returns false. Add a validation guard in verify_proof: if level.nibble > 15 { return false; }

Comment thread benches/benchmarks.rs Outdated
map.add_item(*extra_idx, prefix, extra_data.clone());
let h = black_box(map.hash());
// Remove to reset for next iteration
map.remove_item(extra_idx);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchecked remove_item return — failed reset silently inflates the tree across iterations, skewing timing:

Suggested change
map.remove_item(extra_idx);
assert!(map.remove_item(extra_idx), "failed to remove extra item in benchmark");

…ation, and benchmark assertion

- Use checked_add(1) for inner node depth to panic on overflow instead of silent u8 wrap
- Return bool from add_item: true for new insert, false for duplicate key replacement
- Validate nibble <= 15 in verify_proof to reject malformed proofs early
- Assert remove_item success in incremental_add benchmark instead of ignoring the return value
The rippleci/rippled:develop image updated after 2026-04-01 and broke
integration tests across all PRs (container exits before becoming healthy,
causing Connection refused on localhost:5005).

Pin to the last known-good digest and replace the simple until loop with
a bounded retry that checks container liveness, prints status per attempt,
and dumps container logs on failure.
…ibility, bench, and cache

- CRITICAL: verify_proof now binds each level.nibble to nibble(&proof.index, depth).
  Previously the verifier walked level.nibble values blindly, so for
  TRANSACTION_NO_METADATA leaves (whose hash excludes the index) an attacker
  could pair any existing leaf_hash with an arbitrary index and have it verify.
  Also reject paths longer than 64 levels early. Added test_proof_wrong_index_fails.
- HIGH: ShaMapInner::new is now pub(crate) so external callers cannot construct
  an inner at depth >= 64, which would panic in nibble() on the first op.
  Documented the depth < 64 invariant.
- MEDIUM: bench_shamap_incremental_add switched to iter_batched so the timed
  closure only measures add_item + hash. Previously it also ran remove_item +
  hash per iteration, inflating the reported cost by roughly 2x.
- MEDIUM: ShaMapInner::hash now caches ZERO_256 when bitmap == 0 instead of
  re-running the empty-check on every call.
- Added doc warning on ShaMapProof fields pointing callers at extract_proof
  and explaining the index-binding invariant enforced by verify_proof.
pdp2121 pushed a commit that referenced this pull request Apr 21, 2026
… #3270) (#291)

## Summary

The `rippled` binary was renamed to `xrpld` upstream, and the
`rippleci/rippled` image stopped receiving updates. Our integration
tests across every open PR started failing because the published
`develop` image exited before becoming healthy (`Connection refused` on
`localhost:5005`, **0 passed / 41 failed**).

This PR mirrors the upstream fix in xrpl.js:
[XRPLF/xrpl.js#3270](XRPLF/xrpl.js#3270).
Switching to `rippleci/xrpld:develop` is the **actual root-cause fix**
rather than pinning an old digest of the deprecated image.

## Changes

`.github/workflows/integration_test.yml`:
- `RIPPLED_DOCKER_IMAGE` -> `XRPLD_DOCKER_IMAGE:
rippleci/xrpld:develop`.
- `docker run` simplified to `${IMAGE} --standalone` (the `xrpld` image
handles `mkdir` + launch internally; no more `bash -c "mkdir -p
/var/lib/rippled/db/ && rippled -a"` wrapper).
- Volume mount changed from `/etc/opt/ripple/` to `/etc/opt/xrpld/`.
- Container name: `rippled-service` -> `xrpld-service`.
- Removed the docker `--health-cmd` (which shelled out to the renamed
`rippled` CLI and always failed) in favour of a direct JSON-RPC poll
against `http://localhost:5005/`.
- Always dump container logs on the stop step for post-mortem
visibility.

`.ci-config/rippled.cfg` -> `.ci-config/xrpld.cfg`:
- `path=/var/lib/rippled/db/nudb` -> `path=/var/lib/xrpld/db/nudb`.
- `[database_path] /var/lib/rippled/db` -> `/var/lib/xrpld/db`.
- `[debug_logfile] /var/log/rippled/debug.log` ->
`/var/log/xrpld/debug.log`.

## Verification

Validated on throwaway PR #292 (now closed): **Integration Test green in
2m53s** on this exact workflow. Unit tests, Build & Lint, Quality Check
also pass.

## Related follow-up

The 7 in-flight PRs (#130, #131, #151, #153, #156, #157, #158) currently
carry a stopgap commit pinning `rippleci/rippled:develop` to a specific
digest. After this PR merges to `main`, those branches should:
1. Rebase on `main` to pick up the xrpld switch, or
2. Cherry-pick this commit and drop the stopgap digest pin.

## Test plan

- [x] Validated end-to-end on PR #292
- [x] Build & Lint, Unit Test, Integration Test, Quality Check all pass
- [ ] Merge and confirm subsequent PRs inherit the fix without manual
cherry-pick

## Credit

Approach lifted from @ckeshava's
[xrpl.js#3270](XRPLF/xrpl.js#3270).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement ShaMap Merkle Tree & Ledger Hashing for ledger-level data verification

2 participants