Add Verus/Kani proofs and ExecPlan for LSH invariants (7.2.4/7.2.5) by leynos · Pull Request #194 · leynos/whitaker

leynos · 2026-04-08T15:27:45Z

Summary

Adds a concrete Verus/Kani proof workflow for the clone-detector LSH invariants and accompanying tests; wires ExecPlan for 7.2.4/7.2.5 and updates documentation and tooling. This implements Stages B–I while preserving the runtime API.

What this PR adds

Verus proof: verus/clone_detector_lsh_config.rs (semantic model of LshConfig::new)
Kani harness: crates/whitaker_clones_core/src/index/kani.rs (crate-private, bounded smoke harness)
Proof tooling and wiring:
- Makefile targets: verus-clone-detector, kani-clone-detector (in addition to existing verus and kani scaffolding)
- Script scaffolding: scripts/install-kani.sh, scripts/run-kani.sh, scripts/run-verus.sh updated for grouped proof entry points
ExecPlan documentation and wiring:
- ExecPlan: docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
Documentation and test surface:
- Documentation updates: docs/whitaker-clone-detector-design.md, docs/roadmap.md updates, and docs/developers-guide.md
- Tests: extended unit tests and BDD coverage in min_hash_lsh feature area (zero bands, zero rows, invalid product, overflow edge)
- Feature tests updated to cover new edge cases
Artefacts this plan creates:
- Verus proof: verus/clone_detector_lsh_config.rs
- Kani harness: crates/whitaker_clones_core/src/index/kani.rs
- Makefile and scripts: verus and Kani targets, install/run helpers
- ExecPlan: docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
- Documentation updates: docs/whitaker-clone-detector-design.md, docs/roadmap.md, docs/developers-guide.md updates
- Tests: extended unit tests and BDD coverage in min_hash_lsh feature area

Scope and constraints

Strict focus on roadmap items 7.2.4 and 7.2.5. Do not mark 7.2.6–7.2.8 as done in this change.
Proof tooling remains sidecar-based (Verus for semantic invariants; Kani for bounded checks) per ADR 002.
Preserve the existing runtime API shape; only crate-private proof seams may be introduced if needed.
Build on the existing clone-detector crate (crates/whitaker_clones_core). Do not move LshConfig into a separate crate.
If Kani Harnesses use #[cfg(kani)], register that cfg in the crate's unexpected_cfgs allowlist.
Keep each Rust source file focused and importable; avoid large monolithic files.
Workspace-wide lint/version constraints maintained (e.g., rstest-bdd version).

Implementation plan (highlights)

Stage B: Add clone-detector proof sidecar tooling and Makefile targets
- Extend scripts/run-verus.sh to group proofs by domain (clone-detector group)
- Add install-kani.sh and run-kani.sh for Kani harnesses
- Update Makefile with verus-clone-detector and kani-clone-detector targets
Stage C: Extend unit tests (ZeroBands, ZeroRows, InvalidBandRowProduct, overflow edge)
Stage D: Extend rstest-bdd coverage for min_hash_lsh config outcomes
Stage E: Add Verus proof for LshConfig::new (verus/clone_detector_lsh_config.rs)
Stage F: Add minimal Kani smoke harness (crates/whitaker_clones_core/src/index/kani.rs)
Stage G: Update clone-detector design/roadmap docs
Stage H: Gatekeeping and quality checks (fmt, lint, tests, docs)
Stage I: Finalize ExecPlan after implementation

Validation plan (evidence to collect)

Run gates:
- make fmt, make markdownlint, make nixie
- make verus-clone-detector
- make kani-clone-detector
- make check-fmt, make lint, make test
Optional broader proof smoke: make verus, make kani

Acceptance criteria

The ExecPlan document is added and clearly references 7.2.4 and 7.2.5.
Stage B–I milestones are defined and traceable in the plan.
Provisions for new proof tooling, targets, and tests are described (without changing runtime API yet).
Design decisions are captured in the ExecPlan and related docs updated.

Rationale

This change provides a concrete path to Verus-based proofs for LshConfig::new with a minimal Kani smoke harness, preserving API stability and keeping proof tooling separate from normal cargo/test flows as guided by ADR 002 and the roadmap.

Related guidance

ADR 002 on proof-workflow separation (Verus for semantic invariants; Kani for bounded checks).
Existing LshConfig runtime contract constraints: bands > 0, rows > 0, bands * rows == MINHASH_SIZE (128).

Progress

Stage A–I completed for this turn; gates to be re-run in CI.

Surprises & Discoveries

The repository already includes a pinned Verus sidecar workflow; Kani harnesses are introduced here as a first-stage smoke gate.

Task

https://www.devboxer.com/task/8141455b-138d-4d5c-becb-8d6a210ecbad

coderabbitai · 2026-04-08T15:27:53Z

Summary

This PR introduces formalised proof tooling for LshConfig::new invariants as part of roadmap items 7.2.4 and 7.2.5. It adds a Verus semantic model (verus/clone_detector_lsh_config.rs) specifying acceptance/rejection logic and a Kani bounded-model-checking harness suite (crates/whitaker_clones_core/src/index/kani.rs), along with corresponding Make targets, shell scripts, and documentation updates.

Key changes

Proof artefacts:

verus/clone_detector_lsh_config.rs: Spec-only Verus model defining LshConfigOutcome enum and branch-structured proof lemmas for constructor invariants (zero bands/rows rejection, checked_mul-based overflow handling, and bands * rows == MINHASH_SIZE acceptance).
crates/whitaker_clones_core/src/index/kani.rs: Three Kani harnesses—smoke test, symbolic bounded harness (bands ≤ 128, rows ≤ 128), and overflow-product symbolic harness—executing the real LshConfig::new with targeted assertions for each code path.

Build and scripting:

Makefile targets verus-clone-detector and kani-clone-detector added; kani description updated.
Scripts scripts/run-kani.sh and scripts/run-verus.sh refactored to support dispatched execution modes (clone-detector, decomposition, all).
scripts/install-verus.sh enhanced to handle ANSI sequences when parsing toolchain requirements.
Cargo.toml workspace lint configuration updated to recognise cfg(kani).

Documentation and testing:

New ExecPlan (docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md) documents scope, deliverables, constraints (sidecar-based per ADR 002), and validation gates for items 7.2.4 and 7.2.5.
docs/whitaker-clone-detector-design.md extended with implementation-decision section describing Verus/Kani split and external build entry points.
docs/developers-guide.md adds "Proof workflows" section documenting Make targets, wrapper scripts, and cache/idempotence behaviour.
docs/roadmap.md marks items 7.2.4 and 7.2.5 complete.
Test coverage extended: unit tests and rstest-bdd scenarios in min_hash_lsh.feature and min_hash_lsh_behaviour.rs cover zero-bands, zero-rows, invalid product, and symbolic overflow cases.

Review findings

Automated review identified gaps:

Verus spec: Functions as a semantic model without linkage to the real implementation (no external_fn_specification), so proofs do not establish conformance; several lemmas are tautological.
Kani symbolic harness: Constraints (bands ≤ 128, rows ≤ 128) prevent exercising the checked_mul overflow branch for inputs exceeding MINHASH_SIZE.

The PR author has completed documentation updates; recommendations for Kani overflow harness extension and Verus implementation linkage remain as follow-up items.

Scope

Proof tooling remains opt-in via sidecar scripts and Make targets; runtime API shape, LshConfig location, and workspace lint constraints unchanged per ADR 002. Implementation stages B–I are documented in the ExecPlan; validation includes formatting, linting, unit/BDD tests, and proof runs.

Walkthrough

This pull request adds formal verification infrastructure for the LshConfig::new constructor using both Verus (theorem proving) and Kani (bounded model checking). It introduces three Kani runtime harnesses, a comprehensive Verus specification module, build tooling orchestration via scripts and Make targets, extended test coverage, and supporting documentation marking roadmap items 7.2.4 and 7.2.5 complete.

Changes

Cohort / File(s)	Summary
Build configuration and tooling `Cargo.toml`, `Makefile`, `scripts/install-verus.sh`, `scripts/run-kani.sh`, `scripts/run-verus.sh`	Added `kani` conditional compilation flag recognition to lint configuration; extended Makefile with `verus-clone-detector` and `kani-clone-detector` phony targets; refactored `run-kani.sh` to support `clone-detector` mode dispatching with dedicated clone-detector harness group and decomposition/common harness separation; added group-based proof-file selection to `run-verus.sh` for `all\|decomposition\|clone-detector` modes; fixed ANSI escape sequence parsing in `install-verus.sh`.
Kani proof harnesses `crates/whitaker_clones_core/src/index/kani.rs`, `crates/whitaker_clones_core/src/index/mod.rs`	Added three Kani proof functions (`verify_lsh_config_new_smoke`, `verify_lsh_config_new_symbolic`, `verify_lsh_config_new_overflow_product`) verifying `LshConfig::new` constructor acceptance/rejection contract under smoke test, symbolic input, and integer overflow scenarios; introduced conditional `kani` module registration.
Verus proof specification `verus/clone_detector_lsh_config.rs`	Introduced comprehensive formal specification module defining `LshConfigOutcome` enum and eight proof lemmas establishing that `LshConfig::new` accepts exactly when `bands > 0 && rows > 0 && bands * rows == 128`, and rejects zero-band, zero-row, invalid-product, or overflow cases accordingly.
Test coverage extensions `crates/whitaker_clones_core/src/index/tests.rs`, `crates/whitaker_clones_core/tests/features/min_hash_lsh.feature`, `crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs`	Extended parameterised unit tests with valid `(32, 4)` case and invalid `(3, 42)` and `(usize::MAX, 2)` cases; added two BDD scenarios covering `ZeroRows` and `InvalidBandRowProduct` error paths; refactored `expected_error` function to parse parameterised `InvalidBandRowProduct(<bands>,<rows>)` error names and extract band/row values.
Documentation and roadmap `docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`, `docs/roadmap.md`, `docs/whitaker-clone-detector-design.md`, `docs/developers-guide.md`	Added comprehensive ExecPlan documenting 7.2.4/7.2.5 deliverables, proof tooling constraints, and validation gates; marked roadmap items 7.2.4 and 7.2.5 as complete; introduced "Implementation decisions" section in design document describing Verus/Kani proof split for constructor invariants; added "Proof workflows" section in developers guide covering Make targets, wrapper scripts, and current proof strategy.

Possibly related PRs

Kani verification for build_adjacency with tests and tooling #186: Evolves repository-local Kani tooling and harness workflows via modifications to run-kani.sh and Makefile targets, with parallel registration of cfg(kani) conditional compilation flag configuration.

Poem

With Verus proofs and Kani's might, 🔬
The constructor's logic burns so bright,
Zero bands? Rejected outright,
Band-row product matching MINHASH's height—
Overflow caught—verified just right! ✓

🚥 Pre-merge checks | ✅ 6 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Testing	⚠️ Warning	Core invariants tested via unit and BDD tests in CI, but shell scripts and proof tooling infrastructure lack automated test coverage.	Implement BATS tests for shell scripts and integrate Kani/Verus proof execution into CI as required or documented optional verification steps.

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly describes the main change: adding Verus/Kani proofs and an ExecPlan for LSH invariants (items 7.2.4/7.2.5), which aligns with the comprehensive changeset spanning proof files, harnesses, tooling, tests, and documentation.
Description check	✅ Passed	The description comprehensively outlines the scope, implementation stages, artefacts, validation plan, and acceptance criteria—all directly related to the changeset of proof workflows, tooling updates, test extensions, and documentation additions.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
User-Facing Documentation	✅ Passed	Proof workflows and Make targets are developer-focused infrastructure appropriately documented in developers-guide.md, not user-facing functionality requiring users-guide.md updates.
Developer Documentation	✅ Passed	PR documentation comprehensively covers all new tooling, build requirements, and architectural boundaries with dedicated sections.
Module-Level Documentation	✅ Passed	All new modules carry comprehensive module-level docstrings that clearly explain their purpose, utility, and function as required.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-verus-proofs-lsh-new-invariants-5sefjo

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai · 2026-04-08T15:28:27Z

Reviewer's Guide

Adds Verus and Kani sidecar proof workflows for the clone-detector’s LshConfig::new invariants, including new proof and harness files, grouped proof targets/scripts, extended unit/BDD tests for edge cases, and documentation/roadmap updates to reflect roadmap items 7.2.4 and 7.2.5.

Sequence diagram for kani-clone-detector bounded LshConfig::new verification

sequenceDiagram
  actor Dev
  participant Make as Makefile
  participant RunKani as scripts_run_kani_sh
  participant InstallKani as scripts_install_kani_sh
  participant CargoKani as cargo_kani
  participant Harness as index_kani_rs
  participant Lsh as LshConfig_new

  Dev->>Make: make kani-clone-detector
  Make->>RunKani: ./scripts/run-kani.sh clone-detector
  RunKani->>InstallKani: resolve and install cargo-kani
  InstallKani-->>RunKani: kani_bin path
  RunKani->>CargoKani: cargo-kani --manifest-path ... 
  CargoKani->>Harness: run verify_lsh_config_new_smoke
  Harness->>Lsh: LshConfig::new(32, 4)
  Lsh-->>Harness: Ok(LshConfig)
  Harness-->>CargoKani: assertions over config
  CargoKani->>Harness: run verify_lsh_config_new_symbolic
  Harness->>Harness: choose bands, rows with assumptions
  Harness->>Lsh: LshConfig::new(bands, rows)
  Lsh-->>Harness: Ok or Err(IndexError)
  Harness-->>CargoKani: assertions over result and error shapes
  CargoKani-->>RunKani: verification success/failure
  RunKani-->>Dev: exit status

Class diagram for LshConfig, IndexError, and new Kani harness module

classDiagram
  class LshConfig {
    +bands() usize
    +rows() usize
    +new(bands usize, rows usize) Result~LshConfig, IndexError~
  }

  class IndexError {
    <<enum>>
    ZeroBands
    ZeroRows
    InvalidBandRowProduct
    EmptyFingerprintSet
  }

  class MINHASH_SIZE {
    <<const>>
    +VALUE usize
  }

  class index_kani {
    <<module>>
    +verify_lsh_config_new_smoke()
    +verify_lsh_config_new_symbolic()
  }

  LshConfig --> IndexError : returns
  LshConfig ..> MINHASH_SIZE : uses

  index_kani ..> LshConfig : calls_new_and_accessors
  index_kani ..> IndexError : pattern_matches
  index_kani ..> MINHASH_SIZE : uses

File-Level Changes

Change	Details	Files
Group Verus proof entry points and add a dedicated clone-detector proof file for LshConfig::new invariants.	Refactor run-verus.sh to support proof file groups (decomposition, clone-detector, all) and selectable groups via CLI argument. Default Verus runs now include the new clone-detector proof file alongside existing decomposition proofs. Introduce verus/clone_detector_lsh_config.rs that semantically models LshConfig::new and proves acceptance only for positive bands/rows whose product equals MINHASH_SIZE, plus rejection lemmas for zero and invalid non-zero products.	`scripts/run-verus.sh` `verus/clone_detector_lsh_config.rs`
Introduce Kani sidecar workflow and harnesses for LshConfig::new, wired through scripts and Makefile targets.	Add scripts/install-kani.sh to install a pinned kani-verifier into a cache dir and expose cargo-kani path, running kani setup idempotently. Add scripts/run-kani.sh wrapper that runs the clone-detector Kani harnesses in the whitaker_clones_core crate, with a default-unwind bound. Create crates/whitaker_clones_core/src/index/kani.rs with a smoke harness and a symbolic harness around LshConfig::new, asserting alignment with the documented contract and error variants. Expose the kani module behind cfg(kani) from the index mod and allow cfg(kani) via the workspace unexpected_cfgs lint configuration. Add Makefile targets verus-clone-detector, kani, and kani-clone-detector that call the new scripts with appropriate groups.	`scripts/install-kani.sh` `scripts/run-kani.sh` `crates/whitaker_clones_core/src/index/kani.rs` `crates/whitaker_clones_core/src/index/mod.rs` `Cargo.toml` `Makefile`
Tighten and extend tests and BDD scenarios to cover LshConfig::new edge cases and richer error expectations.	Extend unit tests for invalid LSH configs to cover non-zero invalid product (3,42) and an overflow edge (usize::MAX, 2), and add a new valid case (32,4). Allow parsing of parameterised InvalidBandRowProduct(bands,rows) errors in the MinHash LSH behaviour test helper instead of a single fixed instance. Add new rstest-bdd scenarios for zero rows and invalid non-zero product in the behaviour test file, and reference them from the feature file with appropriate Given/Then steps.	`crates/whitaker_clones_core/src/index/tests.rs` `crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs` `crates/whitaker_clones_core/tests/features/min_hash_lsh.feature`
Document the proof-workflow split and mark roadmap items 7.2.4 and 7.2.5 as complete, plus harden Verus installer output parsing.	Add an ExecPlan document that explains scope, constraints, staged implementation, risks, decisions, and validation for adding clone-detector proof sidecars and the LshConfig::new proof. Update whitaker-clone-detector-design.md with a new section capturing implementation decisions for 7.2.4/7.2.5, including Verus vs Kani responsibilities, sidecar targets, LshConfig contract modeling, Kani harness placement, and overflow handling. Mark roadmap items 7.2.4 and 7.2.5 as done in docs/roadmap.md once implementation is in place. Strip ANSI escape codes from Verus installer output before extracting the suggested rustup toolchain version in scripts/install-verus.sh.	`docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md` `docs/whitaker-clone-detector-design.md` `docs/roadmap.md` `scripts/install-verus.sh`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 862087e28e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

sourcery-ai

Hey - I've found 3 issues

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="scripts/run-kani.sh" line_range="9-15" />
<code_context>
+KANI_BIN=$("${SCRIPT_DIR}/install-kani.sh")
+MANIFEST_PATH="${REPO_ROOT}/crates/whitaker_clones_core/Cargo.toml"
+
+run_clone_detector_harnesses() {
+    "${KANI_BIN}" \
+        --manifest-path "${MANIFEST_PATH}" \
+        --default-unwind 4 \
+        --harness verify_lsh_config_new_smoke \
+        --harness verify_lsh_config_new_symbolic
+}
+
+if [ $# -eq 0 ]; then
+    run_clone_detector_harnesses
+    exit 0
+fi
+
+case "$1" in
+    clone-detector)
+        shift
+        run_clone_detector_harnesses "$@"
+        ;;
+    *)
</code_context>
<issue_to_address>
**suggestion:** The helper ignores forwarded arguments, which is slightly misleading given the `"$@"` in the call site.

`run_clone_detector_harnesses` takes no parameters, but in the `clone-detector` branch you call it as `run_clone_detector_harnesses "$@"`, which is a no-op and potentially confusing. Either remove `"$@"` at the call site, or update `run_clone_detector_harnesses` to accept and forward options (e.g., extra `kani` flags) if that’s the intended design.

```suggestion
run_clone_detector_harnesses() {
    # Forward any additional arguments as extra Kani flags.
    "${KANI_BIN}" \
        --manifest-path "${MANIFEST_PATH}" \
        --default-unwind 4 \
        --harness verify_lsh_config_new_smoke \
        --harness verify_lsh_config_new_symbolic \
        "$@"
}
```
</issue_to_address>

### Comment 2
<location path="docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md" line_range="22" />
<code_context>
+This work matters because roadmap item 7.2.2 already made `LshConfig` a hard
+runtime boundary for MinHash and locality-sensitive hashing (LSH). If that
+boundary drifts, every later proof and every later token-pass candidate check
+inherits the bug. ADR 002 explicitly assigns this constructor to Verus and
+reserves Kani for bounded behavioural checks over the same clone-detector
+crate, so 7.2.4 and 7.2.5 establish the proof split that later items 7.2.6,
</code_context>
<issue_to_address>
**issue (review_instructions):** ADR is introduced as an acronym without being expanded on first use.

Please expand ADR on its first appearance, for example, "Architecture Decision Record (ADR) 002 explicitly assigns this constructor to Verus…" to satisfy the acronym definition requirement.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Define uncommon acronyms on first use.

</details>
</issue_to_address>

### Comment 3
<location path="docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md" line_range="182" />
<code_context>
+  `crates/whitaker_clones_core/tests/features/min_hash_lsh.feature` already
+  exercises zero bands. The implementation should extend these tests rather
+  than replace them.
+- `Makefile` currently exposes only a global `verus` target. If we want fast
+  clone-detector-only iteration without regressing the existing proof workflow,
+  the cleanest approach is to add clone-detector-specific targets alongside the
</code_context>
<issue_to_address>
**issue (review_instructions):** The sentence uses the first-person plural pronoun "we", which the style guide forbids.

To comply with the pronoun rule, consider rephrasing this line, for example: "If fast clone-detector-only iteration is desired, the cleanest approach is to add clone-detector-specific targets alongside the current umbrella target."

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Avoid 2nd person or 1st person pronouns ("I", "you", "we").

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 321: Replace the two-word phrase "check list" with the single word
"checklist" in the sentence that references the crate's `unexpected_cfgs` check
list so it reads "crate's `unexpected_cfgs` checklist"; locate the occurrence
near the text mentioning `unexpected_cfgs` and update the spelling accordingly.
- Line 311: Replace the British spelling "specialised" with the Oxford English
Dictionary preferred "-ize" form "specialized" in the sentence referencing
`unexpected_cfgs` considerations (the phrase "`unexpected_cfgs` considerations
when introducing specialised build modes."). Update that exact token so the doc
uses "specialized" while keeping the surrounding wording intact.

In `@scripts/run-kani.sh`:
- Around line 22-26: The case branch calls run_clone_detector_harnesses with an
unused argument passthrough ("$@"); remove the "$@" so the call is simply
run_clone_detector_harnesses. Update the case block where
run_clone_detector_harnesses is invoked to stop forwarding parameters and ensure
no other callers rely on arguments being passed from this branch.

In `@scripts/run-verus.sh`:
- Line 33: The array assignment using command substitution for PROOF_FILES is
unsafe and triggers SC2207; replace both occurrences where PROOF_FILES is set
from proof_files_for_group (the global case and the case inside the
option-handling branch) to use mapfile -t with process substitution: run
proof_files_for_group through process substitution and load into PROOF_FILES
with mapfile -t to avoid word-splitting and globbing issues (use mapfile -t
PROOF_FILES < <(proof_files_for_group all) for the top-level assignment and
mapfile -t PROOF_FILES < <(proof_files_for_group "$1") for the branch).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b90765d1-53b6-481a-a9cc-16fc2fb38383

📥 Commits

Reviewing files that changed from the base of the PR and between 221fdbd and 862087e.

📒 Files selected for processing (15)

Cargo.toml
Makefile
crates/whitaker_clones_core/src/index/kani.rs
crates/whitaker_clones_core/src/index/mod.rs
crates/whitaker_clones_core/src/index/tests.rs
crates/whitaker_clones_core/tests/features/min_hash_lsh.feature
crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
docs/roadmap.md
docs/whitaker-clone-detector-design.md
scripts/install-kani.sh
scripts/install-verus.sh
scripts/run-kani.sh
scripts/run-verus.sh
verus/clone_detector_lsh_config.rs

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Around line 213-216: Replace the phrase "sidecar proof tooling independent
from ordinary Cargo development" with "sidecar proof tooling independent of
ordinary Cargo development" in the sentence that mentions Makefile targets and
wrapper scripts and references ADR 002 (the sentence containing "sidecar proof
tooling independent from ordinary Cargo"); ensure the docs/markdown uses
en-GB-oxendict spelling and save the updated line.

In `@scripts/run-kani.sh`:
- Around line 27-31: The case branch for "clone-detector" drops all remaining
CLI flags by calling run_clone_detector_harnesses without arguments; change the
invocation in that branch to forward the shifted arguments to
run_clone_detector_harnesses (i.e., call run_clone_detector_harnesses "$@") so
the function (defined as run_clone_detector_harnesses) receives and can forward
the flags to the Kani harness execution.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20c34ef5-409e-406d-aaa3-8c5cec9faef8

📥 Commits

Reviewing files that changed from the base of the PR and between 862087e and d372910.

📒 Files selected for processing (3)

docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
scripts/run-kani.sh
scripts/run-verus.sh

leynos · 2026-04-11T13:27:53Z

@coderabbitai Have the following now been resolved?

Annul any requirements that violate the en-GB-oxendict spelling (-ize / -yse / -our) conventions (for example a request to replace "normalize" with "normalise" or "artefact" with "artifact"), or where the requirement unnecessarily increases cyclomatic complexity.

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Developer Documentation	⚠️ Warning	New tooling and build requirements introduced in this PR are not documented in `docs/developers-guide.md`, the primary reference for developers.	Update `docs/developers-guide.md` to add a new section documenting the proof workflow tooling, including Makefile targets, scripts, Kani installation requirements, and examples.

coderabbitai · 2026-04-11T13:28:21Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if docs/developers-guide.md mentions kani, verus, or proof workflow tooling
echo "=== Searching for kani/verus/proof references in docs/developers-guide.md ==="
fd -i 'developers-guide.md' --exec grep -in 'kani\|verus\|proof\|run-kani\|run-verus\|install-kani\|kani-clone\|verus-clone' {} \;

echo "=== Last 60 lines of docs/developers-guide.md ==="
fd -i 'developers-guide.md' --exec tail -n 60 {} \;

Length of output: 4266

@leynos: Yes — the warning is resolved. docs/developers-guide.md now contains a "Proof workflows" section (line 64 onwards) that covers all the required material:

Makefile targets — make verus, make verus-clone-detector, make kani, and make kani-clone-detector are all documented with a short prose explanation of what each group currently runs.
Scripts — scripts/install-verus.sh, scripts/run-verus.sh, scripts/install-kani.sh, and scripts/run-kani.sh are each described, including their cache locations and idempotency guarantee.
Kani requirements — a dedicated subsection explains the host-toolchain dependency, the automatic setup performed by scripts/install-kani.sh, and where the installed binary lives for direct inspection.
Examples — direct invocation examples for both the Make targets and the underlying wrapper scripts are provided.

Nothing further is needed for this check.

leynos · 2026-04-11T14:27:05Z

@coderabbitai are you satisfied that the introduced proof is substantive, rigorous, and well-founded?

coderabbitai · 2026-04-11T14:28:37Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== verus/clone_detector_lsh_config.rs ==="
fd -g 'clone_detector_lsh_config.rs' --exec cat {} \;

echo ""
echo "=== crates/.../index/kani.rs ==="
fd -g 'kani.rs' --exec cat {} \;

echo ""
echo "=== crates/.../index/lsh.rs (LshConfig::new implementation) ==="
fd -g 'lsh.rs' --exec cat {} \;

echo ""
echo "=== crates/.../index/error.rs ==="
fd -g 'error.rs' -p 'whitaker_clones_core' --exec cat {} \;

Length of output: 7587

🏁 Script executed:

#!/bin/bash
echo "=== LshConfig struct and new() implementation ==="
rg -n 'struct LshConfig|impl LshConfig|fn new' --type rust -A 20 -B 2

echo ""
echo "=== IndexError enum ==="
rg -n 'enum IndexError' --type rust -A 30

echo ""
echo "=== MINHASH_SIZE constant ==="
rg -n 'MINHASH_SIZE' --type rust -C 2

Length of output: 146186

@leynos: Having read both the proof artefacts in full alongside the real LshConfig::new implementation and validate_product, here is my assessment.

Kani — substantive and well-founded ✅

verify_lsh_config_new_symbolic calls the real LshConfig::new with symbolic usize inputs and exhausts all match arms, including the EmptyFingerprintSet unreachability claim. Within its input bounds (bands ≤ 128, rows ≤ 128) the harness provides strong bounded-model-checking guarantees that the implementation satisfies its contract.

One genuine gap: inputs where bands > 128 or rows > 128 are excluded by the kani::assume guards. The overflow branch inside validate_product (where checked_mul returns None) is therefore never exercised. A harness without those upper-bound assumptions — relying on the model checker to verify the None branch via checked_mul — would close that gap.

Verus — self-referential and implementation-free ⚠️

The Verus proof is not well-founded as a proof of the implementation. Three specific concerns:

No linkage to the real code. There is no external_fn_specification, external_body, or equivalent mechanism connecting lsh_config_accepts to LshConfig::new. The proof verifies only the internal consistency of an abstract spec; it makes no claim about the actual Rust function.
Tautological lemmas. lemma_acceptance_matches_documented_contract asserts:
```
lsh_config_accepts(bands, rows) <==> (bands > 0 && rows > 0 && bands * rows == minhash_size())
```
Since lsh_config_accepts is defined as exactly that RHS, this is X <==> X. Verus discharges it with an empty proof body — no proof work is done. Similarly, lemma_zero_bands_rejected and lemma_zero_rows_rejected are immediately true from the definition; substituting 0 for bands or rows falsifies the > 0 guard instantly.
The substantive lemmas are finite arithmetic checks. lemma_exact_product_is_accepted and lemma_invalid_non_zero_product_is_rejected use by (compute) to verify that 32*4 == 128, 1*128 == 128, 16*16 ≠ 128, and 3*42 ≠ 128. These are correct but trivial ground-term computations.

The PR documentation does describe the Verus artefact as a "semantic model" rather than an implementation proof, so this is a known design choice rather than an oversight. However, it is worth being explicit: the Verus file documents the intended contract; it does not verify that LshConfig::new upholds it.

Summary

Artefact	Calls real code	Symbolic	Proof content	Gap
Verus	✗	✗	Tautological over spec definition	No implementation linkage
Kani smoke	✓	✗	Single concrete path	Narrow
Kani symbolic	✓	✓	Full match exhaustion over `[0,128]²`	Overflow inputs uncovered

The Kani harness is the only artefact that actually verifies the implementation, and it does so substantively within its declared bounds. Two actions would strengthen the overall proof story:

Add a Kani harness for unconstrained bands and rows (removing the <= MINHASH_SIZE assumptions) to cover the checked_mul overflow path in validate_product.
Either extend the Verus proof to link the spec to the implementation via external_fn_specification, or document clearly in the design notes that the Verus file is a contract record rather than an implementation proof (the latter already appears partially in the design doc but could be stated more plainly alongside the tautology concern).

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/developers-guide.md`:
- Around line 125-127: Replace the phrase "cached install" with "cached
installation" in the sentence "The first proof run may take longer while
toolchains and verifier binaries are downloaded; later runs reuse the cached
install." to correct the noun usage and conform to en-GB-oxendict
spelling/grammar in docs; update the sentence to read "...later runs reuse the
cached installation." ensuring consistency in docs/*.md.

In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 7: Update the top matter "Status" field from "IN PROGRESS" to "COMPLETED"
so the document header matches the body (all stages checked in the stages
section and the retrospective records finished outcomes); specifically change
the Status value on the header line that currently reads "Status: IN PROGRESS"
to "Status: COMPLETED".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7f051490-5299-4e5f-bac2-bb955a8cb246

📥 Commits

Reviewing files that changed from the base of the PR and between d372910 and 6ae12a1.

📒 Files selected for processing (6)

crates/whitaker_clones_core/src/index/kani.rs
docs/developers-guide.md
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
docs/whitaker-clone-detector-design.md
scripts/run-kani.sh
verus/clone_detector_lsh_config.rs

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md (1)
7-7: ⚠️ Potential issue | 🟡 Minor

Set the ExecPlan status to COMPLETED.

Update Line 7 to match the completed stages and retrospective.

Triage: [type:docstyle]
📝 Apply status fix
-Status: IN PROGRESS
+Status: COMPLETED
As per coding guidelines, documentation in docs/ must remain accurate and
current when implementation state changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md` at line
7, Change the ExecPlan front-matter/status line from "Status: IN PROGRESS" to
"Status: COMPLETED" and update the stage and retrospective text on that document
to reflect the completed work; locate the "Status" header near the top of the
document (line containing "Status: IN PROGRESS") and replace it with "Status:
COMPLETED", then ensure any subsequent sections that list stages, completed
steps, or the retrospective accurately reflect completion.
docs/developers-guide.md (1)
125-127: ⚠️ Potential issue | 🟡 Minor

Replace “cached install” with “cached installation”.

Fix the noun usage at Line 127.

Triage: [type:grammar]
📝 Apply wording fix
-The installer scripts are idempotent. The first proof run may take longer while
-toolchains and verifier binaries are downloaded; later runs reuse the cached
-install.
+The installer scripts are idempotent. The first proof run may take longer while
+toolchains and verifier binaries are downloaded; later runs reuse the cached
+installation.
As per coding guidelines, docs/**/*.md content must use en-GB-oxendict
spelling and grammar.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/developers-guide.md` around lines 125 - 127, Replace the phrase "cached
install" with "cached installation" in the sentence starting "The first proof
run may take longer..." so the line reads "...later runs reuse the cached
installation."; ensure the docs/developers-guide.md change follows
en-GB-oxendict spelling/grammar conventions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs`:
- Around line 50-60: The parsed tokens from arguments.split_once(',') (the local
variables bands and rows) should be trimmed before calling parse::<usize>() to
avoid failures from surrounding whitespace; update the code that constructs
bands and rows (the let bands = ... and let rows = ... expressions) to call
.trim() on each token prior to parse and keep the existing map_err messages
unchanged.

In `@scripts/run-verus.sh`:
- Line 33: Replace use of the Bash-4-only mapfile plus process substitution with
a Bash-3.2-compatible loader: add a helper function named
set_proof_files_for_group() that initializes PROOF_FILES=() and populates it by
piping proof_files_for_group "$1" into a while IFS= read -r proof_file; do
PROOF_FILES+=("${proof_file}"); done loop, then replace the two occurrences of
"mapfile -t PROOF_FILES < <(proof_files_for_group ...)" with calls to
set_proof_files_for_group all and set_proof_files_for_group "$1" respectively so
PROOF_FILES is populated without using mapfile or process substitution.

---

Duplicate comments:
In `@docs/developers-guide.md`:
- Around line 125-127: Replace the phrase "cached install" with "cached
installation" in the sentence starting "The first proof run may take longer..."
so the line reads "...later runs reuse the cached installation."; ensure the
docs/developers-guide.md change follows en-GB-oxendict spelling/grammar
conventions.

In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 7: Change the ExecPlan front-matter/status line from "Status: IN
PROGRESS" to "Status: COMPLETED" and update the stage and retrospective text on
that document to reflect the completed work; locate the "Status" header near the
top of the document (line containing "Status: IN PROGRESS") and replace it with
"Status: COMPLETED", then ensure any subsequent sections that list stages,
completed steps, or the retrospective accurately reflect completion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 615f7307-61b4-45e3-a228-6f499d89bcae

📥 Commits

Reviewing files that changed from the base of the PR and between 6ae12a1 and 5fb3af2.

📒 Files selected for processing (16)

Cargo.toml
Makefile
crates/whitaker_clones_core/src/index/kani.rs
crates/whitaker_clones_core/src/index/mod.rs
crates/whitaker_clones_core/src/index/tests.rs
crates/whitaker_clones_core/tests/features/min_hash_lsh.feature
crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs
docs/developers-guide.md
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
docs/roadmap.md
docs/whitaker-clone-detector-design.md
scripts/install-kani.sh
scripts/install-verus.sh
scripts/run-kani.sh
scripts/run-verus.sh
verus/clone_detector_lsh_config.rs

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

scripts/run-kani.sh (1)
57-58: ⚠️ Potential issue | 🔴 Critical

Remove the duplicate closing brace on line 58.

Line 58 contains an extra } that causes a shell parse error. The function run_clone_detector_harnesses is already closed by the brace on line 57.
🐛 Proposed fix
     done
 }
-}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/run-kani.sh` around lines 57 - 58, There is a duplicate closing brace
after the run_clone_detector_harnesses function causing a shell parse error;
remove the extra `}` that immediately follows the existing closing brace for
run_clone_detector_harnesses so the function is closed only once and the script
parses correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/whitaker-clone-detector-design.md`:
- Line 550: Edit the sentence fragment that currently reads "Together these
harnesses assert the" to insert a comma after "Together" so it reads "Together,
these harnesses assert the" to improve readability; locate the string "Together
these harnesses assert the" in the document and update it accordingly.

In `@Makefile`:
- Line 1: Remove the duplicate 'kani' entry from the .PHONY declaration: edit
the .PHONY line that currently lists targets including 'kani' twice and delete
the redundant 'kani' token so each phony target appears only once (look for the
.PHONY line containing help all clean test build release lint fmt check-fmt ...
kani kani-clone-detector and remove the extra 'kani').

---

Duplicate comments:
In `@scripts/run-kani.sh`:
- Around line 57-58: There is a duplicate closing brace after the
run_clone_detector_harnesses function causing a shell parse error; remove the
extra `}` that immediately follows the existing closing brace for
run_clone_detector_harnesses so the function is closed only once and the script
parses correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 398b1895-7c56-4ae4-abf1-4cc73752c81b

📥 Commits

Reviewing files that changed from the base of the PR and between 5fb3af2 and 4a2d47a.

📒 Files selected for processing (16)

Cargo.toml
Makefile
crates/whitaker_clones_core/src/index/kani.rs
crates/whitaker_clones_core/src/index/mod.rs
crates/whitaker_clones_core/src/index/tests.rs
crates/whitaker_clones_core/tests/features/min_hash_lsh.feature
crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs
docs/developers-guide.md
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
docs/roadmap.md
docs/whitaker-clone-detector-design.md
installer/src/main.rs
scripts/install-verus.sh
scripts/run-kani.sh
scripts/run-verus.sh
verus/clone_detector_lsh_config.rs

… roadmap 7.2.4-7.2.5 Add a detailed, living ExecPlan document outlining the implementation, constraints, risks, progress stages, and acceptance criteria for roadmap items 7.2.4 and 7.2.5. This plan covers establishing reusable proof plumbing for the clone-detector token pipeline, proving constructor invariants for LshConfig::new, integrating Verus and Kani sidecar proof tooling, and augmenting test and documentation coverage. It serves as the authoritative blueprint governing the scoped proof split and workflow before implementation begins. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

…plicit bounds and cfg handling - Introduce a minimal Kani smoke harness for LshConfig with symbolic but bounded inputs. - Follow Chutoro patterns: keep harnesses close to code, guard with #[cfg(kani)], and document usage. - Update unexpected_cfgs allowlist for cfg(kani) to prevent lint warnings. - Extend Makefile and scripts to run practical Kani targets, leaving room for future exhaustive checks. - Add detailed docs on harness design, unwinding limits, and state-explosion mitigation strategies. This supports roadmap item 7.2.4 by providing a practical local smoke gate for Kani proofs without yet implementing heavier exhaustive runs planned for later roadmap items. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

…:new - Added a Verus proof module `verus/clone_detector_lsh_config.rs` that semantically verifies the contract of `LshConfig::new` ensuring positive bands and rows with product equal to MINHASH_SIZE. - Introduced a Kani harness in `crates/whitaker_clones_core/src/index/kani.rs` for bounded symbolic testing of `LshConfig::new` under `cfg(kani)`. - Extended rspec-based test scenarios and unit tests to cover zero rows, invalid non-zero products, and overflow cases. - Added supporting Makefile targets and scripts (`make kani-clone-detector`, `make verus-clone-detector`, install/run scripts) for sidecar proof workflows. - Updated documentation and roadmap to reflect implementation progress on clone-detector proof workflows and LshConfig invariants. - Improved shared tooling in `scripts/install-verus.sh` to strip ANSI escape codes for stable Verus installation. This change establishes a formal verification foundation for clone-detector LSH configuration, enabling future proofs and bounded runtime checks. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

…iteration - Added `make verus-clone-detector` and `make kani-clone-detector` targets for targeted proof runs without regressing existing workflow. - Refactored `scripts/run-kani.sh` to run clone-detector harnesses in a loop, improving maintainability. - Updated `scripts/run-verus.sh` to read proof files efficiently with `mapfile`. - Fixed shared Verus installer to strip ANSI escape codes from toolchain hints, ensuring consistent fallback handling across targets. - Adjusted documentation in execplans to align with new proof sidecar workflow and clarify architecture decisions. These changes enhance developer efficiency by supporting focused proof workflow iterations and improving tooling robustness. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

…ing scripts - Document new Verus and Kani proof make targets and usage - Explain repository-managed proof tooling and scripts - Provide instructions for Kani installation and execution - Include example commands for running proofs - Fix minor typo in execplans markdown regarding Cargo Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

…s LshConfig proofs - Added `verify_lsh_config_new_overflow_product` Kani harness to drive overflow path in LshConfig::new. - Updated Kani harness set to include the overflow-specific proof. - Extended Verus proof to mirror runtime branch order including overflow rejection. - Documented the distinction between Verus model and runtime implementation in developer guide. - Clarified that Verus proves an implementation-shaped model, while Kani executes the real constructor code. - Improved proofs for zero bands, zero rows, exact product acceptance, invalid product rejection, and overflow in Verus. This enhances formal coverage of LshConfig constructor ensuring overflow paths are verified without trusted assumptions. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

Apply the remaining proof-workflow review fixes by trimming parsed LSH\ninputs, updating proof documentation wording and status, and\nreplacing Bash-4-only mapfile usage in the Verus wrapper.\n\nAlso make the installer bin tests deterministic again by routing\ntheir Dylint dependency checks through injected test-only install\noptions instead of the live repository-first installer path. This\nkeeps the validated worktree stable before rebasing onto main.

The script scripts/run-kani.sh had an unnecessary closing brace '}' removed to fix shell syntax correctness. Additionally, a minor punctuation fix was applied in docs/whitaker-clone-detector-design.md, and the Makefile's .PHONY declaration was cleaned up by removing duplicate 'kani' target. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@verus/clone_detector_lsh_config.rs`:
- Around line 8-12: The sidecar currently proves lsh_config_new_result but is
not linked to the runtime constructor LshConfig::new (nor to validate_product),
so the proof is disconnected; either add Verus linkage by declaring an
external_fn_specification or external_body that connects lsh_config_new_result
to LshConfig::new (and any validate_product calls used by the constructor),
ensuring the sidecar’s spec references the same function signature and types, or
alternatively demote the module to a pure contract record by removing the lemma
and using a record/abstract spec that does not pretend to verify the concrete
constructor; update references to LshConfig::new, validate_product, and
lsh_config_new_result accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 03e899d7-ff8a-423f-aa73-5836d91e47af

📥 Commits

Reviewing files that changed from the base of the PR and between 4a2d47a and 6705a99.

📒 Files selected for processing (15)

Cargo.toml
Makefile
crates/whitaker_clones_core/src/index/kani.rs
crates/whitaker_clones_core/src/index/mod.rs
crates/whitaker_clones_core/src/index/tests.rs
crates/whitaker_clones_core/tests/features/min_hash_lsh.feature
crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs
docs/developers-guide.md
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md
docs/roadmap.md
docs/whitaker-clone-detector-design.md
scripts/install-verus.sh
scripts/run-kani.sh
scripts/run-verus.sh
verus/clone_detector_lsh_config.rs

leynos · 2026-04-17T10:39:40Z

@coderabbitai Please suggest a fix for the following concern and provide an AI coding agent prompt for the fix:

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Testing	⚠️ Warning	Core invariants tested via unit and BDD tests in CI, but shell scripts and proof tooling infrastructure lack automated test coverage.	Implement BATS tests for shell scripts and integrate Kani/Verus proof execution into CI as required or documented optional verification steps.

coderabbitai · 2026-04-17T10:41:04Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== CI workflow files ==="
fd -e yml -e yaml -p '.github/workflows' --exec cat {} \;

echo ""
echo "=== Existing BATS usage ==="
fd -e bats -e bash -g '*.bats' --exec echo "Found: {}" \;
rg -l 'bats' --type yaml --type bash 2>/dev/null || true
fd -g 'bats' 2>/dev/null | head -20

echo ""
echo "=== scripts directory tree ==="
tree scripts/ 2>/dev/null || ls -la scripts/

echo ""
echo "=== run-kani.sh ==="
cat scripts/run-kani.sh 2>/dev/null || fd -g 'run-kani.sh' --exec cat {} \;

echo ""
echo "=== run-verus.sh ==="
cat scripts/run-verus.sh 2>/dev/null || fd -g 'run-verus.sh' --exec cat {} \;

echo ""
echo "=== install-kani.sh ==="
cat scripts/install-kani.sh 2>/dev/null || fd -g 'install-kani.sh' --exec cat {} \;

echo ""
echo "=== Makefile (relevant targets) ==="
grep -n 'kani\|verus\|bats\|test' Makefile 2>/dev/null | head -40

echo ""
echo "=== Any existing test/ or tests/ at repo root ==="
ls -la tests/ 2>/dev/null || echo "no root tests dir"
fd -g 'test*' --max-depth 2 -t d 2>/dev/null | head -10

Length of output: 12779

@leynos: Here is a concrete fix plan and a ready-to-use AI coding agent prompt.

Fix overview

The warning has two distinct sub-problems:

Sub-problem	Fix
Shell scripts lack automated tests	Add BATS unit tests under `tests/scripts/` that stub the heavy toolchain binaries and verify the scripts' argument-dispatch and error-handling logic
Proof tooling has no CI gate	Add an opt-in GitHub Actions workflow (triggered by `workflow_dispatch` and a weekly schedule) that installs the sidecar tools and runs `make kani-clone-detector` and `make verus-clone-detector`

What BATS tests should cover

tests/scripts/run-verus.bats

proof_files_for_group decomposition emits the two decomposition proof files
proof_files_for_group clone-detector emits verus/clone_detector_lsh_config.rs
proof_files_for_group all emits all three files
An unknown group name exits with status 1 and an error message on stderr
Passing clone-detector as the first positional argument restricts PROOF_FILES to the clone-detector set before invoking the stub verus binary
Passing an explicit file path bypasses group selection

tests/scripts/run-kani.bats

No arguments → both run_decomposition_harnesses and run_clone_detector_harnesses are called
clone-detector → only the three clone-detector harnesses are invoked, each with --harness <name>
decomposition / common → only the decomposition path is taken
Unknown first argument → treated as a decomposition harness filter (falls through to default case)

tests/scripts/install-kani.bats

When the kani-driver binary already exists at the expected cache path, curl and tar are not called (idempotency)
platform_asset_suffix emits the correct suffix for each uname -s/uname -m combination
An unsupported platform exits 1 with a diagnostic on stderr

AI coding agent prompt

## Task: Add BATS tests for proof-tooling shell scripts and a CI proof workflow

### Context
Repository: whitaker (Rust workspace)
Branch: PR `#194` branch

The following shell scripts under `scripts/` have no automated test coverage:
- `scripts/run-kani.sh`   — dispatches to clone-detector or decomposition Kani harness groups
- `scripts/run-verus.sh`  — dispatches to proof-file groups via `proof_files_for_group`
- `scripts/install-kani.sh` — idempotent download/cache of a pinned Kani release

There is no CI gate that runs `make kani-clone-detector` or `make verus-clone-detector`.

### Spelling/style constraints
- Use en-GB-oxendict spelling throughout all prose and comments (-ise/-yse/-our).
- Do NOT increase cyclomatic complexity unnecessarily.

### Requirements

#### 1. BATS test infrastructure

1.1  Add a `tests/scripts/` directory with a `README.md` explaining that BATS must be
     installed (`brew install bats-core` / `apt-get install bats`) to run these tests.

1.2  Add `tests/scripts/run-verus.bats` covering:
     - `proof_files_for_group decomposition` prints exactly the two decomposition `.rs` paths.
     - `proof_files_for_group clone-detector` prints exactly `verus/clone_detector_lsh_config.rs`.
     - `proof_files_for_group all` prints all three paths.
     - An unknown group name exits with status 1 and writes to stderr.
     - Invoking `scripts/run-verus.sh clone-detector` with a stub `verus` binary (one that
       records its arguments and exits 0) calls that stub with only the clone-detector proof
       file and no other files.
     - Invoking `scripts/run-verus.sh <explicit-path>` passes that exact path to the stub.

1.3  Add `tests/scripts/run-kani.bats` covering:
     - No arguments: the stub `cargo-kani` is called for each of the three clone-detector
       harnesses AND for the decomposition group.
     - `clone-detector`: only the three named `--harness` invocations occur.
     - `decomposition` or `common`: only the decomposition path is taken.
     - An unknown first argument is treated as a decomposition harness filter.

1.4  Add `tests/scripts/install-kani.bats` covering:
     - When `kani-driver` already exists at the expected cache path, neither `curl` nor `tar`
       is executed (idempotency).
     - `platform_asset_suffix` emits `x86_64-unknown-linux-gnu` on Linux x86_64.
     - An unsupported `uname` combination exits 1.

     Use BATS helper libraries `bats-support` and `bats-assert` (submodule or vendored under
     `tests/scripts/test_helper/`). Stub external tools by prepending a `$BATS_TEST_TMPDIR/bin`
     directory containing shell stubs onto `PATH`; do not mock via `eval` or global overrides.

#### 2. Makefile target

Add a `test-scripts` phony target to `Makefile` that runs:

```make
test-scripts: ## Run BATS tests for shell scripts
	`@command` -v bats >/dev/null || { echo "Install bats-core to run script tests"; exit 1; }
	bats tests/scripts/
```

Add `test-scripts` to the `.PHONY` list and call it from inside the `test` target when
`command -v bats` succeeds (non-fatal: print a notice and skip if BATS is absent).

#### 3. CI workflow — optional proof gate

Add `.github/workflows/proofs.yml`:

```yaml
name: Proofs
on:
  workflow_dispatch:
  schedule:
    - cron: '0 3 * * 1'   # Every Monday at 03:00 UTC
```

The workflow should have a single job `proof` that:
- Runs on `ubuntu-latest`.
- Checks out the repository.
- Installs a stable Rust toolchain with `rustup`.
- Runs `make kani-clone-detector` (which internally calls `scripts/install-kani.sh` then
  `scripts/run-kani.sh clone-detector`).
- Runs `make verus-clone-detector` (which internally calls `scripts/install-verus.sh` then
  `scripts/run-verus.sh clone-detector`).
- Caches both `~/.cache/whitaker/kani` and `~/.cache/whitaker/verus` using
  `actions/cache` keyed on the pinned version strings (`KANI_VERSION`, the Verus version)
  so repeated runs do not re-download.

The workflow must NOT be part of the required status checks — it is an optional verification
step. Document this in `docs/developers-guide.md` under the existing "Proof workflows" section
by adding a short paragraph noting the `Proofs` workflow and how to trigger it manually.

#### 4. Quality gates
- All new `.bats` files must pass `shellcheck` (warn level).
- All new `.sh` additions/edits must pass `shellcheck -S warning`.
- BATS tests must be self-contained: no network calls, no Kani/Verus binaries required.

…194) * docs(execplans): add comprehensive execplan for clone-detector proofs roadmap 7.2.4-7.2.5 Add a detailed, living ExecPlan document outlining the implementation, constraints, risks, progress stages, and acceptance criteria for roadmap items 7.2.4 and 7.2.5. This plan covers establishing reusable proof plumbing for the clone-detector token pipeline, proving constructor invariants for LshConfig::new, integrating Verus and Kani sidecar proof tooling, and augmenting test and documentation coverage. It serves as the authoritative blueprint governing the scoped proof split and workflow before implementation begins. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(proofs): add practical Kani harnesses for clone-detector with explicit bounds and cfg handling - Introduce a minimal Kani smoke harness for LshConfig with symbolic but bounded inputs. - Follow Chutoro patterns: keep harnesses close to code, guard with #[cfg(kani)], and document usage. - Update unexpected_cfgs allowlist for cfg(kani) to prevent lint warnings. - Extend Makefile and scripts to run practical Kani targets, leaving room for future exhaustive checks. - Add detailed docs on harness design, unwinding limits, and state-explosion mitigation strategies. This supports roadmap item 7.2.4 by providing a practical local smoke gate for Kani proofs without yet implementing heavier exhaustive runs planned for later roadmap items. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(clone-detector): add Verus proof and Kani harness for LshConfig::new - Added a Verus proof module `verus/clone_detector_lsh_config.rs` that semantically verifies the contract of `LshConfig::new` ensuring positive bands and rows with product equal to MINHASH_SIZE. - Introduced a Kani harness in `crates/whitaker_clones_core/src/index/kani.rs` for bounded symbolic testing of `LshConfig::new` under `cfg(kani)`. - Extended rspec-based test scenarios and unit tests to cover zero rows, invalid non-zero products, and overflow cases. - Added supporting Makefile targets and scripts (`make kani-clone-detector`, `make verus-clone-detector`, install/run scripts) for sidecar proof workflows. - Updated documentation and roadmap to reflect implementation progress on clone-detector proof workflows and LshConfig invariants. - Improved shared tooling in `scripts/install-verus.sh` to strip ANSI escape codes for stable Verus installation. This change establishes a formal verification foundation for clone-detector LSH configuration, enabling future proofs and bounded runtime checks. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(clone-detector): add specific Verus and Kani targets for faster iteration - Added `make verus-clone-detector` and `make kani-clone-detector` targets for targeted proof runs without regressing existing workflow. - Refactored `scripts/run-kani.sh` to run clone-detector harnesses in a loop, improving maintainability. - Updated `scripts/run-verus.sh` to read proof files efficiently with `mapfile`. - Fixed shared Verus installer to strip ANSI escape codes from toolchain hints, ensuring consistent fallback handling across targets. - Adjusted documentation in execplans to align with new proof sidecar workflow and clarify architecture decisions. These changes enhance developer efficiency by supporting focused proof workflow iterations and improving tooling robustness. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * docs(proof-workflows): add documentation for proof workflows and tooling scripts - Document new Verus and Kani proof make targets and usage - Explain repository-managed proof tooling and scripts - Provide instructions for Kani installation and execution - Include example commands for running proofs - Fix minor typo in execplans markdown regarding Cargo Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(kaniproofs): add overflow-specific Kani harness and tighten Verus LshConfig proofs - Added `verify_lsh_config_new_overflow_product` Kani harness to drive overflow path in LshConfig::new. - Updated Kani harness set to include the overflow-specific proof. - Extended Verus proof to mirror runtime branch order including overflow rejection. - Documented the distinction between Verus model and runtime implementation in developer guide. - Clarified that Verus proves an implementation-shaped model, while Kani executes the real constructor code. - Improved proofs for zero bands, zero rows, exact product acceptance, invalid product rejection, and overflow in Verus. This enhances formal coverage of LshConfig constructor ensuring overflow paths are verified without trusted assumptions. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * Fix proof follow-ups and installer test hooks Apply the remaining proof-workflow review fixes by trimming parsed LSH\ninputs, updating proof documentation wording and status, and\nreplacing Bash-4-only mapfile usage in the Verus wrapper.\n\nAlso make the installer bin tests deterministic again by routing\ntheir Dylint dependency checks through injected test-only install\noptions instead of the live repository-first installer path. This\nkeeps the validated worktree stable before rebasing onto main. * style(scripts): remove extraneous closing brace in run-kani.sh The script scripts/run-kani.sh had an unnecessary closing brace '}' removed to fix shell syntax correctness. Additionally, a minor punctuation fix was applied in docs/whitaker-clone-detector-design.md, and the Makefile's .PHONY declaration was cleaned up by removing duplicate 'kani' target. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> --------- Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>

This comment was marked as outdated.

Sign in to view

leynos changed the title ~~Plan Verus proofs for LshConfig::new invariants (7.2.4/7.2.5)~~ Add Verus proof and Kani harness for LshConfig::new invariants Apr 9, 2026

leynos marked this pull request as ready for review April 10, 2026 01:04

coderabbitai bot added the Roadmap label Apr 10, 2026

chatgpt-codex-connector bot reviewed Apr 10, 2026

View reviewed changes

Comment thread scripts/run-kani.sh Outdated

sourcery-ai bot reviewed Apr 10, 2026

View reviewed changes

Comment thread scripts/run-kani.sh

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md Outdated

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md Outdated

coderabbitai bot requested changes Apr 10, 2026

View reviewed changes

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md Outdated

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md Outdated

Comment thread scripts/run-verus.sh Outdated

This comment was marked as outdated.

Sign in to view

leynos changed the title ~~Add Verus proof and Kani harness for LshConfig::new invariants~~ Plan Verus/Kani proofs and tooling for LshConfig::new Apr 10, 2026

coderabbitai bot requested changes Apr 10, 2026

View reviewed changes

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md

Comment thread scripts/run-kani.sh

leynos changed the title ~~Plan Verus/Kani proofs and tooling for LshConfig::new~~ Add Verus clone-detector proof and Kani harness for LshConfig::new Apr 10, 2026

This comment was marked as outdated.

Sign in to view

coderabbitai bot approved these changes Apr 10, 2026

View reviewed changes

leynos changed the title ~~Add Verus clone-detector proof and Kani harness for LshConfig::new~~ Plan Verus Proofs Workflow and Tests Apr 11, 2026

This comment was marked as outdated.

Sign in to view

coderabbitai bot requested changes Apr 11, 2026

View reviewed changes

Comment thread docs/developers-guide.md Outdated

Comment thread docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md Outdated

devboxerhub bot force-pushed the add-verus-proofs-lsh-new-invariants-5sefjo branch from 6ae12a1 to 5fb3af2 Compare April 13, 2026 00:17

This comment was marked as outdated.

Sign in to view

leynos changed the title ~~Plan Verus Proofs Workflow and Tests~~ Add Verus/Kani proof workflow, harnesses, and ExecPlan Apr 13, 2026

coderabbitai bot requested changes Apr 13, 2026

View reviewed changes

Comment thread crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs

Comment thread scripts/run-verus.sh Outdated

devboxerhub bot force-pushed the add-verus-proofs-lsh-new-invariants-5sefjo branch from 5fb3af2 to 4a2d47a Compare April 14, 2026 20:32

This comment was marked as outdated.

Sign in to view

coderabbitai bot requested changes Apr 14, 2026

View reviewed changes

Comment thread docs/whitaker-clone-detector-design.md Outdated

Comment thread Makefile Outdated

This comment was marked as outdated.

Sign in to view

leynos changed the title ~~Add Verus/Kani proof workflow, harnesses, and ExecPlan~~ Add Verus/Kani proofs and ExecPlan for LSH invariants (7.2.4/7.2.5) Apr 15, 2026

coderabbitai bot approved these changes Apr 15, 2026

View reviewed changes

leynos and others added 8 commits April 15, 2026 16:23

devboxerhub bot force-pushed the add-verus-proofs-lsh-new-invariants-5sefjo branch from 512d978 to 6705a99 Compare April 15, 2026 16:28

codescene-delta-analysis bot approved these changes Apr 15, 2026

View reviewed changes

coderabbitai bot requested changes Apr 15, 2026

View reviewed changes

Comment thread verus/clone_detector_lsh_config.rs

coderabbitai bot approved these changes Apr 17, 2026

View reviewed changes

leynos merged commit aeec8b8 into main Apr 17, 2026
5 checks passed

leynos deleted the add-verus-proofs-lsh-new-invariants-5sefjo branch April 17, 2026 11:51

Conversation

leynos commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this PR adds

Scope and constraints

Implementation plan (highlights)

Validation plan (evidence to collect)

Acceptance criteria

Rationale

Related guidance

Progress

Surprises & Discoveries

Task

Uh oh!

coderabbitai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key changes

Review findings

Scope

Walkthrough

Changes

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as outdated.

Uh oh!

sourcery-ai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for kani-clone-detector bounded LshConfig::new verification

Class diagram for LshConfig, IndexError, and new Kani harness module

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

leynos commented Apr 11, 2026

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot commented Apr 11, 2026

Uh oh!

leynos commented Apr 11, 2026

Uh oh!

coderabbitai bot commented Apr 11, 2026

Kani — substantive and well-founded ✅

Verus — self-referential and implementation-free ⚠️

Summary

leynos commented Apr 8, 2026 •

edited

Loading

coderabbitai bot commented Apr 8, 2026 •

edited

Loading

sourcery-ai bot commented Apr 8, 2026 •

edited

Loading