Add Verus/Kani proofs and ExecPlan for LSH invariants (7.2.4/7.2.5)#194
Add Verus/Kani proofs and ExecPlan for LSH invariants (7.2.4/7.2.5)#194
Conversation
SummaryThis PR introduces formalised proof tooling for Key changesProof artefacts:
Build and scripting:
Documentation and testing:
Review findingsAutomated review identified gaps:
The PR author has completed documentation updates; recommendations for Kani overflow harness extension and Verus implementation linkage remain as follow-up items. ScopeProof tooling remains opt-in via sidecar scripts and Make targets; runtime API shape, WalkthroughThis pull request adds formal verification infrastructure for the Changes
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 6 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Reviewer's GuideAdds Verus and Kani sidecar proof workflows for the clone-detector’s LshConfig::new invariants, including new proof and harness files, grouped proof targets/scripts, extended unit/BDD tests for edge cases, and documentation/roadmap updates to reflect roadmap items 7.2.4 and 7.2.5. Sequence diagram for kani-clone-detector bounded LshConfig::new verificationsequenceDiagram
actor Dev
participant Make as Makefile
participant RunKani as scripts_run_kani_sh
participant InstallKani as scripts_install_kani_sh
participant CargoKani as cargo_kani
participant Harness as index_kani_rs
participant Lsh as LshConfig_new
Dev->>Make: make kani-clone-detector
Make->>RunKani: ./scripts/run-kani.sh clone-detector
RunKani->>InstallKani: resolve and install cargo-kani
InstallKani-->>RunKani: kani_bin path
RunKani->>CargoKani: cargo-kani --manifest-path ...
CargoKani->>Harness: run verify_lsh_config_new_smoke
Harness->>Lsh: LshConfig::new(32, 4)
Lsh-->>Harness: Ok(LshConfig)
Harness-->>CargoKani: assertions over config
CargoKani->>Harness: run verify_lsh_config_new_symbolic
Harness->>Harness: choose bands, rows with assumptions
Harness->>Lsh: LshConfig::new(bands, rows)
Lsh-->>Harness: Ok or Err(IndexError)
Harness-->>CargoKani: assertions over result and error shapes
CargoKani-->>RunKani: verification success/failure
RunKani-->>Dev: exit status
Class diagram for LshConfig, IndexError, and new Kani harness moduleclassDiagram
class LshConfig {
+bands() usize
+rows() usize
+new(bands usize, rows usize) Result~LshConfig, IndexError~
}
class IndexError {
<<enum>>
ZeroBands
ZeroRows
InvalidBandRowProduct
EmptyFingerprintSet
}
class MINHASH_SIZE {
<<const>>
+VALUE usize
}
class index_kani {
<<module>>
+verify_lsh_config_new_smoke()
+verify_lsh_config_new_symbolic()
}
LshConfig --> IndexError : returns
LshConfig ..> MINHASH_SIZE : uses
index_kani ..> LshConfig : calls_new_and_accessors
index_kani ..> IndexError : pattern_matches
index_kani ..> MINHASH_SIZE : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 862087e28e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="scripts/run-kani.sh" line_range="9-15" />
<code_context>
+KANI_BIN=$("${SCRIPT_DIR}/install-kani.sh")
+MANIFEST_PATH="${REPO_ROOT}/crates/whitaker_clones_core/Cargo.toml"
+
+run_clone_detector_harnesses() {
+ "${KANI_BIN}" \
+ --manifest-path "${MANIFEST_PATH}" \
+ --default-unwind 4 \
+ --harness verify_lsh_config_new_smoke \
+ --harness verify_lsh_config_new_symbolic
+}
+
+if [ $# -eq 0 ]; then
+ run_clone_detector_harnesses
+ exit 0
+fi
+
+case "$1" in
+ clone-detector)
+ shift
+ run_clone_detector_harnesses "$@"
+ ;;
+ *)
</code_context>
<issue_to_address>
**suggestion:** The helper ignores forwarded arguments, which is slightly misleading given the `"$@"` in the call site.
`run_clone_detector_harnesses` takes no parameters, but in the `clone-detector` branch you call it as `run_clone_detector_harnesses "$@"`, which is a no-op and potentially confusing. Either remove `"$@"` at the call site, or update `run_clone_detector_harnesses` to accept and forward options (e.g., extra `kani` flags) if that’s the intended design.
```suggestion
run_clone_detector_harnesses() {
# Forward any additional arguments as extra Kani flags.
"${KANI_BIN}" \
--manifest-path "${MANIFEST_PATH}" \
--default-unwind 4 \
--harness verify_lsh_config_new_smoke \
--harness verify_lsh_config_new_symbolic \
"$@"
}
```
</issue_to_address>
### Comment 2
<location path="docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md" line_range="22" />
<code_context>
+This work matters because roadmap item 7.2.2 already made `LshConfig` a hard
+runtime boundary for MinHash and locality-sensitive hashing (LSH). If that
+boundary drifts, every later proof and every later token-pass candidate check
+inherits the bug. ADR 002 explicitly assigns this constructor to Verus and
+reserves Kani for bounded behavioural checks over the same clone-detector
+crate, so 7.2.4 and 7.2.5 establish the proof split that later items 7.2.6,
</code_context>
<issue_to_address>
**issue (review_instructions):** ADR is introduced as an acronym without being expanded on first use.
Please expand ADR on its first appearance, for example, "Architecture Decision Record (ADR) 002 explicitly assigns this constructor to Verus…" to satisfy the acronym definition requirement.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.md`
**Instructions:**
Define uncommon acronyms on first use.
</details>
</issue_to_address>
### Comment 3
<location path="docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md" line_range="182" />
<code_context>
+ `crates/whitaker_clones_core/tests/features/min_hash_lsh.feature` already
+ exercises zero bands. The implementation should extend these tests rather
+ than replace them.
+- `Makefile` currently exposes only a global `verus` target. If we want fast
+ clone-detector-only iteration without regressing the existing proof workflow,
+ the cleanest approach is to add clone-detector-specific targets alongside the
</code_context>
<issue_to_address>
**issue (review_instructions):** The sentence uses the first-person plural pronoun "we", which the style guide forbids.
To comply with the pronoun rule, consider rephrasing this line, for example: "If fast clone-detector-only iteration is desired, the cleanest approach is to add clone-detector-specific targets alongside the current umbrella target."
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.md`
**Instructions:**
Avoid 2nd person or 1st person pronouns ("I", "you", "we").
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 321: Replace the two-word phrase "check list" with the single word
"checklist" in the sentence that references the crate's `unexpected_cfgs` check
list so it reads "crate's `unexpected_cfgs` checklist"; locate the occurrence
near the text mentioning `unexpected_cfgs` and update the spelling accordingly.
- Line 311: Replace the British spelling "specialised" with the Oxford English
Dictionary preferred "-ize" form "specialized" in the sentence referencing
`unexpected_cfgs` considerations (the phrase "`unexpected_cfgs` considerations
when introducing specialised build modes."). Update that exact token so the doc
uses "specialized" while keeping the surrounding wording intact.
In `@scripts/run-kani.sh`:
- Around line 22-26: The case branch calls run_clone_detector_harnesses with an
unused argument passthrough ("$@"); remove the "$@" so the call is simply
run_clone_detector_harnesses. Update the case block where
run_clone_detector_harnesses is invoked to stop forwarding parameters and ensure
no other callers rely on arguments being passed from this branch.
In `@scripts/run-verus.sh`:
- Line 33: The array assignment using command substitution for PROOF_FILES is
unsafe and triggers SC2207; replace both occurrences where PROOF_FILES is set
from proof_files_for_group (the global case and the case inside the
option-handling branch) to use mapfile -t with process substitution: run
proof_files_for_group through process substitution and load into PROOF_FILES
with mapfile -t to avoid word-splitting and globbing issues (use mapfile -t
PROOF_FILES < <(proof_files_for_group all) for the top-level assignment and
mapfile -t PROOF_FILES < <(proof_files_for_group "$1") for the branch).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b90765d1-53b6-481a-a9cc-16fc2fb38383
📒 Files selected for processing (15)
Cargo.tomlMakefilecrates/whitaker_clones_core/src/index/kani.rscrates/whitaker_clones_core/src/index/mod.rscrates/whitaker_clones_core/src/index/tests.rscrates/whitaker_clones_core/tests/features/min_hash_lsh.featurecrates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rsdocs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mddocs/roadmap.mddocs/whitaker-clone-detector-design.mdscripts/install-kani.shscripts/install-verus.shscripts/run-kani.shscripts/run-verus.shverus/clone_detector_lsh_config.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Around line 213-216: Replace the phrase "sidecar proof tooling independent
from ordinary Cargo development" with "sidecar proof tooling independent of
ordinary Cargo development" in the sentence that mentions Makefile targets and
wrapper scripts and references ADR 002 (the sentence containing "sidecar proof
tooling independent from ordinary Cargo"); ensure the docs/markdown uses
en-GB-oxendict spelling and save the updated line.
In `@scripts/run-kani.sh`:
- Around line 27-31: The case branch for "clone-detector" drops all remaining
CLI flags by calling run_clone_detector_harnesses without arguments; change the
invocation in that branch to forward the shifted arguments to
run_clone_detector_harnesses (i.e., call run_clone_detector_harnesses "$@") so
the function (defined as run_clone_detector_harnesses) receives and can forward
the flags to the Kani harness execution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 20c34ef5-409e-406d-aaa3-8c5cec9faef8
📒 Files selected for processing (3)
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mdscripts/run-kani.shscripts/run-verus.sh
|
@coderabbitai Have the following now been resolved? Annul any requirements that violate the en-GB-oxendict spelling (-ize / -yse / -our) conventions (for example a request to replace "normalize" with "normalise" or "artefact" with "artifact"), or where the requirement unnecessarily increases cyclomatic complexity. ❌ Failed checks (1 warning)
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4266
Nothing further is needed for this check. |
|
@coderabbitai are you satisfied that the introduced proof is substantive, rigorous, and well-founded? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 7587 🏁 Script executed: Length of output: 146186
Kani — substantive and well-founded ✅
One genuine gap: inputs where Verus — self-referential and implementation-free
|
| Artefact | Calls real code | Symbolic | Proof content | Gap |
|---|---|---|---|---|
| Verus | ✗ | ✗ | Tautological over spec definition | No implementation linkage |
| Kani smoke | ✓ | ✗ | Single concrete path | Narrow |
| Kani symbolic | ✓ | ✓ | Full match exhaustion over [0,128]² |
Overflow inputs uncovered |
The Kani harness is the only artefact that actually verifies the implementation, and it does so substantively within its declared bounds. Two actions would strengthen the overall proof story:
- Add a Kani harness for unconstrained
bandsandrows(removing the<= MINHASH_SIZEassumptions) to cover thechecked_muloverflow path invalidate_product. - Either extend the Verus proof to link the spec to the implementation via
external_fn_specification, or document clearly in the design notes that the Verus file is a contract record rather than an implementation proof (the latter already appears partially in the design doc but could be stated more plainly alongside the tautology concern).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/developers-guide.md`:
- Around line 125-127: Replace the phrase "cached install" with "cached
installation" in the sentence "The first proof run may take longer while
toolchains and verifier binaries are downloaded; later runs reuse the cached
install." to correct the noun usage and conform to en-GB-oxendict
spelling/grammar in docs; update the sentence to read "...later runs reuse the
cached installation." ensuring consistency in docs/*.md.
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 7: Update the top matter "Status" field from "IN PROGRESS" to "COMPLETED"
so the document header matches the body (all stages checked in the stages
section and the retrospective records finished outcomes); specifically change
the Status value on the header line that currently reads "Status: IN PROGRESS"
to "Status: COMPLETED".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7f051490-5299-4e5f-bac2-bb955a8cb246
📒 Files selected for processing (6)
crates/whitaker_clones_core/src/index/kani.rsdocs/developers-guide.mddocs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mddocs/whitaker-clone-detector-design.mdscripts/run-kani.shverus/clone_detector_lsh_config.rs
6ae12a1 to
5fb3af2
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md (1)
7-7:⚠️ Potential issue | 🟡 MinorSet the ExecPlan status to
COMPLETED.Update Line 7 to match the completed stages and retrospective.
Triage:
[type:docstyle]📝 Apply status fix
-Status: IN PROGRESS +Status: COMPLETEDAs per coding guidelines, documentation in
docs/must remain accurate and
current when implementation state changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md` at line 7, Change the ExecPlan front-matter/status line from "Status: IN PROGRESS" to "Status: COMPLETED" and update the stage and retrospective text on that document to reflect the completed work; locate the "Status" header near the top of the document (line containing "Status: IN PROGRESS") and replace it with "Status: COMPLETED", then ensure any subsequent sections that list stages, completed steps, or the retrospective accurately reflect completion.docs/developers-guide.md (1)
125-127:⚠️ Potential issue | 🟡 MinorReplace “cached install” with “cached installation”.
Fix the noun usage at Line 127.
Triage:
[type:grammar]📝 Apply wording fix
-The installer scripts are idempotent. The first proof run may take longer while -toolchains and verifier binaries are downloaded; later runs reuse the cached -install. +The installer scripts are idempotent. The first proof run may take longer while +toolchains and verifier binaries are downloaded; later runs reuse the cached +installation.As per coding guidelines,
docs/**/*.mdcontent must use en-GB-oxendict
spelling and grammar.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/developers-guide.md` around lines 125 - 127, Replace the phrase "cached install" with "cached installation" in the sentence starting "The first proof run may take longer..." so the line reads "...later runs reuse the cached installation."; ensure the docs/developers-guide.md change follows en-GB-oxendict spelling/grammar conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rs`:
- Around line 50-60: The parsed tokens from arguments.split_once(',') (the local
variables bands and rows) should be trimmed before calling parse::<usize>() to
avoid failures from surrounding whitespace; update the code that constructs
bands and rows (the let bands = ... and let rows = ... expressions) to call
.trim() on each token prior to parse and keep the existing map_err messages
unchanged.
In `@scripts/run-verus.sh`:
- Line 33: Replace use of the Bash-4-only mapfile plus process substitution with
a Bash-3.2-compatible loader: add a helper function named
set_proof_files_for_group() that initializes PROOF_FILES=() and populates it by
piping proof_files_for_group "$1" into a while IFS= read -r proof_file; do
PROOF_FILES+=("${proof_file}"); done loop, then replace the two occurrences of
"mapfile -t PROOF_FILES < <(proof_files_for_group ...)" with calls to
set_proof_files_for_group all and set_proof_files_for_group "$1" respectively so
PROOF_FILES is populated without using mapfile or process substitution.
---
Duplicate comments:
In `@docs/developers-guide.md`:
- Around line 125-127: Replace the phrase "cached install" with "cached
installation" in the sentence starting "The first proof run may take longer..."
so the line reads "...later runs reuse the cached installation."; ensure the
docs/developers-guide.md change follows en-GB-oxendict spelling/grammar
conventions.
In `@docs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.md`:
- Line 7: Change the ExecPlan front-matter/status line from "Status: IN
PROGRESS" to "Status: COMPLETED" and update the stage and retrospective text on
that document to reflect the completed work; locate the "Status" header near the
top of the document (line containing "Status: IN PROGRESS") and replace it with
"Status: COMPLETED", then ensure any subsequent sections that list stages,
completed steps, or the retrospective accurately reflect completion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 615f7307-61b4-45e3-a228-6f499d89bcae
📒 Files selected for processing (16)
Cargo.tomlMakefilecrates/whitaker_clones_core/src/index/kani.rscrates/whitaker_clones_core/src/index/mod.rscrates/whitaker_clones_core/src/index/tests.rscrates/whitaker_clones_core/tests/features/min_hash_lsh.featurecrates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rsdocs/developers-guide.mddocs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mddocs/roadmap.mddocs/whitaker-clone-detector-design.mdscripts/install-kani.shscripts/install-verus.shscripts/run-kani.shscripts/run-verus.shverus/clone_detector_lsh_config.rs
5fb3af2 to
4a2d47a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
scripts/run-kani.sh (1)
57-58:⚠️ Potential issue | 🔴 CriticalRemove the duplicate closing brace on line 58.
Line 58 contains an extra
}that causes a shell parse error. The functionrun_clone_detector_harnessesis already closed by the brace on line 57.🐛 Proposed fix
done } -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/run-kani.sh` around lines 57 - 58, There is a duplicate closing brace after the run_clone_detector_harnesses function causing a shell parse error; remove the extra `}` that immediately follows the existing closing brace for run_clone_detector_harnesses so the function is closed only once and the script parses correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/whitaker-clone-detector-design.md`:
- Line 550: Edit the sentence fragment that currently reads "Together these
harnesses assert the" to insert a comma after "Together" so it reads "Together,
these harnesses assert the" to improve readability; locate the string "Together
these harnesses assert the" in the document and update it accordingly.
In `@Makefile`:
- Line 1: Remove the duplicate 'kani' entry from the .PHONY declaration: edit
the .PHONY line that currently lists targets including 'kani' twice and delete
the redundant 'kani' token so each phony target appears only once (look for the
.PHONY line containing help all clean test build release lint fmt check-fmt ...
kani kani-clone-detector and remove the extra 'kani').
---
Duplicate comments:
In `@scripts/run-kani.sh`:
- Around line 57-58: There is a duplicate closing brace after the
run_clone_detector_harnesses function causing a shell parse error; remove the
extra `}` that immediately follows the existing closing brace for
run_clone_detector_harnesses so the function is closed only once and the script
parses correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 398b1895-7c56-4ae4-abf1-4cc73752c81b
📒 Files selected for processing (16)
Cargo.tomlMakefilecrates/whitaker_clones_core/src/index/kani.rscrates/whitaker_clones_core/src/index/mod.rscrates/whitaker_clones_core/src/index/tests.rscrates/whitaker_clones_core/tests/features/min_hash_lsh.featurecrates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rsdocs/developers-guide.mddocs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mddocs/roadmap.mddocs/whitaker-clone-detector-design.mdinstaller/src/main.rsscripts/install-verus.shscripts/run-kani.shscripts/run-verus.shverus/clone_detector_lsh_config.rs
… roadmap 7.2.4-7.2.5 Add a detailed, living ExecPlan document outlining the implementation, constraints, risks, progress stages, and acceptance criteria for roadmap items 7.2.4 and 7.2.5. This plan covers establishing reusable proof plumbing for the clone-detector token pipeline, proving constructor invariants for LshConfig::new, integrating Verus and Kani sidecar proof tooling, and augmenting test and documentation coverage. It serves as the authoritative blueprint governing the scoped proof split and workflow before implementation begins. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…plicit bounds and cfg handling - Introduce a minimal Kani smoke harness for LshConfig with symbolic but bounded inputs. - Follow Chutoro patterns: keep harnesses close to code, guard with #[cfg(kani)], and document usage. - Update unexpected_cfgs allowlist for cfg(kani) to prevent lint warnings. - Extend Makefile and scripts to run practical Kani targets, leaving room for future exhaustive checks. - Add detailed docs on harness design, unwinding limits, and state-explosion mitigation strategies. This supports roadmap item 7.2.4 by providing a practical local smoke gate for Kani proofs without yet implementing heavier exhaustive runs planned for later roadmap items. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…:new - Added a Verus proof module `verus/clone_detector_lsh_config.rs` that semantically verifies the contract of `LshConfig::new` ensuring positive bands and rows with product equal to MINHASH_SIZE. - Introduced a Kani harness in `crates/whitaker_clones_core/src/index/kani.rs` for bounded symbolic testing of `LshConfig::new` under `cfg(kani)`. - Extended rspec-based test scenarios and unit tests to cover zero rows, invalid non-zero products, and overflow cases. - Added supporting Makefile targets and scripts (`make kani-clone-detector`, `make verus-clone-detector`, install/run scripts) for sidecar proof workflows. - Updated documentation and roadmap to reflect implementation progress on clone-detector proof workflows and LshConfig invariants. - Improved shared tooling in `scripts/install-verus.sh` to strip ANSI escape codes for stable Verus installation. This change establishes a formal verification foundation for clone-detector LSH configuration, enabling future proofs and bounded runtime checks. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…iteration - Added `make verus-clone-detector` and `make kani-clone-detector` targets for targeted proof runs without regressing existing workflow. - Refactored `scripts/run-kani.sh` to run clone-detector harnesses in a loop, improving maintainability. - Updated `scripts/run-verus.sh` to read proof files efficiently with `mapfile`. - Fixed shared Verus installer to strip ANSI escape codes from toolchain hints, ensuring consistent fallback handling across targets. - Adjusted documentation in execplans to align with new proof sidecar workflow and clarify architecture decisions. These changes enhance developer efficiency by supporting focused proof workflow iterations and improving tooling robustness. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…ing scripts - Document new Verus and Kani proof make targets and usage - Explain repository-managed proof tooling and scripts - Provide instructions for Kani installation and execution - Include example commands for running proofs - Fix minor typo in execplans markdown regarding Cargo Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…s LshConfig proofs - Added `verify_lsh_config_new_overflow_product` Kani harness to drive overflow path in LshConfig::new. - Updated Kani harness set to include the overflow-specific proof. - Extended Verus proof to mirror runtime branch order including overflow rejection. - Documented the distinction between Verus model and runtime implementation in developer guide. - Clarified that Verus proves an implementation-shaped model, while Kani executes the real constructor code. - Improved proofs for zero bands, zero rows, exact product acceptance, invalid product rejection, and overflow in Verus. This enhances formal coverage of LshConfig constructor ensuring overflow paths are verified without trusted assumptions. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Apply the remaining proof-workflow review fixes by trimming parsed LSH\ninputs, updating proof documentation wording and status, and\nreplacing Bash-4-only mapfile usage in the Verus wrapper.\n\nAlso make the installer bin tests deterministic again by routing\ntheir Dylint dependency checks through injected test-only install\noptions instead of the live repository-first installer path. This\nkeeps the validated worktree stable before rebasing onto main.
The script scripts/run-kani.sh had an unnecessary closing brace '}' removed to fix shell syntax correctness. Additionally, a minor punctuation fix was applied in docs/whitaker-clone-detector-design.md, and the Makefile's .PHONY declaration was cleaned up by removing duplicate 'kani' target. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
512d978 to
6705a99
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@verus/clone_detector_lsh_config.rs`:
- Around line 8-12: The sidecar currently proves lsh_config_new_result but is
not linked to the runtime constructor LshConfig::new (nor to validate_product),
so the proof is disconnected; either add Verus linkage by declaring an
external_fn_specification or external_body that connects lsh_config_new_result
to LshConfig::new (and any validate_product calls used by the constructor),
ensuring the sidecar’s spec references the same function signature and types, or
alternatively demote the module to a pure contract record by removing the lemma
and using a record/abstract spec that does not pretend to verify the concrete
constructor; update references to LshConfig::new, validate_product, and
lsh_config_new_result accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 03e899d7-ff8a-423f-aa73-5836d91e47af
📒 Files selected for processing (15)
Cargo.tomlMakefilecrates/whitaker_clones_core/src/index/kani.rscrates/whitaker_clones_core/src/index/mod.rscrates/whitaker_clones_core/src/index/tests.rscrates/whitaker_clones_core/tests/features/min_hash_lsh.featurecrates/whitaker_clones_core/tests/min_hash_lsh_behaviour.rsdocs/developers-guide.mddocs/execplans/7-2-4-verus-proofs-for-lsh-config-new-invariants.mddocs/roadmap.mddocs/whitaker-clone-detector-design.mdscripts/install-verus.shscripts/run-kani.shscripts/run-verus.shverus/clone_detector_lsh_config.rs
|
@coderabbitai Please suggest a fix for the following concern and provide an AI coding agent prompt for the fix: ❌ Failed checks (1 warning)
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 12779
Fix overviewThe warning has two distinct sub-problems:
What BATS tests should cover
AI coding agent prompt |
…194) * docs(execplans): add comprehensive execplan for clone-detector proofs roadmap 7.2.4-7.2.5 Add a detailed, living ExecPlan document outlining the implementation, constraints, risks, progress stages, and acceptance criteria for roadmap items 7.2.4 and 7.2.5. This plan covers establishing reusable proof plumbing for the clone-detector token pipeline, proving constructor invariants for LshConfig::new, integrating Verus and Kani sidecar proof tooling, and augmenting test and documentation coverage. It serves as the authoritative blueprint governing the scoped proof split and workflow before implementation begins. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(proofs): add practical Kani harnesses for clone-detector with explicit bounds and cfg handling - Introduce a minimal Kani smoke harness for LshConfig with symbolic but bounded inputs. - Follow Chutoro patterns: keep harnesses close to code, guard with #[cfg(kani)], and document usage. - Update unexpected_cfgs allowlist for cfg(kani) to prevent lint warnings. - Extend Makefile and scripts to run practical Kani targets, leaving room for future exhaustive checks. - Add detailed docs on harness design, unwinding limits, and state-explosion mitigation strategies. This supports roadmap item 7.2.4 by providing a practical local smoke gate for Kani proofs without yet implementing heavier exhaustive runs planned for later roadmap items. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(clone-detector): add Verus proof and Kani harness for LshConfig::new - Added a Verus proof module `verus/clone_detector_lsh_config.rs` that semantically verifies the contract of `LshConfig::new` ensuring positive bands and rows with product equal to MINHASH_SIZE. - Introduced a Kani harness in `crates/whitaker_clones_core/src/index/kani.rs` for bounded symbolic testing of `LshConfig::new` under `cfg(kani)`. - Extended rspec-based test scenarios and unit tests to cover zero rows, invalid non-zero products, and overflow cases. - Added supporting Makefile targets and scripts (`make kani-clone-detector`, `make verus-clone-detector`, install/run scripts) for sidecar proof workflows. - Updated documentation and roadmap to reflect implementation progress on clone-detector proof workflows and LshConfig invariants. - Improved shared tooling in `scripts/install-verus.sh` to strip ANSI escape codes for stable Verus installation. This change establishes a formal verification foundation for clone-detector LSH configuration, enabling future proofs and bounded runtime checks. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(clone-detector): add specific Verus and Kani targets for faster iteration - Added `make verus-clone-detector` and `make kani-clone-detector` targets for targeted proof runs without regressing existing workflow. - Refactored `scripts/run-kani.sh` to run clone-detector harnesses in a loop, improving maintainability. - Updated `scripts/run-verus.sh` to read proof files efficiently with `mapfile`. - Fixed shared Verus installer to strip ANSI escape codes from toolchain hints, ensuring consistent fallback handling across targets. - Adjusted documentation in execplans to align with new proof sidecar workflow and clarify architecture decisions. These changes enhance developer efficiency by supporting focused proof workflow iterations and improving tooling robustness. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * docs(proof-workflows): add documentation for proof workflows and tooling scripts - Document new Verus and Kani proof make targets and usage - Explain repository-managed proof tooling and scripts - Provide instructions for Kani installation and execution - Include example commands for running proofs - Fix minor typo in execplans markdown regarding Cargo Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * feat(kaniproofs): add overflow-specific Kani harness and tighten Verus LshConfig proofs - Added `verify_lsh_config_new_overflow_product` Kani harness to drive overflow path in LshConfig::new. - Updated Kani harness set to include the overflow-specific proof. - Extended Verus proof to mirror runtime branch order including overflow rejection. - Documented the distinction between Verus model and runtime implementation in developer guide. - Clarified that Verus proves an implementation-shaped model, while Kani executes the real constructor code. - Improved proofs for zero bands, zero rows, exact product acceptance, invalid product rejection, and overflow in Verus. This enhances formal coverage of LshConfig constructor ensuring overflow paths are verified without trusted assumptions. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> * Fix proof follow-ups and installer test hooks Apply the remaining proof-workflow review fixes by trimming parsed LSH\ninputs, updating proof documentation wording and status, and\nreplacing Bash-4-only mapfile usage in the Verus wrapper.\n\nAlso make the installer bin tests deterministic again by routing\ntheir Dylint dependency checks through injected test-only install\noptions instead of the live repository-first installer path. This\nkeeps the validated worktree stable before rebasing onto main. * style(scripts): remove extraneous closing brace in run-kani.sh The script scripts/run-kani.sh had an unnecessary closing brace '}' removed to fix shell syntax correctness. Additionally, a minor punctuation fix was applied in docs/whitaker-clone-detector-design.md, and the Makefile's .PHONY declaration was cleaned up by removing duplicate 'kani' target. Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com> --------- Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Summary
What this PR adds
Scope and constraints
crates/whitaker_clones_core). Do not moveLshConfiginto a separate crate.#[cfg(kani)], register that cfg in the crate'sunexpected_cfgsallowlist.rstest-bddversion).Implementation plan (highlights)
Validation plan (evidence to collect)
Acceptance criteria
Rationale
This change provides a concrete path to Verus-based proofs for LshConfig::new with a minimal Kani smoke harness, preserving API stability and keeping proof tooling separate from normal cargo/test flows as guided by ADR 002 and the roadmap.
Related guidance
Progress
Surprises & Discoveries
Task