Skip to content

feat: hledger parsing#25

Open
dman-os wants to merge 8 commits into
mainfrom
feat/hledger-support
Open

feat: hledger parsing#25
dman-os wants to merge 8 commits into
mainfrom
feat/hledger-support

Conversation

@dman-os
Copy link
Copy Markdown
Owner

@dman-os dman-os commented Apr 27, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added dayledger plug with hledger journal parsing and financial claim matching
    • Enhanced access control system with granular document and facet-level permissions
    • Token-based resource access model for improved security
  • Improvements

    • Refactored workflow dispatch system with better staging and merge handling
    • Expanded predicate evaluation for more sophisticated access rules
    • Improved error handling and validation across document operations
  • Documentation

    • Updated contributing guide with new plug architecture details

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

This PR introduces a major refactoring of the dispatch and access-control system from facet-keyed dispatch to token-based document/facet access, adds a new plug_dayledger plugin for hledger journal parsing and transaction claim matching, updates plug_plabels and plug_test to adopt the new ACL model, and extends the type system with facet-field-matching predicates and explicit routine-level document ACLs.

Changes

Core Dispatch & Access Control Refactoring

Layer / File(s) Summary
Type System Extensions
src/daybook_types/manifest.rs, src/daybook_types/doc.rs, src/daybook_types/reference.rs
New types: FacetRef (doc reference), RoutineDocAcl (document predicate ACLs), DocPredicateClause::FacetFieldMatch (facet-field evaluation), CompareOp (comparison operators). RoutineManifest gains doc_acls, query_acls, config_facet_acl fields. RoutineFacetAccess gains create/delete booleans. JSON path evaluation delegated to jsonpath-rust library.
Dispatch Model
src/daybook_core/rt.rs, src/daybook_core/rt/dispatch.rs
DispatchArgs::DocInvoke and DocFacet replaced by DispatchArgs::DocRoutine with doc_id, branch_path, heads, changed_facet_keys. New structures: ProcessorInvocation, DocFacetTokens, updated FacetRoutineArgs with invocation, primary_doc, config_docs. Dispatch identity now includes doc/branch/heads; facet_key removed. Staging branch concept introduced for doc modifications.
Triage & Processor Integration
src/daybook_core/rt/triage.rs
PreparedProcessor refactored to use ProcessorManifest and ProcessorEventPredicate instead of RoutineManifest. Dispatch path simplified: predicates matched directly, changed_facet_keys computed from processor read tags/keys.
Plugin Hosting Infrastructure
src/daybook_core/rt/wash_plugin.rs, src/daybook_core/rt/wash_plugin/caps.rs
New token types hosted: DocToken, FacetToken, FacetCreateToken, FacetTagToken with granular WASM resource bindings. New helper build_doc_facet_tokens() applies ACLs and fetches doc state. Expanded get_args flow collects per-owner config docs and sqlite connection tokens. Staging branch handling via ensure_staging_branch helper.
WIT Interface
src/daybook_core/wit/main.wit
Major expansion: added resources doc-token, facet-token, facet-create-token, facet-tag-token, command-invoke-token, plus metadata records (facet-meta, doc-meta) and rights flags (doc-rights, facet-rights). New interface mltools-ocr, mltools-embed, mltools-image-tools (using non-read-only tokens). Updated facet-routine-args with invocation, primary-doc, config-docs wiring.
Manifest & Plugs Definition
src/daybook_core/plugs.rs
System plugs (ocr-image, embed-image, embed-text, index-embedding, test-label) refactored from deets-based DocFacet to explicit doc_acls + query_acls + config_facet_acl. New validation ensures all ACL-referenced tags are in scope via routine.referenced_tags().
Workflow Layer
src/daybook_wflows/wflows/embed_image.rs, embed_text.rs, index_embedding.rs, ocr_image.rs, test_labeler.rs
Workflows refactored to locate facet tokens via args.primary_doc.facets with rights checks, replacing previous rw_facet_tokens/ro_facet_tokens tuple lists. Clone/update patterns applied per workflow logic. Error handling split into access vs. update failure paths.
Iroh & Sync Updates
src/am_utils_rs/repo/iroh.rs, src/am_utils_rs/sync.rs
Minor refactorings: ApplicationClosed handling simplified via pattern guard; test AckSlot processing refactored to while-let pattern for cleaner consumption of ready entries.
CLI & FFI
src/daybook_cli/main.rs, src/daybook_ffi/rt.rs, src/daybook_types/Cargo.toml
DispatchArgs calls updated to DocRoutine with changed_facet_keys. New dependency jsonpath-rust added for manifest feature. Flake.nix Rust version updated (2026-01-01 → 2026-04-16), XDG_DATA_DIRS set for fontconfig.

Existing Plugin Adaptations

Layer / File(s) Summary
plug_plabels Manifest Rework
src/plug_plabels/lib.rs
Four routines (label-note, label-image, learn-image-label-candidates, learn-note-label-candidates) switched from deets::DocFacet to explicit doc_acls with HasTag predicates and structured facet access. tuple_list_take removed from public exports.
plug_plabels Wflow Refactoring
src/plug_plabels/wflows/label_engine.rs, label_image.rs, label_note.rs, learn_image_label_candidates.rs, learn_note_label_candidates.rs
Unified to use FacetToken (non-read-only) for config and working/error facets. Config token lookup via config_docs facets. New helpers find_facet_token_with_rights/find_facet_token for rights-aware token discovery. Embedding/note retrieval mapped to errors instead of unwraps.
plug_test Manifest Expansion
src/plug_test/lib.rs
Public exports updated: added RoutineDocAcl, RoutineManifest, RoutineFacetAccess, RoutineImpl. New routines added with doc_acls. Config facet definitions introduced. New helper functions: find_facet_token_with_rights, write_report_v2 for capability report generation.
plug_test E2E Suite
src/plug_test/e2e/common.rs, src/plug_test/e2e/capability_regression_wflow.rs
New helper import_test_plug_oci() for plug import. Comprehensive regression tests: test_full_command_capability_report, test_full_processor_capability_report, test_minimal_command_capability_report, test_minimal_processor_capability_report, test_downscope_capability, test_denied_update_capability, test_acl_aggregate_capability, test_create_facet_capability, test_get_create_token_capability, test_delete_facet_capability.

New plug_dayledger Plugin

Layer / File(s) Summary
Type System
src/plug_dayledger/types.rs, src/plug_dayledger/hledger/types.rs
New domain types: PostingSign, ClaimPostingHint, HledgerTxnDeets. Hledger types: Date, Status, Amount/AmountStyle/AmountCost, Posting, Transaction, PriceDirective, AccountDeclaration, AccountType. Claim facet expanded with ts, posting_hints, src_ref, src_refs, deets_kind, deets. Removed ClaimRef; use FacetRef instead.
Hledger Parser
src/plug_dayledger/hledger/parse/common.rs, src/plug_dayledger/hledger/parse/journal.rs, src/plug_dayledger/hledger/parse/mod.rs
Parser combinators via chumsky: datep, statusp, codep, commoditysymbolp, accountnamep, rawnumberp, amountp, balance_assertionp. Journal parser: parse_journal entry point, transactionp for transactions with postings. Full support for hledger journal syntax.
Claim Matching Engine
src/plug_dayledger/hledger/claim_matcher.rs
Heuristic-based matcher: match_claims links transactions to existing claims via unique hash/code, then high-confidence heuristics (date, amount, postings, accounts, description). build_claim constructs new claims with transaction details. hash_txn produces stable transaction signatures. Extensive test coverage for insertions, changes, duplicates.
Wflow Implementation
src/plug_dayledger/wflows/parse_hledger.rs
Orchestrates parsing: reads note facet, parses hledger journal, constructs facet URL and source reference, locates/creates claim facets, matches claims, updates/creates claim facets via token-based access. Includes granular error handling and terminal error paths.
Manifest & WIT
src/plug_dayledger/lib.rs, src/plug_dayledger/wit/main.wit, src/plug_dayledger/wit/deps/*
Manifest declares parse-hledger routine with doc/query/config ACLs, wflow bundle, processor for journal parsing. WIT exports bundle. Extensive WIT dependency declarations: wasi-cli, wasi-clocks, wasi-config, wasi-filesystem, wasi-http, wasi-io, wasi-keyvalue, wasi-logging, wasi-random, wasmcloud-postgres.
E2E Tests
src/plug_dayledger/e2e/parse_hledger_wflow.rs, src/plug_dayledger/e2e/common.rs, src/plug_dayledger/hledger/tests/mod.rs, fixtures
Test helpers: import_dayledger_oci(), journal creation. E2E tests: test_parse_hledger_writes_one_claim_per_transaction (validates claim uniqueness, txn indexing), test_parse_hledger_reparse_preserves_claim_ids_when_transaction_inserted (claim ID preservation on reparse). Parser unit tests: date/status/amount/balance formats, journal parsing. Sample journal fixtures: quickstart.journal, sample.journal, sample2.journal, status.journal, unicode.journal.
Cargo & Dependencies
src/plug_dayledger/Cargo.toml, src/plug_dayledger/e2e.rs
New dependencies: chumsky (parsing), jiff (dates), rust_decimal (amounts). Dev-dependency: daybook_core (test-support). E2E module structure: common, parse_hledger_wflow.

Sequence Diagram(s)

sequenceDiagram
    participant Triage
    participant Dispatch
    participant WashPlugin as Wash Plugin Host
    participant WflowGuest
    participant FacetStorage

    Triage->>Dispatch: emit ProcessorEvent with changed_facet_keys
    Dispatch->>Dispatch: create DocRoutine with branch_path, staging_branch
    Dispatch->>WashPlugin: get_args() for facet routine
    WashPlugin->>WashPlugin: build_doc_facet_tokens(primary_doc)
    WashPlugin->>WashPlugin: resolve config_docs, build tokens per owner
    WashPlugin->>WflowGuest: pass FacetRoutineArgs (primary_doc, config_docs, invocation)
    WflowGuest->>WflowGuest: locate facets by key with rights checks
    WflowGuest->>FacetStorage: read facet data via token
    WflowGuest->>FacetStorage: clone token, perform operations
    WflowGuest->>FacetStorage: update facet via token.update()
    Dispatch->>Dispatch: merge staging_branch, resolve heads
    Dispatch-->>Triage: dispatch complete
Loading
sequenceDiagram
    participant WflowCtx
    participant JournalNote as Note Data
    participant Parser as Hledger Parser
    participant Matcher as Claim Matcher
    participant ClaimStorage as Claim Facet

    WflowCtx->>JournalNote: read note facet (journal text)
    WflowCtx->>Parser: parse_journal(text)
    Parser-->>WflowCtx: Vec<Transaction>
    WflowCtx->>WflowCtx: construct src_ref from note URL/heads
    WflowCtx->>ClaimStorage: locate claim tag with create rights
    WflowCtx->>ClaimStorage: read existing claims as HashMap
    WflowCtx->>Matcher: match_claims(transactions, existing, src_ref)
    Matcher->>Matcher: hash/code matching, heuristic scoring
    Matcher-->>WflowCtx: Vec<(id, Claim)> updated claims
    WflowCtx->>ClaimStorage: serialize claims, create/update facets
    WflowCtx-->>WflowCtx: return Ok(())
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • dman-os/townframe#21: Also refactors dispatch/wash_plugin and caps layers to introduce branch_path-aware document/facet tokens.
  • dman-os/townframe#18: Modifies plug_plabels crate manifest and wflows with updated ACL structures and facet/token handling.
  • dman-os/townframe#20: Modifies dispatch/runtime/plug wiring and introduces overlapping command-invoke and token concepts.

Poem

🐰 With tokens now guiding through facets so bright,
And workflows that parse ledgers deep into the night,
Claims matched by heuristics, with predicates wise,
Access control blooming before eager eyes!
The staging branch dances, then merges with grace—
A refactored dispatch system takes its place! 🌿

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/hledger-support

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/plug_plabels/lib.rs (1)

213-262: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Doc-ACL predicate appears to contradict the processor predicate for label-note and label-image.

The label-note processor (lines 487-503) and label-image processor (lines 511-525) both trigger only when the doc does not have PseudoLabel. However, the corresponding routines' doc_acls here have doc_predicate: DocPredicateClause::HasTag(PseudoLabel) and grant PseudoLabel access only with create: false, write: true.

If RoutineDocAcl.doc_predicate gates whether the per-doc facet ACL applies (which is consistent with how build_doc_facet_tokens in src/daybook_core/rt/wash_plugin.rs lines 686-711 only mints tokens for facets matching the ACL on the doc):

  1. When the processor fires on a doc lacking PseudoLabel, the doc_acl predicate HasTag(PseudoLabel) won't match.
  2. No PseudoLabel/PseudoLabelErrorFacet token is granted to the routine.
  3. label_image.rs line 23-27 and label_note.rs would fail with "facet token with update rights not found", terminating every dispatch.

Even if the predicate were satisfied, create: false would still prevent the routines from adding PseudoLabel to a doc that doesn't yet have one — which is the entire purpose of label-note/label-image. The learn-*-label-candidates routines do correctly use HasTag(PseudoLabel) since they only operate on already-labeled docs.

Likely intent for label-note/label-image:

  • doc_predicate: a positive predicate matching the source content (e.g., HasTag(Note) for label-note, HasTag(Blob) for label-image), or no predicate gating at all.
  • create: true for PseudoLabel (and PseudoLabelErrorFacet).
#!/bin/bash
# Verify how doc_predicate / doc_acls are evaluated and whether they gate facet-token minting.
rg -nP --type=rust -C5 '\bdoc_predicate\b'
rg -nP --type=rust -C3 '\bdoc_acls\b'
ast-grep --pattern 'RoutineDocAcl { $$$ }'

# Check whether create=false actually blocks facet creation in the runtime.
rg -nP --type=rust -C5 '\bRoutineFacetAccess\b' 
rg -nP --type=rust -C3 '\bcreate:\s*(true|false)'

Also applies to: 272-332

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_plabels/lib.rs` around lines 213 - 262, The RoutineDocAcl currently
uses DocPredicateClause::HasTag(PseudoLabel) and sets create: false for
PlabelFacetTag::PseudoLabel and PlabelFacetTag::PseudoLabelErrorFacet which
prevents label-note and label-image from receiving/create tokens; update the
doc_acls for the label-note and label-image routines so the doc_predicate
matches the source docs (e.g., DocPredicateClause::HasTag(Note) for label-note
and DocPredicateClause::HasTag(Blob) for label-image or remove the predicate
entirely) and set create: true on the RoutineFacetAccess entries for
PlabelFacetTag::PseudoLabel and PlabelFacetTag::PseudoLabelErrorFacet (ensure
read/write/create flags reflect that the routines must be able to create and
write pseudo-label facets), and mirror the same change in the other ACL block(s)
covering the same routines.
src/plug_plabels/wflows/label_engine.rs (1)

175-214: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't read the config facet through a potentially write-only token.

This branch prefers rw_config_token and immediately calls heads()/get() on it, even though apply_labeling only needs read access here. With the new split ACL model, an update-only config grant will fail this path even when a readable ro_config_token is available.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_plabels/wflows/label_engine.rs` around lines 175 - 214, The code
currently prefers req.rw_config_token and calls heads()/get() on it even when
only read access is needed; change the logic to prefer the read-only token so
write-only grants won't be used for reads. Reorder the if branches to check
req.ro_config_token first (and perform the heads()/get()/serde parsing into
PseudoLabelCandidatesFacet there), then fall back to req.rw_config_token only if
ro is None, and finally to default_label_set(); keep the existing error handling
and use of heads_json, label_set (PseudoLabelCandidatesFacet), and
default_label_set() unchanged.
src/daybook_core/rt.rs (1)

973-979: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Track merge success separately from workflow success.

merged_successfully is now just JobRunResult::Success. Later, command_invoke_reply_from_result interprets false as a post-run merge failure, so cancelled/terminal/worker errors on child commands get reported back as "kind":"merge-failed" instead of their real cause.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt.rs` around lines 973 - 979, The code sets
merged_successfully by treating any JobRunResult::Success as a successful merge,
which misclassifies cancelled/worker/terminal failures as "merge-failed"; change
the merged_successfully binding to reflect the actual merge outcome (e.g.,
inspect a merge-specific field or enum variant on event.result such as a
dedicated MergeSuccess/MergeFailed variant or an explicit merge_success boolean)
instead of matching any JobRunResult::Success, and update the call site that
passes event.result into command_invoke_reply_from_result so it receives the
true merge status; if the JobRunResult enum lacks a merge-specific
variant/field, add one (or add an explicit merge_success flag on the result) and
use that when computing merged_successfully and when invoking
command_invoke_reply_from_result.
🟡 Minor comments (6)
src/plug_dayledger/wit/deps/wasi-keyvalue-0.2.0-draft/package.wit-227-227 (1)

227-227: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Docs/API drift: CAS is advertised but not exposed.

Line 227 states atomic increment and CAS are available, but interface atomics currently only defines increment. Please align docs or add CAS before this interface is relied on by consumers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/wit/deps/wasi-keyvalue-0.2.0-draft/package.wit` at line
227, Docs claim both atomic `increment` and CAS exist, but the `interface
atomics` only exposes `increment`; add a CAS operation or update the docs. Fix
by adding a compare-and-swap function (e.g., `cas` or `compare_and_swap`) to the
`interface atomics` alongside `increment` with a signature that accepts a
key/slot, expected value and new value and returns a boolean or previous value
(match the project's wit convention), and update the documentation comment that
mentions CAS if you choose to remove it instead; reference `interface atomics`
and the existing `increment` symbol when making the change.
CONTRIBUTING.md-73-73 (1)

73-73: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix typo in plug description.
Line 73 has a spelling error: providignproviding.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CONTRIBUTING.md` at line 73, Update the description for `./src/plug_plabels/`
by fixing the typo: change "providign" to "providing" so the line reads
"`./src/plug_plabels/`: plug providing pseudo-labelling and classification
utils."
src/plug_dayledger/hledger/claim_matcher.rs-597-607 (1)

597-607: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Append prior src_ref to src_refs when re-matching an existing claim.

When an existing claim is re-matched from a new source, build_claim preserves existing.src_refs but drops existing.src_ref. This breaks the source ancestry chain. Based on the test assertion "first parse should not carry prior src refs" (which confirms src_refs accumulates on re-parses), the intended semantics are that src_refs records the chain of all prior sources.

On re-match, append existing.src_ref to the front of src_refs before returning the new claim:

src_refs: {
    let mut refs = existing
        .map(|existing_claim| existing_claim.src_refs.clone())
        .unwrap_or_default();
    if let Some(existing_claim) = existing {
        refs.insert(0, existing_claim.src_ref.clone());
    }
    refs
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/claim_matcher.rs` around lines 597 - 607, The
Claim construction in build_claim (or the function creating the Claim) preserves
existing.src_refs but omits existing.src_ref, breaking source ancestry; change
the src_refs initialization so when existing.is_some() you clone
existing.src_refs (or default), then insert existing.src_ref at the front (index
0) before assigning to the Claim.src_refs field, ensuring src_ref is appended
into the accumulated src_refs on re-match.
src/plug_dayledger/hledger/parse/common.rs-109-116 (1)

109-116: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Support multi-word account names or document as v1 constraint.

The accountnamep parser stops at the first space due to none_of([' ', ...]), preventing accounts like liabilities:credit card from parsing correctly. The fixture files quickstart.journal and sample2.journal contain such accounts, but they are not currently tested. Either extend the parser to support single spaces in account names (stopping at double-space or tab before amounts per hledger spec), or document this single-word limitation in a module-level comment to prevent unintended data loss when processing real-world journals.

src/plug_test/e2e/capability_regression_wflow.rs-369-370 (1)

369-370: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tautological assertion provides no coverage.

assert!(tag_keys.is_empty() || !tag_keys.is_empty()) is always true regardless of tag_keys content. Same pattern at line 439. Either remove these lines or replace with the actual expected condition (e.g., assert!(tag_keys.is_empty(), "minimal should have no primary tags") or document the indeterminate expectation).

🐛 Suggested fix
-    let tag_keys: Vec<String> = serde_json::from_value(report["primary_tag_keys"].clone())?;
-    assert!(tag_keys.is_empty() || !tag_keys.is_empty());
+    // primary_tag_keys is intentionally unconstrained for the minimal report.
+    let _tag_keys: Vec<String> = serde_json::from_value(report["primary_tag_keys"].clone())?;

(apply equivalent change at line 439)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_test/e2e/capability_regression_wflow.rs` around lines 369 - 370, The
assertion assert!(tag_keys.is_empty() || !tag_keys.is_empty()) is tautological
and should be replaced with a meaningful check: update the assertion on the
variable tag_keys to reflect the expected behavior (for example use
assert!(tag_keys.is_empty(), "expected no primary tags for minimal case") if the
test expects no tags, or assert!(!tag_keys.is_empty(), "expected primary tags to
be present") if tags must exist), and apply the same replacement for the other
occurrence at the later check (the second tag_keys assertion around line 439) so
both assertions validate the actual expected condition rather than always-true
logic.
src/daybook_core/wit/main.wit-110-117 (1)

110-117: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

facet-tag-token::create returns update-doc-error while sibling lookups return access-error — authorization denials conflated with other.

list_facets and get on facet-tag-token surface access-error with an explicit denied variant, but create and get_create_token return update-doc-error, which lacks a denied case. When authorization fails on create (line 909-912), the implementation returns UpdateDocError::Other("create denied"), masking the denial as a generic error alongside invalid-patch and invalid-key. Consider adding a denied variant to update-doc-error or returning a union type like result<..., variant { access(access-error), update(update-doc-error) }> for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/wit/main.wit` around lines 110 - 117, The
create/get-create-token methods on facet-tag-token currently return
update-doc-error which lacks an access-denied case (the implementation even maps
authorization failures to UpdateDocError::Other), causing denied authorizations
to be masked; update the WIT and implementation to make authorization semantics
consistent by either adding a denied variant to update-doc-error (and return
UpdateDocError::Denied from facet-tag-token::create and
facet-tag-token::get-create-token) or change those method signatures to return a
union type like result<..., variant { access(access-error),
update(update-doc-error) }>, and update the corresponding implementation sites
(where UpdateDocError::Other("create denied") is produced) to return the new
denied/access variant instead.
🧹 Nitpick comments (26)
src/plug_dayledger/wit/deps/wasmcloud-postgres-0.1.1-draft/package.wit (1)

324-334: ⚡ Quick win

Align prepare docs with actual signature

The comment references a connection-token and connection.wit, but prepare only accepts statement: string. Please update the doc block to match this API contract to avoid integration mistakes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/wit/deps/wasmcloud-postgres-0.1.1-draft/package.wit`
around lines 324 - 334, The docblock for the prepare function is out of sync
with its signature: prepare(statement: string) does not take a connection-token
and there is no direct reference to connection.wit in this API, so update the
comment for prepare (and any mention of prepared-statement-token /
statement-prepare-error) to remove references to a connection-token and
connection.wit and instead document that prepare only accepts a SQL statement
string, requires parameterized queries using $<integer>, and returns a
prepared-statement-token or statement-prepare-error; keep the existing example
and parameterization requirement but align wording to the single-argument
signature of prepare.
src/daybook_types/Cargo.toml (1)

76-76: ⚡ Quick win

Consider making jsonpath-rust an optional dependency tied to the manifest feature.

jsonpath-rust is added as an unconditional dependency on a foundational types crate consumed across the entire workspace. According to the AI summary, it's used in reference.rs for ACL/manifest evaluation, which maps naturally to the manifest feature. Leaving it mandatory forces every consumer of daybook_types—even those that only need basic doc types—to compile the JSONPath engine.

♻️ Proposed change
-jsonpath-rust = "1.0.4"
+jsonpath-rust = { version = "1.0.4", optional = true }

Then add dep:jsonpath-rust to the manifest feature:

-manifest = ["dep:garde", "dep:semver", "schemars"]
+manifest = ["dep:garde", "dep:semver", "schemars", "dep:jsonpath-rust"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_types/Cargo.toml` at line 76, Make jsonpath-rust optional and
tied to the manifest feature: in Cargo.toml mark the dependency jsonpath-rust =
{ version = "1.0.4", optional = true } and add the features entry manifest =
["dep:jsonpath-rust"] (or add "dep:jsonpath-rust" to the existing manifest
feature). Also guard runtime usage in reference.rs (the ACL/manifest evaluation
code) behind cfg(feature = "manifest") so that code only references
jsonpath-rust when the manifest feature is enabled.
src/daybook_types/manifest.rs (2)

865-883: 💤 Low value

Silently swallowing JsonPath::query errors loses authoring feedback.

The existing facet_has_reference_to_tag (L832-838) logs at debug level when select_json_path_values fails. The new evaluate_facet_field_match uses let Ok(found) = raw.query(json_path) else { continue; }; with no log, so a typo in a manifest's json_path becomes a silent "predicate never matches" with no diagnostic. Mirror the existing pattern for consistency and easier debugging of authored manifests.

♻️ Add a debug log on parse/query error
     for (_, raw) in facets {
-        let Ok(found) = raw.query(json_path) else {
-            continue;
-        };
+        let found = match raw.query(json_path) {
+            Ok(values) => values,
+            Err(err) => {
+                debug!(error = %err, %json_path, "invalid facet field match json_path");
+                continue;
+            }
+        };
         for found_val in found {
             if compare_json_values(found_val, operator, expected) {
                 return true;
             }
         }
     }

As per coding guidelines: "Don't swallow errors with empty catch statements or catch-and-log unless explicitly told to do so" — here a debug log of the parse failure is the established pattern in this file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_types/manifest.rs` around lines 865 - 883, The function
evaluate_facet_field_match currently swallows JsonPath query errors silently;
update it to mirror the existing pattern used by
facet_has_reference_to_tag/select_json_path_values by logging parse/query
failures at debug level instead of continuing silently. When
raw.query(json_path) returns Err, call debug! including the json_path, the error
(from the Err), and a short context (e.g., the facet key or that the facet query
failed) so manifest authors get diagnostics; keep the rest of the loop behavior
the same.

885-911: ⚡ Quick win

Numeric comparisons via as_f64 lose precision for large integer values.

For hledger journals — the motivating use case for this PR — monetary amounts are typically encoded as integer cents/minor units, and queries like amount > 1_000_000_00 over those fields will silently misbehave once values exceed 2^53. serde_json::Value::as_f64() succeeds on Number::from(i64::MAX) but loses precision.

Consider preferring exact integer comparisons when both sides are integers, falling back to f64 only for mixed/float operands:

♻️ Suggested ordering
         CompareOp::Gt | CompareOp::Gte | CompareOp::Lt | CompareOp::Lte => {
+            if let (Some(lhs), Some(rhs)) = (lhs.as_i64(), rhs.as_i64()) {
+                return match op {
+                    CompareOp::Gt => lhs > rhs,
+                    CompareOp::Gte => lhs >= rhs,
+                    CompareOp::Lt => lhs < rhs,
+                    CompareOp::Lte => lhs <= rhs,
+                    _ => unreachable!(),
+                };
+            }
+            if let (Some(lhs), Some(rhs)) = (lhs.as_u64(), rhs.as_u64()) {
+                return match op {
+                    CompareOp::Gt => lhs > rhs,
+                    CompareOp::Gte => lhs >= rhs,
+                    CompareOp::Lt => lhs < rhs,
+                    CompareOp::Lte => lhs <= rhs,
+                    _ => unreachable!(),
+                };
+            }
             if let (Some(lhs), Some(rhs)) = (lhs.as_f64(), rhs.as_f64()) {

A related minor: Eq/Ne use Value's derived PartialEq, so json!(1) == json!(1.0) is false. If users write predicates with value: 1 against a facet that stores 1.0, the match fails silently. Worth either documenting this constraint on FacetFieldMatch::value or normalizing both sides to a common numeric repr before comparison.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_types/manifest.rs` around lines 885 - 911, compare_json_values
currently uses as_f64 for ordered comparisons which loses precision for large
integers; update compare_json_values to first detect integer numbers via
serde_json::Number methods (use as_i64/as_u64) and perform exact integer
comparisons when both lhs and rhs are integers (handle signed/unsigned mixing by
promoting safely to i128/unsigned logic or compare via i128/u128), fall back to
as_f64 only when one or both operands are non-integer floats or cannot be
represented as integers, and also consider normalizing numeric equality in
CompareOp::Eq/Ne to treat numeric equivalence (e.g., 1 and 1.0) consistently by
comparing numbers via the same numeric normalization logic rather than relying
solely on Value's PartialEq; focus your edits inside the compare_json_values
function and update the match arms for CompareOp::{Eq,Ne,Gt,Gte,Lt,Lte}
accordingly.
src/daybook_wflows/wflows/ocr_image.rs (1)

35-37: 💤 Low value

Downscope the cloned blob token to READ-only.

clone(None) preserves the parent token's full rights. Since mltools_ocr::ocr_image only requires read access, explicitly downscope to FacetRights::READ to follow least-privilege principles and make the intent clear at the call site.

♻️ Proposed downscope
-    let blob_ocr_token = blob_facet_token.clone(None).map_err(|err| {
-        JobErrorX::Terminal(ferr!("access error cloning blob facet token: {err:?}"))
-    })?;
+    let blob_ocr_token = blob_facet_token
+        .clone(Some(FacetRights::READ))
+        .map_err(|err| {
+            JobErrorX::Terminal(ferr!("access error cloning blob facet token: {err:?}"))
+        })?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_wflows/wflows/ocr_image.rs` around lines 35 - 37, The cloned blob
token currently uses blob_facet_token.clone(None) which preserves full rights;
change the clone call to explicitly downscope to read-only by cloning with
FacetRights::READ so blob_ocr_token supplies only read access to
mltools_ocr::ocr_image and update the map_err message if needed; locate the
clone invocation that produces blob_ocr_token (and references blob_facet_token)
and replace the None scope with FacetRights::READ to enforce least-privilege for
the ocr_image call.
src/plug_dayledger/hledger/tests/mod.rs (3)

162-166: ⚡ Quick win

test_sample_journal_not_empty is redundant.

It only verifies the bundled fixture file is not empty, which is enforced at compile time by include_str! (any missing path errors out at build) and is then re-tested by test_parse_sample_journal_fixture. Drop this test to reduce noise.

♻️ Suggested removal
-#[test]
-fn test_sample_journal_not_empty() {
-    let input = include_str!("fixtures/sample.journal");
-    assert!(!input.is_empty());
-}
-
 #[test]
 fn test_parse_sample_journal_fixture() {

As per coding guidelines: "remove useless or shallow tests and consolidate repeated setup code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/tests/mod.rs` around lines 162 - 166, Remove the
redundant unit test function test_sample_journal_not_empty from the test module:
it only asserts include_str!("fixtures/sample.journal") is non-empty (already
guaranteed at compile time) and is covered by test_parse_sample_journal_fixture;
delete the entire fn test_sample_journal_not_empty() { ... } block to reduce
test noise.

119-139: ⚡ Quick win

Tighten shallow string-contains assertions on parsed amounts.

assert!(amt.quantity.contains("47.18")) will pass for "47.18", "-47.18", "147.180", etc. Use assert_eq! to lock down the parser's exact serialization (sign handling and decimal preservation are specifically the things you'd want to regression-test).

♻️ Tightened assertions
-#[test]
-fn test_amount_left_symbol() {
-    let amt = parse_ok(amountp(), "$47.18");
-    assert_eq!(amt.commodity, "$");
-    assert!(amt.quantity.contains("47.18"));
-    assert_eq!(amt.style.commodity_side, CommoditySide::Left);
-}
+#[test]
+fn test_amount_left_symbol() {
+    let amt = parse_ok(amountp(), "$47.18");
+    assert_eq!(amt.commodity, "$");
+    assert_eq!(amt.quantity, "47.18");
+    assert_eq!(amt.style.commodity_side, CommoditySide::Left);
+}
@@
-#[test]
-fn test_amount_no_symbol() {
-    let amt = parse_ok(amountp(), "47.18");
-    assert!(amt.commodity.is_empty());
-    assert!(amt.quantity.contains("47.18"));
-}
+#[test]
+fn test_amount_no_symbol() {
+    let amt = parse_ok(amountp(), "47.18");
+    assert!(amt.commodity.is_empty());
+    assert_eq!(amt.quantity, "47.18");
+    assert_eq!(amt.style.commodity_side, CommoditySide::Right);
+}

As per coding guidelines: "When working on tests, proactively improve them - remove useless or shallow tests and consolidate repeated setup code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/tests/mod.rs` around lines 119 - 139, The tests
use loose contains() checks on amt.quantity; tighten them to assert the exact
serialized quantity so sign and decimal precision are locked down: in
test_amount_left_symbol, replace assert!(amt.quantity.contains("47.18")) with
assert_eq!(amt.quantity, "47.18"); in test_amount_left_symbol_spaced keep
checking amt.style.commodity_spaced but also assert_eq!(amt.quantity, "47.18");
and in test_amount_no_symbol replace assert!(amt.quantity.contains("47.18"))
with assert_eq!(amt.quantity, "47.18"); ensure you reference the parsed amt from
amountp() and the fields amt.quantity, amt.commodity, and
amt.style.commodity_spaced when making these changes.

168-180: 💤 Low value

Consider a snapshot test for test_parse_sample_journal_fixture.

The fixture-based assertions check a hand-picked subset of fields (count, first/fourth txn). A snapshot of the full parsed Vec<Transaction> would catch ordering/posting/amount drift across the whole fixture in one place and would be cheaper to maintain.

As per coding guidelines: "Use snapshot tests, TDD, and TDD macros found in the repo to ensure longevity of test usefulness".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/tests/mod.rs` around lines 168 - 180, Replace the
handful of field assertions in the test_parse_sample_journal_fixture unit test
with (or add) a snapshot assertion that captures the full parsed
Vec<Transaction> returned by parse_journal_ok; serialize txns (e.g., via
Debug/Display, serde_json/serde_yaml, or the repo's TDD snapshot macro such as
insta::assert_debug_snapshot!/assert_yaml_snapshot!) so the snapshot stores all
Transaction fields (including postings, amounts, ordering and Status enum
values) and commit the generated snapshot fixture so regressions in Transaction
or parse_journal_ok are detected automatically.
src/plug_dayledger/hledger/parse/journal.rs (2)

117-131: ⚡ Quick win

transactionp accepts a header with zero postings.

posting.repeated().collect() allows an empty postings vec, so a stray 2024/01/02 desc\n (with no following indented lines) is silently parsed as a 0-posting Transaction. That likely produces a downstream Claim with zero posting_hints and an empty txn_amount_totals, which the matcher treats with the 0.5 fallback (line 352–354 in claim_matcher.rs). Either reject as a parse error or surface it; hledger itself rejects 0-posting transactions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/parse/journal.rs` around lines 117 - 131, The
transaction parser currently allows zero-posting transactions because
posting.repeated().collect() can produce an empty Vec; update the parser that
builds Transaction (the
header.then(posting.repeated().collect::<Vec<_>>()).map(...) closure in
transactionp) to reject empty postings—either replace repeated() with a
combinator that requires at least one posting or, after collecting postings,
check postings.is_empty() and return a parse error (propagate Err) instead of
constructing a Transaction; reference the header, posting, and transactionp
symbols so the change targets that mapping/closure.

6-14: 💤 Low value

default_amount couples to is_missing via empty quantity.

This works because Amount::is_missing checks quantity.is_empty(), but the contract is implicit. Consider deriving Default for Amount (it already requires Default on AmountStyle, which is provided) and reusing Amount::default(), or expose this as Amount::missing() so the intent is self-documenting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/parse/journal.rs` around lines 6 - 14,
default_amount currently relies on Amount::is_missing checking
quantity.is_empty(), making the “missing” contract implicit; derive or implement
Default for Amount (ensuring AmountStyle: Default is used) and replace usages of
default_amount with Amount::default(), or alternatively add an explicit
constructor Amount::missing() that returns the same sentinel (empty
commodity/quantity, default style, None cost/cost_basis); update callers to use
Amount::default() or Amount::missing() and remove the standalone fn
default_amount().
src/plug_dayledger/hledger/claim_matcher.rs (2)

318-347: 💤 Low value

Date heuristic ordering: tied semantics need a comment.

The cascading thresholds in date_score are clever but order-sensitive in non-obvious ways (e.g., the month/day-swap check at lines 331–336 only fires when years match; the same-month-and-day check at 337 captures off-by-year-only). A short comment summarizing the pecking order — and what each branch is meant to catch — would prevent future regressions when someone adjusts thresholds.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/claim_matcher.rs` around lines 318 - 347, The
date_score function's cascade of thresholds is order-sensitive and lacks
documentation; add a concise comment above fn date_score explaining the intended
pecking order and semantics of each branch: exact match (new_date == old_date)
-> 1.0, one-day difference -> 0.90, same-year but month/day swapped (the
month/day-swap check that compares new_date.month() to old_date.day() and
new_date.day() to old_date.month()) -> 0.82, exact month+day match (same month
and day regardless of year) -> 0.78, same year & same day (different month) ->
0.70, within 7 days -> 0.62, else 0.0; note why the swap check is constrained to
same year and that ordering prevents the same-month/day branch from preempting
swap detection so future edits to date_score (and variables new_date, old_ts)
won't inadvertently change behavior.

199-227: ⚖️ Poor tradeoff

Float equality on *best == candidate.score works here but is fragile.

The "is best" check relies on bitwise float equality between candidate.score and the value re-pushed into the scores_by_* lists. It works today because both originate from the same composite_score() invocation per (new, old) pair, but any future refactor that re-derives scores or applies any arithmetic adjustment will silently break tie-breaking.

Consider keying candidates by (usize, OrderedFloat) and selecting "best" by tracking the candidate index rather than re-comparing floats.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/claim_matcher.rs` around lines 199 - 227, The
current candidates.retain logic uses exact float equality (*best ==
candidate.score) to detect the top scorer, which is fragile; instead change how
you record the best per index: when building scores_by_new and scores_by_old (or
the data structure that aggregates composite_score values), key entries by
(index, OrderedFloat(score)) or store alongside each score the originating
candidate id/index so you can determine "best" by comparing keys/indices rather
than re-comparing f64s; update the candidate filtering to check that the
candidate's id/index matches the recorded best index for candidate.new_index and
candidate.old_index and keep the existing margin checks (HEURISTIC_MIN_MARGIN)
unchanged.
src/plug_dayledger/hledger/types.rs (1)

24-34: 💤 Low value

Date Display does not handle negative years correctly.

{:04} doesn't apply zero-padding to negative numbers in the way the format suggests (e.g., year -1 renders as "-1", not "-001" or "-0001"). hledger journals don't typically carry BC dates, so this is unlikely to bite, but worth a passing thought if you ever round-trip serialized dates through this Display.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/types.rs` around lines 24 - 34, The Display impl
for Date (impl std::fmt::Display for Date, fn fmt) needs to handle negative
years explicitly because "{:04}" won't produce a zero-padded negative year
(e.g., year -1 should render "-0001"); change the formatting logic to compute
the year value first (let year = self.0.year()), take the absolute value for
zero-padding, and then write a prefixed '-' when year is negative (use the same
write! in fmt but format abs(year) with {:04} and prepend '-' for negative
years) so both positive and negative years produce the expected zero-padded
output.
src/plug_dayledger/types.rs (2)

88-95: ⚡ Quick win

ClaimPostingHint.hint_type: Option<String> would benefit from a typed enum.

hint_type is populated with stringly-typed values ("virtual", "balancedVirtual") by claim_matcher::build_claim and compared by string in claim_posting_pairs. An enum (mirroring or reusing PostingType) eliminates the typo risk and lets the JSON shape be governed by serde rather than ad-hoc format!.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/types.rs` around lines 88 - 95, Change
ClaimPostingHint.hint_type from Option<String> to a typed enum (e.g.,
Option<PostingType> or a new enum like ClaimHintType) and update serialization
to derive Serialize/Deserialize so JSON ("virtual", "balancedVirtual") is
handled by serde; update claim_matcher::build_claim to construct the enum
variant instead of formatting strings and update claim_posting_pairs to match on
the enum variants rather than comparing strings (or implement
TryFrom<String>/FromStr for the enum and use that conversion where strings are
currently produced/consumed). Ensure the enum derives Debug, Clone, Serialize,
Deserialize, PartialEq (and schemars::JsonSchema if needed) and adjust any
places constructing or comparing hint_type accordingly.

99-113: ⚡ Quick win

HledgerTxnDeets.tags as Vec<(String, String)> serializes as nested arrays.

Anonymous tuples serialize to JSON as [["key","value"], ...], which is awkward for downstream consumers (camelCase rename has no effect, and ad-hoc tools can't address fields by name). Consider either a small named Tag { key, value } struct or BTreeMap<String, Vec<String>> if tags are key→multi-value.

Note this is the persistent JSON shape stored in claim deets, so changing it later requires migration.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/types.rs` around lines 99 - 113, HledgerTxnDeets.tags
currently uses Vec<(String,String)>, which serializes to nested arrays; change
it to a named serializable type (e.g., define a Tag struct with pub key: String
and pub value: String and make tags: Vec<Tag>) or to a map type (e.g.,
BTreeMap<String,Vec<String>>) depending on whether you need multi-values; update
the HledgerTxnDeets struct (tags field) and any code that constructs/parses tags
(search for HledgerTxnDeets and usages) so serialization produces objects/keys
instead of arrays, and ensure serde attributes (if any) reflect the chosen shape
and handle migration of the persisted JSON accordingly.
src/plug_dayledger/hledger/parse/common.rs (1)

142-207: 💤 Low value

amountp right-side branch sets commodity_spaced = true unconditionally.

In the right_or_no_symbol arm, when an opt_commodity is present it always builds AmountStyle { commodity_spaced: true, ... }. That's because ws1().ignore_then(commoditysymbolp()) requires whitespace, so today it's always preceded by a space — but the field is named to round-trip styling, and a future relaxation of the whitespace requirement (e.g., 47.18USD) would silently misreport spacing. Consider capturing whether the leading whitespace existed (ws1().or_not() mapped to a bool) so the style flag tracks the source faithfully.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/hledger/parse/common.rs` around lines 142 - 207, In
amountp, the right_or_no_symbol branch currently sets
AmountStyle.commodity_spaced = true whenever opt_commodity is Some because
ws1().ignore_then(commoditysymbolp()) forces whitespace; change the parser to
capture whether whitespace was present (e.g. replace
ws1().ignore_then(commoditysymbolp()).or_not() with ws1().map(|_|
true).or(empty().to(false)).then(commoditysymbolp()).or_not() or equivalent) so
the mapped closure receives a bool for spacing plus the commodity; then use that
bool to set commodity_spaced instead of hardcoding true (update the mapping in
right_or_no_symbol that constructs Amount and refers to opt_commodity and
decimal_mark).
src/daybook_ffi/rt.rs (1)

85-111: ⚖️ Poor tradeoff

Consider renaming dispatch_doc_facetdispatch_doc_routine for clarity.

The function name still references DocFacet while it now dispatches DocRoutine. Since this is on the FFI surface, the rename is a breaking change for callers (UniFFI bindings consumers); defer if the FFI consumers haven't been migrated yet.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_ffi/rt.rs` around lines 85 - 111, Rename the FFI method
dispatch_doc_facet to dispatch_doc_routine to reflect that it dispatches
DispatchArgs::DocRoutine; update the function signature name (dispatch_doc_facet
→ dispatch_doc_routine) and any public/exported references (including UniFFI
binding metadata) so callers see the clearer name, and if you need to preserve
backward compatibility keep a thin shim function named dispatch_doc_facet that
forwards to dispatch_doc_routine (preserving the same params and return type)
while marking it deprecated so external consumers can migrate.
src/plug_test/e2e/capability_regression_wflow.rs (2)

25-25: 💤 Low value

Take &str instead of &String for doc_id.

Idiomatic Rust prefers &str parameters; &String forces callers to hold a String. Same applies to other helpers in the file.

♻️ Suggested change
-async fn dispatch_and_wait(
-    test_cx: &daybook_core::test_support::DaybookTestContext,
-    routine_name: &str,
-    doc_id: &String,
-    changed_facet_keys: Vec<String>,
-) -> Res<String> {
+async fn dispatch_and_wait(
+    test_cx: &daybook_core::test_support::DaybookTestContext,
+    routine_name: &str,
+    doc_id: &str,
+    changed_facet_keys: Vec<String>,
+) -> Res<String> {

Update the doc_id.clone() in DispatchArgs::DocRoutine to doc_id.to_string() and call sites accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_test/e2e/capability_regression_wflow.rs` at line 25, Change the
doc_id parameter types from &String to &str across this file (e.g., the function
that currently declares doc_id: &String), update any places constructing
DispatchArgs::DocRoutine to call doc_id.to_string() instead of doc_id.clone(),
and update all call sites and helper functions in this file to accept &str (or
call .to_string() when an owned String is needed); ensure signatures, parameter
usages, and any pattern matches are adjusted to use &str so callers are not
forced to hold a String.

22-70: ⚡ Quick win

Consolidate repeated test setup/assertion patterns.

The eight #[tokio::test]s repeat a near-identical preamble (test_cx, import_test_plug_oci, setup_doc, dispatch_and_wait, open_plug_test_local_state, fetch_capability_report*) and the two "config tag rights READ+UPDATE / READ-only" assertion blocks (lines 190-209 and 300-319 are duplicated). Extracting a small helper (assert_config_tag_rights(&config_tag_keys, &config_tag_rights)) and a setup helper would shrink each test to its unique assertions and reduce drift risk.

Per coding guidelines: "When working on tests, proactively improve them ... consolidate repeated setup code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_test/e2e/capability_regression_wflow.rs` around lines 22 - 70,
Extract the repeated test preamble and duplicated assertions into small helpers:
create a setup helper (e.g., setup_plug_test) that performs the sequence
currently repeated (import_test_plug_oci, setup_doc, calling dispatch_and_wait,
open_plug_test_local_state, and invoking the various fetch_capability_report*
functions) and return the values each test needs; and create an
assert_config_tag_rights(&config_tag_keys, &config_tag_rights) helper that
contains the two duplicated "config tag rights READ+UPDATE / READ-only"
assertion blocks so tests call this single helper instead of duplicating logic.
Update each #[tokio::test] to call setup_plug_test and assert_config_tag_rights
and keep only test-unique assertions; reference the existing dispatch_and_wait
function and open_plug_test_local_state and fetch_capability_report* calls when
moving logic into the new helpers.
src/daybook_core/wit/main.wit (2)

67-70: 💤 Low value

Consider clarifying the semantics of doc-meta::updated-at.

Modeling updated-at as list<datetime> is unusual; most consumers expect a single most-recent timestamp. If the intent is a history of updates (e.g., per-branch or per-CRDT change), an inline doc comment in the WIT would help guests interpret the field correctly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/wit/main.wit` around lines 67 - 70, doc-meta::updated-at is
defined as list<datetime> which is ambiguous; either change its type to a single
datetime if you only need the latest timestamp, or keep list<datetime> but add
an inline WIT doc comment on the record doc-meta (and specifically updated-at)
explaining the semantics (e.g., chronological history of updates, per-branch
timestamps, CRDT change history) so guests know whether to read the last element
as the current value or treat the list as a full audit trail; update the record
doc-meta declaration and its updated-at field comment accordingly.

85-93: 💤 Low value

doc-token::clone with none downscope returns full rights — add WIT doc comment to clarify.

When downscope is none, the implementation returns a token with the same rights as the source. This appears intentional (supporting unrestricted copy operations), but the WIT interface lacks documentation explaining this semantics. Add a doc comment to clarify whether none means "preserve all rights" or "no restriction" so guests understand the behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/wit/main.wit` around lines 85 - 93, Add a WIT doc comment to
the doc-token::clone method clarifying the semantics of the downscope parameter:
state that passing none returns a token that preserves the original token's full
rights (i.e., no restriction), whereas passing Some(downscope) returns a token
restricted to the supplied doc-rights; place this comment immediately above the
clone signature in the resource doc-token so callers understand that none means
"preserve all rights / no restriction."
src/plug_test/e2e/common.rs (1)

8-15: 💤 Low value

Hardcoded ../../target/oci ignores CARGO_TARGET_DIR override.

If a developer or CI sets CARGO_TARGET_DIR (or uses [build] target-dir in .cargo/config.toml), this path won't resolve and the test will fail with the misleading "build it first" hint. Consider honoring the env var or accepting a path override.

♻️ Optional fix
-    let artifact_path = PathBuf::from(env!("CARGO_MANIFEST_DIR"))
-        .join("../../target/oci")
-        .join(crate::plug_manifest().id());
+    let target_dir = std::env::var_os("CARGO_TARGET_DIR")
+        .map(PathBuf::from)
+        .unwrap_or_else(|| {
+            PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("../../target")
+        });
+    let artifact_path = target_dir.join("oci").join(crate::plug_manifest().id());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_test/e2e/common.rs` around lines 8 - 15, The artifact_path logic in
common.rs currently hardcodes "../../target/oci" and ignores CARGO_TARGET_DIR;
change it to first check std::env::var("CARGO_TARGET_DIR") and if present use
PathBuf::from(that_value).join("oci").join(crate::plug_manifest().id()),
otherwise fall back to the original env!("CARGO_MANIFEST_DIR")..../../target/oci
behavior; update the artifact_path construction and the eyre::ensure error
message to reflect the actual path checked so failures show the correct
location.
src/plug_plabels/wflows/label_note.rs (1)

49-58: ⚡ Quick win

rw_config_token and ro_config_token may resolve to the same token, leaving the fallback pattern inefficient.

When a facet token holds both READ and UPDATE rights, both .find() calls will return it, making ro_config_token redundant. The fallback logic (if let Some(rw) ... else if let Some(ro) ...) means the ro_config_token branch becomes unreachable whenever both resolve to the same token. This is safe but wasteful.

To clarify intent—prefer a write-capable token when available, otherwise fall back to a strictly read-only token—consider filtering ro_config_token to exclude tokens with UPDATE rights:

Suggested approach
-    let rw_config_token = args
-        .config_docs
-        .iter()
-        .flat_map(|cd| cd.facets.iter())
-        .find(|t| t.key() == config_facet_key && t.rights().contains(FacetRights::UPDATE));
-    let ro_config_token = args
-        .config_docs
-        .iter()
-        .flat_map(|cd| cd.facets.iter())
-        .find(|t| t.key() == config_facet_key && t.rights().contains(FacetRights::READ));
+    let rw_config_token = args
+        .config_docs
+        .iter()
+        .flat_map(|cd| cd.facets.iter())
+        .find(|t| t.key() == config_facet_key && t.rights().contains(FacetRights::UPDATE));
+    // Prefer a strictly read-only token over one with UPDATE rights for the fallback.
+    let ro_config_token = args
+        .config_docs
+        .iter()
+        .flat_map(|cd| cd.facets.iter())
+        .filter(|t| t.key() == config_facet_key && t.rights().contains(FacetRights::READ))
+        .find(|t| !t.rights().contains(FacetRights::UPDATE));

This pattern repeats identically in label_image.rs, learn_note_label_candidates.rs, and learn_image_label_candidates.rs and could be refactored consistently across all four files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_plabels/wflows/label_note.rs` around lines 49 - 58, rw_config_token
and ro_config_token can end up identical when a facet has both READ and UPDATE,
making the read fallback redundant; change the selection so you prefer a token
with UPDATE rights and only fall back to a token that has READ but not UPDATE.
Concretely, keep the existing rw_config_token search (find facet where
key==config_facet_key and rights contains FacetRights::UPDATE), then change the
ro_config_token search to exclude UPDATE (find facet where key==config_facet_key
and rights contains FacetRights::READ and NOT contains FacetRights::UPDATE) so
the fallback branch is meaningful; apply the same change consistently in
label_image.rs, learn_note_label_candidates.rs, and
learn_image_label_candidates.rs.
src/daybook_core/rt/dispatch.rs (1)

102-120: 💤 Low value

Consider migration impact on persisted dispatch rows.

The new fields invocation, primary_doc, config_docs lack #[serde(default)]. If any sqlite store has existing payload_json rows persisted before this change (e.g., a long-running dev DB), load_state (line 861) will fail during startup with a deserialization error. If you do not intend to support that migration path, it's fine to leave as-is, but consider either adding #[serde(default)] or documenting that the dispatch sqlite must be recreated for this version.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt/dispatch.rs` around lines 102 - 120, The FacetRoutineArgs
struct now introduces non-optional fields (invocation, primary_doc, config_docs)
which will break deserialization of existing persisted payload_json rows; to
fix, add #[serde(default)] to the invocation, primary_doc and config_docs fields
in the FacetRoutineArgs definition so deserialization of older rows falls back
to defaults (or alternatively document/reject the migration path), ensuring you
reference the FacetRoutineArgs type and those field names when making the change
so load_state can read old rows without failing.
src/daybook_core/rt/wash_plugin.rs (1)

786-838: 💤 Low value

Owner-plug-id resolution depends on the first ACL entry being representative.

Line 789-793 derives owner_plug_id from config_doc_meta.facet_acl.first(). This is correct only if every entry in a single config_doc_meta.facet_acl shares the same owner_plug_id (which the dispatch grouping should guarantee, but isn't enforced here). If a future change accidentally mixes owners in one entry, the loop would silently group everything under the first entry's owner. Consider asserting consistency or grouping explicitly per owner during construction at the dispatch site rather than per-first-entry here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt/wash_plugin.rs` around lines 786 - 838, The code
currently picks owner_plug_id from config_doc_meta.facet_acl.first(), which can
silently mis-group facets if ACL entries have differing owner_plug_id; update
the loop that processes config_docs (symbols: config_doc_meta, facet_acl,
owner_plug_id, owner_config_docs, get_or_init_plug_config_doc_id,
build_doc_facet_tokens) to validate/assert that every entry in
config_doc_meta.facet_acl yields the same owner_plug_id (or explicitly split the
config_doc_meta by owner_plug_id and process each group separately), and return
an error if inconsistent; this ensures owner resolution is deterministic and
prevents accidental cross-owner grouping.
src/plug_plabels/wflows/learn_note_label_candidates.rs (1)

52-54: ⚡ Quick win

Silent no-ops when only RO config token is available may hide regressions.

Two paths silently discard work:

  1. Line 52-54: if neither rw_config_token nor ro_config_token is present, the routine returns Ok(()) with no log. Whether this is reachable depends on the dispatch ACL, but a tracing::debug! would help diagnose missing config wiring.
  2. Lines 93-110: when merged != proposal_set but only ro_config_token exists, the freshly-computed merge is computed (including LLM call and embedding centroid lookups on the prior lines) and then dropped silently. At minimum log a debug!/warn! so operators can spot this state if it ever occurs.

Also applies to: 93-110

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_plabels/wflows/learn_note_label_candidates.rs` around lines 52 - 54,
The function currently returns early with Ok(()) when both rw_config_token and
ro_config_token are None and silently drops a computed merge when merged !=
proposal_set but only ro_config_token is present; add tracing logs in both
places: (1) before the early return involving rw_config_token and
ro_config_token, emit tracing::debug! including the presence/absence of each
token and any request/dispatch identifiers to help diagnose missing config
wiring; (2) before discarding the freshly-computed merged (the branch where
merged != proposal_set and rw_config_token.is_none() but
ro_config_token.is_some()), emit a tracing::warn! or tracing::debug! with
context such as proposal_set id/length and merged id/length and note that the
merge is being skipped due to read-only config so operators can spot this
unexpected state.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e6aaa902-8362-41cb-9222-175eb1a4aa8e

📥 Commits

Reviewing files that changed from the base of the PR and between 8a0e2de and feee14d.

⛔ Files ignored due to path filters (3)
  • Cargo.lock is excluded by !**/*.lock and included by **/*
  • docs/DEVDOC/log.md is excluded by !docs/DEVDOC/** and included by **/*
  • docs/DEVDOC/todo.md is excluded by !docs/DEVDOC/** and included by **/*
📒 Files selected for processing (80)
  • CONTRIBUTING.md
  • Cargo.toml
  • flake.nix
  • src/am_utils_rs/repo/iroh.rs
  • src/am_utils_rs/sync.rs
  • src/daybook_cli/main.rs
  • src/daybook_core/e2e/command_invoke_wflow.rs
  • src/daybook_core/e2e/embed_text_wflow.rs
  • src/daybook_core/e2e/plugin_local_index_wflow.rs
  • src/daybook_core/plugs.rs
  • src/daybook_core/rt.rs
  • src/daybook_core/rt/dispatch.rs
  • src/daybook_core/rt/triage.rs
  • src/daybook_core/rt/wash_plugin.rs
  • src/daybook_core/rt/wash_plugin/caps.rs
  • src/daybook_core/rt/wash_plugin/mltools.rs
  • src/daybook_core/wit/main.wit
  • src/daybook_ffi/rt.rs
  • src/daybook_types/Cargo.toml
  • src/daybook_types/doc.rs
  • src/daybook_types/manifest.rs
  • src/daybook_types/reference.rs
  • src/daybook_wflows/lib.rs
  • src/daybook_wflows/wflows/embed_image.rs
  • src/daybook_wflows/wflows/embed_text.rs
  • src/daybook_wflows/wflows/index_embedding.rs
  • src/daybook_wflows/wflows/ocr_image.rs
  • src/daybook_wflows/wflows/test_labeler.rs
  • src/mltools/lib.rs
  • src/plug_dayledger/Cargo.toml
  • src/plug_dayledger/e2e.rs
  • src/plug_dayledger/e2e/common.rs
  • src/plug_dayledger/e2e/parse_hledger_wflow.rs
  • src/plug_dayledger/hledger/claim_matcher.rs
  • src/plug_dayledger/hledger/mod.rs
  • src/plug_dayledger/hledger/parse/common.rs
  • src/plug_dayledger/hledger/parse/journal.rs
  • src/plug_dayledger/hledger/parse/mod.rs
  • src/plug_dayledger/hledger/tests/fixtures/mod.rs
  • src/plug_dayledger/hledger/tests/fixtures/quickstart.journal
  • src/plug_dayledger/hledger/tests/fixtures/sample.journal
  • src/plug_dayledger/hledger/tests/fixtures/sample2.journal
  • src/plug_dayledger/hledger/tests/fixtures/status.journal
  • src/plug_dayledger/hledger/tests/fixtures/unicode.journal
  • src/plug_dayledger/hledger/tests/mod.rs
  • src/plug_dayledger/hledger/types.rs
  • src/plug_dayledger/lib.rs
  • src/plug_dayledger/types.rs
  • src/plug_dayledger/wflows/mod.rs
  • src/plug_dayledger/wflows/parse_hledger.rs
  • src/plug_dayledger/wit/deps/api-utils
  • src/plug_dayledger/wit/deps/daybook
  • src/plug_dayledger/wit/deps/daybook-types
  • src/plug_dayledger/wit/deps/mltools
  • src/plug_dayledger/wit/deps/townframe-sql
  • src/plug_dayledger/wit/deps/utils
  • src/plug_dayledger/wit/deps/wasi-cli-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasi-clocks-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasi-config-0.2.0-draft/package.wit
  • src/plug_dayledger/wit/deps/wasi-filesystem-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasi-http-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasi-io-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasi-keyvalue-0.2.0-draft/package.wit
  • src/plug_dayledger/wit/deps/wasi-logging-0.1.0-draft/package.wit
  • src/plug_dayledger/wit/deps/wasi-random-0.2.6/package.wit
  • src/plug_dayledger/wit/deps/wasmcloud-postgres-0.1.1-draft/package.wit
  • src/plug_dayledger/wit/deps/wflow
  • src/plug_dayledger/wit/main.wit
  • src/plug_plabels/lib.rs
  • src/plug_plabels/wflows/label_engine.rs
  • src/plug_plabels/wflows/label_image.rs
  • src/plug_plabels/wflows/label_note.rs
  • src/plug_plabels/wflows/learn_image_label_candidates.rs
  • src/plug_plabels/wflows/learn_note_label_candidates.rs
  • src/plug_test/Cargo.toml
  • src/plug_test/e2e.rs
  • src/plug_test/e2e/capability_regression_wflow.rs
  • src/plug_test/e2e/common.rs
  • src/plug_test/lib.rs
  • x/build-a-dayb.ts
💤 Files with no reviewable changes (2)
  • src/daybook_wflows/lib.rs
  • Cargo.toml

Comment thread src/daybook_core/plugs.rs
Comment on lines +171 to +197
doc_acls: vec![RoutineDocAcl {
doc_predicate: DocPredicateClause::HasTag(
WellKnownFacetTag::Note.into(),
),
facet_acl: vec![
RoutineFacetAccess {
owner_plug_id: None,
tag: WellKnownFacetTag::Blob.into(),
key_id: None,
read: true,
write: false,
create: false,
delete: false,
},
RoutineFacetAccess {
owner_plug_id: None,
tag: WellKnownFacetTag::Note.into(),
key_id: None,
read: true,
write: true,
create: false,
delete: false,
},
],
config_facet_acl: vec![],
},
}],
query_acls: vec![],
config_facet_acl: vec![],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

These routine ACLs no longer match the docs their processors run on.

ocr-image, embed-image, embed-text, and test-label are triggered on docs that are missing the output facet, but these doc_acls require that output tag and only grant write instead of create. Under the new token model that means the workflow won't receive a tag/create token for the first write, so initial OCR/embedding/label creation will fail. Make the ACL predicate match the trigger doc and grant create on the produced facet tag.

Also applies to: 210-237, 249-276, 320-336

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/plugs.rs` around lines 171 - 197, Update each RoutineDocAcl
so the doc_predicate matches the trigger document tag (the tag the processor
watches) and ensure the RoutineFacetAccess entry for the facet the processor
produces sets create: true (not just write). Specifically, for the RoutineDocAcl
instances around the shown diff (and the other ranges mentioned) change the
DocPredicateClause to the trigger tag used by the processor and set the produced
facet's RoutineFacetAccess.create = true (leave read/write as appropriate);
verify the owner_plug_id/tag/key_id entries stay the same. This ensures the
workflow receives a create token for the initial OCR/embedding/label write.

Comment thread src/daybook_core/plugs.rs
Comment on lines +1082 to +1097
} if key == "vtag"
&& patch.path.len() == 3
&& patch.path[1].1 == automerge::Prop::Map("manifests".into()) =>
{
if key == "vtag" {
let Some((_obj, automerge::Prop::Map(plug_id))) = patch.path.get(2) else {
return Ok(());
};
let Some((_obj, automerge::Prop::Map(plug_id))) = patch.path.get(2) else {
return Ok(());
};

let vtag_bytes = match val {
automerge::Value::Scalar(scalar) => match &**scalar {
automerge::ScalarValue::Bytes(bytes) => bytes,
_ => return Ok(()),
},
let vtag_bytes = match val {
automerge::Value::Scalar(scalar) => match &**scalar {
automerge::ScalarValue::Bytes(bytes) => bytes,
_ => return Ok(()),
};
let vtag = VersionTag::hydrate_bytes(vtag_bytes)?;
let event_origin = crate::repos::resolve_origin_from_vtag_actor(
&self.local_actor_id,
&vtag.actor_id,
live_origin,
);
if vtag.version.is_nil() {
out.push(PlugsEvent::PlugAdded {
id: plug_id.clone(),
heads: heads.clone(),
origin: event_origin.clone(),
});
} else {
out.push(PlugsEvent::PlugChanged {
id: plug_id.clone(),
heads: heads.clone(),
origin: event_origin.clone(),
});
}
},
_ => return Ok(()),
};
let vtag = VersionTag::hydrate_bytes(vtag_bytes)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't treat malformed manifest vtag patches as a no-op.

Returning Ok(()) here silently drops unexpected Automerge shapes, which can leave the in-memory plug index stale relative to replicated state. Since this path consumes shared external data, surface drift/corruption as an error instead of ignoring it.

Based on learnings: when validating data that comes from external inputs (such as automerge documents and related metadata like dmeta), use eyre::bail!() to return errors rather than panicking or silently recovering so version drift is surfaced safely.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/plugs.rs` around lines 1082 - 1097, The current
patch-handling branch in plugs.rs silently returns Ok(()) for malformed `vtag`
shapes (the match arm that checks key == "vtag" and patch.path len==3 and
path[1] map "manifests"), which can hide replica drift; instead of returning
Ok(()), replace those silent early returns with eyre::bail! calls that surface
an error with context (include identifying info like the patch.key, patch.path
or plug_id when available). Specifically, for the `let Some((_obj,
automerge::Prop::Map(plug_id)))` failure, for non-Bytes scalar, and for
non-Scalar automerge::Value cases, use eyre::bail! with a clear message
indicating a malformed manifest vtag for plug_id (or full path if plug_id
missing); keep VersionTag::hydrate_bytes(...) usage but propagate its error via
eyre as well.

Comment thread src/daybook_core/rt.rs
Comment on lines +1601 to 1616
let DispatchArgs::DocRoutine {
doc_id,
branch_path,
heads,
changed_facet_keys,
wflow_args_json,
} = args;

let invocation = if changed_facet_keys.is_empty() {
dispatch::RoutineInvocation::Command
} else {
dispatch::RoutineInvocation::Processor(dispatch::ProcessorInvocation {
trigger_doc_id: doc_id.clone(),
changed_facet_keys,
})
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't infer processor-vs-command from changed_facet_keys.is_empty().

Processor-triggered runs can legitimately arrive with no changed keys, so this heuristic rewrites them into RoutineInvocation::Command and drops the processor context. The invocation kind needs to be carried explicitly in DispatchArgs::DocRoutine instead of inferred here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt.rs` around lines 1601 - 1616, Dispatch currently infers
RoutineInvocation kind from changed_facet_keys.is_empty(), which wrongly
converts processor-triggered runs with no keys into Command; update
DispatchArgs::DocRoutine to include an explicit invocation kind (e.g. an enum
field like invocation_kind or routine_kind) and change this construction to use
that field to choose between dispatch::RoutineInvocation::Command and
dispatch::RoutineInvocation::Processor(dispatch::ProcessorInvocation {
trigger_doc_id: doc_id.clone(), changed_facet_keys }), rather than relying on
changed_facet_keys.is_empty(); ensure all callers/builders of
DispatchArgs::DocRoutine are updated to provide the new explicit kind.

Comment thread src/daybook_core/rt.rs
Comment on lines +1618 to +1635
let dispatch_id = {
let mut identity = String::new();
use std::fmt::Write as _;
write!(
&mut identity,
"{}|{}|{}|{}",
doc_id,
am_utils_rs::serialize_commit_heads(heads.as_ref()).join(","),
plug_id,
routine_name
)
.expect("writing to string should never fail");
utils_rs::hash::blake3_hash_bytes(identity.as_bytes())
};
let mut dispatch_id = fixed_dispatch_id
.clone()
.unwrap_or_else(|| format!("{plug_id}/{routine_name}-{dispatch_id}"));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include branch_path in the dispatch identity.

The new fingerprint no longer distinguishes branches. If the same processor is queued for two branches that share the same heads, one branch will reuse or suppress the other dispatch even though triage_doc treated them as separate work items.

Suggested fix
         let dispatch_id = {
             let mut identity = String::new();
             use std::fmt::Write as _;
             write!(
                 &mut identity,
-                "{}|{}|{}|{}",
+                "{}|{}|{}|{}|{}",
                 doc_id,
+                branch_path.as_str(),
                 am_utils_rs::serialize_commit_heads(heads.as_ref()).join(","),
                 plug_id,
                 routine_name
             )
             .expect("writing to string should never fail");
             utils_rs::hash::blake3_hash_bytes(identity.as_bytes())
         };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt.rs` around lines 1618 - 1635, The dispatch identity
currently built in the block that creates dispatch_id (using identity string,
serialize_commit_heads, and blake3_hash_bytes) omits branch_path so different
branches with identical heads collide; update the identity construction inside
the let dispatch_id = { ... } block to include branch_path in the formatted
string (alongside doc_id, serialized heads, plug_id, routine_name) and ensure
the fallback fixed_dispatch_id unwrap_or_else string (where
format!("{plug_id}/{routine_name}-{dispatch_id}")) also incorporates branch_path
so the final dispatch key uniquely includes branch_path; adjust references to
dispatch_id, fixed_dispatch_id, serialize_commit_heads, and blake3_hash_bytes
accordingly.

Comment on lines +260 to 277
let changed_facet_keys: Vec<String> = changed_facet_keys
.map(|keys| {
keys.iter()
.filter(|key| {
processor.read_tags.contains(&key.tag.to_string())
|| processor.read_keys.contains(key)
})
.map(|key| key.to_string())
.collect()
})
.unwrap_or_default();
let args = DispatchArgs::DocRoutine {
doc_id: doc_id.clone(),
branch_path: branch_path.clone(),
heads: doc_heads.clone(),
changed_facet_keys,
wflow_args_json: None,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Added/removed-only processor events lose their trigger facet keys.

This payload is built only from changed_facet_keys. Processors that matched via added_facet_keys or removed_facet_keys therefore dispatch with an empty list, so the routine loses the facet context that actually triggered it.

Suggested fix
-            let changed_facet_keys: Vec<String> = changed_facet_keys
-                .map(|keys| {
-                    keys.iter()
-                        .filter(|key| {
-                            processor.read_tags.contains(&key.tag.to_string())
-                                || processor.read_keys.contains(key)
-                        })
-                        .map(|key| key.to_string())
-                        .collect()
-                })
-                .unwrap_or_default();
+            let changed_facet_keys: Vec<String> = changed_facet_keys
+                .into_iter()
+                .flat_map(|keys| keys.iter())
+                .chain(added_facet_keys.into_iter().flat_map(|keys| keys.iter()))
+                .chain(removed_facet_keys.into_iter().flat_map(|keys| keys.iter()))
+                .filter(|key| {
+                    processor.read_tags.contains(&key.tag.to_string())
+                        || processor.read_keys.contains(key)
+                })
+                .map(|key| key.to_string())
+                .collect::<std::collections::BTreeSet<_>>()
+                .into_iter()
+                .collect();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/daybook_core/rt/triage.rs` around lines 260 - 277, The dispatch payload
is built only from changed_facet_keys so processors that matched via
added_facet_keys or removed_facet_keys lose their trigger facet context; update
the creation of changed_facet_keys (used when constructing
DispatchArgs::DocRoutine) to also include any facet keys from
processor.added_facet_keys and processor.removed_facet_keys that pass the same
read filtering (processor.read_tags.contains or processor.read_keys.contains),
deduplicate and map them to strings, and then use that combined list when
building the DispatchArgs::DocRoutine so routines receive the actual triggering
facet keys.

Comment thread src/plug_dayledger/lib.rs
Comment on lines +192 to +207
processors: [(
"parse-hledger".into(),
Arc::new(ProcessorManifest {
desc: "Parse hledger journal notes into dayledger claims".into(),
deets: ProcessorDeets::DocProcessor {
event_predicate: Default::default(),
routine_name: "parse-hledger".into(),
predicate: DocPredicateClause::And(vec![
DocPredicateClause::HasTag(note_tag.clone()),
DocPredicateClause::FacetFieldMatch {
tag: note_tag,
json_path: "$.mime".into(),
operator: CompareOp::Eq,
value: serde_json::json!("text/x-hledger-journal"),
},
]),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope parse-hledger to note changes, not every matching doc update.

event_predicate: Default::default() plus a predicate that stays true after claims are written means every claim write from parse_hledger can enqueue the same processor again on the same doc. That creates self-induced churn and can loop if identical claim writes still advance heads. Restrict the event predicate to Note mutations, or explicitly ignore claim-only updates.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/lib.rs` around lines 192 - 207, The DocProcessor for
"parse-hledger" is currently using event_predicate: Default::default(), causing
it to re-fire on any doc change (including claim writes it produces); update the
ProcessorManifest/ProcessorDeets for "parse-hledger" to use an event_predicate
that only matches note mutations (e.g., NoteCreated/NoteUpdated or a
MutationKind filter that requires changes to the Note body) or explicitly ignore
claim-only updates so writes emitted by parse-hledger won't re-enqueue the same
processor; locate the ProcessorManifest with routine_name "parse-hledger" and
replace the default event_predicate with a predicate that scopes to note changes
(referencing event_predicate, ProcessorDeets::DocProcessor, and note_tag).

Comment on lines +34 to +36
world imports {
import logging;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

wasi logging 0.1.0-draft WIT package.wit world imports block official definition

💡 Result:

For wasi logging 0.1.0-draft, the authoritative “package.wit” content is the WIT interface wasi:logging/logging; it defines no “world imports { … }” block. Therefore there is no official world-imports block to extract for this package. 1) Official definition (the WIT interface) The wasi-logging repository’s official WIT file defines an interface logging with: - enum level { trace, debug, info, warn, error, critical } - log: func(level: level, context: string, message: string) [1] 2) Why there’s no “world imports { … }” inside this package A “world { … }” is what declares a component’s imports/exports; the standalone file wit/logging.wit is just an interface contract (it does not declare a world). This matches how other docs describe wasi:logging/logging@0.1.0-draft: it lists the interface types/functions (level, log) rather than a world. [1][2] 3) Where “world imports” blocks show up instead Other WIT packages define worlds that import wasi:logging/logging. For example, wasmCloud’s wasi-observe imports world includes an “interface wasi:logging/logging” import in its own world-imports definition; that block is defined in the importing package, not in wasi-logging itself. [3] If you tell me the exact tooling command/path you’re using to produce “package.wit” (some tools assemble/rename files), I can map that to the corresponding upstream interface/world files in the wasi-logging repo. [1]

Citations:


🏁 Script executed:

#!/bin/bash
# Check whether other vendored dep packages in this PR also contain 'world' declarations,
# to understand if this is intentional/consistent.
rg -rn '^world ' src/plug_dayledger/wit/deps/

Repository: dman-os/townframe

Length of output: 646


Remove world declarations from vendored WASI dep packages.

The official wasi:logging@0.1.0-draft specification does not include a world imports block; it defines only the interface contract (enum and log function). The world imports block at lines 34–36 does not match the upstream definition and does not belong in a vendored dependency package.

This issue is not isolated: all vendored WASI packages in this repository (wasi-keyvalue, wasi-http, wasi-filesystem, wasi-clocks, wasi-config) contain similar world declarations. Per WIT conventions, world blocks should be defined in consuming component packages, not in vendored deps. Some WIT toolchains may fail to resolve or compose correctly when dep packages declare their own worlds.

Either remove these world blocks to align with the official WASI specs and conventions, or document why your toolchain requires them.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/wit/deps/wasi-logging-0.1.0-draft/package.wit` around
lines 34 - 36, Remove the erroneous "world imports { import logging; }" block
from the vendored package.wit (wasi-logging-0.1.0-draft) so the file only
defines the logging interface (enums and log function) and no world; update any
other vendored WASI packages (wasi-keyvalue, wasi-http, wasi-filesystem,
wasi-clocks, wasi-config) the same way by deleting their world declarations, or
alternatively add a short comment documenting why a world is required if your
toolchain truly needs it — changes should target the package.wit files and the
"world imports" block referencing the logging symbol so the vendored specs match
the upstream WASI definitions.

Comment on lines +123 to +126
record time-tz {
timesonze: string,
time: time,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix public field typo in time-tz record

timesonze is likely a typo for timezone. In a WIT contract this becomes part of generated bindings, so this will propagate a broken field name across all consumers.

Proposed diff
   record time-tz {
-    timesonze: string,
+    timezone: string,
     time: time,
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
record time-tz {
timesonze: string,
time: time,
}
record time-tz {
timezone: string,
time: time,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_dayledger/wit/deps/wasmcloud-postgres-0.1.1-draft/package.wit`
around lines 123 - 126, The record time-tz declares a public field named
timesonze which is a typo; rename the field to timezone in the package.wit
record time-tz so generated bindings use the correct field name, then regenerate
any bindings/stubs (and update any consumers referencing time-tz.timesonze to
time-tz.timezone) to ensure the corrected field propagates across the codebase.

Comment on lines 198 to 230
fn load_or_init_proposal_set(
rw_config_token: Option<&crate::wit::townframe::daybook::capabilities::FacetTokenRw>,
ro_config_token: Option<&crate::wit::townframe::daybook::capabilities::FacetTokenRo>,
rw_config_token: Option<&crate::wit::townframe::daybook::capabilities::FacetToken>,
ro_config_token: Option<&crate::wit::townframe::daybook::capabilities::FacetToken>,
) -> Result<PseudoLabelCandidatesFacet, JobErrorX> {
if let Some(token) = rw_config_token {
if token.exists() {
let raw = token.get();
let facet_raw: daybook_types::doc::FacetRaw =
serde_json::from_str(&raw).map_err(|err| {
JobErrorX::Terminal(ferr!(
"error parsing config proposal set facet json: {err}"
))
})?;
return serde_json::from_value::<PseudoLabelCandidatesFacet>(facet_raw).map_err(
|err| {
JobErrorX::Terminal(ferr!("config facet is not pseudo label candidates: {err}"))
},
);
}

let value = PseudoLabelCandidatesFacet { labels: vec![] };
let raw = token
.get()
.map_err(|err| JobErrorX::Terminal(ferr!("error reading config facet: {err:?}")))?;
let facet_raw: daybook_types::doc::FacetRaw =
serde_json::to_value(value.clone()).map_err(|err| {
JobErrorX::Terminal(ferr!("error serializing default proposal set: {err}"))
serde_json::from_str(&raw).map_err(|err| {
JobErrorX::Terminal(ferr!("error parsing config proposal set facet json: {err}"))
})?;
let facet_raw = serde_json::to_string(&facet_raw).expect(ERROR_JSON);
token
.update(&facet_raw)
.wrap_err("error writing default learned proposal set")
.map_err(JobErrorX::Terminal)?;
return Ok(value);
}

if let Some(token) = ro_config_token {
if !token.exists() {
return Ok(PseudoLabelCandidatesFacet { labels: vec![] });
}
let raw = token.get();
serde_json::from_value::<PseudoLabelCandidatesFacet>(facet_raw).map_err(|err| {
JobErrorX::Terminal(ferr!("config facet is not pseudo label candidates: {err}"))
})
} else if let Some(token) = ro_config_token {
let raw = token
.get()
.map_err(|err| JobErrorX::Terminal(ferr!("error reading ro config facet: {err:?}")))?;
let facet_raw: daybook_types::doc::FacetRaw =
serde_json::from_str(&raw).map_err(|err| {
JobErrorX::Terminal(ferr!(
"error parsing ro config proposal set facet json: {err}"
))
})?;
return serde_json::from_value::<PseudoLabelCandidatesFacet>(facet_raw).map_err(|err| {
serde_json::from_value::<PseudoLabelCandidatesFacet>(facet_raw).map_err(|err| {
JobErrorX::Terminal(ferr!(
"ro config facet is not pseudo label candidates: {err}"
))
});
})
} else {
Ok(PseudoLabelCandidatesFacet { labels: vec![] })
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Load the proposal set from the readable token, not whichever token can update.

load_or_init_proposal_set tries rw_config_token first and calls get() on it. Under the new ACL split, a write-only config grant turns this into a hard failure instead of falling back to ro_config_token or the empty proposal set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_plabels/wflows/learn_image_label_candidates.rs` around lines 198 -
230, The function load_or_init_proposal_set currently tries rw_config_token
first and fails on get() for write-only grants; change the logic to prefer the
readable token: attempt to read from ro_config_token.get() if ro_config_token is
Some, parse that into PseudoLabelCandidatesFacet, and only if ro_config_token is
None then try rw_config_token.get() (so write-only tokens are only used as a
fallback). Update the branches that call get(), serde_json::from_str and
serde_json::from_value for the tokens (references: load_or_init_proposal_set,
rw_config_token, ro_config_token, get(), PseudoLabelCandidatesFacet) to reflect
this order and preserve the existing error mapping.

Comment thread src/plug_test/lib.rs
Comment on lines +1231 to +1238
"report-full-processor".into(),
Arc::new(CommandManifest {
desc: "report capabilities (full processor)".into(),
deets: CommandDeets::DocCommand {
routine_name: "report-full-processor".into(),
},
}),
),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

report-*-processor won't exercise the processor path as currently wired.

Both entries are still CommandDeets::DocCommand, so report_capabilities will always observe RoutineInvocation::Command for them. If these are meant to cover processor ACLs/invocation, they need real processors manifest entries rather than command aliases.

Also applies to: 1249-1256

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plug_test/lib.rs` around lines 1231 - 1238, The two manifest entries like
the "report-full-processor" CommandManifest are using CommandDeets::DocCommand
so report_capabilities will only see RoutineInvocation::Command; replace those
DocCommand variants with actual processor manifest entries (e.g., use a
CommandDeets variant that lists processors or a Processor-specific manifest
type) so they register as processors rather than command aliases and thus
exercise the processor ACL/invocation path; make the same change for the sibling
report-* -processor entry as well and ensure the manifest field names match the
code that builds RoutineInvocation::Processor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant