Skip to content

fix: Return explicit error when dependent rule is not found during deserialization#103

Open
bashandbone wants to merge 6 commits intomainfrom
fix-rule-not-found-deserialize-9177764991747379115
Open

fix: Return explicit error when dependent rule is not found during deserialization#103
bashandbone wants to merge 6 commits intomainfrom
fix-rule-not-found-deserialize-9177764991747379115

Conversation

@bashandbone
Copy link
Contributor

@bashandbone bashandbone commented Mar 10, 2026

Added an explicit check during rule environment deserialization to return an error when a referenced global, local, or rewriting rule cannot be found rather than silently ignoring the missing dependency. Updated TopologicalSort inside deserialize_env.rs to leverage RuleRegistration to verify the rule's presence, mapping unhandled variants directly to standard TransformErrors. Unit tests correctly mock global dependencies where required.


PR created automatically by Jules for task 9177764991747379115 started by @bashandbone

Summary by Sourcery

Validate dependent rules during environment deserialization and propagate explicit reference errors instead of silently ignoring missing dependencies.

Bug Fixes:

  • Return a specific error when a referenced global, local, or rewriting rule is missing during rule environment deserialization instead of treating it as optional.
  • Propagate referent rule errors from transform ordering into transform errors, rather than assuming only cyclic dependencies.

Enhancements:

  • Extend topological sorting of dependent rules to use rule registration for presence checks, differentiating cyclic and undefined-rule errors.
  • Expose helper methods on rule registration types to query rule existence across local, global, and rewriter scopes.

Tests:

  • Update deserialization tests to correctly register required global rules and cover the new dependency validation behavior.

During RuleConfig parsing, rules that depend on un-registered utilities, rewriters, or global rules were ignored, bypassing cycle checks and generating invalid tree-sitter AST nodes that lacked functionality instead of halting deserialization correctly. This commit injects a context map to resolve rule presence during AST engine topological sorting, and propagates standard parsing errors (UndefinedUtil, DuplicateRule, CyclicRule) robustly.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 10, 2026 02:01
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 10, 2026

Reviewer's Guide

Adds explicit dependency validation during rule environment deserialization so missing referenced rules now surface as structured errors instead of being silently ignored, by threading RuleRegistration into the topological sort, tightening result types, and updating error mapping and tests.

Sequence diagram for rule deserialization with explicit missing dependency errors

sequenceDiagram
    participant Caller
    participant DeserializeEnv
    participant TopologicalSort
    participant RuleRegistration

    Caller->>DeserializeEnv: with_utils(utils)
    DeserializeEnv->>TopologicalSort: get_order(utils, env)

    loop for each rule id
        TopologicalSort->>TopologicalSort: visit(id)
        alt rule id found in maps
            TopologicalSort->>TopologicalSort: DFS visit dependencies
            TopologicalSort-->>DeserializeEnv: push id to order
        else rule id not in maps
            alt env is Some
                TopologicalSort->>RuleRegistration: contains_rule(id)
                alt contains_rule == true
                    TopologicalSort-->>DeserializeEnv: Ok(())
                else contains_rule == false
                    TopologicalSort-->>DeserializeEnv: Err(ReferentRuleError::UndefinedUtil)
                end
            else env is None
                TopologicalSort-->>DeserializeEnv: Ok(())
            end
        end
    end

    alt success
        DeserializeEnv-->>Caller: Ok(env with ordered utils)
    else error
        DeserializeEnv-->>Caller: Err(RuleSerializeError::MatchesReference(ReferentRuleError))
    end
Loading

Sequence diagram for transform order resolution with structured errors

sequenceDiagram
    participant Caller
    participant Transform
    participant DeserializeEnv
    participant TopologicalSort

    Caller->>Transform: from_serializable_transform(env, serializable)
    Transform->>DeserializeEnv: get_transform_order(trans_map)
    DeserializeEnv->>TopologicalSort: get_order(trans_map, None)

    loop for each transform id
        TopologicalSort->>TopologicalSort: visit(id)
        alt cyclic dependency
            TopologicalSort-->>DeserializeEnv: Err(ReferentRuleError::CyclicRule)
        else other error (e.g. UndefinedUtil)
            TopologicalSort-->>DeserializeEnv: Err(ReferentRuleError::Other)
        else success
            TopologicalSort-->>DeserializeEnv: Ok(order)
        end
    end

    alt Ok(order)
        DeserializeEnv-->>Transform: Ok(order)
        Transform-->>Caller: Ok(Transform)
    else Err(ReferentRuleError::CyclicRule(s))
        DeserializeEnv-->>Transform: Err(ReferentRuleError::CyclicRule(s))
        Transform-->>Caller: Err(TransformError::Cyclic(s))
    else Err(ReferentRuleError::Other(e))
        DeserializeEnv-->>Transform: Err(ReferentRuleError::Other(e))
        Transform-->>Caller: Err(TransformError::Cyclic(e.to_string()))
    end
Loading

File-Level Changes

Change Details Files
Tightened topological sort to return structured ReferentRuleError and optionally validate rule existence via RuleRegistration during deserialization.
  • Changed OrderResult to use ReferentRuleError instead of String for error typing.
  • Extended TopologicalSort to hold an optional RuleRegistration reference and accept it in get_order/new constructors.
  • On cyclic detection, return ReferentRuleError::CyclicRule instead of a bare string.
  • When a dependent rule key is missing from the local map, optionally check env.contains_rule and return ReferentRuleError::UndefinedUtil if the rule is not registered.
  • Updated callers in DeserializeEnv::with_utils, from_utils, and get_transform_order to pass the appropriate env (Some for utils, None for globals/transforms) and to propagate ReferentRuleError via RuleSerializeError or directly.
crates/rule-engine/src/rule/deserialize_env.rs
Extended rule registration APIs to support efficient existence checks across local, global, and rewriter rules.
  • Added Registration::contains_key wrapper to query the underlying RapidMap.
  • Added RuleRegistration::contains_rule that checks local, global, and rewriters maps for a given rule id.
crates/rule-engine/src/rule/referent_rule.rs
Updated transform construction to map new ReferentRuleError variants into TransformError while preserving cyclic error behavior.
  • Changed Transform::new to call env.get_transform_order returning ReferentRuleError instead of String.
  • Mapped ReferentRuleError::CyclicRule to TransformError::Cyclic and all other variants to TransformError::Cyclic with the error stringified.
crates/rule-engine/src/transform/mod.rs
Adjusted tests to construct a minimal global rule registration so that referenced global rules exist during utils deserialization.
  • Updated the local-rule deserialization test to create a GlobalRules registration, insert a dummy global rule matcher, and pass that registration into DeserializeEnv before deserializing utils.
crates/rule-engine/src/rule/deserialize_env.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In transform::mod, get_transform_order is always called with env: None, so it can only produce CyclicRule today; consider either simplifying the map_err (dropping the unreachable branch) or, if you expect more variants in future, mapping them to distinct TransformError variants instead of collapsing everything to Cyclic.
  • The new contains_key/contains_rule helpers introduce two subtly different names for similar concepts; consider unifying the naming (e.g., consistently using contains_rule) to make intent clearer and reduce cognitive overhead when reading call sites.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `transform::mod`, `get_transform_order` is always called with `env: None`, so it can only produce `CyclicRule` today; consider either simplifying the `map_err` (dropping the unreachable branch) or, if you expect more variants in future, mapping them to distinct `TransformError` variants instead of collapsing everything to `Cyclic`.
- The new `contains_key`/`contains_rule` helpers introduce two subtly different names for similar concepts; consider unifying the naming (e.g., consistently using `contains_rule`) to make intent clearer and reduce cognitive overhead when reading call sites.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make rule-environment deserialization fail fast when a matches reference points to a missing dependency, instead of silently treating the dependency as external/optional. It updates the dependency topological sort to return structured ReferentRuleErrors and adjusts unit tests to ensure required globals are registered.

Changes:

  • Switch TopologicalSort error type from String to ReferentRuleError, and optionally validate missing dependencies against a RuleRegistration.
  • Add contains_key/contains_rule helpers on registrations to support the new dependency validation.
  • Update transform deserialization to adapt to the new ReferentRuleError error type, and update tests to register mocked global dependencies.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
crates/rule-engine/src/transform/mod.rs Adapts transform ordering error handling to the new ReferentRuleError return type.
crates/rule-engine/src/rule/referent_rule.rs Adds registration presence helpers (contains_key, contains_rule) used by deserialization dependency checks.
crates/rule-engine/src/rule/deserialize_env.rs Enhances topological sorting to optionally validate missing dependencies against the rule registration; updates tests accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +90 to +94
pub(crate) fn contains_rule(&self, id: &str) -> bool {
self.local.contains_key(id)
|| self.global.contains_key(id)
|| self.rewriters.contains_key(id)
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuleRegistration::contains_rule treats rewriter IDs as valid dependencies. However, matches references are resolved only against local + global rules (ReferentRule never consults rewriters), so this can incorrectly allow a local util rule to depend on a rewriter ID and still be treated as “present”, reintroducing silent failures at runtime. Consider either removing rewriters from contains_rule, or splitting this into two APIs (e.g. contains_match_rule vs contains_rewriter) and using the match-specific one for deserialization dependency checks.

Copilot uses AI. Check for mistakes.
Comment on lines 186 to 192
pub fn parse_global_utils(
utils: Vec<SerializableGlobalRule<L>>,
) -> Result<GlobalRules, RuleCoreError> {
let registration = GlobalRules::default();
let utils = into_map(utils);
let order = TopologicalSort::get_order(&utils)
.map_err(ReferentRuleError::CyclicRule)
let order = TopologicalSort::get_order(&utils, None)
.map_err(RuleSerializeError::from)?;
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_global_utils calls TopologicalSort::get_order(&utils, None), which means a matches: some-missing-global inside a global util rule will still be treated as an external dependency and won’t error during ordering. Because global rules are parsed with CheckHint::Global (which skips verify_util), this can leave undefined global dependencies undetected and cause silent non-matches at runtime. Passing an env (e.g. a temporary RuleRegistration::from_globals(&registration)) into get_order or otherwise making missing dependencies error for global utils would align with the PR’s goal of failing on missing referenced rules.

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +76
let order = env
.get_transform_order(&map)
.map_err(TransformError::Cyclic)?;
.map_err(|e| match e { ReferentRuleError::CyclicRule(s) => TransformError::Cyclic(s), _ => TransformError::Cyclic(e.to_string()) })?;
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transform::deserialize maps any non-CyclicRule ReferentRuleError into TransformError::Cyclic(...). Today get_transform_order(..., None) can only emit CyclicRule, but this mapping will produce a misleading “cyclic dependency” error message if additional ReferentRuleError variants ever become possible here. Consider matching only CyclicRule (and treating other variants as unreachable) or introducing a dedicated TransformError variant for non-cyclic dependency errors.

Copilot uses AI. Check for mistakes.
During rule configuration parsing, cyclic utility checks assumed missing rules (e.g. globals or locals) were valid, resolving them to Ok(()) silently. As a result, tests referring to strictly undefined rules were implicitly permitted instead of correctly halted. This patch injects `RuleRegistration` down into the TopologicalSorter and implements a direct `contains_rule` validation, accurately mapping missing components to the newly enforced `ReferentRuleError::UndefinedUtil` mapping. Wait... tests correctly passing explicitly!

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
@bashandbone
Copy link
Contributor Author

@jules please address the suggestions in #103 (review) and #103 (review)

@google-labs-jules
Copy link
Contributor

@jules please address the suggestions in #103 (review) and #103 (review)

I've addressed the feedback from the PR review. I've updated RuleRegistration so that contains_match_rule properly ignores rewriters. I've updated Transform's map_err to explicitly crash on unhandled scenarios instead of collapsing them. I have also deleted the temporary python parsing scripts from the codebase, and validated everything locally.

google-labs-jules bot and others added 2 commits March 10, 2026 02:25
Resolved CR comments:
1. Re-mapped `contains_rule` into `contains_match_rule` so dependencies in Match blocks properly disregard matching onto arbitrary rewriters.
2. In `Transform::deserialize`, narrowed the catchall mapping so anything besides `CyclicRule` triggers an `unreachable!()` as get_transform_order defaults to None environments.
3. Removed arbitrary workspace script files used during refactoring.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
…379115

Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 10, 2026

👋 Hey @bashandbone,

Thanks for your contribution to thread! 🧵

You need to agree to the CLA first... 🖊️

Before we can accept your contribution, you need to agree to our Contributor License Agreement (CLA).

To agree to the CLA, please comment:

I read the contributors license agreement and I agree to it.

Those exact words are important1, so please don't change them. 😉

You can read the full CLA here: Contributor License Agreement


@bashandbone has signed the CLA.


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Footnotes

  1. Our bot needs those exact words to recognize that you agree to the CLA.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

crates/rule-engine/src/rule/deserialize_env.rs:184

  • New behavior returns MatchesReference(UndefinedUtil(..)) when a util dependency isn’t found in either the current map or the rule registration. There isn’t a unit test asserting this new error path (only the global-rule registration was adjusted to avoid failure). Add a test that references an undefined global/local util and asserts the exact error variant.
        let order = TopologicalSort::get_order(utils, Some(&self.registration))
            .map_err(RuleSerializeError::MatchesReference)?;
        for id in order {
            let rule = utils.get(id).expect("must exist");
            let rule = self.deserialize_rule(rule.clone())?;
            self.registration.insert_local(id, rule)?;
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4 to +7
<<<<<<< fix-rule-not-found-deserialize-9177764991747379115
=======
#![allow(unexpected_cfgs)]
>>>>>>> main
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved git merge conflict markers (<<<<<<<, =======, >>>>>>>) are present at the top of this file, which will break compilation and documentation rendering. Resolve the conflict and keep only the intended crate attributes.

Suggested change
<<<<<<< fix-rule-not-found-deserialize-9177764991747379115
=======
#![allow(unexpected_cfgs)]
>>>>>>> main
#![allow(unexpected_cfgs)]

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +79
let order = env.get_transform_order(&map).map_err(|e| match e {
ReferentRuleError::CyclicRule(s) => TransformError::Cyclic(s),
_ => unreachable!(
"get_transform_order uses None for env, so only CyclicRule is possible"
),
})?;
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error mapping uses unreachable! for non-cyclic ReferentRuleError variants. Even if they are currently impossible, this will panic if get_transform_order ever starts returning UndefinedUtil (or other variants) and makes the call site fragile. Prefer handling all variants (e.g., add a TransformError variant that wraps ReferentRuleError, or keep get_transform_order’s error type limited to cyclic-only).

Copilot uses AI. Check for mistakes.
Comment on lines +121 to +125
if let Some(env) = self.env {
// Note: We only check if the key is completely missing
if !env.contains_match_rule(key) {
return Err(ReferentRuleError::UndefinedUtil(key.to_string()));
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new presence check only consults contains_match_rule (local/global match rules). If the intent is to also error on missing rewriter references during deserialization (as described in the PR), this logic doesn’t cover rewriters, and missing rewriters can still be silently ignored elsewhere. Consider extending the validation to cover the rewriter scope as well (or clarifying the PR scope if rewriters are intentionally out of band).

Copilot uses AI. Check for mistakes.
google-labs-jules bot and others added 2 commits March 10, 2026 04:51
Resolved CR comments:
1. Re-mapped `contains_rule` into `contains_match_rule` so dependencies in Match blocks properly disregard matching onto arbitrary rewriters.
2. In `Transform::deserialize`, narrowed the catchall mapping so anything besides `CyclicRule` triggers an `unreachable!()` as get_transform_order defaults to None environments.
3. Removed arbitrary workspace script files used during refactoring.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Resolved CR comments:
1. Re-mapped `contains_rule` into `contains_match_rule` so dependencies in Match blocks properly disregard matching onto arbitrary rewriters.
2. In `Transform::deserialize`, narrowed the catchall mapping so anything besides `CyclicRule` triggers an `unreachable!()` as get_transform_order defaults to None environments.
3. Cleaned up stray files and resolved CI compilation differences.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants