Add file policy engine for file tool harness#1585
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a file access policy engine that allows restricting tool access to specific file paths using glob-style patterns in execpolicy.toml. The implementation includes integration into the tool execution lifecycle, comprehensive tests, and a new feature flag. Feedback from the review highlights a potential security bypass where the apply_patch tool lacks path extraction, performance overhead from repeated regex compilation, and redundant logic when handling default rule sets. Additionally, it was noted that audit events are missing in some enforcement points, which reduces visibility into policy violations.
| pub fn is_file_tool(tool_name: &str) -> bool { | ||
| matches!( | ||
| tool_name, | ||
| "read_file" | "write_file" | "edit_file" | "apply_patch" | ||
| ) | ||
| } | ||
|
|
||
| pub fn extract_file_path<'a>( | ||
| tool_input: &'a serde_json::Value, | ||
| tool_name: &str, | ||
| ) -> Option<&'a str> { | ||
| match tool_name { | ||
| "read_file" | "write_file" | "edit_file" => tool_input.get("path")?.as_str(), | ||
| _ => None, | ||
| } | ||
| } |
There was a problem hiding this comment.
The apply_patch tool is included in is_file_tool but its path extraction is not implemented in extract_file_path. This creates a security bypass where a model can use apply_patch to modify files that are explicitly denied via write_file or edit_file rules. While the design doc notes this limitation, it significantly weakens the effectiveness of the file policy engine, especially since apply_patch is a common way for models to modify code.
| fn pattern_matches_path(pattern: &str, path: &str) -> bool { | ||
| if pattern == "*" { | ||
| return true; | ||
| } | ||
|
|
||
| let mut escaped = regex::escape(pattern); | ||
| escaped = escaped.replace("\\*\\*/", "(?:.*/)?"); | ||
| escaped = escaped.replace("\\*\\*", ".*"); | ||
| escaped = escaped.replace("\\*", "[^/\\\\]*"); | ||
|
|
||
| let re_str = format!("^{escaped}$"); | ||
| let Ok(re) = regex::Regex::new(&re_str) else { | ||
| return false; | ||
| }; | ||
| re.is_match(path) | ||
| } |
There was a problem hiding this comment.
The pattern_matches_path function compiles a new regex::Regex on every call. Since this function is invoked within nested loops in evaluate_file for every file tool execution, this can lead to significant performance overhead. Consider pre-compiling these patterns when the configuration is loaded or using a lazy caching mechanism to store compiled regexes.
| let rule_sets: Vec<&FileRuleSet> = self | ||
| .file_rules | ||
| .get(category) | ||
| .into_iter() | ||
| .chain(self.file_rules.get("default")) | ||
| .collect(); |
There was a problem hiding this comment.
When category is "default", this logic will include the default rule set twice in the rule_sets vector, leading to redundant pattern matching. While not a bug, it's slightly inefficient.
let mut rule_sets = Vec::new();
if let Some(rules) = self.file_rules.get(category) {
rule_sets.push(rules);
}
if category != "default" {
if let Some(rules) = self.file_rules.get("default") {
rule_sets.push(rules);
}
}| if let Some(policy) = file_policy.as_ref() | ||
| && let Some(err) = Engine::evaluate_file_policy(policy, &tool_name, &tool_input) | ||
| { | ||
| return Err(err); | ||
| } |
There was a problem hiding this comment.
This defensive check in execute_tool_with_lock does not emit a tool.file_policy_denied audit event when a call is blocked. This is inconsistent with the enforcement points in turn_loop.rs and reduces visibility into policy violations, especially for sub-calls within parallel tool batches (e.g., multi_tool_use.parallel) which rely on this guard for enforcement.
749117c to
7501651
Compare
|
Updated this PR on top of upstream main/v0.8.35 and addressed the bot review items:
Validation after update:
|
|
This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current |
Adds the file policy engine on top of current main/v0.8.33. Ports the harness-layer file access policy and wires it into serial tool execution, parallel tool batches, and execute_tool_with_lock. Validation: cargo check --workspace; cargo test -p deepseek-tui --bin deepseek-tui file_policy; cargo test -p deepseek-tui --bin deepseek-tui engine::; cargo test -p deepseek-tui --bin deepseek-tui composer_arrows_scroll.