Fix red team status tracking, cache key mismatch, and evaluation error handling#18
Open
slister1001 wants to merge 2 commits intomainfrom
Open
Fix red team status tracking, cache key mismatch, and evaluation error handling#18slister1001 wants to merge 2 commits intomainfrom
slister1001 wants to merge 2 commits intomainfrom
Conversation
…r handling Bug 1 - Status tracking: _determine_run_status now treats 'pending' and 'running' entries as 'failed' instead of 'in_progress'. By the time this method runs the scan is finished, so leftover 'pending' entries (from skipped risk categories or Foundry execution failures) indicate failure, not ongoing work. Bug 2 - Cache key mismatch: _execute_attacks_with_foundry now uses get_attack_objective_from_risk_category() to build the cache lookup key, matching the caching logic in _get_attack_objectives. Previously, ungrounded_attributes objectives were cached under 'isa' but looked up under 'ungrounded_attributes', causing them to be silently skipped. Bug 3 - Evaluation error handling: RAIServiceScorer now detects when the RAI evaluation service returns an error response (properties.outcome == 'error', e.g. ServiceInvocationException) and raises RuntimeError. This causes PyRIT to treat the score as UNDETERMINED instead of using the erroneous passed=False to incorrectly mark the attack as successful, which was inflating ASR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes three bugs discovered during the red team SDK bug bash:
Bug 1 - Run status stuck at in_progress: _determine_run_status() now treats leftover pending and running entries as failed instead of in_progress. By the time this method runs the scan is finished, so pending entries (from skipped risk categories or Foundry execution failures) indicate failure, not ongoing work. Affected ~10 scans in the bug bash.
Bug 2 - ungrounded_attributes silently skipped: _execute_attacks_with_foundry() now uses get_attack_objective_from_risk_category() to build the cache lookup key, matching the caching logic in _get_attack_objectives(). Previously, objectives were cached under 'isa' but looked up under 'ungrounded_attributes', causing the mismatch that made the category appear to have 0 objectives despite the API returning 100.
Bug 3 - ServiceInvocationException inflating ASR: RAIServiceScorer now detects when the RAI evaluation service returns an error response (properties.outcome == 'error') and raises RuntimeError, causing PyRIT to treat the score as UNDETERMINED. Previously, the erroneous passed=False from error responses was incorrectly treated as attack success, inflating the protected_material ASR from 0% to 50%.