FEAT analytics: analyze attack success rate with generic dimensions#1362
FEAT analytics: analyze attack success rate with generic dimensions#1362romanlutz wants to merge 14 commits intoAzure:mainfrom
Conversation
pyrit/analytics/result_analysis.py
Outdated
|
|
||
| # Resolve deprecated aliases and validate dimension names | ||
| resolved_group_by: list[Union[str, tuple[str, ...]]] = [] | ||
| for spec in group_by: |
There was a problem hiding this comment.
should we check whether there are repeats in group_by? Fringe case but counts could be off in that specific case.
| "## Composite Dimensions\n", | ||
| "\n", | ||
| "Use a tuple of dimension names to create a cross-product grouping. For example,\n", | ||
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"crescendo\")`." |
There was a problem hiding this comment.
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"crescendo\")`." | |
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"CrescendoAttack\")`." |
| extractors.update(custom_dimensions) | ||
|
|
||
| # Resolve group_by — default to every registered dimension independently | ||
| if group_by is None: |
There was a problem hiding this comment.
should we warn if group_by = [] but custom_dimensions is non-empty?
|
|
||
| def _compute_stats(successes: int, failures: int, undetermined: int) -> AttackStats: | ||
| @dataclass | ||
| class AnalysisResult: |
There was a problem hiding this comment.
maybe not this PR, but should we have a way to pretty print the result?
| else: | ||
| # Composite: cross-product of all sub-dimension values | ||
| sub_values = [extractors[name](attack) for name in spec] | ||
| for combo in product(*sub_values): |
There was a problem hiding this comment.
I might be going down a rabbithole but I'm imagining that if there is an attack (attack_type_x) with multiple converters (let's say c1 and c2), then this cross product will include (c1, attack_type_x) and (c2, attack_type_y) (and also get more complicated if we have, say, custom dimensions that are one to many). I wonder if that's a bit misleading to a user because it wasn't really c1 attack_type_x that led to a specific result, it was the combination of both converters with attack_type_x. Maybe combinations of converters or other dimensions used in one attack should be treated as one entity?
Co-authored-by: romanlutz <10245648+romanlutz@users.noreply.github.com>
Co-authored-by: romanlutz <10245648+romanlutz@users.noreply.github.com>
Co-authored-by: romanlutz <10245648+romanlutz@users.noreply.github.com>
285149a to
866599e
Compare
There was a problem hiding this comment.
Pull request overview
This PR generalizes the analyze_results function in pyrit/analytics to support flexible, dimension-based grouping of attack results. Previously the function only returned overall stats and a breakdown by attack identifier. The new design supports arbitrary dimensions (built-in or custom), composite cross-product groupings, and a structured AnalysisResult return type. A DeprecationWarning bridges the old attack_identifier group key to the new attack_type canonical name.
Changes:
- Refactored
analyze_resultsinpyrit/analytics/result_analysis.pyto support pluggable dimension extractors, composite groupings, custom dimensions, and a newAnalysisResultdataclass return type. - Expanded unit tests in
tests/unit/analytics/test_result_analysis.pywith grouped test classes covering validation, per-dimension, composite, custom, and deprecated alias scenarios. - Added a new documentation notebook (
doc/code/analytics/1_result_analysis.py+.ipynb) showing usage of all new features, and registered it indoc/_toc.yml.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
pyrit/analytics/result_analysis.py |
Core refactor: new AnalysisResult, DimensionExtractor, built-in extractors, dimension alias support, and generalized analyze_results function |
pyrit/analytics/__init__.py |
Exports new public types AnalysisResult and DimensionExtractor |
tests/unit/analytics/test_result_analysis.py |
Comprehensive test expansion with organized test classes and new scenarios |
doc/code/analytics/1_result_analysis.py |
New JupyText source notebook — contains critical bugs (non-existent import, wrong constructor arguments, raw dict as attack_identifier) |
doc/code/analytics/1_result_analysis.ipynb |
Rendered notebook — mirrors the same bugs from the .py source |
doc/_toc.yml |
Registers the new analytics notebook in the documentation table of contents |
| "# Realistic attack_identifier dicts mirror Strategy.get_identifier() output\n", | ||
| "crescendo_id = {\n", | ||
| " \"__type__\": \"CrescendoAttack\",\n", | ||
| " \"__module__\": \"pyrit.executor.attack.multi_turn.crescendo\",\n", | ||
| " \"id\": \"a1b2c3d4-0001-4000-8000-000000000001\",\n", | ||
| "}\n", | ||
| "red_team_id = {\n", | ||
| " \"__type__\": \"RedTeamingAttack\",\n", | ||
| " \"__module__\": \"pyrit.executor.attack.multi_turn.red_teaming\",\n", | ||
| " \"id\": \"a1b2c3d4-0002-4000-8000-000000000002\",\n", | ||
| "}\n", |
There was a problem hiding this comment.
The notebook constructs AttackResult objects using raw dicts for attack_identifier (e.g., crescendo_id = {"__type__": "CrescendoAttack", ...}). AttackResult is a @dataclass with no __post_init__ that would coerce a dict to ComponentIdentifier. When analyze_results processes these results, _extract_attack_type calls result.attack_identifier.class_name, which will raise an AttributeError since a dict has no .class_name attribute. The notebook should use ComponentIdentifier objects (or ComponentIdentifier.from_dict()) instead of raw dicts for attack_identifier.
| from pyrit.identifiers import ConverterIdentifier | ||
| from pyrit.models import AttackOutcome, AttackResult, MessagePiece | ||
|
|
||
|
|
||
| def make_converter(name: str) -> ConverterIdentifier: | ||
| return ConverterIdentifier( | ||
| class_name=name, | ||
| class_module="pyrit.prompt_converter", | ||
| class_description=f"{name} converter", | ||
| identifier_type="instance", | ||
| supported_input_types=("text",), | ||
| supported_output_types=("text",), |
There was a problem hiding this comment.
The notebook and .py documentation file import ConverterIdentifier from pyrit.identifiers, but this class does not exist in the codebase. Only ComponentIdentifier is defined and exported from pyrit.identifiers. Additionally, the make_converter function passes fields that ComponentIdentifier does not support (class_description, identifier_type, supported_input_types, supported_output_types). This will cause an ImportError at runtime when the notebook is executed. The import and function body should use ComponentIdentifier(class_name=name, class_module="pyrit.prompt_converter") instead.
| from pyrit.identifiers import ConverterIdentifier | |
| from pyrit.models import AttackOutcome, AttackResult, MessagePiece | |
| def make_converter(name: str) -> ConverterIdentifier: | |
| return ConverterIdentifier( | |
| class_name=name, | |
| class_module="pyrit.prompt_converter", | |
| class_description=f"{name} converter", | |
| identifier_type="instance", | |
| supported_input_types=("text",), | |
| supported_output_types=("text",), | |
| from pyrit.identifiers import ComponentIdentifier | |
| from pyrit.models import AttackOutcome, AttackResult, MessagePiece | |
| def make_converter(name: str) -> ComponentIdentifier: | |
| return ComponentIdentifier( | |
| class_name=name, | |
| class_module="pyrit.prompt_converter", |
| "from pyrit.identifiers import ConverterIdentifier\n", | ||
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | ||
| "\n", | ||
| "\n", | ||
| "def make_converter(name: str) -> ConverterIdentifier:\n", | ||
| " return ConverterIdentifier(\n", | ||
| " class_name=name,\n", | ||
| " class_module=\"pyrit.prompt_converter\",\n", | ||
| " class_description=f\"{name} converter\",\n", | ||
| " identifier_type=\"instance\",\n", | ||
| " supported_input_types=(\"text\",),\n", | ||
| " supported_output_types=(\"text\",),\n", |
There was a problem hiding this comment.
The notebook imports ConverterIdentifier from pyrit.identifiers and constructs it with parameters (class_description, identifier_type, supported_input_types, supported_output_types) that don't exist in ComponentIdentifier. ConverterIdentifier is not exported from pyrit.identifiers — only ComponentIdentifier is. This will cause an ImportError when the notebook cell is executed. The import and make_converter function body should use ComponentIdentifier(class_name=name, class_module="pyrit.prompt_converter") instead.
| "from pyrit.identifiers import ConverterIdentifier\n", | |
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | |
| "\n", | |
| "\n", | |
| "def make_converter(name: str) -> ConverterIdentifier:\n", | |
| " return ConverterIdentifier(\n", | |
| " class_name=name,\n", | |
| " class_module=\"pyrit.prompt_converter\",\n", | |
| " class_description=f\"{name} converter\",\n", | |
| " identifier_type=\"instance\",\n", | |
| " supported_input_types=(\"text\",),\n", | |
| " supported_output_types=(\"text\",),\n", | |
| "from pyrit.identifiers import ComponentIdentifier\n", | |
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | |
| "\n", | |
| "\n", | |
| "def make_converter(name: str) -> ComponentIdentifier:\n", | |
| " return ComponentIdentifier(\n", | |
| " class_name=name,\n", | |
| " class_module=\"pyrit.prompt_converter\",\n", |
| # ## Composite Dimensions | ||
| # | ||
| # Use a tuple of dimension names to create a cross-product grouping. For example, | ||
| # `("converter_type", "attack_type")` produces keys like `("Base64Converter", "crescendo")`. |
There was a problem hiding this comment.
The markdown comment says ("converter_type", "attack_type") produces keys like ("Base64Converter", "crescendo"), but the actual output in the notebook shows ("Base64Converter", "CrescendoAttack"). The example key in the comment is incorrect and should be ("Base64Converter", "CrescendoAttack").
| # `("converter_type", "attack_type")` produces keys like `("Base64Converter", "crescendo")`. | |
| # `("converter_type", "attack_type")` produces keys like `("Base64Converter", "CrescendoAttack")`. |
| crescendo_id = { | ||
| "__type__": "CrescendoAttack", | ||
| "__module__": "pyrit.executor.attack.multi_turn.crescendo", | ||
| "id": "a1b2c3d4-0001-4000-8000-000000000001", | ||
| } | ||
| red_team_id = { | ||
| "__type__": "RedTeamingAttack", | ||
| "__module__": "pyrit.executor.attack.multi_turn.red_teaming", | ||
| "id": "a1b2c3d4-0002-4000-8000-000000000002", | ||
| } |
There was a problem hiding this comment.
The notebook constructs AttackResult objects using raw dicts for attack_identifier (e.g., crescendo_id = {"__type__": "CrescendoAttack", ...}). However, AttackResult is a plain @dataclass with no __post_init__ that would coerce a dict to a ComponentIdentifier. When analyze_results processes these results, _extract_attack_type calls result.attack_identifier.class_name, which will raise an AttributeError because a dict has no .class_name attribute. The notebook should use ComponentIdentifier objects (or the ComponentIdentifier.from_dict() factory) instead of raw dicts for attack_identifier.
…sr_by_generic_dimensions
…sr_by_generic_dimensions
| "## Composite Dimensions\n", | ||
| "\n", | ||
| "Use a tuple of dimension names to create a cross-product grouping. For example,\n", | ||
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"crescendo\")`." |
There was a problem hiding this comment.
Same documentation inaccuracy as in the .py file: the markdown comment claims ("converter_type", "attack_type") produces keys like ("Base64Converter", "crescendo"), but the actual output in the same cell (line 340) shows ('Base64Converter', 'CrescendoAttack'). The example should say ("Base64Converter", "CrescendoAttack"), not ("Base64Converter", "crescendo").
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"crescendo\")`." | |
| "`(\"converter_type\", \"attack_type\")` produces keys like `(\"Base64Converter\", \"CrescendoAttack\")`." |
| canonical = _DEPRECATED_DIMENSION_ALIASES.get(name) | ||
| if canonical and canonical in extractors: | ||
| warnings.warn( | ||
| f"Dimension '{name}' is deprecated and will be removed in v0.13.0. Use '{canonical}' instead.", |
There was a problem hiding this comment.
The deprecation warning message uses the format "will be removed in v0.13.0" (with a v prefix), but other deprecation messages throughout the codebase consistently use "will be removed in 0.13.0" (without the v prefix). For example, see pyrit/models/message.py:190 and pyrit/models/scenario_result.py:71. This should be updated to match the established convention.
| f"Dimension '{name}' is deprecated and will be removed in v0.13.0. Use '{canonical}' instead.", | |
| f"Dimension '{name}' is deprecated and will be removed in 0.13.0. Use '{canonical}' instead.", |
| "from pyrit.analytics import analyze_results\n", | ||
| "from pyrit.identifiers import ConverterIdentifier\n", | ||
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | ||
| "\n", | ||
| "\n", | ||
| "def make_converter(name: str) -> ConverterIdentifier:\n", | ||
| " return ConverterIdentifier(\n", | ||
| " class_name=name,\n", | ||
| " class_module=\"pyrit.prompt_converter\",\n", | ||
| " class_description=f\"{name} converter\",\n", | ||
| " identifier_type=\"instance\",\n", | ||
| " supported_input_types=(\"text\",),\n", | ||
| " supported_output_types=(\"text\",),\n", | ||
| " )\n", |
There was a problem hiding this comment.
The same ConverterIdentifier import and usage issue exists in the notebook's .ipynb form. The import on line 41 and the make_converter function using ConverterIdentifier (lines 45-53) will raise ImportError: cannot import name 'ConverterIdentifier' from 'pyrit.identifiers' when the notebook is executed.
| "from pyrit.analytics import analyze_results\n", | |
| "from pyrit.identifiers import ConverterIdentifier\n", | |
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | |
| "\n", | |
| "\n", | |
| "def make_converter(name: str) -> ConverterIdentifier:\n", | |
| " return ConverterIdentifier(\n", | |
| " class_name=name,\n", | |
| " class_module=\"pyrit.prompt_converter\",\n", | |
| " class_description=f\"{name} converter\",\n", | |
| " identifier_type=\"instance\",\n", | |
| " supported_input_types=(\"text\",),\n", | |
| " supported_output_types=(\"text\",),\n", | |
| " )\n", | |
| "from typing import Dict\n", | |
| "from pyrit.analytics import analyze_results\n", | |
| "from pyrit.models import AttackOutcome, AttackResult, MessagePiece\n", | |
| "\n", | |
| "\n", | |
| "def make_converter(name: str) -> Dict[str, str]:\n", | |
| " return {\n", | |
| " \"__type__\": name,\n", | |
| " \"__module__\": \"pyrit.prompt_converter\",\n", | |
| " }\n", |
|
|
||
| # Realistic attack_identifier dicts mirror Strategy.get_identifier() output | ||
| crescendo_id = { | ||
| "__type__": "CrescendoAttack", | ||
| "__module__": "pyrit.executor.attack.multi_turn.crescendo", | ||
| "id": "a1b2c3d4-0001-4000-8000-000000000001", | ||
| } | ||
| red_team_id = { | ||
| "__type__": "RedTeamingAttack", | ||
| "__module__": "pyrit.executor.attack.multi_turn.red_teaming", | ||
| "id": "a1b2c3d4-0002-4000-8000-000000000002", | ||
| } | ||
|
|
||
| # Build a small set of representative attack results | ||
| results = [ | ||
| # Crescendo attacks with Base64Converter | ||
| AttackResult( | ||
| conversation_id="c1", | ||
| objective="bypass safety filter", | ||
| attack_identifier=crescendo_id, | ||
| outcome=AttackOutcome.SUCCESS, | ||
| last_response=MessagePiece( | ||
| role="user", | ||
| original_value="response 1", | ||
| converter_identifiers=[make_converter("Base64Converter")], | ||
| labels={"operation_name": "op_safety_bypass", "operator": "alice"}, | ||
| ), | ||
| ), | ||
| AttackResult( | ||
| conversation_id="c2", | ||
| objective="bypass safety filter", | ||
| attack_identifier=crescendo_id, | ||
| outcome=AttackOutcome.FAILURE, | ||
| last_response=MessagePiece( | ||
| role="user", | ||
| original_value="response 2", | ||
| converter_identifiers=[make_converter("Base64Converter")], | ||
| labels={"operation_name": "op_safety_bypass", "operator": "alice"}, | ||
| ), | ||
| ), | ||
| # Red teaming attacks with ROT13Converter | ||
| AttackResult( | ||
| conversation_id="c3", | ||
| objective="extract secrets", | ||
| attack_identifier=red_team_id, | ||
| outcome=AttackOutcome.SUCCESS, | ||
| last_response=MessagePiece( | ||
| role="user", | ||
| original_value="response 3", | ||
| converter_identifiers=[make_converter("ROT13Converter")], | ||
| labels={"operation_name": "op_secret_extract", "operator": "bob"}, | ||
| ), | ||
| ), | ||
| AttackResult( | ||
| conversation_id="c4", | ||
| objective="extract secrets", | ||
| attack_identifier=red_team_id, | ||
| outcome=AttackOutcome.SUCCESS, | ||
| last_response=MessagePiece( | ||
| role="user", | ||
| original_value="response 4", | ||
| converter_identifiers=[make_converter("ROT13Converter")], | ||
| labels={"operation_name": "op_secret_extract", "operator": "bob"}, | ||
| ), | ||
| ), | ||
| # An undetermined result (no converter, no labels) | ||
| AttackResult( | ||
| conversation_id="c5", | ||
| objective="test prompt", | ||
| attack_identifier=crescendo_id, | ||
| outcome=AttackOutcome.UNDETERMINED, | ||
| ), | ||
| ] |
There was a problem hiding this comment.
The notebook passes plain Python dicts (e.g., crescendo_id, red_team_id) as the attack_identifier argument when constructing AttackResult objects. Unlike MessagePiece, the AttackResult dataclass has no __post_init__ and performs no normalization, so the dict is stored as-is on attack.attack_identifier. When _extract_attack_type accesses result.attack_identifier.class_name, it tries to access .class_name on a plain dict, which raises AttributeError: 'dict' object has no attribute 'class_name'. The notebook should pass a ComponentIdentifier instance directly instead of a raw dict.
…sr_by_generic_dimensions
…icts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Single dimension: attack_identifier |
There was a problem hiding this comment.
The section comment header says "Single dimension: attack_identifier" but the class below it is TestGroupByAttackType, which groups by attack_type. The dimension was renamed to attack_type, so the comment should be updated to "Single dimension: attack_type" to stay consistent with the rest of the file and reflect the current dimension name.
| # Single dimension: attack_identifier | |
| # Single dimension: attack_type |
| warnings.warn( | ||
| f"Dimension '{name}' is deprecated and will be removed in v0.13.0. Use '{canonical}' instead.", | ||
| DeprecationWarning, | ||
| stacklevel=4, |
There was a problem hiding this comment.
The stacklevel=4 in _resolve_dimension_name is correct when the caller path is analyze_results → _resolve_dimension_spec (simple string, line 189) → _resolve_dimension_name (4 frames). However, when a composite tuple triggers this path via the generator expression tuple(_resolve_dimension_name(...) for n in spec) at line 190, an extra stack frame is introduced by the generator, making the effective call stack 5 frames deep. As a result, the deprecation warning's source location will incorrectly point to _resolve_dimension_spec rather than the user's analyze_results call site when the deprecated alias is used inside a composite tuple. Consider passing stacklevel as a parameter to _resolve_dimension_name, or passing a flag to use a different value when invoked from a generator context.
| DEFAULT_DIMENSIONS: dict[str, DimensionExtractor] = { | ||
| "attack_type": _extract_attack_type, | ||
| "converter_type": _extract_converter_types, | ||
| "label": _extract_labels, | ||
| } |
There was a problem hiding this comment.
The DEFAULT_DIMENSIONS constant uses a public naming convention (uppercase, no underscore) suggesting it is part of the public API. However, it is not exported in pyrit/analytics/__init__.py's __all__ list, while DimensionExtractor (needed together with DEFAULT_DIMENSIONS for extensions) is exported. Users who want to extend or inspect the built-in dimension registry (a reasonable use case given the custom dimensions feature) would need to import directly from pyrit.analytics.result_analysis, which is not documented. Either export DEFAULT_DIMENSIONS in __all__, or prefix it with an underscore to signal it is internal.
| - file: code/scoring/prompt_shield_scorer | ||
| - file: code/scoring/generic_scorers | ||
| - file: code/scoring/8_scorer_metrics | ||
| - file: code/analytics/1_result_analysis |
There was a problem hiding this comment.
The 1_result_analysis notebook is added to _toc.yml at the top level with no parent page or section, positioned between the scoring section and the memory section. By contrast, other notebooks like scoring (code/scoring/0_scoring) have a dedicated parent section entry. The analytics notebook should either have a parent analytics section page (e.g., code/analytics/0_analytics) or be nested under an existing relevant section, rather than appearing as a standalone top-level entry without a parent section.
| - file: code/analytics/1_result_analysis | |
| - file: code/analytics/0_analytics | |
| sections: | |
| - file: code/analytics/1_result_analysis |
…sr_by_generic_dimensions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Generalizing
analyze_resultsto work with attack type, converter type, labels, or any dimension we can define from attack results.Tests and Documentation
unit tests and new notebook