Skip to content

MAINT/FIX: Generalizing eval-hash#1434

Merged
rlundeen2 merged 6 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/2026_02_26_identifier_refactor
Mar 3, 2026
Merged

MAINT/FIX: Generalizing eval-hash#1434
rlundeen2 merged 6 commits intoAzure:mainfrom
rlundeen2:users/rlundeen/2026_02_26_identifier_refactor

Conversation

@rlundeen2
Copy link
Contributor

@rlundeen2 rlundeen2 commented Mar 3, 2026

Problem

Prior to this PR, eval-hash computation (_build_eval_dict, compute_eval_hash) lived in scorer_metrics_io.py and was tightly coupled to Scorer. Every call site had to import scorer-io specific helpers and manually pass params. This made the code hard to reuse for non-scorer components. Pretty soon, we will want to do evals on attacks, and potentially other pieces.

Additionally, there were some instances in ScorerEvaluator that were using the hash instead of eval_hash. This PR fixes that.

Also fixing unrelated flaky test that didn't patch central memory and failed in build.

Solution

This PR moves eval-hash computation into the identity layer. ComponentIdentifier gains a compute_eval_hash() method that dynamically looks up target keys and child params. Identifiable adds those class vars and a get_eval_hash() convenience method, so any component type can opt into eval-hash filtering by overriding the ClassVars.

All the existing behavior (other than the bug fix) should stay the same.

Copilot AI review requested due to automatic review settings March 3, 2026 00:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR generalizes “eval-hash” computation by moving it from scorer-specific IO utilities into the identifiers layer, enabling reuse across other component types (e.g., attacks) and fixing evaluator lookups to consistently use eval_hash rather than the full component hash.

Changes:

  • Added ComponentIdentifier.compute_eval_hash() plus Identifiable.get_eval_hash() and ClassVar-based configuration for defining eval-hash behavior.
  • Updated scorer evaluation registry read/write and lookups to use eval_hash (and updated call sites accordingly).
  • Refactored and relocated eval-hash unit tests from scorer metrics tests into identifier tests; updated evaluator tests for the new lookup key and function names.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pyrit/identifiers/component_identifier.py Adds eval-hash computation, dynamic eval-config resolution, and Identifiable.get_eval_hash() helper.
pyrit/score/scorer.py Defines scorer eval-hash ClassVars (EVAL_TARGET_CHILD_KEYS, EVAL_BEHAVIORAL_CHILD_PARAMS).
pyrit/score/scorer_evaluation/scorer_metrics_io.py Switches registry lookup and replace semantics to key on eval_hash; requires callers to provide eval_hash.
pyrit/score/scorer_evaluation/scorer_evaluator.py Uses get_eval_hash() for skip-logic and writes metrics keyed by eval hash.
pyrit/score/true_false/true_false_scorer.py Updates objective metrics lookup to use eval hash.
pyrit/score/float_scale/float_scale_scorer.py Updates harm metrics lookup to use eval hash.
pyrit/score/printer/console_scorer_printer.py Updates printer registry lookups to use eval hash.
pyrit/models/scenario_result.py Updates scenario result metrics lookup to use eval hash.
tests/unit/score/test_scorer_metrics.py Updates registry entry assertions to use eval_hash and removes eval-hash implementation tests from this file.
tests/unit/score/test_scorer_evaluator.py Updates mocks/patches to use get_eval_hash() and find_*_by_eval_hash functions.
tests/unit/identifiers/test_component_identifier.py Adds comprehensive tests for eval-hash dict construction, dynamic config resolution, and get_eval_hash().
Comments suppressed due to low confidence (2)

pyrit/identifiers/component_identifier.py:284

  • When only one of target_child_keys / behavioral_child_params is provided and _resolve_eval_config() fails, the other value becomes an empty frozenset(). If target_child_keys is non-empty but behavioral_child_params resolves to empty, target children will have all params stripped from the eval hash, which can incorrectly collapse distinct model configs into the same eval_hash. Consider validating that these two args are provided together (or that resolution succeeded) and falling back to self.hash / raising a ValueError when the eval config is incomplete.
        if target_child_keys is None or behavioral_child_params is None:
            resolved_keys, resolved_params = self._resolve_eval_config()
            if target_child_keys is None:
                target_child_keys = resolved_keys
            if behavioral_child_params is None:
                behavioral_child_params = resolved_params

        if not target_child_keys:
            return self.hash

pyrit/score/scorer_evaluation/scorer_evaluator.py:276

  • scorer_hash now holds the scorer’s eval hash (get_eval_hash()), not the full component hash. Renaming this local (e.g., eval_hash) would reduce confusion and make it clearer which key is used for registry lookups.
            scorer_hash = self.scorer.get_eval_hash()

Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@bashirpartovi bashirpartovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch on fixing those call sites to use eval_hash instead of hash, sorry I missed that in my previous PR.

I do want to push back a bit on moving the eval-hash logic onto ComponentIdentifier, though. It kind of undoes the decoupling we just did in the recent refactor.

Just to recap why we changed this in the first place, the old design had a separate identifier class for every component (ScorerIdentifier, TargetIdentifier, etc.). Each one had its own custom fields and escape hatches for weird params. When a scorer captured a target, it reached in and cherry-picked specific fields like model_name and temperature. Converters did the same thing. Basically, components had to know way too much about each other. If someone added frequency_penalty to a target, it wouldn't even show up in the scorer's identity unless someone remembered to update Scorer._create_identifier().

The refactor fixed this by using a single ComponentIdentifier that just holds opaque snapshots. The parent doesn't peek inside anymore. The target's full identity flows in as a child, the hash includes it automatically, and the DB gets the whole unfiltered picture for debugging.

Because the old way was cherry picking fields, it generated lossy data and mashed object identity and eval grouping together. We explicitly broke those apart, ComponentIdentifier handles strict identity, and compute_eval_hash (in scorer_metrics_io.py) handles the behavioral grouping with an allowlist. The identity layer just hashes things, and the eval layer decides what to filter. This also set us up so when attacks need eval hashing later, they just write their own wrappers.

By moving compute_eval_hash to ComponentIdentifier and adding those EVAL_* vars to Identifiable, we're leaking eval logic back into the identity layer. Now every Identifiable component carries eval-specific ClassVars even though only Scorer uses them right now.

The part I'm most worried about is _resolve_eval_config:

def _resolve_eval_config(self) -> tuple[frozenset[str], frozenset[str]]:
    module = importlib.import_module(self.class_module)
    cls = getattr(module, self.class_name, None)

ComponentIdentifier is supposed to be a pure, dumb data snapshot. from_dict() should be able to rebuild it safely from the DB even if the original class doesn't exist anymore. This method breaks that safety. If a class gets renamed or moved, importlib fails silently, returns empty sets, and compute_eval_hash() falls back to self.hash with no filtering. We won't get any errors, just two scorers that should share results silently failing to do so, wasting compute on expensive evals.

It also gives us two different code paths to calculate the same hash, a live instance uses scorer.get_eval_hash() (reading ClassVars directly), while a deserialized one uses identifier.compute_eval_hash() (using importlib). Before this PR, we had one function, one path, explicit constants, and everyone used the same thing.

Recommendation

I'm 100% on board with making this code reusable. We'll probably need eval hashing for other things. We just don't need to complicate the identity type to get there.

What if we move _build_eval_dict and a generic compute_eval_hash into component_identifier.py as module-level functions with explicit parameters? Like this:

def compute_eval_hash(
    identifier: ComponentIdentifier,
    *,
    target_child_keys: frozenset[str],
    behavioral_child_params: frozenset[str],
) -> str:
    eval_dict = _build_eval_dict(
        identifier,
        target_child_keys=target_child_keys,
        behavioral_child_params=behavioral_child_params,
    )
    return config_hash(eval_dict)

Then each eval domain just wraps it with its own constants:

# scorer_metrics_io.py
_TARGET_CHILD_KEYS = frozenset({"prompt_target", "converter_target"})
_BEHAVIORAL_CHILD_PARAMS = frozenset({"model_name", "temperature", "top_p"})

def compute_scorer_eval_hash(identifier: ComponentIdentifier) -> str:
    return compute_eval_hash(
        identifier,
        target_child_keys=_TARGET_CHILD_KEYS,
        behavioral_child_params=_BEHAVIORAL_CHILD_PARAMS,
    )

# attack_metrics_io.py (for future usecase)
def compute_attack_eval_hash(identifier: ComponentIdentifier) -> str:
    return compute_eval_hash(
        identifier,
        target_child_keys=frozenset({"objective_target"}),
        behavioral_child_params=frozenset({"model_name", "temperature"}),
    )

This keeps the generic hashing in the identity module, but leaves the domain-specific config in the domain modules. ComponentIdentifier stays a pure data type (no importlib, no silent failures, no extra ClassVars). Every domain gets a clean 3-line wrapper, and we keep the decoupling intact while still getting all the reuse.

Copilot AI review requested due to automatic review settings March 3, 2026 21:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (6)

pyrit/score/scorer.py:86

  • get_eval_hash() introduces a new local import of ScorerEvaluationIdentity. This module doesn’t appear to import Scorer back, so the deferred import may be unnecessary; keeping it at module scope (or under TYPE_CHECKING + a normal import) would reduce per-call overhead and make imports consistent/predictable.
        # Deferred import to avoid circular dependency (scorer_evaluation_identity → identifiers → …)
        from pyrit.score.scorer_evaluation.scorer_evaluation_identity import ScorerEvaluationIdentity

        return ScorerEvaluationIdentity(self.get_identifier()).eval_hash

pyrit/score/printer/console_scorer_printer.py:106

  • print_objective_scorer() adds a function-local import of ScorerEvaluationIdentity. This dependency doesn’t look circular from this module, so it can likely be imported at the top (or under TYPE_CHECKING) to follow the project’s usual import organization and avoid repeated imports on each call.
        from pyrit.score.scorer_evaluation.scorer_evaluation_identity import ScorerEvaluationIdentity
        from pyrit.score.scorer_evaluation.scorer_metrics_io import (
            find_objective_metrics_by_eval_hash,
        )

pyrit/score/printer/console_scorer_printer.py:134

  • print_harm_scorer() adds a function-local import of ScorerEvaluationIdentity. If there’s no circular import here, prefer moving it to the module import section (or TYPE_CHECKING + normal import) for consistency and to avoid repeated imports.
        from pyrit.score.scorer_evaluation.scorer_evaluation_identity import ScorerEvaluationIdentity
        from pyrit.score.scorer_evaluation.scorer_metrics_io import (
            find_harm_metrics_by_eval_hash,
        )

pyrit/score/scorer_evaluation/scorer_evaluator.py:279

  • scorer_hash now holds the scorer’s eval hash (via get_eval_hash()), but the variable name still suggests it’s the raw ComponentIdentifier.hash. Renaming it (e.g., scorer_eval_hash) would reduce ambiguity and make the subsequent lookup calls clearer.
            scorer_hash = self.scorer.get_eval_hash()

            # Determine if this is a harm or objective evaluation
            metrics_type = MetricsType.OBJECTIVE if isinstance(self.scorer, TrueFalseScorer) else MetricsType.HARM

pyrit/identifiers/evaluation_identity.py:163

  • EvaluationIdentity.init will raise a generic AttributeError if a subclass forgets to define TARGET_CHILD_KEYS/BEHAVIORAL_CHILD_PARAMS. Consider adding an explicit validation in init (e.g., via getattr checks) and raising a clearer exception that explains which ClassVars are missing and how to fix it.
    def __init__(self, identifier: ComponentIdentifier) -> None:
        self._identifier = identifier
        self._eval_hash = compute_eval_hash(
            identifier,
            target_child_keys=self.TARGET_CHILD_KEYS,
            behavioral_child_params=self.BEHAVIORAL_CHILD_PARAMS,
        )

pyrit/score/scorer.py:76

  • The PR description mentions adding a generic get_eval_hash() convenience method on Identifiable so non-scorer components can reuse eval-hash logic. In this diff, get_eval_hash() is only added to Scorer; consider either adding the Identifiable-level helper or updating the PR description to match the current scope.
    def get_eval_hash(self) -> str:
        """
        Compute a behavioral equivalence hash for evaluation grouping.

        Delegates to ``ScorerEvaluationIdentity`` which filters target children

Copilot AI review requested due to automatic review settings March 3, 2026 22:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (5)

pyrit/score/printer/console_scorer_printer.py:106

  • These methods add additional local imports (ScorerEvaluationIdentity, find_*_by_eval_hash) even though the imported modules don’t appear to depend on console_scorer_printer (so a circular import is unlikely). To keep import structure consistent and easier to maintain, prefer moving these imports to the top of the file (or add a brief comment if a cycle truly requires local imports).
        from pyrit.score.scorer_evaluation.scorer_evaluation_identity import ScorerEvaluationIdentity
        from pyrit.score.scorer_evaluation.scorer_metrics_io import (
            find_objective_metrics_by_eval_hash,
        )

tests/unit/score/test_scorer_evaluation_identity.py:9

  • The module docstring says this file covers the Scorer.get_eval_hash() convenience method, but the tests below compute hashes via ScorerEvaluationIdentity(...) directly and never call Scorer.get_eval_hash(). Either update the docstring to match the actual coverage, or add a minimal concrete Scorer subclass test that exercises get_eval_hash() end-to-end.
"""
Tests for pyrit.score.scorer_evaluation.scorer_evaluation_identity.

Covers ``ScorerEvaluationIdentity`` ClassVar values, eval-hash delegation, and
the ``Scorer.get_eval_hash()`` convenience method.
"""

pyrit/score/scorer_evaluation/scorer_evaluator.py:289

  • scorer_hash now stores self.scorer.get_eval_hash() (an evaluation hash), so the variable name is misleading. Renaming it (e.g., scorer_eval_hash) would make it clearer what is being passed to the *_by_eval_hash lookup helpers.
            scorer_hash = self.scorer.get_eval_hash()

            # Determine if this is a harm or objective evaluation
            metrics_type = MetricsType.OBJECTIVE if isinstance(self.scorer, TrueFalseScorer) else MetricsType.HARM

            existing: Optional[ScorerMetrics] = None
            if metrics_type == MetricsType.HARM:
                if harm_category is None:
                    logger.warning("harm_category must be provided for harm scorer evaluations")
                    return (False, None)
                existing = find_harm_metrics_by_eval_hash(
                    eval_hash=scorer_hash,

pyrit/score/scorer_evaluation/scorer_metrics_io.py:176

  • The find_objective_metrics_by_hash / find_harm_metrics_by_hash APIs were renamed/removed in favor of *_by_eval_hash. If this module is part of the public API surface, this is a breaking change for downstream callers. Consider keeping the old function names as deprecated wrappers (calling the new functions and emitting a deprecation warning) for at least one release cycle.
def find_objective_metrics_by_eval_hash(
    *,
    eval_hash: str,
    file_path: Optional[Path] = None,
) -> Optional[ObjectiveScorerMetrics]:
    """
    Find objective scorer metrics by evaluation hash.

    Args:
        eval_hash (str): The scorer evaluation hash to search for.
        file_path (Optional[Path]): Path to the JSONL file to search.
            If not provided, uses the default path:
            SCORER_EVALS_PATH / "objective" / "objective_achieved_metrics.jsonl"

    Returns:
        ObjectiveScorerMetrics if found, else None.
    """
    if file_path is None:
        file_path = SCORER_EVALS_PATH / "objective" / "objective_achieved_metrics.jsonl"

    return _find_metrics_by_eval_hash(file_path=file_path, eval_hash=eval_hash, metrics_class=ObjectiveScorerMetrics)


def find_harm_metrics_by_eval_hash(
    *,
    eval_hash: str,

pyrit/score/scorer.py:87

  • get_eval_hash() uses a local (deferred) import, but ScorerEvaluationIdentity only depends on pyrit.identifiers and does not import pyrit.score.scorer (so a circular import doesn’t appear to exist). If there’s no real cycle, prefer moving this import to the module top to keep imports centralized; if there is a cycle, it should be documented with the specific modules involved.
        # Deferred import to avoid circular dependency (scorer_evaluation_identity → identifiers → …)
        from pyrit.score.scorer_evaluation.scorer_evaluation_identity import ScorerEvaluationIdentity

        return ScorerEvaluationIdentity(self.get_identifier()).eval_hash

@rlundeen2 rlundeen2 merged commit 17de38a into Azure:main Mar 3, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants