Skip to content

feat(autoresearch): scorer interface + 3 implementations (#3198)#3202

Merged
mrveiss merged 2 commits intoDev_new_guifrom
issue-3198-scorers
Apr 1, 2026
Merged

feat(autoresearch): scorer interface + 3 implementations (#3198)#3202
mrveiss merged 2 commits intoDev_new_guifrom
issue-3198-scorers

Conversation

@mrveiss
Copy link
Copy Markdown
Owner

@mrveiss mrveiss commented Apr 1, 2026

Summary

  • Pluggable PromptScorer ABC with ScorerResult dataclass (0.0-1.0 clamped)
  • ValBpbScorer: experiment-based scoring via ExperimentRunner
  • LLMJudgeScorer: automated 0-10 rating via LLMService with JSON/regex fallback
  • HumanReviewScorer: Redis-backed human review queue with polling + timeout

Closes #3198
Part of #2600 (AutoResearch M3)

Test plan

  • All 11 scorer unit tests pass
  • Existing autoresearch tests unaffected

🤖 Generated with Claude Code

…umanReview scorers (#3198)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ard, data leakage (#3198)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant