Skip to content

AutoResearch M3: missing test coverage for edge cases and new endpoints #3211

@mrveiss

Description

@mrveiss

Summary

Code review of M3 PRs identified several test coverage gaps:

Backend

  • No test for multi-scorer chains (final_score averaging across >1 scorer)
  • No route-level tests for the 10 new API endpoints (only unit/integration tests)
  • No test for `LLMJudgeScorer._parse_rating` with completely unparseable input
  • No test for `ValBpbScorer` when `ExperimentRunner.run_experiment` raises exception
  • No test for `KnowledgeSynthesizer.synthesize_session` with LLM failure or empty session
  • No test for enriched `_build_document` when `val_bpb=None` but `baseline_val_bpb` is set

Frontend

  • Accessibility: buttons lack `aria-label`, status dots use color alone, search input has no ``
  • No test for `ExperimentDashboard` component mount/rendering

Files

  • `autobot-backend/services/autoresearch/scorers_test.py`
  • `autobot-backend/services/autoresearch/prompt_optimizer_test.py`
  • `autobot-backend/services/autoresearch/knowledge_synthesizer_test.py`
  • `autobot-backend/services/autoresearch/routes_test.py`
  • `autobot-frontend/src/components/autoresearch/`

Origin

Discovered during code review of PRs #3202, #3203, #3206, #3207

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions