Agent performance may change due to model updates, infrastructure changes, or evolving issue types. This document outlines monitoring processes to ensure quality and enable continuous improvement.
Weekly rotating role (definition) responsible for:
- Conducting weekly review sessions
- Monitoring agent activities and identifying patterns
- Triaging escalated issues
- Preparing weekly status reports
- Tool: Phoenix link TBD to new cluster
- Retention: 2 weeks
- Filter by: Jira issue key in metadata or input.value
- Triage: Issue analysis, decisions, patch validation, Jira updates
- Rebase: Version mapping, specfile updates, builds
- Backport: Patch application, merge conflicts, builds, tests
- MR: Creation, iterations, CI monitoring
The team primarily tracks these items using Jira dashboards:
- Newly triaged issues and CVEs
- Issues with
ymir_needs_attentionorymir_*_erroredlabels - Z-stream issues in current batches
- MR quality, correctness, and completeness
- Agent decision accuracy (triage, patch selection, severity)
Dashboards:
- Labels:
ymir_needs_attention,ymir_*_errored,ymir_cant_do - Build/Test: CI failures, ROG gating failures, test regressions
- Quality: Incorrect patches, incomplete fixes, spec file errors, backwards compatibility issues
- Patterns: Repeated failures by package type or upstream source, declining success rates
- Edge Cases: RHIVOS/FuSa packages, modules, embargoed CVEs
| Level | Channel | Use For |
|---|---|---|
| 1. Team | #forum-ymir-package-automation |
Questions, feedback, label reviews |
| 2. Jira | Packit project, jotnar component | Unresolved issues, bugs, feature requests |
| 3. Leadership | Contact team managers directly | Critical CVEs, VP escalations, systemic failures |
| 4. Anonymous | Feedback form | Sensitive concerns |
- RHIVOS/FuSa (24 packages): Maintainer approval required before merge
- Embargoed CVEs: NOT handled by agents; escalate if urgent
Ratio of MRs merged authored by automation versus human maintainers. A long-term increasing trend indicates successful automation adoption. Stagnation or decreasing trends signal problems with workflows or agent capabilities.
Number of maintainer tasks automated, broken down by workflow type (triage, rebase, backport, build, merge). Consistent increase demonstrates growing automation value. Declining usage in specific workflow types indicates need for evaluation and improvement.
- Who: Full team, led by Skald
- When: Weekly, 1-2 hours
- Agenda: Metrics, labels, CVEs, error patterns, prioritization
- Output: Issue priorities, prompt recommendations, backlog items, status report input
The team reviews agent results during weekly sessions, identifies failing use cases from Phoenix traces and Jira issues, applies new prompts to address the issues, and reruns agents locally to verify the fixes before re-deployment.
| Label | Meaning | Action |
|---|---|---|
ymir_needs_attention |
Requires team review | Weekly priority |
ymir_*_errored |
Workflow failed | Review if >3 attempts |
ymir_cant_do |
Agent cannot handle | Human takeover |
ymir_fusa |
RHIVOS FuSa package | Maintainer approval |