Skip to content

docs: add observability guide#367

Open
yocaba wants to merge 1 commit into
mainfrom
docs/obersability_guide
Open

docs: add observability guide#367
yocaba wants to merge 1 commit into
mainfrom
docs/obersability_guide

Conversation

@yocaba
Copy link
Copy Markdown
Contributor

@yocaba yocaba commented May 19, 2026

What

Closes #296

Why

Testing

I've created a Grafana dashboard, enabled monitoring and checked the top health signals.

Checklist

  • Tests added/updatedn/a
  • No breaking changes (or upgrade path documented above)
  • Readable commit history (squashed and cleaned up as desired)
  • AI code review considered and comments resolved

Summary by CodeRabbit

  • Documentation
    • Added comprehensive observability guide covering metrics exposure, health probe configuration, RBAC requirements, and monitoring setup instructions for platform reliability and operational visibility.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

📝 Walkthrough

Walkthrough

New operator manual documentation describing ARC's observability surface: Prometheus metrics via controller-runtime with no custom ARC metrics, health probes on port 8081, Helm configuration with TLS options, RBAC setup for metrics scraping, key health indicators, and a complete metric catalog for controller, workqueue, reconciliation, and REST client signals.

Changes

Observability Documentation

Layer / File(s) Summary
Observability Reference Guide
docs/operator-manual/observability.md
Complete operator manual page documenting ARC's Prometheus metrics surface (controller-runtime metrics with no ARC-specific custom metrics), Kubernetes health probes (/healthz, /readyz on port 8081), Helm-based metrics enabling with port and TLS/cert-manager configuration, RBAC requirements and ClusterRole binding for metrics readers, key health metrics for order and artifactworkflow controllers, and a comprehensive metric catalog table covering controller, workqueue, reconciliation, and REST client metrics with their types and labels. Includes guidance that workflow execution failures are owned by Argo and should be monitored via Argo's Prometheus endpoint.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

A document blooms on the operator's screen,
Metrics and probes, a health-monitoring dream!
No custom bells, just controller-runtime's song,
With tables and Helm steps to guide you along. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The pull request description is incomplete. The 'Why' section is empty, and the 'Notes for reviewers' section required for documentation changes is missing. Fill in the 'Why' section with motivation/context for the observability guide, and add a 'Notes for reviewers' section noting any relevant documentation structure changes or related documentation updates.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding an observability guide to the documentation.
Linked Issues check ✅ Passed The PR fulfills all acceptance criteria from issue #296: adds observability section with top health metrics and comprehensive metric catalog table with types, descriptions, and labels.
Out of Scope Changes check ✅ Passed All changes are in-scope documentation additions directly addressing the observability guide requirements; no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/obersability_guide

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/operator-manual/observability.md (1)

36-38: ⚡ Quick win

Clarify that both ServiceAccount name and namespace need adjustment.

The comment on line 37 notes that the ServiceAccount name should be adjusted, but the namespace on line 38 also needs to be adjusted to match your Prometheus deployment. Consider updating the comment to make this explicit.

📝 Suggested clarification
 subjects:
   - kind: ServiceAccount
-    name: prometheus-kube-prometheus-prometheus   # adjust to your Prometheus SA
-    namespace: monitoring
+    name: prometheus-kube-prometheus-prometheus   # adjust to your Prometheus SA name
+    namespace: monitoring                          # adjust to your Prometheus namespace
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/operator-manual/observability.md` around lines 36 - 38, Update the
inline comment near the ServiceAccount block so it explicitly states that both
the ServiceAccount name and its namespace must be adjusted to match your
Prometheus deployment; reference the YAML keys 'name:
prometheus-kube-prometheus-prometheus' and 'namespace: monitoring' and change
the comment to something like "adjust to your Prometheus ServiceAccount name and
namespace" so readers know to modify both fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/operator-manual/observability.md`:
- Around line 36-38: Update the inline comment near the ServiceAccount block so
it explicitly states that both the ServiceAccount name and its namespace must be
adjusted to match your Prometheus deployment; reference the YAML keys 'name:
prometheus-kube-prometheus-prometheus' and 'namespace: monitoring' and change
the comment to something like "adjust to your Prometheus ServiceAccount name and
namespace" so readers know to modify both fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bd74544-e415-491c-8148-9ac4a912e145

📥 Commits

Reviewing files that changed from the base of the PR and between e3863b3 and 52fb202.

📒 Files selected for processing (1)
  • docs/operator-manual/observability.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Observability Reference to Operator Manual

2 participants