feat(observability): split into composable per-component tasks#28
feat(observability): split into composable per-component tasks#28
Conversation
The monolithic install-observability task deploys the entire telemetry stack as a single blob, which doesn't fit on resource-constrained CI runners. Downstream repos like resource-metrics only need Victoria Metrics + OTel Collector for their e2e tests. Split the observability stack into per-component kustomization directories and Taskfile tasks so consumers can install only what they need. The existing install-observability task is preserved as a thin composite that calls all sub-tasks, maintaining full backward compatibility. New tasks: - install-prometheus-crds - install-victoria-metrics (depends on prometheus-crds) - install-otel-collector (with webhook retry logic) - install-grafana (depends on victoria-metrics) - install-loki - install-tempo Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mponent its own namespace Problem 1: the composable split had copy-pasted namespace.yaml, helm-repositories.yaml, and datasources/ into each component subdirectory while leaving the originals at the root. That duplicated ~200 lines and caused the root kustomize build to fail on conflicting resources. - Deleted the root-level namespace.yaml, helm-repositories.yaml, and datasources/ directory. - Kept a single copy of the datasources under grafana/datasources/ (where the Grafana instance lives). - Pared each component's helm-repositories.yaml to just the repo that component actually consumes. Gave loki/tempo distinct HelmRepository names (loki-charts, tempo-charts) so the composed root kustomize build does not fail on duplicate source.toolkit.fluxcd.io resources. - Root components/observability/kustomization.yaml now references the six component subdirectories only. - Dropped prometheus-crds/namespace.yaml entirely — the kustomization only installs cluster-scoped CRDs so no namespace is needed. Problem 2: everything still deployed to telemetry-system, which defeats the point of per-component composition. Each component now has its own namespace so kubectl delete ns <x> cleanly uninstalls it: - victoria-metrics-system (was telemetry-system) - grafana-system - loki-system - tempo-system - otel-collector-system Cross-component references are now fully qualified service DNS names: - Grafana datasources point at vmsingle/vmalertmanager in victoria-metrics-system, loki-system-loki in loki-system, and tempo-system-tempo in tempo-system. - The OTel Collector's otlp, loki, and prometheusremotewrite exporters point at the new FQDNs. - VMAlert's datasource, notifier, and remoteWrite URLs use the new victoria-metrics-system service names. - VM defaultDashboards.grafanaOperator.allowCrossNamespaceImport is now true so dashboards created in victoria-metrics-system can target the Grafana CR in grafana-system. Taskfile's per-component waits updated to reference the new namespaces (vmagent/vmsingle in victoria-metrics-system, otel-collector-collector DaemonSet in otel-collector-system). README refreshed to document the subcomponent layout, namespaces, and removal procedure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Follow-up commit Problem 1 — removed duplicated shared files
Problem 2 — per-component namespacesEach component deploys to its own namespace so
Cross-component references are now FQDNs:
Taskfile's Verification
Diff stats
|
The VM HelmRelease had defaultDashboards.grafanaOperator.enabled set to true, which generates GrafanaDashboard resources. When installed on its own (e.g. `task install-prometheus-crds install-victoria-metrics`) the grafana-operator CRDs are not present and Helm fails with "no matches for kind GrafanaDashboard in version grafana.integreatly.org/v1beta1". Flip the default to false so per-component installs succeed, and patch it back to true in the root observability kustomization so the full stack continues to ship dashboards via the operator. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Stacked fix in b3453bd: flipped |
Summary
resource-metricse2e tests)install-observabilityis now a thin composite that calls all sub-tasks, producing the same result as beforeNew tasks
install-prometheus-crdsinstall-victoria-metricsinstall-prometheus-crdsinstall-otel-collectorinstall-grafanainstall-victoria-metricsinstall-lokiinstall-tempoinstall-observabilityStructure
Each component has its own
kustomization.yamlundercomponents/observability/<component>/that is self-contained (includes namespace + helm-repositories). The rootkustomization.yamlreferences individual files from subdirectories to avoid resource duplication.Test plan
kustomize build components/observabilityproduces the same resource set as beforekustomize build components/observability/<component>works for each component individuallytask --listshows all new taskstask install-victoria-metrics install-otel-collectordeploys only VM + OTel on a constrained clustertask install-observabilitystill deploys the full stack🤖 Generated with Claude Code