Enable Alloy clustering to reduce DPM by ~50%#522
Open
yangw-dev wants to merge 1 commit into
Open
Conversation
tofu plan — arc-cbr-production✅ Plan succeeded · commit Plan output |
1. Add alloy.clustering.enabled=true — tells the helm chart to create a headless service and pass --cluster.join-addresses, enabling peer discovery between replicas. DPM drops from ~2x to ~1x per series. 2. Increase memory limit from 1Gi to 2Gi — prod has more scrape targets than staging, and clustering adds memberlist overhead. Without this, prod Alloy pods OOMKill. No changes to controller type (stays as Deployment), deploy.sh, or smoke tests. Authored with Claude.
ff487e2 to
44f0e0c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
alloy.clustering.enabled: trueto helm values. This tells the Alloy helm chart to:alloy-cluster) for peer discovery--cluster.join-addresses=alloy-clusterto Alloy podsWith this, the two Alloy replicas discover each other and distribute scrape targets ~50/50, instead of both independently scraping all targets (doubling DPM).
This is a one-line change — no changes to
controller.type(stays as Deployment),deploy.sh, or smoke tests.Root cause
The Alloy River config already had
clustering { enabled = true }on servicemonitors and podmonitors, but the helm chart levelalloy.clustering.enabledwas not set. Without it, the chart doesn't create the headless service or pass--cluster.join-addresses, so pods can't discover each other. Logs showed:"no peer discovery configured: both join and discover peers are empty".Expected impact
Test plan