Feature Request
Summary:
As a user of Typesense Operator, I want the ability to enforce scheduling of operator pods (when running multiple replicas for high availability) on different Kubernetes nodes, so there's resilience to node failures and true HA is achieved.
Background:
Currently, multiple typesense-operator replica pods may be scheduled by Kubernetes on the same node, unless a user explicitly defines affinity or anti-affinity. This can lead to both operator pods being unavailable if the node fails, undermining fault tolerance.
Requested Behavior:
- Provide an easy/standard mechanism to ensure replicas of the operator are always scheduled on separate nodes.
- This could be done via:
- Helm chart support for configurable anti-affinity in
values.yaml, e.g.:
controllerManager:
replicas: 2
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: control-plane
operator: In
values: [controller-manager]
topologyKey: kubernetes.io/hostname
- Or, sensible defaults out-of-the-box (in the chart) when
replicas > 1.
- Documentation/example for YAML/kustomize installations on how to configure anti-affinity for operator pods.
Benefits:
- Ensures that operator HA deployments withstand individual node failures.
- Provides a safer default for production use cases.
- Reduces ops burden for users who may not be familiar with affinity/anti-affinity settings.
Acceptance Criteria:
References:
Requested by user to improve HA of operator deployments and resilience to node failure.
Feature Request
Summary:
As a user of Typesense Operator, I want the ability to enforce scheduling of operator pods (when running multiple replicas for high availability) on different Kubernetes nodes, so there's resilience to node failures and true HA is achieved.
Background:
Currently, multiple typesense-operator replica pods may be scheduled by Kubernetes on the same node, unless a user explicitly defines affinity or anti-affinity. This can lead to both operator pods being unavailable if the node fails, undermining fault tolerance.
Requested Behavior:
values.yaml, e.g.:replicas > 1.Benefits:
Acceptance Criteria:
References:
Requested by user to improve HA of operator deployments and resilience to node failure.