Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/service/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v1
appVersion: "1.0"
description: Helm chart creates a deployment, service, hpa for a service along with serviceMonitor etc
name: service
version: 0.0.28
version: 0.0.29
icon: "https://zop.dev/logo.png"
maintainers:
- name: ZopDev
Expand Down
44 changes: 43 additions & 1 deletion charts/service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,16 +94,46 @@ _See [helm uninstall](https://helm.sh/docs/helm/helm_uninstall/) for command doc
| `volumeMounts.secrets` | list | List of Secrets to mount | `[]` |
| `volumeMounts.pvc` | list | List of PVCs to mount | `[]` |

### Security Context Configuration

Security hardening is opt-in. When `securityContext` is not set, the chart preserves legacy behavior (no pod-level security context unless PVC volumes are used). When enabled, it applies pod-level and container-level hardening following Kubernetes security best practices.

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `securityContext.runAsNonRoot` | boolean | Enforce non-root execution | not set |
| `securityContext.runAsUser` | integer | UID for the pod | not set |
| `securityContext.runAsGroup` | integer | GID for the pod | not set |
| `securityContext.fsGroup` | integer | fsGroup for volume ownership | not set |
| `securityContext.fsGroupChangePolicy` | string | Policy for fsGroup ownership changes | not set |
| `securityContext.seccompProfile` | string | Seccomp profile type (e.g. `RuntimeDefault`) | not set |
| `securityContext.readOnlyRootFilesystem` | boolean | Make container filesystem read-only | not set |

When `securityContext` is enabled, the container also gets `allowPrivilegeEscalation: false` and `capabilities.drop: [ALL]` (unless `Containers.privileged` is true).

### Alerting Configuration

The chart includes built-in infrastructure alerts using Prometheus rules. Alerts use rollout-aware expressions (via `changes()` on `kube_deployment_status_replicas_updated`) to avoid false positives during node upgrades, rolling updates, and scaling events.

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `alerts.standard.infra.unavailableReplicasThreshold` | integer | Alert if available replicas is less than desired | `0` |
| `alerts.standard.infra.unavailableReplicasThreshold` | integer | Alert if available replicas is less than desired (severity: warning) | `0` |
| `alerts.standard.infra.podRestartThreshold` | integer | Alert if pod restarts exceed threshold | `0` |
| `alerts.standard.infra.hpaNearingMaxPodThreshold` | integer | Alert if replica count exceeds threshold percentage | `80` |
| `alerts.standard.infra.serviceMemoryUtilizationThreshold` | integer | Alert if memory utilization exceeds threshold | `90` |
| `alerts.standard.infra.serviceCpuUtilizationThreshold` | integer | Alert if CPU utilization exceeds threshold | `90` |

#### Alert Severity and Behavior

| Alert | Severity | `for` Duration | Behavior |
|-------|----------|---------------|----------|
| `_pod_below_minimum_replicas` | warning | 10m | Only fires when replicas are below minimum **and** no rollout is in progress |
| `_unavailable_replicas` | warning | 10m | Only fires when replicas are unavailable **and** no rollout is in progress |
| `_deployment_has_zero_replicas` | critical | 5m | Fires on complete outage (zero replicas) |
| `_pod_restarts` | critical | instant | Fires when pod restarts exceed threshold in the configured time window |
| `_hpa_nearing_max_pod_count` | warning | instant | Fires when replica count nears HPA max |

Set any threshold to `-1` to disable the corresponding alert.

### Custom Alerts

| Parameter | Type | Description | Default |
Expand Down Expand Up @@ -135,6 +165,16 @@ extraAnnotations:
Containers:
privileged: false

# Opt-in: uncomment to enable pod and container security hardening
# securityContext:
# runAsNonRoot: true
# runAsUser: 1000
# runAsGroup: 1000
# fsGroup: 1000
# fsGroupChangePolicy: OnRootMismatch
# seccompProfile: RuntimeDefault
# readOnlyRootFilesystem: true

imagePullSecrets:
# - gcr-secrets
# - acr-secrets
Expand Down Expand Up @@ -285,7 +325,9 @@ datastores:
- Environment variable management
- Volume mounting support
- Prometheus metrics integration
- Rollout-aware alerting rules (suppresses false alerts during node upgrades and rolling updates)
- Custom alerting rules
- Opt-in pod and container security hardening (runAsNonRoot, seccomp, read-only filesystem, drop capabilities)
- Horizontal Pod Autoscaling
- Pod Disruption Budget
- Service monitoring
Expand Down
16 changes: 8 additions & 8 deletions charts/service/templates/alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ spec:
# Alert if the number of pods goes down minimum over a period of time.
- alert: {{ snakecase .Values.name }}_pod_below_minimum_replicas
annotations:
description: "Replicas of {{ .Values.name }} are falling short than the minimum required count in {{ .Release.Namespace }} namespace for longer than 3 minutes."
expr: sum(kube_horizontalpodautoscaler_spec_min_replicas{namespace="{{ .Release.Namespace }}", horizontalpodautoscaler="{{ .Values.name }}"}) - sum(kube_deployment_status_replicas_available{namespace="{{ .Release.Namespace }}", deployment="{{ .Values.name }}"}) > 0
for: 3m
description: "Replicas of {{ .Values.name }} are falling short than the minimum required count in {{ .Release.Namespace }} namespace for longer than 10 minutes."
expr: (sum(kube_horizontalpodautoscaler_spec_min_replicas{namespace="{{ .Release.Namespace }}", horizontalpodautoscaler="{{ .Values.name }}"}) - sum(kube_deployment_status_replicas_available{namespace="{{ .Release.Namespace }}", deployment="{{ .Values.name }}"}) > 0) and (changes(kube_deployment_status_replicas_updated{namespace="{{ .Release.Namespace }}", deployment="{{ .Values.name }}"}[10m]) == 0)
for: 10m
labels:
severity: critical
severity: warning
servicealert: "true"
service: {{ .Values.name }}
namespace: {{ .Release.Namespace }}
Expand All @@ -38,11 +38,11 @@ spec:
{{- if ne (int .Values.alerts.standard.infra.unavailableReplicasThreshold) -1 }}
- alert: {{ snakecase .Values.name }}_unavailable_replicas
annotations:
description: "One or more replicas of {{ .Values.name }} are currently unavailable in the {{ .Release.Namespace }} namespace."
expr: sum(kube_deployment_status_replicas_unavailable{deployment="{{ .Values.name }}",namespace="{{ .Release.Namespace }}"}) > {{ .Values.alerts.standard.infra.unavailableReplicasThreshold }}
for: 3m
description: "One or more replicas of {{ .Values.name }} are currently unavailable in the {{ .Release.Namespace }} namespace for longer than 10 minutes with no rollout progress."
expr: (sum(kube_deployment_status_replicas_unavailable{deployment="{{ .Values.name }}",namespace="{{ .Release.Namespace }}"}) > {{ .Values.alerts.standard.infra.unavailableReplicasThreshold }}) and (changes(kube_deployment_status_replicas_updated{namespace="{{ .Release.Namespace }}", deployment="{{ .Values.name }}"}[10m]) == 0)
for: 10m
labels:
severity: critical
severity: warning
servicealert: "true"
service: {{ .Values.name }}
namespace: {{ .Release.Namespace }}
Expand Down
45 changes: 37 additions & 8 deletions charts/service/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,29 @@ spec:
app: {{ .Values.name }}
spec:
securityContext:
runAsNonRoot: true
runAsUser: {{ .Values.securityContext.runAsUser | default 1000 }}
runAsGroup: {{ .Values.securityContext.runAsGroup | default 1000 }}
fsGroup: {{ .Values.securityContext.fsGroup | default 1000 }}
fsGroupChangePolicy: OnRootMismatch
{{- if .Values.securityContext }}
runAsNonRoot: {{ .Values.securityContext.runAsNonRoot | default false }}
{{- if .Values.securityContext.runAsUser }}
runAsUser: {{ .Values.securityContext.runAsUser }}
{{- end }}
{{- if .Values.securityContext.runAsGroup }}
runAsGroup: {{ .Values.securityContext.runAsGroup }}
{{- end }}
{{- if .Values.securityContext.fsGroup }}
fsGroup: {{ .Values.securityContext.fsGroup }}
{{- end }}
{{- if .Values.securityContext.fsGroupChangePolicy }}
fsGroupChangePolicy: {{ .Values.securityContext.fsGroupChangePolicy }}
{{- end }}
{{- if .Values.securityContext.seccompProfile }}
seccompProfile:
type: RuntimeDefault
type: {{ .Values.securityContext.seccompProfile }}
{{- end }}
{{- else }}
{{- if and (hasKey .Values.volumeMounts "pvc") (not (empty .Values.volumeMounts.pvc)) }}
fsGroup: 1000
{{- end }}
{{- end }}
{{- if .Values.imagePullSecrets}}
imagePullSecrets:
{{- range $v := .Values.imagePullSecrets }}
Expand Down Expand Up @@ -109,16 +125,29 @@ spec:
command: {{ tpl (toJson .Values.command) . }}
{{- end}}
securityContext:
{{- if .Values.securityContext }}
{{- if .Values.Containers.privileged }}
allowPrivilegeEscalation: true
readOnlyRootFilesystem: true
{{- else }}
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
{{- end }}
{{- if hasKey .Values.securityContext "readOnlyRootFilesystem" }}
readOnlyRootFilesystem: {{ .Values.securityContext.readOnlyRootFilesystem }}
{{- end }}
{{- else }}
{{- with .Values.Containers }}
{{- if .privileged | default false }}
privileged: true
{{- end }}
{{- end }}
{{- if and (hasKey .Values.volumeMounts "pvc") (not (empty .Values.volumeMounts.pvc)) }}
runAsUser: 1000
runAsGroup: 1000
{{- end }}
{{- end }}
ports:
- name: http-port
containerPort: {{ .Values.httpPort }}
Expand Down
12 changes: 8 additions & 4 deletions charts/service/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,14 @@ extraAnnotations:
Containers:
privileged: false

securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
# securityContext: # Opt-in: uncomment to enable pod and container security hardening
# runAsNonRoot: true # Enforce non-root execution
# runAsUser: 1000 # UID for the pod
# runAsGroup: 1000 # GID for the pod
# fsGroup: 1000 # fsGroup for volume ownership
# fsGroupChangePolicy: OnRootMismatch # Only change ownership when needed
# seccompProfile: RuntimeDefault # Seccomp profile type
# readOnlyRootFilesystem: true # Make container filesystem read-only

imagePullSecrets:
# - gcr-secrets
Expand Down
Loading
Loading