Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/ci/ct.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
chart-dirs:
- charts
helm-extra-args: "--timeout=5m"
check-version-increment: false
target-branch: main
81 changes: 81 additions & 0 deletions .github/workflows/helm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
name: Helm

on:
push:
branches:
- main
- release-*
paths:
- 'api/**'
- 'charts/**'
- 'config/crd/**'
- '.github/workflows/helm.yaml'
- '.github/ci/ct.yaml'
- 'hack/verify-chart-drift.sh'
- 'Makefile'
pull_request:
paths:
- 'api/**'
- 'charts/**'
- 'config/crd/**'
- '.github/workflows/helm.yaml'
- '.github/ci/ct.yaml'
- 'hack/verify-chart-drift.sh'
- 'Makefile'

jobs:
lint-and-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Helm
uses: azure/setup-helm@v4.2.0
with:
version: v3.15.1

- uses: actions/setup-python@v5.1.1
with:
python-version: 3.12

- uses: actions/setup-go@v5
with:
go-version-file: 'go.mod'

- name: Run chart drift check
run: |
hack/verify-chart-drift.sh

- name: Set up chart-testing
uses: helm/chart-testing-action@v2.8.0
with:
version: v3.11.0

- name: Install Helm Unit Test Plugin
run: |
helm plugin install --version 1.0.3 https://github.com/helm-unittest/helm-unittest

- name: Run Helm Unit Tests
run: |
helm unittest charts/nrr-controller --strict -d

- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config=.github/ci/ct.yaml)
if [[ -n "$changed" ]]; then
echo "changed=true" >> $GITHUB_OUTPUT
fi

- name: Run chart-testing (lint)
run: ct lint --config=.github/ci/ct.yaml --validate-maintainers=false

# Need a multi node cluster so controller can run with leadership
- name: Create multi node Kind cluster
run: make kind-multi-node

- name: Run chart-testing (install)
run: ct install --config=.github/ci/ct.yaml
21 changes: 20 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -489,4 +489,23 @@ crd-ref-docs:
--source-path=${PWD}/api/v1alpha1/ \
--config=crd-ref-docs.yaml \
--renderer=markdown \
--output-path=${PWD}/docs/book/src/reference/api-spec.md
--output-path=${PWD}/docs/book/src/reference/api-spec.md

# helm

ensure-helm-install:
ifndef HAS_HELM
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 && chmod 700 ./get_helm.sh && ./get_helm.sh
endif

lint-chart: ensure-helm-install
helm lint ./charts/nrr-controller

build-helm:
helm package ./charts/nrr-controller --dependency-update --destination ./bin/chart

kind-multi-node:
kind create cluster --name $(KIND_CLUSTER) --config ./config/testing/kind/kind-3node-config.yaml --wait 2m

ct-helm:
./hack/verify-chart.sh
22 changes: 22 additions & 0 deletions charts/nrr-controller/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
15 changes: 15 additions & 0 deletions charts/nrr-controller/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v2
name: nrr-controller
description: A Helm chart for the Node Readiness Controller
type: application
version: 0.1.0
appVersion: "v0.3.0"
kubeVersion: ">=1.25.0-0"
home: https://github.com/kubernetes-sigs/node-readiness-controller
sources:
- https://github.com/kubernetes-sigs/node-readiness-controller
keywords:
- kubernetes
- controller
- readiness
- node
105 changes: 105 additions & 0 deletions charts/nrr-controller/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Node Readiness Controller for Kubernetes

[Node Readiness Controller](https://github.com/kubernetes-sigs/node-readiness-controller) for Kubernetes a Kubernetes controller that provides fine-grained, declarative readiness for nodes. It ensures nodes only accept workloads when all required components eg: network agents, GPU drivers, storage drivers or custom health-checks, are fully ready on the node.

## TL;DR:

```shell
helm repo add node-readiness-controller https://kubernetes-sigs.github.io/node-readiness-controller/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this work for helm upgrade? how does future schema changes reach existing installs?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented this in the chart README. Helm installs CRDs from the chart crds/ directory during initial install, but does not upgrade or delete those CRDs during helm upgrade or helm uninstall.

For future schema changes, users need to apply the updated CRD before upgrading to a chart version that depends on it. Moving CRDs into templates/ solves the problem, but the CRD lifecycle becomes more dangerous. Alternatively, we can add a pre-install/pre-upgrade hook Job that runs kubectl apply for CRDs. For this PR I kept Helm’s standard crds/ behavior and documented that schema-changing upgrades require applying the updated CRD first.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting the upgrade flow with this PR sounds good. We should include it also in the project (mdbook) documentation. If this will be a followup PR, please create an issue / action-item on updating installation instruction so it doesnt get missed.

CRD lifecycle becomes more dangerous.

Could you clarify this bit more on how this is dangerous?

helm install my-release --namespace kube-system node-readiness-controller/nrr-controller
```

## Introduction

This chart bootstraps a [node-readiness-controller](https://github.com/kubernetes-sigs/node-readiness-controller) deployment on a [Kubernetes](http://kubernetes.io) cluster using the [Helm](https://helm.sh) package manager.

## Prerequisites

- Kubernetes 1.25+

## Installing the Chart

To install the chart with the release name `my-release`:

```shell
helm install --namespace kube-system my-release node-readiness-controller/nrr-controller
```

The command deploys the _node-readiness-controller_ on the Kubernetes cluster in the default configuration. The [configuration](#configuration) section lists the parameters that can be configured during installation.

> **Tip**: List all releases using `helm list`

## CRD Upgrades

Helm installs CRDs from the chart `crds/` directory during initial install, but Helm does not upgrade or delete CRDs from that directory during `helm upgrade` or `helm uninstall`. Before upgrading to a chart version that changes the `NodeReadinessRule` schema, apply the updated CRD from the release artifacts or from `charts/nrr-controller/crds`.

## Uninstalling the Chart

To uninstall/delete the `my-release` deployment:

```shell
helm delete my-release
```

The command removes all the Kubernetes components associated with the chart and deletes the release.

## Configuration

The following table lists the configurable parameters of the _node-readiness-controller_ chart and their default values.

| Parameter | Description | Default |
| ---------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- |
| `image.repository` | Docker repository to use | `registry.k8s.io/node-readiness-controller/node-readiness-controller` |
| `image.tag` | Docker tag to use | `v[chart appVersion]` |
| `image.pullPolicy` | Docker image pull policy | `IfNotPresent` |
| `imagePullSecrets` | Docker repository secrets | `[]` |
| `nameOverride` | String to partially override `nrr-controller.fullname` template (will prepend the release name) | `""` |
| `fullnameOverride` | String to fully override `nrr-controller.fullname` template | `""` |
| `namespaceOverride` | Override the deployment namespace; defaults to .Release.Namespace | `""` |
| `replicaCount` | The replica count for Deployment | `1` |
| `leaderElection.enabled` | Enable leader election to support multiple replicas | `true` |
| `priorityClassName` | The name of the priority class to add to pods | `system-cluster-critical` |
| `rbac.create` | If `true`, create & use RBAC resources | `true` |
| `resources` | Node Readiness Controller container CPU and memory requests/limits | _see values.yaml_ |
| `serviceAccount.create` | If `true`, create a service account | `true` |
| `serviceAccount.name` | The name of the service account to use, if not set and create is true a name is generated using the fullname template | `nil` |
| `serviceAccount.annotations` | Specifies custom annotations for the serviceAccount | `{}` |
| `podAnnotations` | Annotations to add to the node-readiness-controller Pods | `{"kubectl.kubernetes.io/default-container":"manager"}` |
| `podLabels` | Labels to add to the node-readiness-controller Pods | `{}` |
| `commonLabels` | Labels to apply to all resources | `{}` |
| `podSecurityContext` | Security context for pod | _see values.yaml_ |
| `securityContext` | Security context for container | _see values.yaml_ |
| `terminationGracePeriodSeconds` | Time to wait before forcefully terminating the pod | `10` |
| `healthProbeBindAddress` | The bind address for health probes | `:8081` |
| `livenessProbe` | Liveness probe configuration for the node-readiness-controller container | _see values.yaml_ |
| `readinessProbe` | Readiness probe configuration for the node-readiness-controller container | _see values.yaml_ |
| `metrics.secure` | Enable secure metrics endpoint | `false` |
| `metrics.bindAddress` | The bind address for metrics server | `:8443` |
| `metrics.service.port` | The port exposed by the metrics service | `8443` |
| `metrics.service.targetPort` | The target port for the metrics service | `8443` |
| `metrics.certDir` | Directory for metrics server certificates | `/tmp/k8s-metrics-server/metrics-certs` |
| `metrics.certSecretName` | Name of the secret containing metrics server certificates | `metrics-server-cert` |
| `webhook.enabled` | Enable the webhook server | `false` |
| `webhook.port` | The port for the webhook server | `9443` |
| `webhook.service.port` | The port exposed by the webhook service | `443` |
| `webhook.service.targetPort` | The target port for the webhook service | `9443` |
| `webhook.certDir` | Directory for webhook server certificates | `/tmp/k8s-webhook-server/serving-certs` |
| `webhook.certSecretName` | Name of the secret containing webhook server certificates | `webhook-server-certs` |
| `certManager.enabled` | Enable cert-manager integration for automatic TLS certificate generation | `false` |
| `certManager.issuer.create` | Create a cert-manager issuer | `true` |
| `certManager.issuer.name` | Name of the cert-manager issuer | `selfsigned-issuer` |
| `certManager.metricsCertificate.create` | Create a cert-manager certificate for metrics server | `true` |
| `certManager.metricsCertificate.name` | Name of the metrics certificate | `metrics-certs` |
| `certManager.webhookCertificate.create` | Create a cert-manager certificate for webhook server | `true` |
| `certManager.webhookCertificate.name` | Name of the webhook certificate | `serving-cert` |
| `validatingWebhook.enabled` | Enable the validating webhook | `false` |
| `validatingWebhook.name` | Name of the ValidatingWebhookConfiguration resource | `validating-webhook-configuration` |
| `validatingWebhook.webhookName` | Name of the webhook | `vnodereadinessrule.kb.io` |
| `validatingWebhook.failurePolicy` | Failure policy for the webhook | `Fail` |
| `validatingWebhook.sideEffects` | Side effects for the webhook | `None` |
| `validatingWebhook.path` | The path for the webhook | `/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule` |
| `validatingWebhook.admissionReviewVersions` | Admission review versions supported by the webhook | `["v1"]` |
| `nodeSelector` | Node selectors to run the controller on specific nodes | `nil` |
| `tolerations` | Tolerations to run the controller on specific nodes | `nil` |
| `affinity` | Node affinity to run the controller on specific nodes | `nil` |
| `nodeReadinessRules` | Custom NodeReadinessRule resources to create. When validating webhooks are enabled, apply rules after the webhook is ready. | `[]` |
Loading