- Red Hat OpenShift Day-2 Operations
- OpenShift Identity Providers
- Node Configurations
- Troubleshooting
- Nested Virtualization
- Replacing the default Ingress Certificate
- OpenShift Web Console Customizations
- Registry Authentication
- Activate Internal Registry
- Quick NFS Storage
- USB Client Passthrough
- Backup and restore OpenShift Cluster
- Pod with external NetworkAccess
- Egress IP
- VirtualMachinePool (VMPool)
- NFS Volume Mount
- OpenShift Cluster Monitoring - Also relevant for Workload Availability Operators
- Enable User Workload Monitoring - Also relevant for Workload Availability Operators
- Virtualization Workload High-Availability
- Load-aware rebalancing using the Kubernetes Descheduler
- User-Workload Monitoring with Grafana
- Expose MetalLB to other than default MachineNetwork
- Ingress Sharding
- OpenShift Virtualization (KubeVirt) Checkups
- Micro-Segmentation VMs using MultiNetworkPolicies
Docs: Configuring an htpasswd identity provider
Step 1: Create an htpasswd file to store the user and password information:
htpasswd -c -B -b users.htpasswd rguske <password>
Add a new user to the file:
htpasswd -bB users.htpasswd rbohne 'r3dh4t1!'
htpasswd -bB users.htpasswd devuser 'r3dh4t1!'
Remove an existing user:
htpasswd -D users.htpasswd <username>
Replacing an updated users.htpasswd file:
oc create secret generic htpass-secret --from-file=htpasswd=users.htpasswd --dry-run=client -o yaml -n openshift-config | oc replace -f -
Step 2: Create a Kubernetes secret:
oc create secret generic htpass-secret-rguske --from-file=htpasswd=<path_to_rguske.htpasswd> -n openshift-config
oc create secret generic htpass-secret-devuser --from-file=htpasswd=<path_to_devuser.htpasswd> -n openshift-config
This can also be done using the OpenShift User Interface:
oc get secret htpass-secret -ojsonpath={.data.htpasswd} -n openshift-config | base64 --decode > users.htpasswd
Docs: Using RBAC to define and apply permissions
Add cluster-wide admin priviledges to e.g. user rguske:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rguske-cluster-admin
subjects:
- kind: User
apiGroup: rbac.authorization.k8s.io
name: rguske
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-adminAlternatively via the WebUi:
Configuring an LDAP identity provider
To use the identity provider, you must define an OpenShift Container Platform Secret object that contains the bindPassword field.
oc create secret generic ldap-secret \
--from-literal=bindPassword='r3dh4t1!' \
-n openshift-configIdentity providers use OpenShift Container Platform ConfigMap objects in the openshift-config namespace to contain the certificate authority bundle. These are primarily used to contain certificate bundles needed by the identity provider.
oc create configmap ca-config-map \
--from-file=ca.crt=/path/to/ca \
-n openshift-configThere's also an option to skip the certificate verification:
insecure: trueVia MMC (Microsoft Management Console): Open MMC:
Press Win + R, type mmc, press Enter.
Add the Certificates Snap-in:
In MMC, go to File > Add/Remove Snap-in. Select Certificates, click Add. Choose Computer account, then Local computer, click Finish. Navigate to the Certificate:
Expand Certificates (Local Computer). Look under: Personal > Certificates for most service-related certs. Web Hosting > Certificates for IIS SSL certs. Export the Certificate:
Right-click the certificate > All Tasks > Export. Use the Certificate Export Wizard. Choose Yes, export the private key if needed (e.g., for backup or moving). Choose format: .PFX (with private key), or .CER (public cert only).
Validate the bind user and the appropriate configuration using ldapsearch:
ldapsearch -x -H ldap://jarvisnas.jarvis.lab \
-D "uid=root,cn=users,dc=ldap,dc=jarvis,dc=lab" \
-b "dc=ldap,dc=jarvis,dc=lab" \
-W "(objectClass=*)"Creating the LDAP CR:
The following custom resource (CR) shows the parameters and acceptable values for an LDAP identity provider.
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: ldapidp
mappingMethod: claim
type: LDAP
ldap:
attributes:
id:
- dn
email:
- mail
name:
- cn
preferredUsername:
- uid
bindDN: "sa-ldap-bind"
bindPassword:
name: ldap-bind-password-qrzn9
# ca:
# name: ca-config-map
insecure: true
url: "ldap://w2k19-dc.rguske.coe.muc.redhat.com/DC=rguske,DC=coe,DC=muc,DC=redhat,DC=com?sAMAccountName"Docs: Configuring chrony time service
You can set the time server and related settings used by the chrony time service (chronyd) by modifying the contents of the
chrony.conffile.
Create a Butane config including the contents of the chrony.conf file. For example, to configure chrony on worker nodes, create a 99-worker-chrony.bu file.
- Download Butane:
curl -LO https://mirror.openshift.com/pub/openshift-v4/amd64/clients/butane/v0.26.0-1/butane-amd64
tee 99-worker-chrony.bu > /dev/null <<'EOF'
variant: openshift
version: 4.20.0
metadata:
name: 99-worker-chrony-configuration
labels:
machineconfiguration.openshift.io/role: worker
storage:
files:
- path: /etc/chrony.conf
mode: 0644
overwrite: true
contents:
inline: |
pool NTPSERVER iburst
pool NTPSERVER iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF- Create the yaml manifest
butane 99-worker-chrony-configuration.bu -o 99-worker-chrony-configuration.yaml
Create the file and wait until all nodes are restarted. Check MachineConfigPools for the status.
Apply the config: oc apply -f 99-worker-chrony-configuration.yaml
Alternatively to butane:
chronybase64=$(cat << EOF | base64 -w 0
server NTPSERVER iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
)
oc apply -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-chrony
spec:
config:
ignition:
version: 2.2.0
storage:
files:
- contents:
source: data:text/plain;charset=utf-8;base64,${chronybase64}
filesystem: root
mode: 0644
path: /etc/chrony.conf
EOFRed Hat KB6148012 - How to schedule pod on master node where scheduling is disabled?
oc get scheduler cluster -oyaml
apiVersion: config.openshift.io/v1
kind: Scheduler
metadata:
creationTimestamp: "2025-01-28T15:20:20Z"
generation: 1
name: cluster
resourceVersion: "542"
uid: 59f6fef1-e88a-484a-8e3c-fa38e6e300b3
spec:
mastersSchedulable: false
policy:
name: ""
status: {}
Edit the scheduler CR and configure the spec: mastersSchedulable: true
oc get nodes
NAME STATUS ROLES AGE VERSION
ocp1-h5ggj-master-0 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-master-1 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-master-2 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-worker-0 Ready worker 2d18h v1.30.6
ocp1-h5ggj-worker-1 Ready worker 2d18h v1.30.6
Creating must-gather with more details for specific components in OCP 4
Data Collection Audit logs:
oc adm must-gather -- /usr/bin/gather_audit_logs
Default must-gather including the audit logs:
oc adm must-gather -- '/usr/bin/gather && /usr/bin/gather_audit_logs'
OCPV:
oc adm must-gather --image-stream=openshift/must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel[8,9]:[operator_version]
The [8,9] should be replaced based on the version of OCP 4.12 uses rhel8, and OCP 4.13 and later uses rhel9. The [operator_version] tag should be in format v4.y.z.
Examples - 4.17: oc adm must-gather --image-stream=openshift/must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8:v4.17.4
oc adm must-gather \
--image-stream=openshift/must-gather \
--image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel9:v4.17.4 \
--image=registry.redhat.io/workload-availability/node-healthcheck-must-gather-rhel9:v0.9.0
How to generate a sosreport within nodes without SSH in OCP 4
oc get nodes
NAME STATUS ROLES AGE VERSION
ocp1-h5ggj-master-0 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-master-1 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-master-2 Ready control-plane,master,worker 2d19h v1.30.6
ocp1-h5ggj-worker-0 Ready worker 2d18h v1.30.6
ocp1-h5ggj-worker-1 Ready worker 2d18h v1.30.6
Then, create a debug session with oc debug node/ (in this example oc debug node/node-1). The debug session will spawn a pod using the tools image from the release (which doesn't contain sos):
oc debug node/ocp1-h5ggj-master-0
chroot /host bash
[root@ocp1-h5ggj-master-0 /]# cat /etc/redhat-release
Red Hat Enterprise Linux CoreOS release 4.17
$ toolbox
Trying to pull registry.redhat.io/rhel9/support-tools:latest...
Getting image source signatures
Checking if image destination supports signatures
Copying blob facf1e7dd3e0 done |
Copying blob a0e56de801f5 done |
Copying blob ec465ce79861 done |
Copying blob cbea42b25984 done |
Copying config a627accb68 done |
Writing manifest to image destination
Storing signatures
a627accb682adb407580be0d7d707afbcb90abf2f407a0b0519bacafa15dd409
Spawning a container 'toolbox-root' with image 'registry.redhat.io/rhel9/support-tools'
Detected RUN label in the container image. Using that as the default...
ebf4dd2b82bf8ebeab55291c8ca195b61e13c9fc5d8dfb095f5fdcbcdabae2df
toolbox-root
Container started successfully. To exit, type 'exit'.
sosreport -e openshift -k crio.all=on -k crio.logs=on -k podman.all=on -k podman.logs=on --all-logs
How to set the CPU model to Passthrough in OpenShift Virtualization?
oc create -f - <<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
annotations:
labels:
app: rhel9-pod-bridge
kubevirt.io/dynamic-credentials-support: "true"
name: rhel9-pod-bridge
spec:
dataVolumeTemplates:
- apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: rhel9-pod-bridge
spec:
sourceRef:
kind: DataSource
name: rhel9
namespace: openshift-virtualization-os-images
storage:
accessModes:
- ReadWriteMany
storageClassName: thin-csi
resources:
requests:
storage: 30Gi
running: false
template:
metadata:
annotations:
vm.kubevirt.io/flavor: tiny
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
kubevirt.io/allow-pod-bridge-network-live-migration: ""
labels:
kubevirt.io/domain: rhel9-pod-bridge
kubevirt.io/size: tiny
spec:
domain:
cpu:
model: host-passthrough
cores: 1
sockets: 1
threads: 1
devices:
disks:
- disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- bridge: {}
name: default
machine:
type: pc-q35-rhel9.2.0
memory:
guest: 1.5Gi
networks:
- name: default
pod: {}
terminationGracePeriodSeconds: 180
volumes:
- dataVolume:
name: rhel9-pod-bridge
name: rootdisk
- cloudInitNoCloud:
userData: |-
#cloud-config
user: cloud-user
password: redhat
chpasswd: { expire: False }
name: cloudinitdisk
EOFOther sources:
OpenShift Virtualization reports no nodes are available, cannot start VMs
Nested virtualization in OpenShift Virtualization
Enable features on vSphere:
Prerequisites:
- You must have a wildcard certificate for the fully qualified .apps subdomain and its corresponding private key. Each should be in a separate PEM format file.
- The private key must be unencrypted. If your key is encrypted, decrypt it before importing it into OpenShift Container Platform.
- The certificate must include the subjectAltName extension showing *.apps...
- The certificate file can contain one or more certificates in a chain. The wildcard certificate must be the first certificate in the file. It can then be followed with any intermediate certificates, and the file should end with the root CA certificate.
- Copy the root CA certificate into an additional PEM format file.
- Verify that all certificates which include -----END CERTIFICATE----- also end with one carriage return after that line.
Create a config map that includes only the root CA certificate used to sign the wildcard certificate:
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: user-ca-bundle
namespace: openshift-config
data:
ca-bundle.crt: |
# MyPrivateCA (root.crt)
-----BEGIN CERTIFICATE-----
zzzzz
-----END CERTIFICATE-----
EOFUpdate the cluster-wide proxy configuration with the newly created config map:
oc patch proxy/cluster \
--type=merge \
--patch='{"spec":{"trustedCA":{"name":"user-ca-bundle"}}}'
Create a secret that contains the wildcard certificate chain and key:
oc create secret tls ocp1-wildcard-cert \
--cert='/Users/rguske/Downloads/ocp1.rguske/chain.crt' \
--key='/Users/rguske/Downloads/ocp1.rguske/key.key' \
-n openshift-ingress
Update the Ingress Controller configuration with the newly created secret:
// Replace the secret name
oc patch ingresscontroller.operator default \
--type=merge -p \
'{"spec":{"defaultCertificate": {"name": "ocp1-wildcard-cert"}}}' \
-n openshift-ingress-operator
Watch the ClusterOperator (co) for the status update.
Docs - Customizing the web console in OpenShift Container Platform
oc create configmap console-custom-logo --from-file /path/to/console-custom-logo.png -n openshift-config
oc create configmap console-custom-logo --from-file '/Users/rguske/Documents/ironman.jpg' -n openshift-config
Edit the web console’s Operator configuration to include customLogoFile and customProductName:
oc edit consoles.operator.openshift.io cluster
apiVersion: operator.openshift.io/v1
kind: Console
metadata:
name: cluster
spec:
customization:
customLogoFile:
key: ironman.jpg
name: console-custom-logo
customProductName: My ConsoleOnce the Operator configuration is updated, it will sync the custom logo config map into the console namespace, mount it to the console pod, and redeploy.
Validate: oc get clusteroperator console
Docs - Customizing the login page
Run the following commands to create templates you can modify:
oc adm create-login-template > login.html
Alternatively, adjust the existing login.html and or provider.html.
Export the existing login.html and provider.html:
POD=$(oc get pods -n openshift-authentication -o name | head -n 1)
oc exec -n openshift-authentication "$POD" -- cat /var/config/system/secrets/v4-0-config-system-ocp-branding-template/login.html > login.html
oc exec -n openshift-authentication "$POD" -- cat /var/config/system/secrets/v4-0-config-system-ocp-branding-template/providers.html > providers.html
Choose an image ewhich you'd like to use for the replacement. Encode the the image into base64. Base64 Guru helps.
Replace the base64 value in the login.html. Search for background-image:url(data:image/, pay attention to the file format (png, svg, jpg), adjust it if necessary and replace the base64 value of the image.
Create the secrets:
oc -n openshift-config get secret
NAME TYPE DATA AGE
etcd-client kubernetes.io/tls 2 8d
htpasswd-dm9mt Opaque 1 6d1h
initial-service-account-private-key Opaque 1 8d
pull-secret kubernetes.io/dockerconfigjson 1 8d
webhook-authentication-integrated-oauth Opaque 1 8d
oc create secret generic login-template --from-file=login.html -n openshift-config
oc create secret generic providers-template --from-file=providers.html -n openshift-config
Edit the oauth CR:
oc edit oauths cluster
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
# ...
spec:
templates:
error:
name: error-template
login:
name: login-template
providerSelection:
name: providers-templateAfter editing the CR, the pods within the openshift-authentication namespace will be redeployed.
oc -n openshift-authentication get pods -w
NAME READY STATUS RESTARTS AGE
oauth-openshift-8c7859b9f-fwsnl 1/1 Running 0 6m55s
oauth-openshift-8c7859b9f-kp8rw 1/1 Running 0 7m53s
oauth-openshift-8c7859b9f-qw7wl 1/1 Running 0 7m25s
oauth-openshift-8c7859b9f-kp8rw 1/1 Terminating 0 8m42s
oauth-openshift-664fbb9d49-r5bzk 0/1 Pending 0 0s
oauth-openshift-664fbb9d49-r5bzk 0/1 Pending 0 0s
oauth-openshift-8c7859b9f-kp8rw 0/1 Terminating 0 9m8s
oauth-openshift-664fbb9d49-r5bzk 0/1 Pending 0 26s
oauth-openshift-664fbb9d49-r5bzk 0/1 Pending 0 26s
oauth-openshift-664fbb9d49-r5bzk 0/1 ContainerCreating 0 26s
oauth-openshift-8c7859b9f-kp8rw 0/1 Terminating 0 9m8s
oauth-openshift-8c7859b9f-kp8rw 0/1 Terminating 0 9m8s
oauth-openshift-664fbb9d49-r5bzk 0/1 ContainerCreating 0 27s
oauth-openshift-664fbb9d49-r5bzk 0/1 Running 0 27s
oauth-openshift-664fbb9d49-r5bzk 1/1 Running 0 28s
oc create secret docker-registry docker-hub \
--docker-server=docker.io \
--docker-username= \
--docker-password='' \
--docker-email=''
oc secrets link default docker-hub --for=pull
Docs - Changing the image registry’s management state
You need to first activate the Internal Registry by changing its state to managed. To start the image registry, you must change the Image Registry Operator configuration’s managementState from Removed to Managed.
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed"}}'
Default:
oc get configs.imageregistry.operator.openshift.io cluster
NAME AGE
cluster 36d
oc get configs.imageregistry.operator.openshift.io cluster -oyaml | grep managementState
managementState: Removed
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed"}}'
config.imageregistry.operator.openshift.io/cluster patched
oc get configs.imageregistry.operator.openshift.io cluster -oyaml | grep managementState
managementState: Managed
Docs - Image registry storage configuration
Verify that you do not have a registry pod:
oc get pod -n openshift-image-registry -l docker-registry=default
Edit the cluster operator:
oc edit configs.imageregistry.operator.openshift.io
Adjust the storage section accordingly. Leave the claim field blank to allow the automatic creation of an image-registry-storage persistent volume claim (PVC).
[...]
storage:
pvc:
claim:
[...]Docs - Enable the Image Registry default route with the Custom Resource Definition
In OpenShift Container Platform, the Registry Operator controls the OpenShift image registry feature. The Operator is defined by the configs.imageregistry.operator.openshift.io Custom Resource Definition (CRD).
If you need to automatically enable the Image Registry default route, patch the Image Registry Operator CRD.
oc patch configs.imageregistry.operator.openshift.io/cluster --type merge -p '{"spec":{"defaultRoute":true}}'
oc -n openshift-image-registry get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
default-route default-route-openshift-image-registry.apps.ocp-mk1.jarvis.lab image-registry <all> reencrypt None
Docs - Exposing a default registry manually
podman login -u rguske -p $(oc whoami -t) --tls-verify=false $HOST
Login Succeeded!
With Certificate:
oc extract secret/$(oc get ingresscontroller -n openshift-ingress-operator default -o json | jq '.spec.defaultCertificate.name // "router-certs-default"' -r) -n openshift-ingress --confirm
sudo mv tls.crt /etc/pki/ca-trust/source/anchors/
sudo update-ca-trust enable
Create a Secret using the exracted certificates:
oc create secret tls public-route-tls \
-n openshift-image-registry \
--cert=/Users/rguske/Downloads/tls.crt \
--key=/Users/rguske/Downloads/tls.key
Configure the Operator using oc edit configs.imageregistry.operator.openshift.io/cluster
routes:
- name: public-routes
hostname: default-route-openshift-image-registry.apps.ocp-mk1.jarvis.lab
secretName: public-route-tlsIn can be handy to have a NFS backend storage for an OpenShift cluster available quickly. The following instructions guides you through the installation of a NFS server installed on a RHEL bastion host.
Install the NFS package and activate the service:
dnf install nfs-utils -y
systemctl enable nfs-server.service
systemctl start nfs-server.service
systemctl status nfs-server.service
Create the directory in which the Persistent Volumes will be stored in:
mkdir /srv/nfs-storage-pv-user-pvs
chmod g+w /srv/nfs-storage-pv-user-pvs
Configure the folder as well as the network CIDR for the systems which are accessing the NFS server:
vi /etc/exports
/srv/nfs-storage-pv-user-pvs 10.198.15.0/24(rw,sync,no_root_squash)
systemctl restart nfs-server
exportfs -arv
exportfs -s
Configure the firewall on the RHEL accordingly:
firewall-cmd --permanent --add-service=nfs
firewall-cmd --permanent --add-service=rpc-bind
firewall-cmd --permanent --add-service=mountd
firewall-cmd --reload
# Add Helm repo
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
# List versions
helm search repo -l csi-driver-nfs
Install the NFS provisioner:
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --version 4.11.0 \
--create-namespace \
--namespace csi-driver-nfs \
--set controller.runOnControlPlane=true \
--set controller.replicas=2 \
--set controller.strategyType=RollingUpdate \
--set externalSnapshotter.enabled=true \
--set externalSnapshotter.customResourceDefinitions.enabled=false
For a SNO setup:
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --version 4.11.0 \
--create-namespace \
--namespace csi-driver-nfs \
--set controller.runOnControlPlane=true \
--set controller.strategyType=RollingUpdate \
--set externalSnapshotter.enabled=true \
--set externalSnapshotter.customResourceDefinitions.enabled=false
Grant additional permissions to the ServiceAccounts:
oc adm policy add-scc-to-user privileged -z csi-nfs-node-sa -n csi-driver-nfs
oc adm policy add-scc-to-user privileged -z csi-nfs-controller-sa -n csi-driver-nfs
Create a StorageClass:
oc apply -f https://raw.githubusercontent.com/rguske/openshift-day-two/refs/heads/main/manifests/nfs-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-csi
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io
parameters:
server: 10.10.42.20 ### NFS server's IP/FQDN
share: /volume1/nfs_ds/ocp ### NFS server's exported directory
subDir: ${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name} ### Folder/subdir name template
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: trueCreate a SnapshotClass:
oc apply -f https://raw.githubusercontent.com/rguske/openshift-day-two/refs/heads/main/manifests/nfs-volumesnapshotclass.yaml
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
deletionPolicy: Delete
driver: nfs.csi.k8s.io
metadata:
name: csi-nfs-snapclassSet the StorageClass to default:
oc annotate storageclass/nfs-csi storageclass.kubernetes.io/is-default-class=true
We need a NFS provisioner in order to consume the NFS service. Create the following OpenShift template and make sure to adjust the IP address as well as the path to the NFS folder accordingly at the end of the file:
Example:
- name: NFS_SERVER
required: true
value: xxx.xxx.xxx.xxx ## IP of the host which runs the NFS server
- name: NFS_PATH
required: true
value: /srv/nfs-storage-pv-user-pvs ## folder which was configured on the NFS serverCreate the template:
tee nfs-provisioner-template.yaml > /dev/null <<'EOF'
apiVersion: template.openshift.io/v1
kind: Template
labels:
template: nfs-client-provisioner
message: 'NFS storage class ${STORAGE_CLASS} created.'
metadata:
annotations:
description: nfs-client-provisioner
openshift.io/display-name: nfs-client-provisioner
openshift.io/provider-display-name: Tiger Team
tags: infra,nfs
template.openshift.io/documentation-url: nfs-client-provisioner
template.openshift.io/long-description: nfs-client-provisioner
version: 0.0.1
name: nfs-client-provisioner
objects:
- kind: Namespace
apiVersion: v1
metadata:
name: ${TARGET_NAMESPACE}
- kind: ServiceAccount
apiVersion: v1
metadata:
name: nfs-client-provisioner
namespace: ${TARGET_NAMESPACE}
- kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
namespace: ${TARGET_NAMESPACE}
roleRef:
kind: ClusterRole
name: nfs-client-provisioner-runner
apiGroup: rbac.authorization.k8s.io
- kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner
namespace: ${TARGET_NAMESPACE}
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["security.openshift.io"]
resourceNames: ["hostmount-anyuid"]
resources: ["securitycontextconstraints"]
verbs: ["use"]
- kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner
namespace: ${TARGET_NAMESPACE}
subjects:
- kind: ServiceAccount
name: nfs-client-provisioner
roleRef:
kind: Role
name: nfs-client-provisioner
apiGroup: rbac.authorization.k8s.io
- kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
namespace: ${TARGET_NAMESPACE}
spec:
replicas: 1
selector:
matchLabels:
app: nfs-client-provisioner
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: ${PROVISIONER_IMAGE}
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: ${PROVISIONER_NAME}
- name: NFS_SERVER
value: ${NFS_SERVER}
- name: NFS_PATH
value: ${NFS_PATH}
volumes:
- name: nfs-client-root
nfs:
server: ${NFS_SERVER}
path: ${NFS_PATH}
- apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: managed-nfs-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: ${PROVISIONER_NAME}
parameters:
archiveOnDelete: "false"
parameters:
- description: Target namespace where nfs-client-provisioner will run.
displayName: Target namespace
name: TARGET_NAMESPACE
required: true
value: openshift-nfs-provisioner
- name: NFS_SERVER
required: true
value: xxx.xxx.xxx.xxx ## IP of the host which runs the NFS server
- name: NFS_PATH
required: true
value: /srv/nfs-storage-pv-user-pvs ## folder which was configured on the NFS server
- name: PROVISIONER_IMAGE
value: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
- name: PROVISIONER_NAME
value: "nfs-client-provisioner"
EOFDeploy the template: oc process -f nfs-provisioner-template.yaml | oc apply -f -
oc -n test1 create -f - <<EOF
kind: Deployment
apiVersion: apps/v1
metadata:
name: ubi9
spec:
replicas: 1
selector:
matchLabels:
app: ubi9
template:
metadata:
creationTimestamp: null
labels:
app: ubi9
spec:
storageClassName: managed-nfs-storage
volumes:
- name: pvc
persistentVolumeClaim:
claimName: pvc
containers:
- name: ubi
image: 'registry.access.redhat.com/ubi9/ubi-micro:latest'
volumeMounts:
- name: pvc
mountPath: /pvc
command:
- /bin/sh
- '-c'
- |
sleep infinity
EOFCreate the first PersistentVolumeClaim either via the OpenShift Webconsole or via oc:
oc -n test create -f - <<EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: managed-nfs-storage
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
EOFOption 1:
Installing the Local Storage Operator
Option 2:
Logical Volume Manager Storage installation
Installation via yaml:
oc apply -f https://raw.githubusercontent.com/rguske/openshift-day-two/refs/heads/main/manifests/lvm-storage-operator.yaml
Via Operator Web Console
Install the Logical Volume Cluster only including the SSD with the by-path identifier:
ls -li /dev/disk/by-path
oc apply -f https://raw.githubusercontent.com/rguske/openshift-day-two/refs/heads/main/manifests/lvmcluster.yaml
Create a test pvc:
oc create -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lvm-block-1
namespace: default
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block
resources:
requests:
storage: 10Gi
limits:
storage: 20Gi
storageClassName: lvms-vg1
EOFNot supported in Red Hat OpenShift Virtualization!
From the official docs:
Support for redirection of client's USB device was introduced in release v0.44. This feature is not enabled by default. To enable it, add an empty clientPassthrough under devices, as such:
spec:
domain:
devices:
clientPassthrough: {}There are two ways of redirecting the same USB devices: Either using its device's vendor and product information or the actual bus and device address information. In Linux, you can gather this info with lsusb, a redacted example below:
Connect an USB device like e.g. an external CD-Rom device. I've connected it to my MacBook, installed lsusb via brew and checked for the Vendor ID and Product ID.
lsusb
[...]
Bus 002 Device 001: ID 0e8d:1806 MediaTek Inc. MT1806 Serial: R8RY6GAC60008Y
[...]Connect to your VM running on OpenShift Virtualization.
virtctl console rguske-rhel9
Successfully connected to rguske-rhel9 console. The escape sequence is ^]
rguske-rhel9 login:
[cloud-user@rguske-rhel9 ~]$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hubOn your local machine, install virtctl and the usbredir. I've installed both using brew.
sudo virtctl usbredir 0e8d:1806 rguske-rhel9
{"component":"portforward","level":"info","msg":"port_arg: '127.0.0.1:49275'","pos":"client.go:166","timestamp":"2025-03-26T10:19:43.292294Z"}
{"component":"portforward","level":"info","msg":"args: '[--device 0e8d:1806 --to 127.0.0.1:49275]'","pos":"client.go:167","timestamp":"2025-03-26T10:19:43.293541Z"}
{"component":"portforward","level":"info","msg":"Executing commandline: 'usbredirect [--device 0e8d:1806 --to 127.0.0.1:49275]'","pos":"client.go:168","timestamp":"2025-03-26T10:19:43.293591Z"}
{"component":"portforward","level":"info","msg":"Connected to usbredirect at 610.549083ms","pos":"client.go:132","timestamp":"2025-03-26T10:19:43.903058Z"}The output will show the redirection to your Virtual Machine.
On your target VM, you'll notice:
[151999.488527] usb 1-1: new high-speed USB device number 9 using xhci_hcd
[152000.279607] usb 1-1: New USB device found, idVendor=0e8d, idProduct=1806, bcdDevice= 0.00
[152000.280126] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[152000.280490] usb 1-1: Product: MT1806
[152000.280786] usb 1-1: Manufacturer: MediaTek Inc
[152000.281075] usb 1-1: SerialNumber: R8RY6GAC60008Y
[152000.548218] usb-storage 1-1:1.0: USB Mass Storage device detected
[152000.551594] scsi host7: usb-storage 1-1:1.0
[152001.907628] scsi 7:0:0:0: CD-ROM ASUS SDRW-08D3S-U F201 PQ: 0 ANSI: 0
[152002.595801] sr 7:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray
[152003.026401] sr 7:0:0:0: Attached scsi generic sg0 type 5Using lsusb will show the connected device:
[cloud-user@rguske-rhel9 ~]$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 009: ID 0e8d:1806 MediaTek Inc. Samsung SE-208 Slim Portable DVD Writer
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root huboc debug --as-root node/ocp-mk1.jarvis.lab
To use host binaries, run `chroot /host`. Instead, if you need to access host namespaces, run `nsenter -a -t 1`.
Pod IP: 192.168.42.2
If you don't see a command prompt, try pressing enter.
sh-5.1#
- Change your root directory to
/hostin the debug shell:
chroot /host
- If proxy is in use:
export HTTP_PROXY=http://<your_proxy.example.com>:8080
export HTTPS_PROXY=https://<your_proxy.example.com>:8080
export NO_PROXY=<example.com>
- Run the
cluster-backup.shscript:
The cluster-backup.sh script is maintained as a component of the etcd Cluster Operator and is a wrapper around the etcdctl snapshot save command.
/usr/local/bin/cluster-backup.sh /home/core/assets/backup
Starting pod/ocp-mk1jarvislab-debug-t6x4m ...
To use host binaries, run `chroot /host`. Instead, if you need to access host namespaces, run `nsenter -a -t 1`.
Pod IP: 192.168.42.2
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# /usr/local/bin/cluster-backup.sh /home/core/assets/backup
Certificate /etc/kubernetes/static-pod-certs/configmaps/etcd-all-bundles/server-ca-bundle.crt is missing. Checking in different directory
Certificate /etc/kubernetes/static-pod-resources/etcd-certs/configmaps/etcd-all-bundles/server-ca-bundle.crt found!
found latest kube-apiserver: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-14
found latest kube-controller-manager: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-5
found latest kube-scheduler: /etc/kubernetes/static-pod-resources/kube-scheduler-pod-5
found latest etcd: /etc/kubernetes/static-pod-resources/etcd-pod-2
56518b777f31c161916f516b21725a562461218761fbf03224014afd83c3e589
etcdctl version: 3.5.21
API version: 3.5
{"level":"info","ts":"2025-09-15T08:35:18.813919Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/home/core/assets/backup/snapshot_2025-09-15_083517.db.part"}
{"level":"info","ts":"2025-09-15T08:35:18.823486Z","logger":"client","caller":"v3@v3.5.21/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2025-09-15T08:35:18.823575Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://192.168.42.2:2379"}
{"level":"info","ts":"2025-09-15T08:35:21.367146Z","logger":"client","caller":"v3@v3.5.21/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2025-09-15T08:35:22.483373Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://192.168.42.2:2379","size":"289 MB","took":"3 seconds ago"}
{"level":"info","ts":"2025-09-15T08:35:22.484415Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/home/core/assets/backup/snapshot_2025-09-15_083517.db"}
Snapshot saved at /home/core/assets/backup/snapshot_2025-09-15_083517.db
{"hash":2597648169,"revision":49148119,"totalKey":15553,"totalSize":288808960}
snapshot db and kube resources are successfully saved to /home/core/assets/backup
- Two files saved into
/home/core/assets/backup
ls /home/core/assets/backup
snapshot_2025-09-15_083517.db static_kuberesources_2025-09-15_083517.tar.gz
- snapshot_.db: This file is the etcd snapshot. The cluster-backup.sh script confirms its validity.
- static_kuberesources_.tar.gz: This file contains the resources for the static pods. If etcd encryption is enabled, it also contains the encryption keys for the etcd snapshot.
- Example Executed Once:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
featureSet: TechPreviewNoUpgradeapiVersion: operator.openshift.io/v1alpha1
kind: EtcdBackup
metadata:
name: etcd-single-backup
namespace: openshift-etcd
spec:
pvcName: etcd-backup-pvc- Example Scheduled Executions
apiVersion: config.openshift.io/v1alpha1
kind: Backup
metadata:
name: etcd-recurring-backup
spec:
etcd:
schedule: "20 4 * * *"
timeZone: "UTC"
pvcName: etcd-backup-pvcWorking example:
apiVersion: v1
kind: Pod
metadata:
name: rhel-support-tools-localnet-50
namespace: default
annotations:
k8s.v1.cni.cncf.io/networks: |
[{
"name": "localnet-50",
"interface": "net1",
"ips": [ "192.168.xxx.xxx/24" ],
"gateway": [ "192.168.xxx.1" ],
"default-route": ["192.168.xxx.1"]
}]
spec:
containers:
- name: rhel-support-tools
image: registry.redhat.io/rhel9/support-tools:9.7
command: ["/bin/bash","-c","sleep infinity"]Node --> Pod (EgressIP) --curl--> external Webserver
- install Podman on your jumphost
sudo install -y podman
- start a simple nginx pod:
podman run -ti --rm -p 8080:8080 quay.io/openshift-examples/simple-http-server:latest
- configure the RHEL firewall:
sudo firewall-cmd --permanent --zone=public --add-port=8080/tcp
- Egress for worker nodes:
oc get nodes -l node-role.kubernetes.io/worker
ocp-mk42-cp1.jarvislab.guske.io
ocp-mk42-cp2.jarvislab.guske.io
- Label the worker nodes:
for node in $(oc get nodes -o jsonpath='{.items[*].metadata.name}'); do echo ${node} ; oc label node/${node} k8s.ovn.org/egress-assignable="" ; done
- Create Egress object:
oc apply -f - <<EOF
apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
name: egress-poc
spec:
egressIPs:
- 192...
namespaceSelector:
matchLabels:
egress: poc
EOF- rollout a test deployment
oc new-project poc-egress
oc apply -k git@github.com:openshift-examples/kustomize/components/simple-http-server
oc rsh deployment/simple-http-server
curl -i http://192.168...:8080
- label the namespace:
oc label namespace/poc-egress egress=poc
apiVersion: pool.kubevirt.io/v1alp1
kind: VirtualMachinePool
metadata:
name: vm-pool-cirros
namespace: eventing
spec:
replicas: 0
selector:
matchLabels:
kubevirt.io/vmpool: vm-pool-cirros
virtualMachineTemplate:
metadata:
labels:
kubevirt.io/vmpool: vm-pool-cirros
spec:
template:
metadata:
labels:
kubevirt.io/vmpool: vm-pool-cirros
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: containerdisk
resources:
requests:
memory: 128Mi
volumes:
- containerDisk:
image: 'docker.io/kubevirt/cirros-container-disk-demo:latest'
name: containerdisk- create PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
storageClassName: "storageClass"
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
nfs:
path: "/fs/ess/group/openshift_test"
server: "xxx.xxx.xxx.xxx"
readOnly: false- create pvc
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nfs-pvc
spec:
storageClassName: "storageClass"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
volumeName: nfs-pv- create deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-mounter
labels:
app: nfs-mounter
spec:
selector:
matchLabels:
app: nfs-mounter
template:
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: |
[{
"name": "localnet-550",
"namespace": "default",
"interface": "net1",
"ips": ["10.xxx.xxx.xxx/23"],
"gateway": ["10.xxx.xxx.1"],
"default-route": ["10.xxx.xxx.1"],
"dns": {"nameservers": ["xxx.xxx.xxx.xxx"]}
}]
labels:
app: nfs-mounter
spec:
volumes:
- name: nfs-vol
persistentVolumeClaim:
claimName: nfs-pvc
containers:
- name: app
image: registry.redhat.io/rhel9/support-tools:9.7
command: ["/bin/sh", "-c", "sleep infinity"]
volumeMounts:
- mountPath: /mnt/vol1
name: nfs-volSource:
Check if Cluster Monitoring exists:
oc -n openshift-monitoring get configmap cluster-monitoring-config
If not, create the ConfigMap
oc create -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
EOFThe ConfigMap can be adjusted in order to be compliant with various requirements. Example configuration to specify the resource for the components:
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
alertmanagerMain:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
prometheusK8s:
resources:
limits:
cpu: 500m
memory: 3Gi
requests:
cpu: 200m
memory: 500Mi
thanosQuerier:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
prometheusOperator:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
metricsServer:
resources:
requests:
cpu: 10m
memory: 50Mi
limits:
cpu: 50m
memory: 500Mi
kubeStateMetrics:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
telemeterClient:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
openshiftStateMetrics:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
nodeExporter:
resources:
limits:
cpu: 50m
memory: 150Mi
requests:
cpu: 20m
memory: 50Mi
monitoringPlugin:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
prometheusOperatorAdmissionWebhook:
resources:
limits:
cpu: 50m
memory: 100Mi
requests:
cpu: 20m
memory: 50Mi
EOFStorage for the ClusterMonitoring:
Important Do not use a raw block volume, which is described with volumeMode: Block in the PersistentVolume resource. Prometheus cannot use raw block volumes. Prometheus does not support file systems that are not POSIX compliant. For example, some NFS file system implementations are not POSIX compliant. If you want to use an NFS file system for storage, verify with the vendor that their NFS implementation is fully POSIX compliant.
Update the ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
retention: 96h
retentionSize: 180GB
volumeClaimTemplate:
spec:
storageClassName: kubevirt-odf-replica-two-file
resources:
requests:
storage: 40GiThe STS is updated and the pods were restarted:
time=2026-02-23T12:22:06.277Z level=INFO source=main.go:1540 msg="Completed loading of configuration file" db_storage=21.822µs remote_storage=4.436µs web_handler=1.538µs query_engine=2.722µs scrape=1.012336ms scrape_sd=97.452214ms notify=687.797µs notify_sd=723.528µs rules=2.504338141s tracing=20.569µs filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=2.778510441s
time=2026-02-23T12:22:06.277Z level=INFO source=main.go:1276 msg="Server is ready to receive web requests."
time=2026-02-23T12:22:06.278Z level=INFO source=manager.go:176 msg="Starting rule manager..." component="rule manager"
time=2026-02-23T12:22:07.787Z level=INFO source=main.go:1500 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
time=2026-02-23T12:22:08.591Z level=INFO source=main.go:1540 msg="Completed loading of configuration file" db_storage=4.525µs remote_storage=4.974µs web_handler=1.836µs query_engine=2.785µs scrape=181.497µs scrape_sd=19.437441ms notify=875.227µs notify_sd=22.636µs rules=734.864806ms tracing=16.288µs filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=804.119561ms
Docs: Resizing a persistent volume
Source:
Edit the cluster-monitoring-config ConfigMap object:
oc -n openshift-monitoring edit configmap cluster-monitoring-config
Adjust the cm with the following:
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
enableUserWorkload: trueThe thanos-quierer pods will be restarted.
Verification:
oc -n openshift-user-workload-monitoring get pod
NAME READY STATUS RESTARTS AGE
prometheus-operator-577c9d7bcb-hfpv5 2/2 Running 0 93s
prometheus-user-workload-0 6/6 Running 0 90s
prometheus-user-workload-1 6/6 Running 0 90s
thanos-ruler-user-workload-0 4/4 Running 0 90s
thanos-ruler-user-workload-1 4/4 Running 0 90s
Granting users permission to configure monitoring for user-defined projects
How to create a Token for the Metric ServiceMonitor:
TOKEN="$(oc -n openshift-user-workload-monitoring create token prometheus-user-workload)"
oc -n openshift-workload-availability create secret generic prometheus-user-workload-token \
--from-literal=token="$TOKEN"
Obtain the secret:
oc -n openshift-workload-availability get secret prometheus-user-workload-token -o jsonpath='{.data.token}' | base64 -d
Create the ServiceMonitor:
oc create -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: node-healthcheck-metrics-monitor
namespace: openshift-workload-availability
labels:
app.kubernetes.io/component: controller-manager
spec:
endpoints:
- interval: 30s
port: https
scheme: https
authorization:
type: Bearer
credentials:
name: prometheus-user-workload-token
key: token
tlsConfig:
ca:
configMap:
name: node-healthcheck-ca-bundle # ConfigMap name was wrong in our documentation
key: service-ca.crt
serverName: node-healthcheck-controller-manager-metrics-service.openshift-workload-availability.svc
selector:
matchLabels:
app.kubernetes.io/component: controller-manager
app.kubernetes.io/name: node-healthcheck-operator
app.kubernetes.io/instance: metrics
EOFTo confirm that the configuration is successful the Observe > Targets tab in OCP Web UI shows Endpoint Up.
The following are example metrics from the various workload availability operators.
The metrics include information on the following indicators:
- Operator availability: Showing if and when each Operator is up and running.
- Node remediation count: Showing the number of remediations across the same node, and across all nodes.
- Node remediation duration: Showing the remediation downtime or recovery time.
- Node remediation gauge: Showing the number of ongoing remediations.
Sources:
- OpenShift Virtualization - Fencing and VM High Availability Guide
- Node Health Check
- Docs: Chapter 2. Using Self Node Remediation
- OpenShift Examples
- mdeik8s on GitHub
Note:
- The Self Node Remediation Operator creates the CR by default in the deployment namespace.
- The name for the CR must be self-node-remediation-config.
- You can only have one SelfNodeRemediationConfig CR.
- Deleting the SelfNodeRemediationConfig CR disables Self Node Remediation.
- You can edit the self-node-remediation-config CR that is created by the Self Node Remediation Operator.
Create the namespace:
oc create -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: openshift-workload-availability
EOFDefine the OperatorGroup:
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: workload-availability-operator-group
namespace: openshift-workload-availability
EOFDefine the Subscription for Self-Node-Remediation
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: self-node-remediation-operator
namespace: openshift-workload-availability
spec:
channel: stable
installPlanApproval: Manual
name: self-node-remediation-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
package: self-node-remediation
EOFValidate the new resources:
oc get deployment -n openshift-workload-availability
NAME READY UP-TO-DATE AVAILABLE AGE
node-healthcheck-controller-manager 2/2 2 2 2d17h
node-healthcheck-node-remediation-console-plugin 1/1 1 1 2d17h
self-node-remediation-controller-manager 2/2 2 2 3m6s
oc get selfnoderemediationtemplate -n openshift-workload-availability
NAME AGE
self-node-remediation-automatic-strategy-template 65s
oc get csv -n openshift-workload-availability
NAME DISPLAY VERSION REPLACES PHASE
node-healthcheck-operator.v0.10.1 Node Health Check Operator 0.10.1 node-healthcheck-operator.v0.10.0 Succeeded
self-node-remediation.v0.11.0 Self Node Remediation Operator 0.11.0 self-node-remediation.v0.10.2 Succeeded
oc get selfnoderemediationconfig -n openshift-workload-availability
NAME AGE
self-node-remediation-config 2m55s
oc get SelfNodeRemediationConfig self-node-remediation-config -oyaml
apiVersion: self-node-remediation.medik8s.io/v1alpha1
kind: SelfNodeRemediationConfig
metadata:
name: self-node-remediation-config
namespace: openshift-workload-availability
spec:
apiCheckInterval: 15s
apiServerTimeout: 5s
hostPort: 30001
isSoftwareRebootEnabled: true
maxApiErrorThreshold: 3
minPeersForRemediation: 1
peerApiServerTimeout: 5s
peerDialTimeout: 5s
peerRequestTimeout: 7s
peerUpdateInterval: 15m
watchdogFilePath: /dev/watchdogoc get daemonset -n openshift-workload-availability
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
self-node-remediation-ds 6 6 6 6 6 <none> 2m50s
Start adjusting the SelfNodeRemediationTemplate. Default is:
oc get selfnoderemediationtemplate -n openshift-workload-availability -oyaml
apiVersion: v1
items:
- apiVersion: self-node-remediation.medik8s.io/v1alpha1
kind: SelfNodeRemediationTemplate
metadata:
annotations:
remediation.medik8s.io/multiple-templates-support: "true"
creationTimestamp: "2026-02-23T13:58:44Z"
generation: 1
labels:
remediation.medik8s.io/default-template: "true"
name: self-node-remediation-automatic-strategy-template
namespace: openshift-workload-availability
resourceVersion: "5244533"
uid: a2672726-7692-454c-8ee9-43d33b1ac473
spec:
template:
spec:
remediationStrategy: Automatic
kind: List
metadata:
resourceVersion: ""I'll change it from Automatic to OutOfServiceTaint.
apiVersion: self-node-remediation.medik8s.io/v1alpha1
kind: SelfNodeRemediationTemplate
metadata:
annotations:
remediation.medik8s.io/multiple-templates-support: "true"
labels:
remediation.medik8s.io/default-template: "true"
name: self-node-remediation-automatic-strategy-template
namespace: openshift-workload-availability
spec:
template:
spec:
remediationStrategy: OutOfServiceTaint| Strategy | Description |
|---|---|
| Automatic | This remediation strategy simplifies the remediation process by letting the Self Node Remediation Operator decide on the most suitable remediation strategy for the cluster. This strategy checks if the OutOfServiceTaint strategy is available on the cluster. If the OutOfServiceTaint strategy is available, the Operator selects the OutOfServiceTaint strategy. If the OutOfServiceTaint strategy is not available, the Operator selects the ResourceDeletion strategy. Automatic is the default remediation strategy. |
| OutOfServiceTaint | This remediation strategy implicitly causes the removal of the pods and associated volume attachments on the node, rather than the removal of the node object. It achieves this by placing the OutOfServiceTaint strategy on the node. This strategy has been supported on technology preview since OpenShift Container Platform version 4.13, and on general availability since OpenShift Container Platform version 4.15. |
The Node Health Check Operator detects the health of the nodes in a cluster. The NodeHealthCheck controller creates the NodeHealthCheck custom resource (CR), which defines a set of criteria and thresholds to determine the health of a node.
Important:
Note:
During the upgrade process, nodes in the cluster might become temporarily unavailable and get identified as unhealthy. In the case of worker nodes, when the Operator detects that the cluster is upgrading, it stops remediating new unhealthy nodes to prevent such nodes from rebooting.
Install the Operator via CLI:
oc create -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: openshift-workload-availability
EOFoc create -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: workload-availability-operator-group
namespace: openshift-workload-availability
EOFoc create -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: node-health-check-operator
namespace: openshift-workload-availability
spec:
channel: stable
installPlanApproval: Manual
name: node-health-check-operator
source: redhat-operators
sourceNamespace: openshift-marketplace
package: node-health-check-operator
EOFCreate a NHC CR:
oc create -f - <<EOF
apiVersion: remediation.medik8s.io/v1alpha1
kind: NodeHealthCheck
metadata:
name: worker-availability
spec:
minHealthy: 51%
remediationTemplate:
apiVersion: self-node-remediation.medik8s.io/v1alpha1
kind: SelfNodeRemediationTemplate
name: self-node-remediation-automatic-strategy-template
namespace: openshift-workload-availability
selector:
matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists
values: []
unhealthyConditions:
- duration: 2s
status: 'False'
type: Ready
- duration: 2s
status: Unknown
type: Ready
EOFYou can benefit from descheduling running pods in situations such as the following:
- Nodes are underutilized or overutilized.
- Pod and node affinity requirements, such as taints or labels, have changed and the original scheduling decisions are no longer appropriate for certain nodes.
- Node failure requires pods to be moved.
- New nodes are added to clusters.
- Pods have been restarted too many times.
The KubeDescheduler can be installed via the OpertorHub or via appropriate manifest files:
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-kube-descheduler-operator
namespace: openshift-kube-descheduler-operator
spec:
targetNamespaces:
- openshift-kube-descheduler-operator
upgradeStrategy: Default
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
labels:
operators.coreos.com/cluster-kube-descheduler-operator.openshift-kube-descheduler-op: ""
name: cluster-kube-descheduler-operator
namespace: openshift-kube-descheduler-operator
spec:
channel: stable
installPlanApproval: Automatic
name: cluster-kube-descheduler-operator
source: redhat-operators
sourceNamespace: openshift-marketplaceThe following configuration will evict long-running pods and balances resource usage between nodes.
See further profile specific info here: LifecycleAndUtilization
apiVersion: operator.openshift.io/v1
kind: KubeDescheduler
metadata:
name: cluster
namespace: openshift-kube-descheduler-operator
spec:
logLevel: Normal
mode: Automatic
operatorLogLevel: Normal
deschedulingIntervalSeconds: 3600
profileCustomizations:
devActualUtilizationProfile: PrometheusCPUCombined
devDeviationThresholds: AsymmetricLow
devEnableSoftTainter: true
profiles:
- LifecycleAndUtilization
- EvictPodsWithPVC
- EvictPodsWithLocalStorage
managementState: ManagedapiVersion: operator.openshift.io/v1
kind: KubeDescheduler
metadata:
name: cluster
namespace: openshift-kube-descheduler-operator
spec:
managementState: Managed
deschedulingIntervalSeconds: 30
mode: "Automatic"
profiles:
- KubeVirtRelieveAndMigrate
profileCustomizations:
devEnableSoftTainter: true
devDeviationThresholds: AsymmetricLow
devActualUtilizationProfile: PrometheusCPUCombinedThe KubeVirtRelieveAndMigrate profile requires PSI metrics to be enabled on all worker nodes. You can enable this by applying the following MachineConfig custom resource (CR):
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-openshift-machineconfig-worker-psi-karg
spec:
kernelArguments:
- psi=1- Create a
VirtualMachinePoolin order to schedule mass-VMs:
oc create -f - <<EOF
apiVersion: pool.kubevirt.io/v1alpha1
kind: VirtualMachinePool
metadata:
name: vm-pool-cirros
spec:
replicas: 3
selector:
matchLabels:
kubevirt.io/vmpool: vm-pool-cirros
virtualMachineTemplate:
metadata:
creationTimestamp: null
labels:
kubevirt.io/vmpool: vm-pool-cirros
spec:
runStrategy: Always
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/vmpool: vm-pool-cirros
spec:
domain:
devices:
disks:
- disk:
bus: virtio
name: containerdisk
resources:
requests:
memory: 128Mi
terminationGracePeriodSeconds: 0
volumes:
- containerDisk:
image: docker.io/kubevirt/cirros-container-disk-demo:latest
name: containerdisk
EOF- Use the following metrics query in order to show VM distribution:
count by (node) (kubevirt_vmi_info{name=~".*cirros.*", phase="running"})
- This metric shows succeeded VM migration:
count(kubevirt_vmi_migration_succeeded)
- install the Grafana Community Operator in the specific namespace
openshift-user-workload-monitoring
oc project openshift-user-workload-monitoring
oc create sa grafana-sa
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-sa
CREATETOKEN="$(oc -n openshift-user-workload-monitoring create token grafana-sa --duration=8760h)"
oc -n openshift-workload-availability create secret generic prometheus-user-workload-token \
--from-literal=token="$CREATETOKEN"
GETTOKEN="$(oc -n openshift-workload-availability get secret grafana-sa-token -o jsonpath='{.data.token}' | base64 -d)"
TOKEN="$(oc create token grafana-sa --duration=8760h)"
echo "$TOKEN"
oc create -f - <<EOF
kind: Secret
apiVersion: v1
metadata:
name: credentials
namespace: openshift-user-workload-monitoring
stringData:
GF_SECURITY_ADMIN_PASSWORD: grafana
GF_SECURITY_ADMIN_USER: root
PROMETHEUS_TOKEN: '${TOKEN}'
type: Opaque
EOFoc create -f - <<EOF
apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
name: grafana
labels:
dashboards: "grafana"
folders: "grafana"
spec:
deployment:
spec:
template:
spec:
containers:
- name: grafana
env:
- name: GF_SECURITY_ADMIN_USER
valueFrom:
secretKeyRef:
key: GF_SECURITY_ADMIN_USER
name: credentials
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
key: GF_SECURITY_ADMIN_PASSWORD
name: credentials
config:
auth:
disable_login_form: "false"
disable_signout_menu: "true"
auth.anonymous:
enabled: "false"
log:
level: warn
mode: console
EOFoc -n openshift-user-workload-monitoring get pods -l app=grafana
- Expose the
grafana-servicevia an OpenShift Route:
oc -n openshift-user-workload-monitoring create route edge grafana --service=grafana-service --insecure-policy=Redirect
- create our Grafana Datasource, which will connect to
thanos-querierin theopenshift-monitoringproject and will use thegrafana-saservice account token that is stored in secret credentials
oc create -f - <<EOF
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDatasource
metadata:
name: grafana-ds
namespace: openshift-user-workload-monitoring
spec:
valuesFrom:
- targetPath: "secureJsonData.httpHeaderValue1"
valueFrom:
secretKeyRef:
name: "credentials"
key: "PROMETHEUS_TOKEN"
instanceSelector:
matchLabels:
dashboards: "grafana"
datasource:
name: Prometheus
type: prometheus
access: proxy
url: https://thanos-querier.openshift-monitoring.svc:9091
isDefault: true
jsonData:
"tlsSkipVerify": true
"timeInterval": "5s"
httpHeaderName1: 'Authorization'
secureJsonData:
"httpHeaderValue1": "Bearer \${PROMETHEUS_TOKEN}"
editable: true
EOFoc -n openshift-user-workload-monitoring get GrafanaDatasource
- create a Grafana dashboard, which will fetch the JSON externally from Github:
oc create -f - <<EOF
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDashboard
metadata:
name: grafana-dashboard-ocp-v
labels:
app: grafana
spec:
instanceSelector:
matchLabels:
dashboards: grafana
folder: "Openshift Virtualization"
url: https://raw.githubusercontent.com/leoaaraujo/articles/master/openshift-virtualization-monitoring/files/ocp-v-dashboard.json
EOF- Create an additional Grafana dashboard object:
oc create -f - <<EOF
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDashboard
metadata:
name: grafana-dashboard-ocp-v-lab
labels:
app: grafana
spec:
instanceSelector:
matchLabels:
dashboards: grafana
folder: "Openshift Virtualization"
url: https://raw.githubusercontent.com/openshift-virtualization/descheduler-psi-evaluation/refs/heads/main/monitoring/json/load_aware_rebalancing.json
EOFClient → LoadBalancer IP
- Client sends TCP SYN to : <-- This IP is owned by MetalLB
MetalLB has:
- assigned the IP from its pool
- advertised it on your L2 network (ARP)
L2 delivery to the node (via your VLAN setup):
- Client resolves LB IP → MAC via ARP
- Switch forwards frame → arrives on:
- eno2 (tagged VLAN, e.g. VLAN 50)
Frame enters OVS:
- eno2 → br-data
- OVS forwards it to the correct internal port
MetalLB node receives the packet
- L2 mode
- One node “owns” the IP
- Packet is delivered locally to that node’s network stack
Kubernetes Service handling (kube-proxy / OVN)
Now the packet hits:
- LoadBalancer IP → Service
- externalTrafficPolicy: Cluster (default)
- Node receives packet
- Service load-balancing kicks in:
- kube-proxy (iptables/IPVS) or
- OVN load balancer
Packet is forwarded to a backend pod:
- Node → Pod (possibly on another node)
- Source IP is SNATed
Pod receives packet
-
TCP SYN arrives at container
-
Application responds with SYN-ACK
-
NNCP with an
ovs-interface(s)configuration for MetalLB to communicate to external networks:
(VLAN 50,51 tagged)
│
eno2
│
┌───────────┐
│ br-data │ (OVS)
└───────────┘
│ │
VLAN 50 │ │ VLAN 51
│ │
ovs-vlan50 ovs-vlan51
│ │
192.168.50.240 192.168.51.240
oc create -f - <<EOF
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: br-data-ocp-mk42-cp1
spec:
nodeSelector:
kubernetes.io/hostname: "ocp-mk42-cp1.jarvislab.guske.io"
desiredState:
interfaces:
- name: eno2
type: ethernet
state: up
ipv4:
enabled: false
ipv6:
enabled: false
- name: br-data
type: ovs-bridge
state: up
bridge:
allow-extra-patch-ports: true
options:
stp: false
port:
# trunk uplink
- name: eno2
# access port for VLAN 50
- name: ovs-vlan50
vlan:
mode: access
tag: 50
# access port for VLAN 51 (optional)
- name: ovs-vlan51
vlan:
mode: access
tag: 51
- name: ovs-vlan50
type: ovs-interface
state: up
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.50.240
prefix-length: 24
- name: ovs-vlan51
type: ovs-interface
state: up
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.51.240
prefix-length: 24
ovn:
bridge-mappings:
- bridge: br-data
localnet: physnet-data
state: present
---
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: br-data-ocp-mk42-cp2
spec:
nodeSelector:
kubernetes.io/hostname: "ocp-mk42-cp2.jarvislab.guske.io"
desiredState:
interfaces:
- name: eno2
type: ethernet
state: up
ipv4:
enabled: false
ipv6:
enabled: false
- name: br-data
type: ovs-bridge
state: up
bridge:
allow-extra-patch-ports: true
options:
stp: false
port:
# trunk uplink
- name: eno2
# access port for VLAN 50
- name: ovs-vlan50
vlan:
mode: access
tag: 50
# access port for VLAN 51 (optional)
- name: ovs-vlan51
vlan:
mode: access
tag: 51
- name: ovs-vlan50
type: ovs-interface
state: up
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.50.241
prefix-length: 24
- name: ovs-vlan51
type: ovs-interface
state: up
ipv4:
enabled: true
dhcp: false
address:
- ip: 192.168.51.241
prefix-length: 24
ovn:
bridge-mappings:
- bridge: br-data
localnet: physnet-data
state: present
EOF- You'll see new ovs-interface
oc debug node/ocp-mk42-cp1.jarvislab.guske.io
ovs-vsctl show
sh-5.1# ovs-vsctl show
57e01f0c-0626-4fa4-9467-b6092aba4dd2
Bridge br-data
Port eno2
Interface eno2
type: system
Port ovs-vlan51
tag: 51
Interface ovs-vlan51
type: internal
Port ovs-vlan50
tag: 50
Interface ovs-vlan50
type: internal
[...]
- These interfaces will also show up as links on Linux
- The status will be unknown since a virtual interface is not showing a carrier signal
oc debug node/ocp-mk42-cp1.jarvislab.guske.io
sh-5.1# ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
eno1 UP
eno2 UP
eno3 DOWN
eno4 UP
ovs-system DOWN
ovn-k8s-mp0 UNKNOWN 10.129.0.2/23 fe80::858:aff:fe81:2/64
br-int DOWN
genev_sys_6081 UNKNOWN fe80::a451:c9ff:fe0d:ccf5/64
eth0.42@eno1 UP
br-ex UNKNOWN 192.168....
[...]
ovs-vlan50 UNKNOWN 192.168.50.240/24
ovs-vlan51 UNKNOWN 192.168.51.240/24
- Enable
routingViaHost Trueas well asipForwarding: Global
oc patch network.operator cluster -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"gatewayConfig": {"routingViaHost": true} }}}}' --type=merge
oc patch network.operator cluster -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"gatewayConfig":{"ipForwarding": "Global"}}}}}' --type=merge
- Create the IpAddressPool:
oc create -f - <<EOF
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: vlan51-ipaddresspool
namespace: metallb-system
spec:
addresses:
- 192.168.51.201-192.168.51.205
autoAssign: true
avoidBuggyIPs: true
serviceAllocation:
namespaces:
- test-a
priority: 50
serviceSelectors:
- matchExpressions:
- key: l2listener-vlan
operator: In
values:
- "51"
EOF- Create the L2Advertisement:
oc create -f - <<EOF
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-adv-vlan50
namespace: metallb-system
spec:
ipAddressPools:
- vlan51-ipaddresspool
EOF- Example Deployment:
oc create -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-web-app
spec:
replicas: 1
selector:
matchLabels:
app: simple-web-app
template:
metadata:
labels:
app: simple-web-app
spec:
containers:
- name: nginx
image: quay.io/rguske/simple-web-app:v1
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
EOF- Expose the Service and ensure to set the appropriate label:
oc expose deployment simple-web-app --type=LoadBalancer --name=simple-web-app-vlan50 --port=80 --target-port=8080 --labels=l2listener-vlan=50
- Validate the communication
oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
simple-web-app-vlan51 LoadBalancer 172.30.155.190 192.168.51.201 80:31145/TCP 2d20h
- Check ICMP from the same network (vlan51 in my case):
curl -kLi 192.168.51.201
HTTP/1.1 200 OK
Server: Werkzeug/3.1.3 Python/3.12.9
Date: Mon, 13 Apr 2026 07:41:39 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 821
Connection: close
[...]
- The ICMP request can also be seen on the node:
sh-5.1# tcpdump -i eno2 -nnn -v
dropped privs to tcpdump
tcpdump: listening on eno2, link-type EN10MB (Ethernet), snapshot length 262144 bytes
07:48:49.298849 STP 802.1w, Rapid STP, Flags [Learn, Forward, Agreement], bridge-id a000.f4:e2:c6:67:dd:c1.8003, length 36
message-age 1.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
root-id 8000.d8:b3:70:76:86:19, root-pathcost 20000, port-role Designated
07:48:49.557845 IP (tos 0x0, ttl 64, id 40441, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.51.192 > 192.168.51.201: ICMP echo request, id 23, seq 4, length 64
07:48:49.557989 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.51.201 tell 192.168.51.240, length 28
07:48:50.581908 IP (tos 0x0, ttl 64, id 41227, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.51.192 > 192.168.51.201: ICMP echo request, id 23, seq 5, length 64
07:48:50.581998 IP (tos 0xc0, ttl 64, id 48200, offset 0, flags [none], proto ICMP (1), length 112)
192.168.51.240 > 192.168.51.192: ICMP redirect 192.168.51.201 to host 192.168.51.201, length 92
IP (tos 0x0, ttl 63, id 41227, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.51.192 > 192.168.51.201: ICMP echo request, id 23, seq 5, length 64
07:48:50.616670 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.51.201 tell 192.168.51.240, length 28
07:48:51.298868 STP 802.1w, Rapid STP, Flags [Learn, Forward, Agreement], bridge-id a000.f4:e2:c6:67:dd:c1.8003, length 36
message-age 1.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
root-id 8000.d8:b3:70:76:86:19, root-pathcost 20000, port-role Designated
-
Common Use Cases
- Internal vs external traffic separation
- Blue/green or canary router setups
- Dedicated routers for high-security apps
- Different TLS / wildcard domains
Ingress Controller sharding is useful when balancing incoming traffic load among a set of Ingress Controllers and when isolating traffic to a specific Ingress Controller. For example, company A goes to one Ingress Controller and company B to another.
Important: You must keep all of OpenShift Container Platform’s administration routes on the same Ingress Controller. Therefore, avoid adding additional selectors to the default Ingress Controller that exclude these essential routes.
- create a new project:
oc new-project ingress-sharding-no-lb
endpointPublishingStrategy:
type: HostNetworkThis forces each router pod to bind host ports 80 and 443 directly on the node.
Kubernetes scheduler enforces:
Only one pod per node can bind a given host port.
With HostNetwork:
You cannot run multiple routers on the same node unless they use different ports (which OpenShift routers do not support).
I'd suggest going with either NodePortService or prefered LoadBalancerService!
I'd like to achieve that the new Ingress Controller (sharded ingress) is using a specific interface. Therefore, I've used the nncp configuration which I've used in this section.
Important to mention is the topic AsymetricRouting.
- Create the
IpAddressPool:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: vlan51-ipaddresspool-ingress-sharding
namespace: metallb-system
spec:
addresses:
- 192.168.51.206-192.168.51.210
autoAssign: true
avoidBuggyIPs: true
serviceAllocation:
namespaces:
- openshift-ingress- Create the
L2Advertisementaccordingly
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2adv-vlan51-ipaddresspool-ingress-sharding
namespace: metallb-system
spec:
ipAddressPools:
- vlan51-ipaddresspool-ingress-sharding- Create an IngressController object which will be used for the new
route - The Ingress Controller selects routes using a
routeSelector
oc apply -f - <<EOF
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
name: sharded-router-no-lb
namespace: openshift-ingress-operator
spec:
domain: my-sharded-domain.retroplay.guske.io
nodePlacement:
nodeSelector:
matchLabels:
node-role.kubernetes.io/worker: ""
routeSelector:
matchLabels:
type: sharded
endpointPublishingStrategy:
type: LoadBalancerService
EOFoc get pods -n openshift-ingress -o wide | grep sharded-router-no-lb
router-sharded-router-no-lb-6d8c6bdf59-7ff2j 1/1 Running 0 5m38s 10.129.0.233 ocp-mk42-cp1.jarvislab.guske.io <none> <none>
router-sharded-router-no-lb-6d8c6bdf59-qzjqb 1/1 Running 0 5m38s 10.130.1.31 ocp-mk42-cp2.jarvislab.guske.io <none> <none>
oc -n openshift-ingress-operator get ingresscontrollers.operator.openshift.io
NAME AGE
default 98d
sharded-router-no-lb 34s
oc get svc -n openshift-ingress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
router-internal-default ClusterIP 172.30.62.65 <none> 80/TCP,443/TCP,1936/TCP 98d
router-internal-sharded-router-no-lb ClusterIP 172.30.178.216 <none> 80/TCP,443/TCP,1936/TCP 8s
router-sharded-router-no-lb LoadBalancer 172.30.231.186 192.168.51.206 80:31519/TCP,443:32731/TCP 8s
- Configure your DNS properly with the assigned IP
dig +short test.my-sharded-domain.retroplay.guske.io
192.168.51.206
- You can optionally force a specific pool via annotation:
apiVersion: v1
kind: Service
metadata:
name: router-sharded-router
namespace: openshift-ingress
annotations:
metallb.universe.tf/address-pool: vlan51-ipaddresspool-ingress-sharding- Deploy an example application
- important is the label for the
routeobject:type=sharded- this is what we've specified in the Ingress object
oc apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-web-app
spec:
replicas: 1
selector:
matchLabels:
app: simple-web-app
template:
metadata:
labels:
app: simple-web-app
spec:
containers:
- name: nginx
image: quay.io/rguske/simple-web-app:v1
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
name: simple-web-app
spec:
type: ClusterIP
selector:
app: simple-web-app
ports:
- port: 8080
targetPort: 8080
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
name: simple-web-app-route
labels:
type: sharded
spec:
to:
name: simple-web-app
weight: 100
kind: Service
host: simple-web-app.my-sharded-domain.retroplay.guske.io
path: ''
tls:
insecureEdgeTerminationPolicy: Redirect
termination: edge
port:
targetPort: 8080
EOFoc get deploy,svc,route
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/simple-web-app 1/1 1 1 77m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/simple-web-app ClusterIP 172.30.129.154 <none> 8080/TCP 77m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
route.route.openshift.io/simple-web-app-route simple-web-app.my-sharded-domain.retroplay.guske.io ... 1 more simple-web-app 8080 edge/Redirect None
- Validate:
dig +short simple-web-app.my-sharded-domain.retroplay.guske.io
192.168.51.206
oc get route/simple-web-app-route -o json | jq '.status.ingress'
[
{
"conditions": [
{
"lastTransitionTime": "2026-04-14T08:34:39Z",
"status": "True",
"type": "Admitted"
}
],
"host": "simple-web-app.my-sharded-domain.retroplay.guske.io",
"routerCanonicalHostname": "router-sharded-router-no-lb.my-sharded-domain.retroplay.guske.io",
"routerName": "sharded-router-no-lb",
"wildcardPolicy": "None"
},
{
"conditions": [
{
"lastTransitionTime": "2026-04-14T08:34:39Z",
"status": "True",
"type": "Admitted"
}
],
"host": "simple-web-app.my-sharded-domain.retroplay.guske.io",
"routerCanonicalHostname": "router-default.apps.ocp-mk42.retroplay.guske.io",
"routerName": "default",
"wildcardPolicy": "None"
}
]- test from a system which is on the same subnet
curl -kI https://simple-web-app.my-sharded-domain.retroplay.guske.io
HTTP/1.1 200 OK
server: Werkzeug/3.1.3 Python/3.12.9
date: Tue, 14 Apr 2026 16:40:49 GMT
content-type: text/html; charset=utf-8
content-length: 821
set-cookie: cbda86f129258df79f6b63217c5fda7e=abbd944bc53d81890ede6449349fcebe; path=/; HttpOnly; Secure; SameSite=None
A checkup is an automated test workload that allows you to verify if a specific cluster functionality works as expected.
- Create a
NetworkAttachmentDefinitionfor two projects (vms, vms-2 in my case)
apiVersion: k8s.ovn.org/v1
kind: ClusterUserDefinedNetwork
metadata:
name: nad-50
spec:
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
- vms-2
- vms
network:
localnet:
ipam:
mode: Disabled
mtu: 1500
physicalNetworkName: physnet-data
role: Secondary
vlan:
access:
id: 50
mode: Access
topology: LocalnetAfter this prerequisites exists, one can click on "Install Permissions" in the OpenShift WebConsole --> Virtualization --> Checkups. The button is greyed out as long as no NAD is configured.
Note: All checks can configured via CLI. Check the official docs
It'll add a new ServiceAccount named vm-latency-checkup-sa...
oc get sa
NAME SECRETS AGE
builder 0 93d
default 0 93d
deployer 0 93d
vm-latency-checkup-sa 0 3m8s
Running the checkup will instantiate new pods:
oc get pods
NAME READY STATUS RESTARTS AGE
virt-launcher-latency-check-source-gjk5h-j2cgl 3/3 Running 0 42s
virt-launcher-latency-check-target-g96l7-cllk5 3/3 Running 0 42s
vm-latency-checkup-1-7643-8g7gv 1/1 Running 0 43s
The result will be stored in a ConfigMap:
oc get cm vm-latency-checkup-1 -oyaml
apiVersion: v1
data:
spec.param.networkAttachmentDefinitionName: nad-vlan-50
spec.param.networkAttachmentDefinitionNamespace: vms
spec.param.sampleDurationSeconds: "5"
spec.param.sourceNode: ocp-mk42-cp1.jarvislab.guske.io
spec.param.targetNode: ocp-mk42-cp2.jarvislab.guske.io
spec.timeout: 5m
status.completionTimestamp: "2026-04-14T17:19:34Z"
status.failureReason: ""
status.result.avgLatencyNanoSec: "2234000"
status.result.maxLatencyNanoSec: "6209000"
status.result.measurementDurationSec: "5"
status.result.minLatencyNanoSec: "428000"
status.result.sourceNode: ocp-mk42-cp1.jarvislab.guske.io
status.result.targetNode: ocp-mk42-cp2.jarvislab.guske.io
status.startTimestamp: "2026-04-14T17:18:23Z"
status.succeeded: "true"
kind: ConfigMap
metadata:
creationTimestamp: "2026-04-14T17:18:22Z"
labels:
kiagnose/checkup-type: kubevirt-vm-latency
name: vm-latency-checkup-1
namespace: vms
resourceVersion: "188052432"
uid: c58d68e0-e792-475a-b395-8408238e5406You can use a storage checkup to verify that the cluster storage is optimally configured for OpenShift Virtualization.
Note: All checks can configured via CLI. Check the official docs
Same as it was for the Network Latency Checkup, permissions first. Click on Install Permissions in the WebConsole.
oc get sa
NAME SECRETS AGE
builder 0 93d
default 0 93d
deployer 0 93d
storage-checkup-sa 0 106s
vm-latency-checkup-sa 0 67m
Running the checkup will add resources accordingly:
oc get pod,pvc,dv,job
NAME READY STATUS RESTARTS AGE
pod/kubevirt-storage-checkup-1-9942-nt7gq 1/1 Running 0 102s
pod/virt-launcher-rhel-9-white-fowl-80-m7d47 0/2 Completed 0 30h
pod/virt-launcher-rhel-9-white-fowl-80-qxdvw 2/2 Running 0 6h10m
pod/vm-latency-checkup-1-7643-8g7gv 0/1 Completed 0 59m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/persistent-state-for-vmi-under-test-df89f-mlrn8 Pending synology-iscsi-storage <unset> 12s
persistentvolumeclaim/rhel-9-white-fowl-80-volume Bound pvc-ff0b8ed4-e328-45e8-851b-ffb41537a714 30Gi RWX synology-iscsi-storage <unset> 30h
persistentvolumeclaim/vmi-under-test-df89f-dv Bound pvc-b0128a6d-add3-45d9-b287-99f6427697b0 30Gi RWX synology-iscsi-storage <unset> 17s
NAME PHASE PROGRESS RESTARTS AGE
datavolume.cdi.kubevirt.io/rhel-9-white-fowl-80-volume Succeeded 100.0% 30h
datavolume.cdi.kubevirt.io/vmi-under-test-df89f-dv Succeeded 100.0% 18s
NAME STATUS COMPLETIONS DURATION AGE
job.batch/kubevirt-storage-checkup-1-9942 Running 0/1 102s 102s
job.batch/vm-latency-checkup-1-7643 Complete 1/1 74s 59m
Failed Job:
kind: ConfigMap
apiVersion: v1
metadata:
name: kubevirt-storage-checkup-1
namespace: vms
labels:
kiagnose/checkup-type: kubevirt-vm-storage
data:
status.result.goldenImagesNotUpToDate: |-
openshift-virtualization-os-images/centos-stream9-image-cron
openshift-virtualization-os-images/centos-stream10-image-cron
status.result.cnvVersion: 4.21.3
status.succeeded: 'false'
status.result.defaultStorageClass: synology-iscsi-storage
status.result.vmHotplugVolume: ''
status.result.vmBootFromGoldenImage: 'failed waiting for VMI "vmi-under-test-df89f" successfully booted: timed out waiting for the condition'
status.result.storageProfilesWithSmartClone: synology-iscsi-storage
status.result.storageProfilesWithEmptyClaimPropertySets: ''
spec.timeout: 10m
status.startTimestamp: '2026-04-14T18:16:22Z'
status.result.storageProfilesWithSpecClaimPropertySets: ''
status.result.ocpVersion: 4.21.9
status.failureReason: VMI vms/vmi-under-test-df89f is owned by a VM
status.result.vmsWithUnsetEfsStorageClass: ''
status.result.concurrentVMBoot: ''
status.result.vmLiveMigration: 'failed waiting for VMI "vmi-under-test-df89f" migration completed: timed out waiting for the condition'
status.result.pvcBound: pvc failed to bound
status.result.vmVolumeClone: 'DV cloneType: "csi-clone"'
status.result.storageProfileMissingVolumeSnapshotClass: ''
status.result.vmsWithNonVirtRbdStorageClass: ''
status.result.goldenImagesNoDataSource: ''
status.result.storageProfilesWithRWX: synology-iscsi-storage
status.completionTimestamp: '2026-04-14T18:23:42Z'In my case, there's indeed something wrong with the centos9 Golden Image:
oc -n openshift-virtualization-os-images get dv
NAME PHASE PROGRESS RESTARTS AGE
centos-stream10-40669a406f49 N/A 39d
centos-stream10-5db94eb365eb Succeeded 100.0% 70d
centos-stream10-8adef4f5457b Succeeded 100.0% 83d
centos-stream10-da2ffd43fa26 Succeeded 100.0% 76d
centos-stream9-0e16ba1cf6c9 Succeeded 100.0% 69d
centos-stream9-2e68de8fe816 N/A 39d
centos-stream9-4c67dd12e190 Succeeded 100.0% 82d
centos-stream9-86bfc3da3797 Succeeded 100.0% 75d
fedora-68ed96832eca Succeeded 100.0% 97d
rhel10-c03936a065f2 Succeeded 100.0% 1 97d
rhel8-004e24cfacec Succeeded 100.0% 1 89d
rhel8-4ccd8b6aee47 Succeeded 100.0% 1 97d
rhel9-ab4ec16077fe Succeeded 100.0% 1 97d
Checkout this repository for an automated test checking the readiness of a KubeVirt cluster to run virtualized realtime workloads.
In OpenShift Virtualization (via Kubernetes + Multus), a MultiNetworkPolicy is used to control traffic on secondary networks (not the default pod network).
-
Create two projects:
oc new-project dev-a
oc new-project dev-b
- important is that a bridge-mapping in the
nncpexists
[...]
ovn:
bridge-mappings:
- bridge: br-data
localnet: physnet-data
state: present- create two
network-attachment-definitionin the projects
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: nad-br-data-vlan50
namespace: dev-a
spec:
config: |-
{
"cniVersion": "0.4.0",
"name": "physnet-data",
"type": "ovn-k8s-cni-overlay",
"mtu": 1500,
"netAttachDefName": "dev-a/nad-br-data-vlan50",
"topology": "localnet",
"vlanID": 50
}
EOFoc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: nad-br-data-vlan50
namespace: dev-b
spec:
config: |-
{
"cniVersion": "0.4.0",
"name": "physnet-data",
"type": "ovn-k8s-cni-overlay",
"mtu": 1500,
"netAttachDefName": "dev-b/nad-br-data-vlan50",
"topology": "localnet",
"vlanID": 50
}
EOF- Create VMs in both new projects:
export VM=rhel9-a && echo $VM
oc create -f - <<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: '${VM}'
namespace: dev-a
labels:
vm.openshift.io/folder: linux
spec:
dataVolumeTemplates:
- apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: '${VM}'
spec:
sourceRef:
kind: DataSource
name: rhel9
namespace: openshift-virtualization-os-images
storage:
resources:
requests:
storage: 30Gi
runStrategy: RerunOnFailure
template:
metadata:
annotations:
kubevirt.io/pci-topology-version: v3
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
spec:
architecture: amd64
domain:
cpu:
cores: 1
sockets: 1
threads: 1
devices:
disks:
- bootOrder: 1
disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- bridge: {}
model: virtio
name: default
state: up
rng: {}
features:
acpi: {}
smm:
enabled: true
firmware:
bootloader:
efi: {}
machine:
type: pc-q35-rhel9.6.0
memory:
guest: 2Gi
resources: {}
networks:
- multus:
networkName: nad-br-data-vlan50
name: default
volumes:
- dataVolume:
name: '${VM}'
name: rootdisk
- cloudInitNoCloud:
userData: |-
#cloud-config
user: cloud-user
password: redhat
chpasswd: { expire: False }
name: cloudinitdisk
EOF- create a second VM
export VM=rhel9-b && echo $VM
oc create -f - <<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: '${VM}'
namespace: dev-a
labels:
vm.openshift.io/folder: linux
spec:
dataVolumeTemplates:
- apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
name: '${VM}'
spec:
sourceRef:
kind: DataSource
name: rhel9
namespace: openshift-virtualization-os-images
storage:
resources:
requests:
storage: 30Gi
runStrategy: RerunOnFailure
template:
metadata:
annotations:
kubevirt.io/pci-topology-version: v3
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
spec:
architecture: amd64
domain:
cpu:
cores: 1
sockets: 1
threads: 1
devices:
disks:
- bootOrder: 1
disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
interfaces:
- bridge: {}
model: virtio
name: default
state: up
rng: {}
features:
acpi: {}
smm:
enabled: true
firmware:
bootloader:
efi: {}
machine:
type: pc-q35-rhel9.6.0
memory:
guest: 2Gi
resources: {}
networks:
- multus:
networkName: nad-br-data-vlan50
name: default
volumes:
- dataVolume:
name: '${VM}'
name: rootdisk
- cloudInitNoCloud:
userData: |-
#cloud-config
user: cloud-user
password: redhat
chpasswd: { expire: False }
name: cloudinitdisk
EOF- create a NetworkAttachmentDefinition for project dev-a
- in my example using VLAN50
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: nad-br-data-vlan50
namespace: dev-a
spec:
config: |-
{
"cniVersion": "0.4.0",
"name": "physnet-data",
"type": "ovn-k8s-cni-overlay",
"mtu": 1500,
"netAttachDefName": "dev-a/nad-br-data-vlan50",
"topology": "localnet",
"vlanID": 50
}
EOF- and VLAN51 for project dev-b
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: nad-br-data-vlan50
namespace: dev-b
spec:
config: |-
{
"cniVersion": "0.4.0",
"name": "physnet-data",
"type": "ovn-k8s-cni-overlay",
"mtu": 1500,
"netAttachDefName": "dev-b/nad-br-data-vlan50",
"topology": "localnet",
"vlanID": 50
}
EOF- validate IP assignment (DHCP)
oc get vmi -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{range .status.interfaces[*]}{.ipAddress}{" "}{end}{"\n"}{end}'
rhel9-a 192.168.50.179
- connect to your vm in project
dev-aand validate communication to projectdev-b
virtctl ssh cloud-user@vm/rhel9-a -i ~/.ssh/id_jarvishomelab_ed25519
- create a
MultiNetworkPolicywhich denies all incoming as well as outgoing traffic for all pods in the project dev-a:
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1beta1
kind: MultiNetworkPolicy
metadata:
name: default-deny-to-dev-a
namespace: dev-a
annotations:
k8s.v1.cni.cncf.io/policy-for: dev-a/nad-br-data-vlan50
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress: []
egress: []
EOF- if the use-case is, that only ingress should be affected for vms which has the label
security=developmentattached, use:
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1beta1
kind: MultiNetworkPolicy
metadata:
name: default-deny-to-label-development
namespace: dev-a
annotations:
k8s.v1.cni.cncf.io/policy-for: dev-a/nad-br-data-vlan50
spec:
podSelector:
matchLabels:
security-label: development
policyTypes:
- Ingress
ingress: []
EOF- the next
MultiNetworkPolicyonly allows communication on port 22 (ssh) - this time using the label
security-label=development
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1beta1
kind: MultiNetworkPolicy
metadata:
name: allow-ssh-only-to-label-development
namespace: dev-a
annotations:
k8s.v1.cni.cncf.io/policy-for: dev-a/nad-br-data-vlan50
spec:
podSelector:
matchLabels:
security-label: development
policyTypes:
- Ingress
ingress:
- ports:
- protocol: TCP
port: 22
EOF- validate its functionality
sshshould work but e.g.icmpnot
Another example explicitly allows communication from vm rhel9-b (192.168.50.180) to vm rhel9-a (192.168.50.179)
oc create -f - <<EOF
apiVersion: k8s.cni.cncf.io/v1beta1
kind: MultiNetworkPolicy
metadata:
name: allow-from-dev-b-vm-192-168-50-180
namespace: dev-a
annotations:
k8s.v1.cni.cncf.io/policy-for: dev-a/nad-br-data-vlan50
spec:
podSelector:
matchLabels:
security-label: development
policyTypes:
- Ingress
ingress:
- from:
- ipBlock:
cidr: 192.168.50.180/32
EOF- result should be, that only rhel9-b can reach rhel9-a. rhel9-c can't!







