OCPBUGS-64841: Ensure 'containers' user & group are part of the image#1917
OCPBUGS-64841: Ensure 'containers' user & group are part of the image#1917travier wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@travier: This pull request references Jira Issue OCPBUGS-64841, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: travier The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Another option here is to instead go all in and drop those from the image. This means that we need to manually remove the As this user was mostly used to allocate a subuid/subgid range for UID/GID namespaced containers, it's not clear the impact of a UID/GID change would have. |
This is a combinaison of multiple things: - In [1], the cri-o package has been updated to use systemd-sysusers config instead of using useradd/usermod commands directly. - Starting with OCP 4.19, we've split the OCP packages (here cri-o) from the base RHEL image to the Node image layer. This means that the sysusers scriplet in `%pre` is now called during the node layer build and does not add the user/group to the `/usr/lib/passwd|group` files but to the `/etc/passwd|group` ones. As it does not take into account the existing users & groups from `/usr/lib/passwd|group`, the new `containers` user/group have a UID/GID that collide with an existing user/group. Changes to the `/etc/passwd|group` files are also not propagated to the system ones on updates as those files are changed on first boot as the `core` user is created on the system and thus ostree does not update them anymore. See [2] & [3]. - Starting with OCP 4.19, new nodes start with no `containers` user/group defined (either in `/usr/` or `/etc`) and those are thus created in `/etc` after the switch to the node image, so everything appear to be OK when you create a fresh cluster. Clusters updating to OCP 4.19 with older nodes that used to have the `containers` user/group defined in `/usr/lib/passwd|group` will now no longer have them there and thus systemd-sysusers will attempt to create them on the system. This will however fail as entries for those user/group are left in the `/etc/shadow` and `/etc/gshadow` files. This is [4] but "reversed". The proposed solution here is to keep the `containers` user/group properly defined in the container image in the `/usr/lib/passwd|group` files. Older nodes will thus use those user/group like they used to. New nodes will stop trying to create them. They will have missing `shadow|gshadow` entries however until we fix [4] but that should be an issue as those are not used for interactive/login session users. The medium/longer term fix is to complete the transition away from nss-altfiles for all Bootable Container systems. [1] https://pkgs.devel.redhat.com/cgit/rpms/cri-o/commit/?h=rhaos-4.18-rhel-9&id=240a1e3db29a1d1c1b58dfae1325a9f19c663b91 [2] https://bootc-dev.github.io/bootc/building/users-and-groups.html#system-users-and-groups-added-via-packages-etc [3] https://bootc-dev.github.io/bootc/building/users-and-groups.html#nss-altfiles [4] bootc-dev/bootc#1179 Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841 Related: https://gitlab.com/fedora/bootc/tracker/-/work_items/76
a2e7ffd to
005e8a5
Compare
|
I think we should have added this user/group to our passwd/group files in #1661. |
|
@travier: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
LGTM, when this merges should I update the crio rpm to remove the sysusers addition or would it be a noop? |
Don't remove the sysusers config, we are using it here. The |
|
There has been movement related to issue recently. See:
(from https://gitlab.com/fedora/bootc/tracker/-/work_items/76#note_3187445643) But I've just realized that this would be for RPM reading the "right" password & group files, not for sysusers to use them so this will not help here. |
|
(I still need to fully test this fix but we can still have the discussion about the approach now) |
From what I understand new and upgrading systems have been working fine even without this PR. If this hadn't failed would systems have stopped working? |
| # for the full details. | ||
|
|
||
| # Only do that when doing a container build | ||
| if [[ -f /run/.containerenv ]] && [[ -f /usr/lib/sysusers.d/crio.conf ]]; then |
There was a problem hiding this comment.
In what situation would we ever not be running in a container env?
| if [[ -f /run/.containerenv ]] && [[ -f /usr/lib/sysusers.d/crio.conf ]]; then | |
| if [[ -f /usr/lib/sysusers.d/crio.conf ]]; then |
There was a problem hiding this comment.
I was conservative and copied the checks from above. I'll update both.
There was a problem hiding this comment.
Yeah the check above dates back from when we supported both layered and base composes for the node image.
| # First, cleanup the broken entries from /etc/passwd|group|shadow|gshadow | ||
| sed -i "/^containers:/d" /etc/{passwd,group,shadow,gshadow} |
There was a problem hiding this comment.
This contradicts your statement in the description of this PR:
Starting with OCP 4.19, new nodes start with no containers user/group defined (either in /usr/ or /etc)
If that is true then containers: should never be defined here.
There was a problem hiding this comment.
New nodes start with a pure RHEL boot image that does not have CRI-O installed. This is the container image with CRI-O installed.
There was a problem hiding this comment.
just for my understanding, since there are a lot of pieces here. Today we have:
- RHEL CoreOS Base Image build (no crio)
- OpenShift Node Image build FROM:rhel-coreos-base where crio gets installed
a. RPM transation where crio gets installed
b. postprocess scripts (not rpm scriptlets) where thissedstatement runs
I think what you are saying is that in a. the containers: user gets added to /etc/{passwd,group,shadow,gshadow} and that's what we are cleaning up?
There was a problem hiding this comment.
yes, it gets added to the node layer /etc/{passwd,group,shadow,gshadow} and not to the /usr as part of the node layer build.
| # Only do that when doing a container build | ||
| if [[ -f /run/.containerenv ]] && [[ -f /usr/lib/sysusers.d/crio.conf ]]; then |
There was a problem hiding this comment.
We only ever run in a container environment.
| # Only do that when doing a container build | |
| if [[ -f /run/.containerenv ]] && [[ -f /usr/lib/sysusers.d/crio.conf ]]; then | |
| if [[ -f /usr/lib/sysusers.d/crio.conf ]]; then |
Am I missing something?
There was a problem hiding this comment.
Also, in what situations would crio.conf not exist? I'm thinking we should fail if crio.conf doesn't exist.
There was a problem hiding this comment.
Probably should fail indeed.
| mv /etc/passwd /usr/lib/passwd | ||
| mv /etc/group /usr/lib/group | ||
| mv /etc/passwd.bak /etc/passwd | ||
| mv /etc/group.bak /etc/group |
There was a problem hiding this comment.
I'd like to keep this "hack" targeted.
Can we do something here that will make sure the only entry that was created was the containers: user and no other changes were made to the /usr/lib/passwd* and /etc/passwd* from the RHEL CoreOS Base image we are deriving from and fail if some other changes were made?
I don't think anyone is using the functionality that is broken by this PR (user namespace'd containers). I'm not fully sure this is needed for it to work (but maybe for it to be secure to make sure those IDs are not reused by something else). |
| # for the full details. | ||
|
|
||
| # Only do that when doing a container build | ||
| if [[ -f /run/.containerenv ]] && [[ -f /usr/lib/sysusers.d/crio.conf ]]; then |
There was a problem hiding this comment.
Yeah the check above dates back from when we supported both layered and base composes for the node image.
| mv /usr/lib/group /etc/group | ||
|
|
||
| # Re-create the user/group/shadow/gshadow entries | ||
| systemd-sysusers crio.conf |
There was a problem hiding this comment.
I think we need to go further and fixate the UID/GID to whatever it was in the base composes on 4.18. Basically same rationale as f202927.
I guess we could do that here, or just add it to r-c-c which already carries a bunch of other fixated users/groups the node image needs.
I'm not sure why containers wasn't part of that commit I did back then. I'm pretty sure the way I came up with this is that I booted a live ~4.18 RHCOS and added the dynamic entries, so perhaps containers wasn't added back then for some reason.
|
RHCOS 4.18-9.4 (2026-03-23): This UID is now used by dnsmasq: https://github.com/coreos/rhel-coreos-config/blob/main/passwd#L25 RHCOS 4.19-9.6 (2025-05-02) (FYI, I think we never shipped this one as we moved to the node image): So the UID/GID will change from what we had in 4.18. |
The openvswitch user and group have been part of the passwd & group files for, at least, as long as we've published RHCOS sources publicly: - https://github.com/openshift/os/blame/bdb5b8153ed68c88e2485d9e7bd66ea6eb54d6c1/passwd#L27 - https://github.com/openshift/os/blame/release-4.19/group#L47 We did not remove them when we re-visited our fixed UIDs/GID in the split between the RHEL boot image and the new OCP node image ([1], [2] & [3]). Thus they are now part of the base RHEL boot image, even though the openvswitch package is not included there. Although technically unnecessary, this is fine and simplify things a bit as we do not have to update the user & group entries during the node image build, which is currently a problematic topic (see [4]). Thus instead of adding openvswitch to hugetlbfs group in the node image build, we add it here directly to simplify the logic. [1] openshift/os#1661 [2] coreos#29 [3] coreos#31 [4] openshift/os#1917
Adding users and groups during a container image layered build is currently non-ergonomic with bootable containers. Thus instead of doing that in openshift/os for the node layer, we directly include the user & group here, which also guarentees us that the UID/GID remain stable. See openshift/os#1917 for the original version of this change and the full details about what makes adding user/group in the node layer non-ergonomic. Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841
Adding users and groups during a container image layered build is currently non-ergonomic with bootable containers. Thus instead of doing that in openshift/os for the node layer, we directly include the user & group here, which also guarentees us that the UID/GID remain stable. See openshift/os#1917 for the original version of this change and the full details about what makes adding user/group in the node layer non-ergonomic. Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841
Adding users and groups during a container image layered build is currently non-ergonomic with bootable containers. Thus instead of doing that in openshift/os for the node layer, we directly include the user & group here, which also guarentees us that the UID/GID remain stable. See openshift/os#1917 for the original version of this change and the full details about what makes adding user/group in the node layer non-ergonomic. Unfortunately we can not use the UID/GID that were used in the last "full" RHCOS image (4.18) as those are now used for dnsmasq (see [1]). Thus use the first UID & GID available for both user and group, going downward. [1] openshift/os#1917 (comment) Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841
|
OK I've made coreos/rhel-coreos-config#224 instead. I'll make another PR here to remove the script for openvswitch. |
We are moving the group inclusion directly to the RHEL base image instead of working around it here in the OCP node layer. See: openshift#1917 See: coreos/rhel-coreos-config#224 See: https://redhat.atlassian.net/browse/OCPBUGS-64841
|
Workaround removal for the node layer: #1918 |
The openvswitch user and group have been part of the passwd & group files for, at least, as long as we've published RHCOS sources publicly: - https://github.com/openshift/os/blame/bdb5b8153ed68c88e2485d9e7bd66ea6eb54d6c1/passwd#L27 - https://github.com/openshift/os/blame/release-4.19/group#L47 We did not remove them when we re-visited our fixed UIDs/GID in the split between the RHEL boot image and the new OCP node image ([1], [2] & [3]). Thus they are now part of the base RHEL boot image, even though the openvswitch package is not included there. Although technically unnecessary, this is fine and simplify things a bit as we do not have to update the user & group entries during the node image build, which is currently a problematic topic (see [4]). Thus instead of adding openvswitch to hugetlbfs group in the node image build, we add it here directly to simplify the logic. [1] openshift/os#1661 [2] #29 [3] #31 [4] openshift/os#1917
Adding users and groups during a container image layered build is currently non-ergonomic with bootable containers. Thus instead of doing that in openshift/os for the node layer, we directly include the user & group here, which also guarentees us that the UID/GID remain stable. See openshift/os#1917 for the original version of this change and the full details about what makes adding user/group in the node layer non-ergonomic. Unfortunately we can not use the UID/GID that were used in the last "full" RHCOS image (4.18) as those are now used for dnsmasq (see [1]). Thus use the first UID & GID available for both user and group, going downward. [1] openshift/os#1917 (comment) Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841
|
Closing this one as we are doing coreos/rhel-coreos-config#224 & #1918 instead. |
|
@travier: This pull request references Jira Issue OCPBUGS-64841. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This is a combinaison of multiple things:
In [1], the cri-o package has been updated to use systemd-sysusers config instead of using useradd/usermod commands directly.
Starting with OCP 4.19, we've split the OCP packages (here cri-o) from the base RHEL image to the Node image layer. This means that the sysusers scriplet in
%preis now called during the node layer build and does not add the user/group to the/usr/lib/passwd|groupfiles but to the/etc/passwd|groupones. As it does not take into account the existing users & groups from/usr/lib/passwd|group, the newcontainersuser/group have a UID/GID that collide with an existing user/group. Changes to the/etc/passwd|groupfiles are also not propagated to the system ones on updates as those files are changed on first boot as thecoreuser is created on the system and thus ostree does not update them anymore. See [2] & [3].Starting with OCP 4.19, new nodes start with no
containersuser/group defined (either in/usr/or/etc) and those are thus created in/etcafter the switch to the node image, so everything appear to be OK when you create a fresh cluster. Clusters updating to OCP 4.19 with older nodes that used to have thecontainersuser/group defined in/usr/lib/passwd|groupwill now no longer have them there and thus systemd-sysusers will attempt to create them on the system. This will however fail as entries for those user/group are left in the/etc/shadowand/etc/gshadowfiles. This is [4] but "reversed".The proposed solution here is to keep the
containersuser/group properly defined in the container image in the/usr/lib/passwd|groupfiles. Older nodes will thus use those user/group like they used to. New nodes will stop trying to create them. They will have missingshadow|gshadowentries however until we fix [4] but that should be an issue as those are not used for interactive/login session users.The medium/longer term fix is to complete the transition away from nss-altfiles for all Bootable Container systems.
[1] https://pkgs.devel.redhat.com/cgit/rpms/cri-o/commit/?h=rhaos-4.18-rhel-9&id=240a1e3db29a1d1c1b58dfae1325a9f19c663b91
[2] https://bootc-dev.github.io/bootc/building/users-and-groups.html#system-users-and-groups-added-via-packages-etc
[3] https://bootc-dev.github.io/bootc/building/users-and-groups.html#nss-altfiles
[4] bootc-dev/bootc#1179
Fixes: https://redhat.atlassian.net/browse/OCPBUGS-64841
Related: https://gitlab.com/fedora/bootc/tracker/-/work_items/76