Skip to content

grubconfigs: Sync grub.cfg and bootuuid.cfg to all ESPs on multi-device setups#1077

Draft
ckyrouac wants to merge 1 commit intocoreos:mainfrom
ckyrouac:grub-multi-fix
Draft

grubconfigs: Sync grub.cfg and bootuuid.cfg to all ESPs on multi-device setups#1077
ckyrouac wants to merge 1 commit intocoreos:mainfrom
ckyrouac:grub-multi-fix

Conversation

@ckyrouac
Copy link
Copy Markdown
Contributor

On multi-device setups (e.g. RAID 1 with an ESP per disk), grubconfigs::install() only writes grub.cfg and bootuuid.cfg to the single ESP currently mounted at /boot/efi. This means that if the system boots from a different disk, GRUB drops to a shell because there is no grub.cfg on that ESP.

Add sync_grub_configs_to_all_esps() which runs after the primary grubconfigs::install(). It discovers all ESP partitions across the provided devices, mounts each one to a temporary directory, and calls the new grubconfigs::install_to_esp() helper to write the static EFI grub.cfg and bootuuid.cfg.

Also refactor the ESP config writing in grubconfigs::install() into a shared write_esp_configs() helper to avoid code duplication.

Fixes #1076

Assisted-by: Claude Code (Opus 4.6)

…ce setups

On multi-device setups (e.g. RAID 1 with an ESP per disk),
grubconfigs::install() only writes grub.cfg and bootuuid.cfg to the
single ESP currently mounted at /boot/efi. This means that if the
system boots from a different disk, GRUB drops to a shell because
there is no grub.cfg on that ESP.

Add sync_grub_configs_to_all_esps() which runs after the primary
grubconfigs::install(). It discovers all ESP partitions across the
provided devices, mounts each one to a temporary directory, and
calls the new grubconfigs::install_to_esp() helper to write the
static EFI grub.cfg and bootuuid.cfg.

Also refactor the ESP config writing in grubconfigs::install() into
a shared write_esp_configs() helper to avoid code duplication.

Fixes coreos#1076

Assisted-by: Claude Code (Opus 4.6)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 27, 2026

Hi @ckyrouac. Thanks for your PR.

I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ckyrouac ckyrouac marked this pull request as draft March 27, 2026 13:37
@ckyrouac
Copy link
Copy Markdown
Contributor Author

pushing up what claude generated in case anyone wants to review it while I test/validate locally and add a test.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for synchronizing GRUB configuration files across all detected EFI System Partitions (ESPs) on multi-device systems. It refactors the GRUB installation logic into reusable helpers and introduces a synchronization loop that mounts additional ESPs to apply updates. Feedback suggests improving error logging during ESP detection and implementing a panic-safe RAII pattern for the mount/unmount operations.

Comment on lines +254 to +257
let esp_parts: Vec<String> = devices
.iter()
.filter_map(|dev| dev.find_partition_of_esp().ok().map(|p| p.path()))
.collect();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of .ok() here silently ignores any errors from dev.find_partition_of_esp(). In other parts of the codebase (e.g., the install function in this file), similar errors are logged with a warning. For consistency and better diagnostics, it would be beneficial to log a warning when find_partition_of_esp fails.

Suggested change
let esp_parts: Vec<String> = devices
.iter()
.filter_map(|dev| dev.find_partition_of_esp().ok().map(|p| p.path()))
.collect();
let esp_parts: Vec<String> = devices
.iter()
.filter_map(|dev| match dev.find_partition_of_esp() {
Ok(p) => Some(p.path()),
Err(e) => {
log::warn!("Skipping device {} for ESP sync: {:#}", dev.path(), e);
None
}
})
.collect();

Comment on lines +271 to +306
for esp_path in &esp_parts {
log::info!("Syncing GRUB configs to ESP {esp_path}");

// Mount the ESP
std::process::Command::new("mount")
.arg(esp_path)
.arg(tmpmnt)
.run_inherited()
.with_context(|| format!("Mounting ESP {esp_path}"))?;

let result = (|| -> Result<()> {
// Ensure the vendor directory exists under EFI/
let efi_vendor = tmpmnt.join("EFI").join(vendor);
std::fs::create_dir_all(&efi_vendor)
.with_context(|| format!("Creating {:?}", efi_vendor))?;

let efidir = openat::Dir::open(tmpmnt.join("EFI"))
.context("Opening EFI directory on ESP")?;

crate::grubconfigs::install_to_esp(
source_root,
&boot_grub2,
&efidir,
vendor,
write_uuid,
)
})();

// Always unmount, even on error
std::process::Command::new("umount")
.arg(tmpmnt)
.run_inherited()
.with_context(|| format!("Unmounting ESP {esp_path}"))?;

result?;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation for mounting and unmounting the ESP is not panic-safe. If a panic occurs after mounting but before unmounting (for example, inside crate::grubconfigs::install_to_esp), the umount command will not be executed. This could leave a stale mount point on a temporary directory that is about to be removed.

A more robust approach is to use a RAII guard struct that mounts in its constructor and unmounts in its Drop implementation. This ensures that the ESP is always unmounted, even in the case of a panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

grub.cfg and bootuuid.cfg not synced to secondary ESPs on multi-device setups

1 participant