Skip to content

shim: SandboxPlatform validation fails when containerd sets SandboxIsolation without SandboxPlatform #2619

@rzlink

Description

@rzlink

Bug Report

Summary

The sandbox platform validation added in PR #2473 (createInternal() in service_internal.go) fails when containerd's default runtime configuration sets SandboxIsolation=1 (HYPERVISOR) without setting SandboxPlatform. This breaks all Hyper-V isolated Windows containers on containerd v2.2.1+.

Error

FailedCreatePodSandBox: failed to create shim task: invalid runtime sandbox platform: 
"" is an invalid OS component of "": OSAndVersion specifier component must match 
"^([A-Za-z0-9_-]+)(?:\\(([A-Za-z0-9_.-]*)\\))?$": invalid argument

Root Cause

Two components interact to cause this bug:

1. containerd config_windows.go — incomplete runtime handler defaults

containerd's code defaults for runhcs-wcow-hypervisor set SandboxIsolation: 1 but omit SandboxPlatform:

"runhcs-wcow-hypervisor": {
    Type: "io.containerd.runhcs.v1",
    Options: map[string]interface{}{
        "SandboxIsolation": 1,
        // SandboxPlatform is NOT set
    },
},

This makes the runtime options non-empty (proto.Equal(shimOpts, &runhcsopts.Options{}) returns false), since SandboxIsolation is set.

2. hcsshim PR #2473 — strict validation assumes SandboxPlatform is always set when options are non-empty

The emptyShimOpts check:

emptyShimOpts := req.Options == nil || proto.Equal(shimOpts, &runhcsopts.Options{})

When emptyShimOpts == false, the code unconditionally validates SandboxPlatform:

if !emptyShimOpts {
    plat, err := platforms.Parse(shimOpts.GetSandboxPlatform())  // fails: "" is not valid

However, when emptyShimOpts == true, the shim correctly infers the platform from the OCI spec without needing SandboxPlatform. The inference logic already exists — it's just not reachable in the non-empty options path.

Affected Versions

Component Version Status
hcsshim v0.14.0-rc.1+ (includes PR #2473) Affected
containerd v2.2.1+ (bundles hcsshim v0.14.0-rc.1) Affected
containerd v2.1.x (bundles hcsshim v0.13.0) Not affected (no validation)

Reproduction

  1. Use stock containerd v2.2.1 with default config (no custom runtime handler options)
  2. Create a pod with runtimeClassName: runhcs-wcow-hypervisor
  3. Pod fails with the error above

Suggested Fix

When runtime options are non-empty but SandboxPlatform is empty, infer the platform from the OCI spec rather than failing:

if !emptyShimOpts {
    sandboxPlatform := shimOpts.GetSandboxPlatform()
    if sandboxPlatform == "" {
        if oci.IsLCOW(&spec) {
            sandboxPlatform = "linux/" + runtime.GOARCH
        } else if oci.IsWCOW(&spec) {
            sandboxPlatform = "windows/" + runtime.GOARCH
        } else {
            return nil, fmt.Errorf("cannot infer sandbox platform from OCI spec")
        }
        shimOpts.SandboxPlatform = sandboxPlatform
    }
    plat, err := platforms.Parse(sandboxPlatform)
    // ... existing validation continues
}

This mirrors the existing behavior when options are entirely empty (the shim already infers platform from the spec in that case).

Environment

  • Windows Server 2022 (build 10.0.20348)
  • Kubernetes v1.33+
  • containerd v2.2.1
  • CAPZ (Cluster API Provider Azure) clusters

/cc @helsaawy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions