Skip to content

Add reuse-container mode#303

Open
rsennewald wants to merge 1 commit intobuildkite-plugins:masterfrom
rsennewald:reuse-containers
Open

Add reuse-container mode#303
rsennewald wants to merge 1 commit intobuildkite-plugins:masterfrom
rsennewald:reuse-containers

Conversation

@rsennewald
Copy link
Copy Markdown

Add a persistent container feature. When reuse-container: true
is set, the plugin keeps a container alive across job steps
instead of creating and destroying one each time.
Key behaviors:

  • Derives a stable container name from the image name and agent
    spawn index (or uses an explicit reuse-container-name override).
  • On each job, checks for an existing running container: if the
    image digest matches, attaches via docker exec; if the digest
    differs, prints a warning banner, removes the stale container,
    and creates a fresh one.
  • New containers are started detached (docker run -d ... sleep infinity) with --rm suppressed, then the job command runs via
    docker exec.
  • Environment variables are stripped from the creation docker run
    and injected per-exec only, preventing secrets from leaking
    between jobs that share the container.
  • The pre-exit hook skips container cleanup when reuse-container
    is enabled so the container persists for subsequent jobs.
    Files changed:
  • plugin.yml: new reuse-container and reuse-container-name opts
  • lib/shared.bash: get_reuse_container_name,
    get_container_image_id, get_image_id helpers
  • commands/run.sh: reuse-container branch with exec-arg extraction,
    digest validation, and detached container creation
  • hooks/pre-exit: skip cleanup when reuse-container is set
  • tests/reuse-container.bats: 11 bats tests covering creation,
    reuse, digest mismatch, stopped container, custom name, spawn
    index, env passthrough, env stripping, pre-exit, --rm absence,
    and exit code propagation

Add a persistent container feature. When `reuse-container: true`
is set, the plugin keeps a container alive across job steps
instead of creating and destroying one each time.
Key behaviors:
- Derives a stable container name from the image name and agent
  spawn index (or uses an explicit `reuse-container-name` override).
- On each job, checks for an existing running container: if the
  image digest matches, attaches via `docker exec`; if the digest
  differs, prints a warning banner, removes the stale container,
  and creates a fresh one.
- New containers are started detached (`docker run -d ... sleep
  infinity`) with `--rm` suppressed, then the job command runs via
  `docker exec`.
- Environment variables are stripped from the creation `docker run`
  and injected per-exec only, preventing secrets from leaking
  between jobs that share the container.
- The `pre-exit` hook skips container cleanup when reuse-container
  is enabled so the container persists for subsequent jobs.
Files changed:
- plugin.yml: new `reuse-container` and `reuse-container-name` opts
- lib/shared.bash: `get_reuse_container_name`,
  `get_container_image_id`, `get_image_id` helpers
- commands/run.sh: reuse-container branch with exec-arg extraction,
  digest validation, and detached container creation
- hooks/pre-exit: skip cleanup when reuse-container is set
- tests/reuse-container.bats: 11 bats tests covering creation,
  reuse, digest mismatch, stopped container, custom name, spawn
  index, env passthrough, env stripping, pre-exit, --rm absence,
  and exit code propagation
@rsennewald rsennewald requested a review from a team as a code owner March 23, 2026 23:50
Copy link
Copy Markdown
Contributor

@toote toote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great PR @rsennewald! I think it is a very interesting feature but I have some questions about how it would interact with other parts of the plugin that assume you are running in docker to ensure an ephemeral and isolated execution environment (specially labels and, volume mounts) that is in a known state - because it will be created from an image every time.

Also, I think that the code flow should be better so that we avoid having to duplicate the code to run the container or calculate the right arguments for the commands.

More than happy to discuss these items and looking forward to your opinions!

Comment thread commands/run.sh
Comment on lines +604 to +619
# Build exec_args by extracting only the flags that docker exec supports
# from the run_flags snapshot (tty, interactive, env, workdir, user).
exec_args=()
i=0
while [[ $i -lt ${#run_flags[@]} ]]; do
case "${run_flags[$i]}" in
-t|-i)
exec_args+=("${run_flags[$i]}")
;;
--env|--env-file|--workdir|-u)
exec_args+=("${run_flags[$i]}" "${run_flags[$((i+1))]}")
i=$((i+1))
;;
esac
i=$((i+1))
done
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more readable to create a separate argument array for the exec codepath or, even better, an array for the options available on both exec and run and another for the ones exclusive to run

Comment thread commands/run.sh
Comment on lines +622 to +639
exec_cmd=()
if [[ ${#shell[@]} -gt 0 ]]; then
for shell_arg in "${shell[@]}"; do
exec_cmd+=("$shell_arg")
done
fi
if [[ -n "${BUILDKITE_COMMAND}" ]]; then
if is_windows; then
windows_multi_command=${BUILDKITE_COMMAND//$'\n'/ && }
exec_cmd+=("${windows_multi_command}")
else
exec_cmd+=("${BUILDKITE_COMMAND}")
fi
elif [[ ${#command[@]} -gt 0 ]]; then
for command_arg in "${command[@]}"; do
exec_cmd+=("$command_arg")
done
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here: why duplicate this logic that is already done above for the run command? It would make more sense to save all this to a separate array that is the same for both run and exec

Comment thread commands/run.sh
Comment on lines +715 to +732
# Disable -e outside of the subshell; since the subshell returning a failure
# would exit the parent shell (here) early.
set +e

# Prevent SIGTERM from killing this script. SIGTERM will still be passed to the Docker container, which can exit
# gracefully (or, if necessary, non-gracefully per the `--stop-timeout` flag passed above).
trap '' SIGTERM

# Don't convert paths on gitbash on windows, as that can mangle user paths and cmd options.
# See https://github.com/buildkite-plugins/docker-buildkite-plugin/issues/81 for more information.
# `trap` is used in this subshell for the same reason it is used above.
( if is_windows ; then export MSYS_NO_PATHCONV=1; fi && trap '' SIGTERM && docker run "${args[@]}" )

exit_code=$?

set -e

exit $exit_code
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole block is common to both branches, I think it could be simplified with a better crafting (i.e. replacing with a better version) of the ${args[@]}

Comment thread lib/shared.bash

# Returns a stable container name for the reuse-container feature.
# Uses the explicit override if set, otherwise derives from the image name
# and the agent's spawn index (to isolate containers per agent on a host).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be isolated per pipeline, otherwise you could run into issues with a completely different repositories being mounted (if they use the same image to run)

Comment thread commands/run.sh
Comment on lines 536 to 539
"--label" "com.buildkite.pipeline_name=${BUILDKITE_PIPELINE_NAME}"
"--label" "com.buildkite.pipeline_slug=${BUILDKITE_PIPELINE_SLUG}"
"--label" "com.buildkite.build_number=${BUILDKITE_BUILD_NUMBER}"
"--label" "com.buildkite.job_id=${BUILDKITE_JOB_ID}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how would would you suggest to fix the fact that the created container will for ever have labels referencing jobs/builds/pipelines it may no longer be running for?

Comment thread README.md

### `reuse-container` (optional, boolean)

When set to `true`, the plugin keeps a named container running across pipeline steps instead of creating and destroying one each time. On each step, the plugin checks for an existing container: if the image digest matches, the command runs via `docker exec`; if the digest differs, the old container is removed and a fresh one is created. Environment variables are injected per-exec only (not baked into the container) to prevent secrets from leaking between jobs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should add a note with a huge warning about turning on this option when running commands that have side-effects on the container or file system as those would be preserved between runs or pipelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants