Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ When debugging container behavior, the order is: image `/etc/{rc,fstab,environme
## Active design proposals

- **`doc/zfs.md`** — optional ZFS storage backend (`ENROOT_STORAGE_BACKEND=zfs`). Replaces `unsquashfs`-per-create with extract-once-then-`zfs clone`. Adds a `.zfs` (zfs send stream) image format and a `zfs://host/NAME` transport scheme alongside today's `.sqsh`. Introduces a shared template cache with a live/warm/cold lifecycle (knobs: `ENROOT_TEMPLATE_WARM_SECONDS`, `ENROOT_TEMPLATE_PRESSURE_THRESHOLD`; eviction is implicit on `create`, no daemon, no `enroot gc` command). Default backend (`dir`) is unchanged.
- **`doc/plans/`** — six implementation plans (A–F) breaking the ZFS backend into independently-landable slices. Start with `doc/plans/README.md` for the index and recommended landing order (A → E → F → B → C → D). Plans add a new sourced module `src/storage_zfs.sh` (under a `zfs::` namespace) and branch in `src/runtime.sh`, `src/docker.sh` on `ENROOT_STORAGE_BACKEND`. **All six plans merged** on `zenroot/main` (PRs [#1](https://github.com/zeroae/enroot/pull/1), [#2](https://github.com/zeroae/enroot/pull/2), [#3](https://github.com/zeroae/enroot/pull/3), [#5](https://github.com/zeroae/enroot/pull/5), [#7](https://github.com/zeroae/enroot/pull/7), and Plan D in review).
- **`doc/plans/`** — implementation plans for the ZFS backend, broken into independently-landable slices. Start with `doc/plans/README.md` for the index and recommended landing order (A → E → F → B → C → D → G). Plans add a new sourced module `src/storage_zfs.sh` (under a `zfs::` namespace) and branch in `src/runtime.sh`, `src/docker.sh` on `ENROOT_STORAGE_BACKEND`. **Plans A–F merged** on `zenroot/main` (PRs [#1](https://github.com/zeroae/enroot/pull/1), [#2](https://github.com/zeroae/enroot/pull/2), [#3](https://github.com/zeroae/enroot/pull/3), [#5](https://github.com/zeroae/enroot/pull/5), [#7](https://github.com/zeroae/enroot/pull/7), [#8](https://github.com/zeroae/enroot/pull/8)). Plan G (per-layer clone chain, opt-in via `ENROOT_ZFS_LAYER_CHAIN=y`, [issue #4](https://github.com/zeroae/enroot/issues/4)) layered on top of F.

## Conventions

Expand Down
421 changes: 421 additions & 0 deletions doc/plans/2026-05-01-zfs-g-layer-chain.md

Large diffs are not rendered by default.

5 changes: 3 additions & 2 deletions doc/plans/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,16 @@ Plans for landing the optional ZFS storage backend designed in [`../zfs.md`](../
| D. `zfs://` URI transport — `enroot load zfs://host/NAME`, `enroot export NAME zfs://host` | [2026-04-29-zfs-d-zfs-uri.md](2026-04-29-zfs-d-zfs-uri.md) | A, C |
| E. Ephemeral start ZFS path — substitute `squashfuse + overlay` with throwaway clone | [2026-04-29-zfs-e-ephemeral-start.md](2026-04-29-zfs-e-ephemeral-start.md) | A |
| F. Docker layer-stack ZFS path — lift `ENROOT_NATIVE_OVERLAYFS=y` requirement on ZFS hosts | [2026-04-29-zfs-f-docker-load.md](2026-04-29-zfs-f-docker-load.md) | A |
| G. Per-layer ZFS clone chain (opt-in `ENROOT_ZFS_LAYER_CHAIN=y`) — cross-image layer dedup at the dataset level | [2026-05-01-zfs-g-layer-chain.md](2026-05-01-zfs-g-layer-chain.md) | F |

```
A ─┬─> B
├─> C ─> D
├─> E
└─> F
└─> F ─> G
```

Recommended landing order: **A → E → F → B → C → D**. A is the foundation; E/F give the most user-visible wins next; B improves cache economics; C/D add transport options.
Recommended landing order: **A → E → F → B → C → D → G**. A is the foundation; E/F give the most user-visible wins next; B improves cache economics; C/D add transport options; G is an opt-in optimization on top of F.

## Conventions used by these plans

Expand Down
14 changes: 13 additions & 1 deletion doc/zfs.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ZFS storage backend

This document describes an optional ZFS-aware mode for the enroot container store. **All six plans (A–F) are implemented.** When `ENROOT_STORAGE_BACKEND=zfs`: `enroot create`, `enroot remove`, ephemeral `enroot start <image>`, and `enroot load docker://...` all use ZFS datasets, with a shared template cache that survives `enroot remove` (warm) for `ENROOT_TEMPLATE_WARM_SECONDS` and gets pressure-evicted LRU once the templates dataset crosses `ENROOT_TEMPLATE_PRESSURE_THRESHOLD` of its quota. `enroot create` accepts both `.sqsh` and `.zfs` (zfs send stream) inputs; `enroot export --format=zfs` produces the latter. The `zfs://[USER@]HOST/NAME` URI scheme transports containers between enroot hosts over SSH (`enroot load zfs://...` to pull, `enroot export NAME zfs://...` to push). The default storage backend (plain directories under `ENROOT_DATA_PATH`) is unchanged and remains the only option on hosts without ZFS.
This document describes an optional ZFS-aware mode for the enroot container store. **All six plans (A–F) are implemented; Plan G adds an opt-in per-layer clone chain on top of F.** When `ENROOT_STORAGE_BACKEND=zfs`: `enroot create`, `enroot remove`, ephemeral `enroot start <image>`, and `enroot load docker://...` all use ZFS datasets, with a shared template cache that survives `enroot remove` (warm) for `ENROOT_TEMPLATE_WARM_SECONDS` and gets pressure-evicted LRU once the templates dataset crosses `ENROOT_TEMPLATE_PRESSURE_THRESHOLD` of its quota. `enroot create` accepts both `.sqsh` and `.zfs` (zfs send stream) inputs; `enroot export --format=zfs` produces the latter. The `zfs://[USER@]HOST/NAME` URI scheme transports containers between enroot hosts over SSH (`enroot load zfs://...` to pull, `enroot export NAME zfs://...` to push). The default storage backend (plain directories under `ENROOT_DATA_PATH`) is unchanged and remains the only option on hosts without ZFS.

## Motivation

Expand All @@ -15,6 +15,7 @@ The ZFS backend is an *alternative storage driver*, in the same spirit as Docker
| `ENROOT_STORAGE_BACKEND` | `dir` | `dir` = today's behavior. `zfs` = use ZFS datasets for the container store. |
| `ENROOT_TEMPLATE_WARM_SECONDS` | `604800` (7 days) | How long a template with no clones remains evictable only under pressure. `0` = evict immediately when refcount reaches zero (refcount-only). `inf` = never auto-evict. |
| `ENROOT_TEMPLATE_PRESSURE_THRESHOLD` | `0.80` | Templates dataset quota fraction above which routine `create`s start evicting warm templates. Soft signal; the ZFS quota is the hard wall. |
| `ENROOT_ZFS_LAYER_CHAIN` | unset | When `y` AND backend is `zfs`, populate the Docker template cache via a per-layer `zfs clone` chain under `<store>/.layers/<digest>` instead of a single merged extract. Cross-image base layers are physically shared on disk (a debian-bookworm base used by both `python:slim` and `node:slim` is stored once). Applies to `docker://` URIs only; `dockerd://` and `podman://` always go through the daemon-flat-export path and are unaffected. Default off — leaves Plan F's single-merge path unchanged. |

When `ENROOT_STORAGE_BACKEND=zfs`, `ENROOT_DATA_PATH` must be the mountpoint of a ZFS dataset that the unprivileged user has been granted permission on (see [Admin setup](#admin-setup)).

Expand All @@ -26,6 +27,17 @@ ${pool}/${dataset}/templates/<sha256>@pristine # snapshot taken after extracti
${pool}/${dataset}/<user>/<container_name> # clones of @pristine, the user's containers
```

When `ENROOT_ZFS_LAYER_CHAIN=y`, an additional `.layers/` namespace appears under the same store; templates become clones of the chain leaf instead of being filled by a single merged extract:

```
${pool}/${dataset}/.layers/<layer-digest> # one per distinct registry layer
${pool}/${dataset}/.layers/<layer-digest>@done # snapshot taken after layer apply
${pool}/${dataset}/.templates/<image-config-sha> # zfs clone of the chain leaf @done
${pool}/${dataset}/.templates/<image-config-sha>@pristine
```

Each layer dataset is `zfs clone`d from the previous layer's `@done`, so two images sharing a base layer (e.g. `python:3.12-slim` and `node:20-slim`, both built on `debian:bookworm-slim`) physically share the base bytes. Layer datasets are immutable origins; ZFS refuses to destroy a layer while any descendant clone exists, so layer GC is automatic once all referencing templates are evicted.

Mountpoints follow the dataset hierarchy under `ENROOT_DATA_PATH`. Templates are not user-visible — `enroot list` only enumerates `<user>/<container_name>` clones. Templates have `readonly=on`; clones inherit the property override on `start -w`.

The `templates` dataset is shared across all users on the host. Its quota and properties are admin-controlled (see below).
Expand Down
2 changes: 1 addition & 1 deletion pkg/deb/control
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Depends: ${shlibs:Depends}, ${misc:Depends},
# tar,
# util-linux,
# ncurses-bin
Recommends: pigz
Recommends: pigz, attr
Suggests: libnvidia-container-tools, squashfuse, fuse-overlayfs
Description: Unprivileged container sandboxing utility
A simple yet powerful tool to turn traditional container/OS images into
Expand Down
15 changes: 14 additions & 1 deletion src/docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,13 @@ docker::_prepare_layers() (
zstd -q -d -o config "${ENROOT_CACHE_PATH}/${config}"
docker::configure "${PWD}/0" config "${arch}"

# Side-emit the ordered layer-digest list to ./.layers (one per line, base
# first, top last). The ZFS chain-mode path (Plan G) reads this back to
# build the per-layer dataset chain. Plan F and dir-backend callers ignore
# the file; it lives in the caller's per-call mktmpdir so it gets cleaned
# up alongside the rest of the extraction temp dir.
printf "%s\n" "${layers[@]}" > .layers

printf "%s\n%s\n" "${config}" "${#layers[@]}"
)

Expand Down Expand Up @@ -545,7 +552,13 @@ docker::load() (
fi

if zfs::enabled; then
zfs::docker_install_from_layers "${config}" "${layer_count}" "${unpriv}" "${name}"
if zfs::layer_chain_active; then
local -a layer_digests=()
readarray -t layer_digests < .layers
zfs::docker_install_from_layers "${config}" "${layer_count}" "${unpriv}" "${name}" "${layer_digests[@]}"
else
zfs::docker_install_from_layers "${config}" "${layer_count}" "${unpriv}" "${name}"
fi
else
# Create a mount namespace and overlay mount
mkdir -p rootfs "${name}"
Expand Down
Loading