Skip to content

forkd-jupyter-spawner: JupyterHub Spawner adapter #21

@WaylandYang

Description

@WaylandYang

Goal

Ship a forkd-jupyter-spawner Python package that implements
JupyterHub's Spawner interface, so JupyterHub admins can use forkd
as the per-user kernel backend instead of Docker, Kubernetes, or
local processes.

Why

JupyterHub's current spawner ecosystem (DockerSpawner, KubeSpawner,
SystemdSpawner, …) all pay container or process cold-start per user.
forkd's recipes/jupyter-kernel/ parent already proves the SciPy
stack can be warmed once and shared across N children. The missing
piece is a Spawner class that talks to the forkd-controller daemon
on the host.

This is the natural production path for the "Where forkd fits" /
Jupyter-kernel use case currently surfaced in the README.

Scope

  • In scope. A standalone Python package forkd-jupyter-spawner
    (separate repo or sdk/python-spawner/ here) implementing
    jupyterhub.spawner.Spawner subclass. Methods to wire:
    • start()POST /v1/sandboxes on the forkd controller with
      a configured snapshot tag (e.g. jupyter-kernel).
    • poll()GET /v1/sandboxes/:id to check liveness.
    • stop()DELETE /v1/sandboxes/:id.
    • State persistence: get_state / load_state for surviving
      hub restarts.
    • Network: forward the kernel's ZMQ ports through the child's
      netns to the host so JupyterHub clients can connect.
  • Out of scope. Multi-host scheduling (the controller is single-
    host today); a custom Jupyter UI; replacing JupyterHub itself.

Known unknowns

  • Kernel readiness signal. The forkd-controller doesn't currently
    expose "the in-VM kernel is listening on ZMQ ports X/Y/Z."
    Either (a) extend the controller to forward arbitrary TCP ports
    through netns + return their host-side addresses, or (b) use the
    existing forkd-agent on :8888 as a control plane and have it
    start ipykernel and report ports back.
  • Resource limits. JupyterHub admins expect per-user memory /
    cpu quotas. forkd has cgroup memory.max today; cpu/io/pids quotas
    are still TODO.
  • Authentication. Hub authenticates the user; forkd-controller
    expects a bearer token. Spawner needs to pass that through (config).

Acceptance criteria

  • pip install forkd-jupyter-spawner works end-to-end.
  • A minimal JupyterHub c.JupyterHub.spawner_class = 'forkd_jupyter_spawner.ForkdSpawner' config spawns a user kernel against a
    forkd snapshot tag.
  • Per-user kernel cold-start (hub start → cell-1 ready) is < 1 s
    given a warmed parent on the same host.
  • Tests: one integration test that brings up a local JupyterHub,
    spawns 5 kernels in parallel, asserts each can execute a numpy
    expression.

Related

  • Recipe: recipes/jupyter-kernel/ (already shipped — proves the
    parent-rootfs story; spawner is the JupyterHub-side glue.)
  • README "Where forkd fits" → first bullet now explicitly names
    Jupyter-kernel sandboxes as the design point.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions