Skip to content
111 changes: 111 additions & 0 deletions docs/plans/sandbox-redesign-implementation-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Sandbox Redesign Implementation Plan

Created: 2026-02-09T07:21:38Z (UTC)
Source: `docs/sandbox-redesign.md`

## How To Mark Progress
- Use `[ ]` for not started, `[~]` for in progress, `[x]` for done.
- When a task is done, fill in `Completed At (UTC)` with an ISO timestamp (example: `2026-02-09T08:05:12Z`).

## Implementation Plan

### Phase 0: Scaffolding
- [x] Create `engine/sandbox/types.go` with `SandboxPolicy`, `SandboxInput`, `SandboxOutput`, `SandboxRunner`
Completed At (UTC): `2026-02-09T07:24:21Z`
- [x] Create Linux package skeleton:
`engine/sandbox/linux/runner.go`, `engine/sandbox/linux/namespaces.go`, `engine/sandbox/linux/filesystem.go`, `engine/sandbox/linux/cgroups.go`
Completed At (UTC): `2026-02-09T07:24:21Z`
- [x] Add unit tests for types/mapping basics under `engine/sandbox`
Completed At (UTC): `2026-02-09T07:24:21Z`

### Phase 1: Namespace + Filesystem Isolation
- [x] Implement isolated process execution in new namespaces (PID, mount, UTS, IPC, net-by-default-off) in `engine/sandbox/linux/namespaces.go`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [x] Implement minimal/scratch filesystem setup with writable `/work` bind mount in `engine/sandbox/linux/filesystem.go`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [x] Implement stdout/stderr/exit-code capture in parent process path in `engine/sandbox/linux/runner.go`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [x] Make compile and run steps share the same workspace so compiled artifacts persist between steps
Completed At (UTC): `2026-02-09T07:57:51Z`

### Phase 2: Cgroups v2 Limits
- [~] Create and clean up per-run cgroup in `engine/sandbox/linux/cgroups.go`
Completed At (UTC): ``
- [~] Apply `memory.max`, `cpu.max`, and `pids.max` before command execution
Completed At (UTC): ``
- [ ] Ensure subprocess trees are also constrained by cgroup limits
Completed At (UTC): ``

### Phase 3: Wire Into Existing Runtime/Controller Flow
- [x] Add runtime adapter: update `engine/runtime/runtime_agent.go` to execute via `SandboxRunner` (without breaking controller behavior)
Completed At (UTC): `2026-02-09T07:57:51Z`
- [x] Keep `engine/controller/controller.go` flow unchanged (write source -> optional compile -> run -> cleanup), but route execution through sandboxed runtime path
Completed At (UTC): `2026-02-09T07:57:51Z`
- [ ] Update `server/main.go` runner construction to initialize sandbox-capable runtime wiring
Completed At (UTC): ``
- [x] Preserve current API behavior in `server` and `engine/coderunner/v2` output semantics
Completed At (UTC): `2026-02-09T07:57:51Z`

### Phase 4: Optional Hardening
- [ ] Add `no_new_privileges` and capability dropping in Linux runner path
Completed At (UTC): ``
- [ ] Add seccomp allowlist and validation tests
Completed At (UTC): ``

## Testing Strategy

### 1) Fast Unit Tests (Default CI Path)
- [x] Add/update unit tests for sandbox types and policy conversion logic
Command: `go test ./engine/sandbox/...`
Completed At (UTC): `2026-02-09T07:24:21Z`
- [ ] Update runtime unit tests to mock sandbox execution and keep readiness/state behavior coverage
Command: `go test ./engine/runtime/...`
Completed At (UTC): ``
- [ ] Keep coderunner/controller tests green while swapping execution backend
Commands: `go test ./engine/controller/...` and `go test ./engine/coderunner/v2/...`
Completed At (UTC): ``

### 2) Linux Sandbox Integration Tests (Privileged/Tagged)
- [x] Namespace isolation test (`TestNamespaces`) verifies PID/mount/network isolation
Command: `go test ./engine/sandbox/linux -run TestNamespaces`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [x] Filesystem visibility test verifies minimal root and controlled `/work` mount
Command: `go test ./engine/sandbox/linux -run TestFilesystem`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [~] Cgroup enforcement test (`TestCgroupLimits`) verifies CPU/memory/pids limits
Command: `go test ./engine/sandbox/linux -run TestCgroupLimits`
Completed At (UTC): ``

### 3) End-to-End and API Validation
- [ ] End-to-end language flow test for interpreted and compiled paths (at least Python + C++)
Command: `go test ./engine/coderunner/v2 -run TestRunner`
Completed At (UTC): ``
- [ ] API smoke tests still pass with sandbox backend
Command: `go test ./server/...`
Completed At (UTC): ``

### 4) Regression and Operational Validation
- [x] Full repo test run remains green
Command: `go test ./...`
Completed At (UTC): `2026-02-09T07:57:51Z`
- [ ] Container image build/run still works with runtime dependencies available to sandbox
Commands: `docker build -f docker/server-debian/Dockerfile .` and runtime smoke check
Completed At (UTC): ``

## Risks To Track During Execution
- [ ] Runtime binaries/libs not mounted correctly into scratch FS (language commands fail at runtime)
Completed At (UTC): ``
- [ ] Privilege/capability requirements for namespace/cgroup setup differ between local/dev/CI environments
Completed At (UTC): ``
- [ ] Compile and run workspace continuity regressions for compiled languages
Completed At (UTC): ``

## Execution Log
- 2026-02-09T07:21:38Z: Plan file created in `docs/plans/`.
- 2026-02-09T07:24:21Z: Completed Phase 0 scaffolding files under `engine/sandbox` and `engine/sandbox/linux`.
- 2026-02-09T07:24:21Z: Ran `go test ./engine/sandbox/...` successfully.
- 2026-02-09T07:24:21Z: Attempted `go test ./...`; blocked in this environment by restricted network access to module download endpoints.
- 2026-02-09T07:57:51Z: Implemented Linux sandbox runner (namespace + scratch-ish root + /work bind) and wired RuntimeAgent to use it.
- 2026-02-09T07:57:51Z: Fixed controller agent double-booking by adding non-blocking agent claim.
- 2026-02-09T07:57:51Z: Fixed coderunner compile step to be optional (no empty pre-run command).
- 2026-02-09T07:57:51Z: Ran `go test ./...` successfully (with escalated permissions).
41 changes: 22 additions & 19 deletions engine/coderunner/v2/runner.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,27 +19,28 @@ func (cr *CodeRunner) Run(props *RunnerProps) (*RunnerOutput, error) {

language := LangNameToLangMap[props.Lang]
filename := "run" + language.FileExtension
compileCmd := language.CompileCmd
if language.CompileCmd != nil {
var compileCommands *runtime.RunProps
if language.CompileCmd != nil && len(language.CompileCmd) > 0 {
compileCmd := append([]string{}, language.CompileCmd...)
compileCmd = append(compileCmd, filename)
}

compileCommands := &runtime.RunProps{
RunArgs: compileCmd,
Timeout: runtime.DefaultTimeout,
Nprocs: runtime.DefaultNproc,
Fsize: runtime.DefaultCompileFsize,
Stacksize: runtime.DefaultCompileStackSize,
Cputime: runtime.DefaultCputime,
}
compileCommands = &runtime.RunProps{
RunArgs: compileCmd,
Timeout: runtime.DefaultTimeout,
Nprocs: runtime.DefaultNproc,
Fsize: runtime.DefaultCompileFsize,
Stacksize: runtime.DefaultCompileStackSize,
Cputime: runtime.DefaultCputime,
}

// Language-specific modifications
// Rust has large binaries, even for simple applications
//
// ... is there a better way to do this without switching on names?
switch language.Name {
case "rust":
compileCommands.Fsize = 1 << 25 // 32 mB
// Language-specific modifications
// Rust has large binaries, even for simple applications
//
// ... is there a better way to do this without switching on names?
switch language.Name {
case "rust":
compileCommands.Fsize = 1 << 25 // 32 mB
}
}

runCommands := language.RunCmd
Expand All @@ -54,7 +55,9 @@ func (cr *CodeRunner) Run(props *RunnerProps) (*RunnerOutput, error) {
}

print2.DebugPrintf("writing file: %v", props.Source)
print2.DebugPrintf("compile commands: %v", compileCommands.RunArgs)
if compileCommands != nil {
print2.DebugPrintf("compile commands: %v", compileCommands.RunArgs)
}
print2.DebugPrintf("run commands: %v", runtimeProps.RunArgs)
runOut := cr.controller.SubmitRequest(&controller.Props{
Data: writerremover.NewBlob([]byte(props.Source), filename),
Expand Down
165 changes: 117 additions & 48 deletions engine/controller/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package controller

import (
"errors"
"os"
"path/filepath"
"strconv"
"sync"
Expand Down Expand Up @@ -38,9 +39,16 @@ type agentData struct {
rwmutex sync.RWMutex
agent runtime.Runtime
writerRemover writerremover.BlobWriterRemover
claim chan struct{}
}

func NewAsyncControllerWithMap(agents map[uint]*agentData) *AsyncController {
for _, a := range agents {
if a.claim == nil {
a.claim = make(chan struct{}, 1)
a.claim <- struct{}{}
}
}
return &AsyncController{agents}
}

Expand All @@ -50,15 +58,50 @@ func NewAsyncController(size uint, provider runtime.ArgProvider, parentWorkdir s
for i := uint(0); i < size; i++ {
key := uint(i + 1)
workdir := filepath.Join(parentWorkdir, pattern+strconv.FormatInt(int64(key), 10))
if parentWorkdir != "" {
if err := os.MkdirAll(workdir, 0o755); err != nil {
print2.DebugPrintf("failed to create runner workdir %q: %v", workdir, err)
}
}
agents[key] = &agentData{
rwmutex: sync.RWMutex{},
agent: runtime.NewRuntimeAgentWithIds("agent"+strconv.FormatInt(int64(key), 10), int(key), provider, workdir),
writerRemover: writerremover.NewWorkdirWriter(workdir, 0644),
claim: newClaim(),
}
}
return &AsyncController{agents}
}

func newClaim() chan struct{} {
ch := make(chan struct{}, 1)
ch <- struct{}{}
return ch
}

func (a *agentData) tryClaim() bool {
if a.claim == nil {
// legacy: behave like "no claim" semantics
return true
}
select {
case <-a.claim:
return true
default:
return false
}
}

func (a *agentData) releaseClaim() {
if a.claim == nil {
return
}
select {
case a.claim <- struct{}{}:
default:
}
}

var (
NoRunnerIsReady = CtrlErr(errors.New("no runner available"))
InvalidInput = CtrlErr(errors.New("invalid input"))
Expand All @@ -80,65 +123,91 @@ func (ac *AsyncController) SubmitRequest(runprops *Props) *CtrlRunOutput {
}

for _, agentData := range ac.agents {
if agentData.agent.IsReady() {

// unpack these, easier to reference below
agent := agentData.agent
writerRemover := agentData.writerRemover
preRunProps := runprops.PreRunProps
runProps := runprops.RunProps
data := runprops.Data

// pre-pre run props is to actually write some the blob
err := writerRemover.Write(data)
if err != nil {
print2.DebugPrintf("error writing file before running command: %v", err)
return &CtrlRunOutput{
ControllerErr: PreRunWriteError,
RunOutput: nil,
CommandErr: nil,
}
}

if runprops.PreRunProps != nil {
preRunOut, commandErr := agent.SafeRunCmd(preRunProps)
if commandErr != nil {
print2.DebugPrintf("error preparing command: output=%v\n \nerror=%v", preRunOut, commandErr)
return &CtrlRunOutput{
ControllerErr: nil,
RunOutput: preRunOut,
CommandErr: commandErr,
}
}
if !agentData.agent.IsReady() {
continue
}
if !agentData.tryClaim() {
continue
}
defer agentData.releaseClaim()

// unpack these, easier to reference below
agent := agentData.agent
writerRemover := agentData.writerRemover
preRunProps := runprops.PreRunProps
runProps := runprops.RunProps
data := runprops.Data

// pre-pre run props is to actually write some the blob
err := writerRemover.Write(data)
if err != nil {
print2.DebugPrintf("error writing file before running command: %v", err)
return &CtrlRunOutput{
ControllerErr: PreRunWriteError,
RunOutput: nil,
CommandErr: nil,
}
}

// the actual command must be run as non-root user
runOutput, commandErr := agent.SafeRunCmd(&runtime.RunProps{
RunArgs: runProps.RunArgs,
Timeout: runtime.DefaultTimeout,
Nprocs: runtime.DefaultNproc,
Fsize: runtime.DefaultFsize,
Cputime: runtime.DefaultCputime,
Stacksize: runtime.DefaultStackSize,
Uid: agentData.agent.RuntimeUid(),
Gid: agentData.agent.RuntimeGid(),
})

err = writerRemover.Remove()
if err != nil {
print2.DebugPrintf("error cleaning up")
if preRunProps != nil && len(preRunProps.RunArgs) > 0 {
preRunOut, commandErr := agent.SafeRunCmd(preRunProps)
if commandErr != nil {
print2.DebugPrintf("error preparing command: output=%v\n \nerror=%v", preRunOut, commandErr)
return &CtrlRunOutput{
ControllerErr: PostRunPurgeError,
RunOutput: runOutput,
ControllerErr: nil,
RunOutput: preRunOut,
CommandErr: commandErr,
}
}
}

timeout := runProps.Timeout
if timeout <= 0 {
timeout = runtime.DefaultTimeout
}
nprocs := runProps.Nprocs
if nprocs <= 0 {
nprocs = runtime.DefaultNproc
}
fsize := runProps.Fsize
if fsize <= 0 {
fsize = runtime.DefaultFsize
}
cputime := runProps.Cputime
if cputime <= 0 {
cputime = runtime.DefaultCputime
}
stacksize := runProps.Stacksize
if stacksize <= 0 {
stacksize = runtime.DefaultStackSize
}

// the actual command must be run as non-root user
runOutput, commandErr := agent.SafeRunCmd(&runtime.RunProps{
RunArgs: runProps.RunArgs,
Timeout: timeout,
Nprocs: nprocs,
Fsize: fsize,
Cputime: cputime,
Stacksize: stacksize,
Uid: agentData.agent.RuntimeUid(),
Gid: agentData.agent.RuntimeGid(),
})

err = writerRemover.Remove()
if err != nil {
print2.DebugPrintf("error cleaning up")
return &CtrlRunOutput{
ControllerErr: nil,
ControllerErr: PostRunPurgeError,
RunOutput: runOutput,
CommandErr: commandErr,
}
}
return &CtrlRunOutput{
ControllerErr: nil,
RunOutput: runOutput,
CommandErr: commandErr,
}
}

return &CtrlRunOutput{
Expand Down
Loading