Skip to content

Harden task-pod claim with atomic reservation and rollback#3595

Open
tomsmith8 wants to merge 1 commit intomasterfrom
codex/task-pod-safety-phase1
Open

Harden task-pod claim with atomic reservation and rollback#3595
tomsmith8 wants to merge 1 commit intomasterfrom
codex/task-pod-safety-phase1

Conversation

@tomsmith8
Copy link
Copy Markdown
Collaborator

Summary

  • Centralize task↔pod claim flow through shared helpers (claimTaskPodAndSetup, claimPodForTaskAtomically, attachPodToTaskAtomically) so reservation, setup, credential persistence, and rollback all go through one path
  • Fix race condition where concurrent claims for the same task could each claim a different pod, leaking pods on every retry
  • Add rollback: if any post-claim step fails (frontend discovery, control port, credentials), the pod is released back to the pool and task fields are cleared
  • Artifact sync path (/api/chat/response) now rejects nonexistent pod links and refuses password-only updates unless the pod row exists
  • releasePodById now clears agentUrl alongside podId and agentPassword for symmetrical cleanup
  • Capacity read prefers task.podId over usageStatusMarkedBy as the authoritative link
  • Remove fake persisted podId: "local-dev" from custom Goose dev paths

Test plan

  • Concurrent same-task claim: 5 parallel claims result in exactly 1 pod claimed, 1 task.podId
  • Rollback on failure: pod released and task cleared when setup fails after reservation (both /claim-pod and /api/agent)
  • Retry safety: failed claim + retry does not consume additional pods
  • Attach validation: nonexistent pods rejected, already-assigned pods rejected
  • Release symmetry: agentUrl cleared alongside podId and agentPassword
  • Artifact attach: valid pod attached via transactional helper; stale pod links rejected
  • Phase 2 schema migration (pods.assignedTaskId FK + unique constraint) not yet landed

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
hive Ignored Ignored Mar 25, 2026 3:19pm

@github-actions
Copy link
Copy Markdown

Test environment is now live.

View it at: https://hive-preview-6.sphinx.chat

Database expires at: Mar 25, 2026, 8:21 PM UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant