Skip to content

bug: codex-acp Discord session can hang after auto-approved shell escalation #286

@chihkang

Description

@chihkang

Description

When running openab with the Codex agent (codex-acp) on Kubernetes, a Discord thread can become permanently stuck after a shell command first fails under the default Codex sandbox/network restrictions and is then retried through the auto-approved escalation path.

In my case, this happened while asking the bot to run:

HOME=/home/node gh issue view 269 -R openabdev/openab

The bot acknowledged the request, then appeared to switch to an elevated path, and never sent a final reply back to Discord.

This appears to be both:

  • a documentation gap in the current Codex deployment guide, and
  • a runtime/integration bug in the escalation path

Steps to Reproduce

  1. Deploy openab with codex-acp using the documented/basic setup from docs/codex.md.
  2. Authenticate gh successfully inside the agent container.
  3. In Discord, ask the bot to run a GitHub CLI command that needs network access, for example:
    HOME=/home/node gh issue view 269 -R openabdev/openab
  4. Observe that the first attempt fails due to GitHub/network access inside the default Codex runtime.
  5. Observe that openab auto-allows the follow-up permission/escalation request.
  6. The Discord thread never receives a final answer and appears stuck.

Expected Behavior

One of the following should happen:

  1. the escalated command completes and the agent replies normally, or
  2. the agent returns a clear failure message to Discord

It should not silently hang after an auto-approved escalation.

Environment

  • openab chart/app version: 0.7.1
  • deployment: Helm on k3s / Kubernetes
  • agent: codex-acp
  • image: ghcr.io/openabdev/openab-codex:78f8d2c
  • working dir: /home/node
  • Discord integration enabled
  • GitHub CLI (gh) installed and authenticated inside the agent container
  • Initial generated Codex config was effectively:
[agent]
command = "codex-acp"
args = []
working_dir = "/home/node"

Screenshots / Logs

Observed signals:

  • shell error: error connecting to api.github.com
  • openab log showed an auto-allowed permission for the retried command
  • the Codex session transcript stopped after the escalated tool call
  • no task_complete / final reply was emitted
  • in this container/runtime, workspace-write sandboxing also produced:
    bwrap: Failed to make / slave: Permission denied

Additional notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions