Skip to content

Bug: with_connection holds write lock during entire prompt streaming, blocking concurrent threads #315

@JARVIS-coding-Agent

Description

@JARVIS-coding-Agent

Description

SessionPool::with_connection() in src/acp/pool.rs acquires self.state.write().await and holds the write lock for the entire duration of the prompt streaming (f(conn).await). Since every Discord thread message goes through with_connection, all conversations are serialized — thread B must wait for thread A to finish its full response before it can begin processing.

Each thread has its own independent AcpConnection in the active HashMap, so there is no reason they cannot run in parallel.

// pool.rs — current code (v0.7.3-beta.2 / main @ 5f21b43)
pub async fn with_connection<F, R>(&self, thread_id: &str, f: F) -> Result<R>
{
    let mut state = self.state.write().await;   // ← write lock acquired
    let conn = state.active
        .get_mut(thread_id)
        .ok_or_else(|| anyhow!("no connection for thread {thread_id}"))?;
    f(conn).await   // ← lock held during entire prompt streaming (seconds to minutes)
}

The call chain is: discord.rs stream_prompt()pool.with_connection()conn.session_prompt() + full response streaming. The lock is not released until the agent finishes its entire reply.

Steps to Reproduce

  1. Deploy openab with any agent backend (e.g. Kiro CLI) with pool.maxSessions >= 2
  2. In a Discord channel, @mention the bot to create Thread A — bot starts responding
  3. While Thread A is still streaming its response, @mention the bot again to create Thread B
  4. Observe that Thread B shows no reaction emoji (👀) and does not begin processing until Thread A completes
  5. Once Thread A finishes, Thread B immediately starts processing
  6. Repeat with Thread A doing a long task (e.g. complex coding with tool calls) to make the blocking more obvious

This is reproducible 100% of the time.

Expected Behavior

Thread A and Thread B should process concurrently. Each thread has its own AcpConnection (separate child process), so there is no data dependency between them. The write lock should only be held long enough to retrieve the connection reference, not during the entire prompt lifecycle.

Environment

  • OpenAB: v0.7.3-beta.2 (also confirmed on main @ 5f21b43)
  • Agent: Kiro CLI (kiro-cli acp --trust-all-tools)
  • K3s v1.33 on Ubuntu 24.04 (linux/amd64)
  • pool.maxSessions: 10

Screenshots / Logs

openab logs showing sequential session creation — Thread B only spawns after Thread A completes:

01:49:04 INFO spawning agent cmd="kiro-cli"    ← Thread A starts
01:49:06 INFO initialized agent="Kiro CLI Agent"
01:49:07 INFO session created session_id=48046eef-...
         ← Thread A streaming for ~20 min, Thread B blocked
02:08:52 INFO spawning agent cmd="kiro-cli"    ← Thread B starts only after A finishes
02:08:53 INFO initialized agent="Kiro CLI Agent"
02:08:54 INFO session created session_id=11828c0e-...

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingduplicateThis issue or pull request already existsp1High — address this sprintsession

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions