Skip to content

bug: claude provider times out after 2 mins #521

@christso

Description

@christso

Summary

Claude eval runs are failing after roughly 2 minutes even though the target timeout is configured to 3600 seconds.

What I observed

This does not currently look like a normal AgentV timeout path.

Across two runs, the outcome was:

  • execution_status: execution_error
  • failure_reason_code: provider_error
  • error message: Claude Code process aborted by user

So the failure appears to be a Claude provider / Claude Code subprocess abort rather than AgentV enforcing the configured timeout.

Environment

  • AgentV local checkout using the built-in claude provider
  • Claude provider implementation is SDK-backed via @anthropic-ai/claude-agent-sdk
  • That SDK then spawns a Claude Code subprocess underneath
  • Claude Code installed locally: claude 2.1.62
  • Claude runtime reported in logs: claude-opus-4-6
  • AgentV target timeout: 3600 seconds
  • Static workspace path used:
    • /tmp/agentv-cargowise-customs-static-claude

Target config

- name: claude
  provider: claude
  timeout_seconds: 3600
  judge_target: llm
  log_format: json

Reproduction

Command:

cd /home/christso/projects/WTG.AI.Prompts
export AGENTV_CARGOWISE_CUSTOMS_WORKSPACE_PATH=/tmp/agentv-cargowise-customs-static-claude
agentv eval evals/cargowise-customs/customs-validation-rules/customs-rule-codes-configuration.yaml \
  --targets .agentv/targets.yaml \
  --target claude \
  --test-id add-rule-code-to-country \
  --workers 1 \
  --output /tmp/customs-add-rule-code-claude-static.jsonl \
  --output-format jsonl \
  --verbose

Reran the same command with:

--output /tmp/customs-add-rule-code-claude-static-rerun.jsonl

Behavior

For both runs:

  • static workspace setup succeeded
  • before_all completed successfully
  • Claude progressed normally through discovery for about 2 minutes
  • AgentV then recorded:
    • Claude Code process aborted by user

This reproduced twice, so it does not currently look intermittent.

Artifacts

First run:

  • Result JSONL:
    • /tmp/customs-add-rule-code-claude-static.jsonl
  • AgentV result file:
    • /home/christso/projects/WTG.AI.Prompts/.agentv/results/eval_2026-03-10T11-00-07-414Z.jsonl
  • Claude log:
    • /home/christso/projects/WTG.AI.Prompts/.agentv/logs/claude/2026-03-10T11-00-10-160Z_claude_add-rule-code-to-country_attempt-1_63681e6a.log

Second run:

  • Result JSONL:
    • /tmp/customs-add-rule-code-claude-static-rerun.jsonl
  • AgentV result file:
    • /home/christso/projects/WTG.AI.Prompts/.agentv/results/eval_2026-03-10T11-06-58-388Z.jsonl
  • Claude log:
    • /home/christso/projects/WTG.AI.Prompts/.agentv/logs/claude/2026-03-10T11-07-00-987Z_claude_add-rule-code-to-country_attempt-1_cad57748.log

Additional clue from logs

Claude init event includes:

  • apiKeySource: "none"
  • model: "claude-opus-4-6"

Likely next investigation

  • investigate why the SDK-backed Claude provider is surfacing Claude Code process aborted by user
  • compare behavior with a direct Claude CLI execution path to see whether the issue is in:
    • AgentV Claude provider integration
    • the Claude Agent SDK layer
    • or Claude Code subprocess handling itself

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions