Skip to content

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452

Merged
heskew merged 1 commit intomainfrom
workflow/bump-max-turns-and-search-discipline
May 1, 2026
Merged

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452
heskew merged 1 commit intomainfrom
workflow/bump-max-turns-and-search-discipline

Conversation

@heskew
Copy link
Copy Markdown
Member

@heskew heskew commented May 1, 2026

Summary

Fixes the immediate symptom from harper PR #450's failed review (run 25234303343): agent burned ~20 turns on `Bash(grep -rn …)` repo-wide searches, hit `--max-turns 24`, posted no review comment, log step skipped. Diff was 5 files / +83/−8 — exploration pattern, not diff size, drove the cost.

What changes

  1. `--max-turns 24 → 48` — band-aid. Doubles worst-case budget; ends the immediate dropouts on PRs with non-trivial cross-symbol exploration.
  2. `## Tools` section rewrite:
    • Adds a turn budget hint (1–2 turns context, 1 turn diff, ~1 per file, rest for searches + posting).
    • Names repo-wide search as THE biggest turn-burner with a concrete failure-mode citation ("recent run that timed out before posting anything").
    • Points at `Grep` tool with `path` parameter scoping as the right tool for the job.

Open question (followup)

In the failed run, the agent's `Bash(grep -rn …)` calls all came back with `is_error: false` — they ran. But `Bash(grep …)` shouldn't match any of the existing allowlist patterns (`Bash(git diff:)`, `Bash(git log:)`, `Bash(git blame:)`, `Bash(git show:)`, `Bash(gh ...:*)`).

Either:

  • claude-code-action permits any `Bash(...)` once the tool is enabled at all, or
  • there's a matching gotcha in our patterns.

Worth digging into so the prompt instruction is backed by an actual deny at the action level. Not blocking this PR.

Test plan

  • Push to a PR after this lands → confirm review completes and posts a marker'd comment.
  • Confirm the agent's tool-call pattern: prefers `Grep` over `Bash(grep)`, doesn't do repo-wide recursive searches.
  • Re-trigger PR Read-only mode #450's review (or wait for next push) → confirm review now lands.

Followup

oauth mirror once this merges.

🤖 Generated with Claude Code

Diagnosis from harper PR #450 review failure (run 25234303343):
agent burned ~20 turns on `Bash(grep -rn …)` searches across the
whole repo before hitting `--max-turns 24`, posted no review
comment, and the log step skipped (no marker'd comment to log).
Diff was small (5 files, +83/−8) — exploration pattern, not diff
size, was the cost driver.

Two changes:

1. **--max-turns 24 → 48.** Cheap band-aid that ends the immediate
   dropouts on PRs with non-trivial cross-symbol exploration. Per
   the existing comment, this is the real cost ceiling — bumping
   doubles the worst-case budget and moves us out of the 'one
   exploratory grep too many = no review at all' regime.

2. **Tools section: explicit turn-budget hint + sharpened
   anti-pattern.** Names "repo-wide search" as THE biggest
   turn-burner, points at the `Grep` tool with `path` scoping, and
   cites the failure mode ("recent run that timed out before
   posting anything") so the agent has concrete signal that
   matches its observed behavior.

Out of scope: investigating whether `Bash(grep …)` is supposed to
be denied by the existing `--allowedTools` list (current entries
scope to `Bash(git diff:*)`, `Bash(git log:*)`, `Bash(git blame:*)`,
`Bash(git show:*)`, `Bash(gh ...:*)` — `Bash(grep …)` shouldn't
match any pattern but ran cleanly in #450's failed run, with
`is_error: false` on every grep tool call). Either claude-code-
action permits any `Bash(...)` once the tool is enabled, or there's
a matching gotcha. Tracked as a follow-up; this PR addresses the
immediate symptom.

oauth needs the same change — will mirror after this lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@heskew heskew requested review from a team as code owners May 1, 2026 22:57
@heskew heskew merged commit 0118c1c into main May 1, 2026
22 of 25 checks passed
@heskew heskew deleted the workflow/bump-max-turns-and-search-discipline branch May 1, 2026 23:01
heskew added a commit to HarperFast/oauth that referenced this pull request May 1, 2026
Mirror of HarperFast/harper#452. Same fix applied to the oauth
review workflow: max-turns doubled, turn-budget hint added,
repo-wide search called out as the single biggest turn-burner.

The triggering harper run is referenced as the concrete failure
mode rather than re-running the same diagnosis on oauth (oauth
hasn't seen the same failure yet, but the prompt and budget are
identical, so the same exposure exists).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cb1kenobi pushed a commit that referenced this pull request May 4, 2026
Symptom: every PR was failing `Auth gate invariants / validate`
with `authorize job has no step setting USERS_TO_CHECK env var`,
even on workflows that clearly set it. Reported on PR #452 and
others.

Cause: the check `USERS_TO_CHECK presence` (added in #417 review-
fixup) was using a jq-flavored expression — `// empty` in
particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not
jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on
the yq invocation, so the lexer error was eaten and the variable
came back as the empty string, tripping the existence check on
every workflow.

Fix: rewrite the expression in idiomatic yq:
- `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)`
  — emits a stream of one value per step that sets the env var,
  skipping steps that don't.
- `head -1` collapses the stream to a single value (or empty)
  for the `[ -n ]` check.

Verified locally with mikefarah/yq v4.53.2 against all three
harper claude-*.yml workflows — all pass.

Out of scope but worth following up:
- The `2>/dev/null` swallowed a real yq error. Worth either
  removing the suppression or capturing stderr for diagnostics
  when the expression returns empty.
- oauth's mirror (#68) carries the same bug; a copy of this
  fix needs to land there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kriszyp pushed a commit that referenced this pull request May 4, 2026
Symptom: every PR was failing `Auth gate invariants / validate`
with `authorize job has no step setting USERS_TO_CHECK env var`,
even on workflows that clearly set it. Reported on PR #452 and
others.

Cause: the check `USERS_TO_CHECK presence` (added in #417 review-
fixup) was using a jq-flavored expression — `// empty` in
particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not
jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on
the yq invocation, so the lexer error was eaten and the variable
came back as the empty string, tripping the existence check on
every workflow.

Fix: rewrite the expression in idiomatic yq:
- `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)`
  — emits a stream of one value per step that sets the env var,
  skipping steps that don't.
- `head -1` collapses the stream to a single value (or empty)
  for the `[ -n ]` check.

Verified locally with mikefarah/yq v4.53.2 against all three
harper claude-*.yml workflows — all pass.

Out of scope but worth following up:
- The `2>/dev/null` swallowed a real yq error. Worth either
  removing the suppression or capturing stderr for diagnostics
  when the expression returns empty.
- oauth's mirror (#68) carries the same bug; a copy of this
  fix needs to land there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kriszyp pushed a commit that referenced this pull request May 5, 2026
Symptom: every PR was failing `Auth gate invariants / validate`
with `authorize job has no step setting USERS_TO_CHECK env var`,
even on workflows that clearly set it. Reported on PR #452 and
others.

Cause: the check `USERS_TO_CHECK presence` (added in #417 review-
fixup) was using a jq-flavored expression — `// empty` in
particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not
jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on
the yq invocation, so the lexer error was eaten and the variable
came back as the empty string, tripping the existence check on
every workflow.

Fix: rewrite the expression in idiomatic yq:
- `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)`
  — emits a stream of one value per step that sets the env var,
  skipping steps that don't.
- `head -1` collapses the stream to a single value (or empty)
  for the `[ -n ]` check.

Verified locally with mikefarah/yq v4.53.2 against all three
harper claude-*.yml workflows — all pass.

Out of scope but worth following up:
- The `2>/dev/null` swallowed a real yq error. Worth either
  removing the suppression or capturing stderr for diagnostics
  when the expression returns empty.
- oauth's mirror (#68) carries the same bug; a copy of this
  fix needs to land there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant