ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section by heskew · Pull Request #452 · HarperFast/harper

heskew · 2026-05-01T22:57:30Z

Summary

Fixes the immediate symptom from harper PR #450's failed review (run 25234303343): agent burned ~20 turns on `Bash(grep -rn …)` repo-wide searches, hit `--max-turns 24`, posted no review comment, log step skipped. Diff was 5 files / +83/−8 — exploration pattern, not diff size, drove the cost.

What changes

`--max-turns 24 → 48` — band-aid. Doubles worst-case budget; ends the immediate dropouts on PRs with non-trivial cross-symbol exploration.
`## Tools` section rewrite:
- Adds a turn budget hint (1–2 turns context, 1 turn diff, ~1 per file, rest for searches + posting).
- Names repo-wide search as THE biggest turn-burner with a concrete failure-mode citation ("recent run that timed out before posting anything").
- Points at `Grep` tool with `path` parameter scoping as the right tool for the job.

Open question (followup)

In the failed run, the agent's `Bash(grep -rn …)` calls all came back with `is_error: false` — they ran. But `Bash(grep …)` shouldn't match any of the existing allowlist patterns (`Bash(git diff:)`, `Bash(git log:)`, `Bash(git blame:)`, `Bash(git show:)`, `Bash(gh ...:*)`).

Either:

claude-code-action permits any `Bash(...)` once the tool is enabled at all, or
there's a matching gotcha in our patterns.

Worth digging into so the prompt instruction is backed by an actual deny at the action level. Not blocking this PR.

Test plan

Push to a PR after this lands → confirm review completes and posts a marker'd comment.
Confirm the agent's tool-call pattern: prefers `Grep` over `Bash(grep)`, doesn't do repo-wide recursive searches.
Re-trigger PR Read-only mode #450's review (or wait for next push) → confirm review now lands.

Followup

oauth mirror once this merges.

🤖 Generated with Claude Code

Diagnosis from harper PR #450 review failure (run 25234303343): agent burned ~20 turns on `Bash(grep -rn …)` searches across the whole repo before hitting `--max-turns 24`, posted no review comment, and the log step skipped (no marker'd comment to log). Diff was small (5 files, +83/−8) — exploration pattern, not diff size, was the cost driver. Two changes: 1. **--max-turns 24 → 48.** Cheap band-aid that ends the immediate dropouts on PRs with non-trivial cross-symbol exploration. Per the existing comment, this is the real cost ceiling — bumping doubles the worst-case budget and moves us out of the 'one exploratory grep too many = no review at all' regime. 2. **Tools section: explicit turn-budget hint + sharpened anti-pattern.** Names "repo-wide search" as THE biggest turn-burner, points at the `Grep` tool with `path` scoping, and cites the failure mode ("recent run that timed out before posting anything") so the agent has concrete signal that matches its observed behavior. Out of scope: investigating whether `Bash(grep …)` is supposed to be denied by the existing `--allowedTools` list (current entries scope to `Bash(git diff:*)`, `Bash(git log:*)`, `Bash(git blame:*)`, `Bash(git show:*)`, `Bash(gh ...:*)` — `Bash(grep …)` shouldn't match any pattern but ran cleanly in #450's failed run, with `is_error: false` on every grep tool call). Either claude-code- action permits any `Bash(...)` once the tool is enabled, or there's a matching gotcha. Tracked as a follow-up; this PR addresses the immediate symptom. oauth needs the same change — will mirror after this lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirror of HarperFast/harper#452. Same fix applied to the oauth review workflow: max-turns doubled, turn-budget hint added, repo-wide search called out as the single biggest turn-burner. The triggering harper run is referenced as the concrete failure mode rather than re-running the same diagnosis on oauth (oauth hasn't seen the same failure yet, but the prompt and budget are identical, so the same exposure exists). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Symptom: every PR was failing `Auth gate invariants / validate` with `authorize job has no step setting USERS_TO_CHECK env var`, even on workflows that clearly set it. Reported on PR #452 and others. Cause: the check `USERS_TO_CHECK presence` (added in #417 review- fixup) was using a jq-flavored expression — `// empty` in particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on the yq invocation, so the lexer error was eaten and the variable came back as the empty string, tripping the existence check on every workflow. Fix: rewrite the expression in idiomatic yq: - `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)` — emits a stream of one value per step that sets the env var, skipping steps that don't. - `head -1` collapses the stream to a single value (or empty) for the `[ -n ]` check. Verified locally with mikefarah/yq v4.53.2 against all three harper claude-*.yml workflows — all pass. Out of scope but worth following up: - The `2>/dev/null` swallowed a real yq error. Worth either removing the suppression or capturing stderr for diagnostics when the expression returns empty. - oauth's mirror (#68) carries the same bug; a copy of this fix needs to land there too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

heskew requested review from a team as code owners May 1, 2026 22:57

heskew merged commit 0118c1c into main May 1, 2026
22 of 25 checks passed

heskew deleted the workflow/bump-max-turns-and-search-discipline branch May 1, 2026 23:01

heskew mentioned this pull request May 5, 2026

ci(claude): clarify --allowedTools is convenience, not enforcement #473

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452
heskew merged 1 commit intomainfrom
workflow/bump-max-turns-and-search-discipline

heskew commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

heskew commented May 1, 2026

Summary

What changes

Open question (followup)

Test plan

Followup

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant