RFC: PR Review Scoring Reform — Moving Away From Overly Strict Agent Review

### Question

How should we reform the PR review process to prevent agents from being overly strict and blocking PRs with excessive NIT comments?

## Problem: PR #191 Real Case Study

**PR #191 feat: add interactive setup command**

One feature ended up with:
- `chaodu-agent` posted **4 rounds** of review
- **~20 issues** flagged (including many NITs / cosmetics)
- PR **AUTHOR** kept having to respond, refresh, and chase comment threads

### Core Symptoms

| Symptom | Description |
|---------|-------------|
| Comment fragmentation | 4 separate threads, author constantly refreshing |
| No separation: Blocking vs Non-Blocking | trailing comma, box drawing constants each get their own comment |
| Reviewers discussing internally | working_dir split into Option A/B/C — 3 layers of back-and-forth |
| Quantity as KPI | 20 issues flagged, but only 2-3 are truly blocking |
| NITs occupying comments | cosmetic issues blocked the PR merge |

---

## Proposed Reform

### 1. Prompt Reform — Based on Google Code Review Guide

Adopting Google's official 11 dimensions, focusing review on:

| Dimension | Priority | Scoring |
|-----------|----------|---------|
| Design | 🔴 High | 1-10 |
| Functionality | 🔴 High | 1-10 |
| Complexity | 🔴 High | 1-10 |
| Tests | 🔴 High | 1-10 |
| Consistency | 🟡 Medium | 1-10 |
| Documentation | 🟡 Medium | 1-10 |
| Style / Naming / Comments | 🟢 NIT only | No score, use \"Nit:\" prefix |

**Core Principles:**
- **Max 3 blocking issues** — beyond this, split into follow-up PR
- **One PR = one review comment** — read everything, reply once
- NITs grouped together, **do NOT block merge**
- Score ≥ 7 + blocking ≤ 3 → suggest merge

### 2. Threshold Consensus

```
✅ >= 7 points + blocking <= 3 → Suggest Merge
⚠️ 5-6 points OR blocking > 3 → Fix before merge
❌ < 5 points → Major revision needed
```

---

## Expected Result (After)

Same PR #191, new review would look like:

```
## PR Review Report
**Overall Score: 7.5/10 → ✅ Suggest Merge (after fixing blocking items)**

### 🔴 Blocking (2 items)
1. TOML injection risk — use toml crate serialize instead of string replacement
2. working_dir points to wrong directory for claude/codex agents

### 🟡 Non-Blocking (grouped, one time)
- Consider adding --output flag for custom config path (nice to have)

### 🟢 Nits (do NOT block merge)
- Nit: trailing comma
- Nit: box drawing constants instead of hardcoded strings
```

---

## Next Steps

These are areas I (Sammy) will drive:

- [ ] **Prompt rollout** — Integrate the new Review Prompt into each Agent's system prompt
- [ ] **Threshold consensus** — Document and align the team on the scoring threshold

---

*Reference: Google Code Review Guide — https://google.github.io/eng-practices/review/reviewer/looking-for.html*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: PR Review Scoring Reform — Moving Away From Overly Strict Agent Review #314

Question

Problem: PR #191 Real Case Study

Core Symptoms

Proposed Reform

1. Prompt Reform — Based on Google Code Review Guide

2. Threshold Consensus

Expected Result (After)

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Symptom	Description
Comment fragmentation	4 separate threads, author constantly refreshing
No separation: Blocking vs Non-Blocking	trailing comma, box drawing constants each get their own comment
Reviewers discussing internally	working_dir split into Option A/B/C — 3 layers of back-and-forth
Quantity as KPI	20 issues flagged, but only 2-3 are truly blocking
NITs occupying comments	cosmetic issues blocked the PR merge

Dimension	Priority	Scoring
Design	🔴 High	1-10
Functionality	🔴 High	1-10
Complexity	🔴 High	1-10
Tests	🔴 High	1-10
Consistency	🟡 Medium	1-10
Documentation	🟡 Medium	1-10
Style / Naming / Comments	🟢 NIT only	No score, use "Nit:" prefix

RFC: PR Review Scoring Reform — Moving Away From Overly Strict Agent Review #314

Description

Question

Problem: PR #191 Real Case Study

Core Symptoms

Proposed Reform

1. Prompt Reform — Based on Google Code Review Guide

2. Threshold Consensus

Expected Result (After)

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions