Epistemic Question: How do we measure the true productivity impact of AI coding agents?

## Epistemic Question

**Core Question:** How do we measure the true productivity impact of AI coding agents, distinguishing between perceived speed and actual delivered value?

## Why This Matters

[Agile Agentic Analytics](https://github.com/weisberg/knowledge_base_public/wiki/Vibe-Analytics-Agile-Agentic-Analytics) documents the "verification bottleneck and illusion of speed" failure mode: controlled evidence shows AI tools can slow experienced developers in familiar codebases because time shifts into prompting, waiting, and verification—yet developers may still *believe* they are faster.

This creates a measurement paradox: traditional velocity metrics become less meaningful when agents can spike output, but what should replace them?

## Related Framework

The Agile Agentic Analytics page recommends a balanced KPI set:
- DORA metrics (deployment frequency, lead time, change failure rate)
- Agent productivity metrics (cycle time, PR merge rate, thrash rate)
- Quality metrics (escaped defects, flaky tests, maintainability)
- Hallucination/plausibility-failure rate
- Cost and latency (tokens, CI minutes)

**But the question remains:** How do we know if we're measuring the right things? How do we prevent gaming these metrics?

## Open Questions to Explore

1. **Counterfactual measurement:** How do we establish a reliable baseline for "what would have happened without agents"?
2. **Value vs. volume:** Should we measure lines of code generated, or business value delivered? If the latter, how?
3. **Learning curve effects:** How long until teams reach "steady state" productivity, and how do we account for the transition period?
4. **Maintenance burden:** How do we capture the hidden cost of reviewing/debugging AI-generated code that may increase work for senior developers?
5. **Long-term code health:** Do current velocity gains come at the expense of technical debt accumulation?

## Potential Research Directions

- [ ] Survey existing empirical studies on AI coding assistant productivity (academic + industry)
- [ ] Design A/B test framework for team-level productivity measurement
- [ ] Develop "productivity toxicity" indicators (high velocity but declining quality)
- [ ] Create rubric for "maintenance burden ratio" (time spent fixing AI outputs vs. manual coding baseline)
- [ ] Explore causal inference methods for attribution (what % of productivity change is genuinely due to AI?)

## Success Criteria for Answering This Question

We will know we've made progress when we can:
1. Define 3-5 leading indicators that predict sustainable productivity (not just short-term throughput)
2. Establish measurement protocols that work across different team skill levels and codebases
3. Distinguish between "tool-amplified productivity" and "tool-induced toil"
4. Provide decision rules: "When should we increase agent autonomy?" vs. "When should we pull back?"

## Cross-References

- [Vibe Analytics: Principle 2 - Specification Over Execution](https://github.com/weisberg/knowledge_base_public/wiki/Vibe-Analytics-Principle-2-Specification-Over-Execution)
- [Agile Agentic Analytics: Metrics and KPIs](https://github.com/weisberg/knowledge_base_public/wiki/Vibe-Analytics-Agile-Agentic-Analytics#metrics-and-kpis-for-mixed-human-ai-scrum)
- [Risks and Mitigation: Quality Decay](https://github.com/weisberg/knowledge_base_public/wiki/Vibe-Analytics-Risks-and-Mitigation#risk-2-quality-degradation-from-ai-errors)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epistemic Question: How do we measure the true productivity impact of AI coding agents? #1

Epistemic Question

Why This Matters

Related Framework

Open Questions to Explore

Potential Research Directions

Success Criteria for Answering This Question

Cross-References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Epistemic Question: How do we measure the true productivity impact of AI coding agents? #1

Description

Epistemic Question

Why This Matters

Related Framework

Open Questions to Explore

Potential Research Directions

Success Criteria for Answering This Question

Cross-References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions