feat: rename --budget-conscious to --lite with v2 5-agent pipeline design by lukeinglis · Pull Request #399 · akashgit/remote-factory

lukeinglis · 2026-05-28T13:30:04Z

Closes #398

Changes

Add --lite boolean flag to both ceo and run subcommand parsers in build_parser(), following the same pattern as --no-github
Thread the lite bool through cmd_ceo(), cmd_run(), _build_ceo_task(), and _run_single_cycle()
When lite=True, _build_ceo_task() appends a ## Lite Mode section to the CEO task string describing 7 optimizations:
1. Lite Researcher (archive-only, no web search)
2. Skip baseline eval (read last_eval.json)
3. Review pipeline reduction (skip Reviewer agent + headless review, use CLI guard)
4. Archivist consolidation (1 batch at cycle end)
5. Strategist context compression (factory brief)
6. Single hypothesis per cycle
7. Invocation budget (cap at 7)
Add ## Lite Mode section to ceo.md with full v2 protocol for each optimization, including concrete bash examples
Verify Builder Failure Recovery Protocol is present in ceo.md (confirmed present in Error Recovery section)
Add TestLiteMode class with 8 unit tests covering parser acceptance, defaults, task injection, and flag propagation
No handoff command code, no structlog imports in plugin/runner/mcp files

…sign Add --lite flag to both ceo and run subcommand parsers, thread the flag through cmd_ceo, cmd_run, _run_single_cycle, and _build_ceo_task. When active, the CEO task string includes a Lite Mode section describing 7 optimizations: archive-only researcher, skip baseline eval, review pipeline reduction, archivist consolidation, strategist context compression, single hypothesis per cycle, and invocation budget cap. Add a Lite Mode section to ceo.md with the full v2 protocol for each optimization, including concrete bash examples and the invariant that Sacred Rules and precheck gates remain unchanged. Add 8 unit tests in TestLiteMode covering parser acceptance, defaults, task section injection, and flag propagation through both ceo and run code paths. Closes #398 Signed-off-by: Luke Inglis <lukeinglis21@yahoo.com>

codecov · 2026-05-28T13:31:20Z

Codecov Report

❌ Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.40%. Comparing base (8097461) to head (f1229cc).
⚠️ Report is 17 commits behind head on main.

Files with missing lines	Patch %	Lines
factory/cli.py	91.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #399      +/-   ##
==========================================
+ Coverage   87.31%   87.40%   +0.09%     
==========================================
  Files          61       61              
  Lines        9339     9409      +70     
==========================================
+ Hits         8154     8224      +70     
  Misses       1185     1185

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lukeinglis · 2026-05-28T13:41:55Z

✅ Factory Review: KEEP

Verdict: KEEP
Reason: Clean --lite flag implementation: CLI plumbing + CEO prompt Lite Mode section + 8 passing tests. One non-blocking gap (_chain_modes lite propagation) documented for follow-up.

Experiment: #10
Hypothesis: Strip PR #346 to clean slate and rename --budget-conscious to --lite with v2 5-agent pipeline design

Score Comparison

Metric	Value
Before	0.6500
After	0.6489
Delta	-0.0011
Threshold	0.8000

Guard Checks

scope: PASS
eval_immutable: PASS

Known Gap

_chain_modes does not forward lite parameter. Only affects Build→Discover→Improve chaining, not the primary Improve mode use case. Follow-up fix needed.

Posted by Factory CEO

lukeinglis · 2026-05-28T14:35:28Z

Context: Lite Mode v2 Design

This PR is PR 1 of 8 in the lite mode series. It establishes the --lite flag and CEO prompt protocol. The remaining 7 PRs add programmatic enforcement and supporting infrastructure.

Problem

Non-API users on monthly subscription plans (Claude Max/Pro) hit session and token limits during factory cycles. The v1 --budget-conscious flag (PR #346, now closed) offered marginal savings (~5-10%). This redesign targets 40-45% token reduction.

Token Consumption Analysis

A standard single-hypothesis Improve cycle consumes:

Component	Standard	Lite	Savings
CEO prompt (all turns)	~253K eff. tokens	~126K eff. tokens	50%
Researcher agent	20-40K	8-12K (archive-only)	60-70%
Strategist agent	12-22K	4-6K (compressed)	65%
Builder agent	50-200K	50-200K	0% (irreducible)
Baseline Evaluator	3-5K	0 (use last_eval.json)	100%
Post-change Evaluator	3-5K	3-5K	0%
Reviewer agent	10-30K	0 (CEO review + CLI guard)	100%
Headless final review	5-30K	0 (merged to CEO review)	100%
Archivist (x5)	14-20K	3-5K (1 batch)	75%
Total	385K-645K	209K-394K	40-45%

Agent count drops from 12 invocations (11 agents + 1 headless) to 5 invocations (Researcher-lite, Strategist, Builder, Evaluator x1, Archivist batch).

The 7 Optimizations (defined in this PR's CEO prompt section)

Lite Researcher: Archive-only, no web search (~8-12K vs 20-40K)
Skip baseline eval: Read .factory/last_eval.json instead of spawning Evaluator
Review pipeline reduction: CEO structured review + factory guard CLI replaces triple-redundant review (CEO + Reviewer agent + headless)
Archivist consolidation: 1 batch invocation at cycle end replaces 5 per-phase invocations
Strategist context compression: factory brief (compact JSON) replaces full file cat (~500 tokens vs 12-22K)
Single hypothesis per cycle: Invocation budget constrains to 1 hypothesis
Invocation budget: Hard cap at 7 agent invocations with ceiling warnings

Remaining PRs

PR	What	Impact
2	CEO prompt splitting (base + mode modules)	~126K eff. tokens saved
3	Review pipeline reduction (conditional in CEO prompt)	15-60K saved per hypothesis
4	Archivist consolidation (conditional in CEO prompt)	8-15K saved
5	`factory brief` CLI command	10-20K saved
6	Lite Researcher (archive-only constraint)	12-28K saved
7	Skip baseline eval (use last_eval.json)	3-5K saved
8	Invocation budget system (generalize Bob Shell ceiling)	Enforcement layer

Full spec: .factory/strategy/idea.md

Known Gap

_chain_modes does not forward the lite parameter, so Build-Discover-Improve chaining silently drops lite mode. 3-line fix needed (add lite: bool = False to _chain_modes signature and forward to _run_single_cycle at all 3 call sites). Does not affect the primary use case (Improve mode on existing projects).

lukeinglis marked this pull request as ready for review May 28, 2026 13:41

lukeinglis mentioned this pull request May 28, 2026

Add --budget-conscious CLI flag for session-aware experiment execution #346

Closed

lukeinglis mentioned this pull request May 28, 2026

feat: rename --budget-conscious to --lite with v2 5-agent pipeline design #398

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: rename --budget-conscious to --lite with v2 5-agent pipeline design#399

feat: rename --budget-conscious to --lite with v2 5-agent pipeline design#399
lukeinglis wants to merge 1 commit into
mainfrom
factory/run-4f008854

lukeinglis commented May 28, 2026

Uh oh!

codecov Bot commented May 28, 2026

Uh oh!

lukeinglis commented May 28, 2026

Uh oh!

lukeinglis commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lukeinglis commented May 28, 2026

Changes

Uh oh!

codecov Bot commented May 28, 2026

Codecov Report

Uh oh!

lukeinglis commented May 28, 2026

✅ Factory Review: KEEP

Score Comparison

Guard Checks

Known Gap

Uh oh!

lukeinglis commented May 28, 2026

Context: Lite Mode v2 Design

Problem

Token Consumption Analysis

The 7 Optimizations (defined in this PR's CEO prompt section)

Remaining PRs

Known Gap

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant