feat: add meta mode Phase 3 — classify and contribute upstream by Maxusmusti · Pull Request #376 · akashgit/remote-factory

Maxusmusti · 2026-05-25T22:33:42Z

Summary

Adds a contribution pipeline to meta mode that classifies evolved playbook items as general (upstream-worthy) vs project-specific (local only), then lets users contribute the general ones back as PRs
New factory contribute CLI command with --classify, --submit, and --status subcommands
CEO prompt updated with Phase 3 (M4/M5/M6) that runs after ACE evolution

Motivation

Currently, meta mode evolves playbooks locally via ACE — all learnings stay in ~/.factory/playbooks/ and never flow back upstream. This means every user independently re-discovers the same improvements. With this change, the factory can identify patterns that are universally useful across diverse projects and contribute them back to the default playbooks, closing the self-improvement loop: the more the factory is used, the better it becomes for everyone.

How it works

Classification engine (`factory/ace/contributor.py`)

Each evolved playbook item is scored on a general-vs-specific spectrum using four weighted signals:

Signal	Weight	What it measures
Cross-project prevalence	40%	Does this pattern appear across 3+ unrelated projects?
Domain independence	25%	Does it reference factory internals or project-specific frameworks?
Evidence strength	20%	How many observations (helpful/harmful) support it?
Category signal	15%	Is the hypothesis category inherently general (e.g., prompt_engineering) or specific (e.g., feature)?

Items scoring ≥ 0.65 are classified as general, ≤ 0.35 as specific, and between as uncertain.

User experience

At the end of a meta mode run, users see a terminal summary:

════════════════════════════════════════════════════════════
                    META MODE SUMMARY
════════════════════════════════════════════════════════════

PLAYBOOK EVOLUTION COMPLETE
  9 items evolved across 5 roles
  3 general (upstream candidates)  |  3 specific (local only)  |  3 uncertain

────────────────────────────────────────────────────────────
GENERAL IMPROVEMENTS (upstream candidates)
────────────────────────────────────────────────────────────

  1. [strategist] "Always run type checkers after making changes"
     Generality: ████████░░ 0.81  |  5 projects  |  16 experiments
     Category: type_safety

────────────────────────────────────────────────────────────
PROJECT-SPECIFIC IMPROVEMENTS (staying local)
────────────────────────────────────────────────────────────

  1. [builder] "Use iframe wait patterns for Playwright tests"
     Generality: ██░░░░░░░░ 0.22  |  1 project  |  5 experiments
     Why local: single-project signal, domain-specific (Playwright)

════════════════════════════════════════════════════════════
Run `factory contribute` to select items for upstream PR.
════════════════════════════════════════════════════════════

Users can then run factory contribute --submit to create a PR, or skip — contribution is always opt-in.

CLI commands

# Classify evolved items and show summary
factory contribute --classify /path/to/project

# Create PR with all general items
factory contribute --submit /path/to/project --all

# Check pending candidates
factory contribute --status

CEO prompt changes

Phase 3 (steps M4/M5/M6) is added after Phase 2 (ACE). The CEO:

Runs factory contribute --classify to score evolved items
Presents the summary to the user
Waits for explicit approval before submitting — never auto-contributes

Files changed

File	Change
`factory/ace/contributor.py`	New — classification engine, contribution pipeline, terminal summary, git/gh submit, JSON persistence (780 lines)
`factory/cli.py`	Modified — `factory contribute` command with `--classify`/`--submit`/`--status` subcommands
`factory/agents/prompts/ceo.md`	Modified — Phase 3 (M4/M5/M6) + task table entry
`tests/test_contributor.py`	New — 26 tests covering classification, diffing, summary, PR body, persistence

Design decisions

Composition over inheritance for ClassifiedItem wrapping PlaybookItem (since PlaybookItem has extra="forbid")
Reuses existing factory infrastructure: Playbook.from_markdown(), classify_hypothesis(), discover_projects(), load_all_histories(), DEFAULTS_DIR, user_playbooks_dir()
prepare_contribution() returns specs without executing git — keeps the module testable; execute_contribution() handles the actual git/gh commands separately
Fuzzy matching (SequenceMatcher ≥ 0.75) for both cross-project evidence and playbook diffing, consistent with the existing reflector

Test plan

26 new tests pass (pytest tests/test_contributor.py)
261 existing tests pass — zero regressions
factory contribute --help shows correct usage
All imports resolve correctly
Manual test: run factory contribute --classify on a project with evolved playbooks
Manual test: run factory contribute --submit --dry-run to verify PR spec generation

🤖 Generated with Claude Code

Add the ability for meta mode to distinguish general improvements from project-specific ones, and contribute the general items back upstream as PRs. This closes the self-improvement loop: the more the factory is used across diverse projects, the better its default playbooks become. New CLI command `factory contribute` with three modes: - `--classify`: scores evolved playbook items on a general-vs-specific spectrum using four weighted signals (cross-project prevalence 40%, domain independence 25%, evidence strength 20%, category signal 15%) - `--submit`: creates a PR against the factory repo with approved items - `--status`: shows pending contribution candidates The classification engine uses cross-project experiment data to identify items that appear across 3+ unrelated projects as "general" (upstream candidates), single-project items as "specific" (local only), and everything in between as "uncertain" (needs more data). The CEO prompt is updated with Phase 3 (M4/M5/M6) which runs after ACE evolution, presents the user with a terminal summary showing the distinction, and lets them opt in to contributing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-05-25T22:35:33Z

Codecov Report

❌ Patch coverage is 93.70277% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.56%. Comparing base (190741e) to head (290672d).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
factory/cli.py	65.51%	20 Missing ⚠️
factory/ace/contributor.py	98.52%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #376      +/-   ##
==========================================
+ Coverage   87.54%   87.56%   +0.02%     
==========================================
  Files          60       62       +2     
  Lines        9170     9734     +564     
==========================================
+ Hits         8028     8524     +496     
- Misses       1142     1210      +68

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add 26 tests covering uncovered paths: classify_evolved_playbooks pipeline, package_evidence, prepare_contribution, execute_contribution (mocked subprocess), explain_specificity/uncertainty branches, load_candidates edge cases, and cmd_contribute CLI handler. Fix lint: remove unused imports, rename ambiguous variable, drop unused locals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Maxusmusti and others added 2 commits May 25, 2026 18:38

fix: remove unused imports (ruff F401)

dcc231d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add meta mode Phase 3 — classify and contribute upstream#376

feat: add meta mode Phase 3 — classify and contribute upstream#376
Maxusmusti wants to merge 3 commits into
mainfrom
feat/meta-mode-upstream-contributions

Maxusmusti commented May 25, 2026

Uh oh!

codecov Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Maxusmusti commented May 25, 2026

Summary

Motivation

How it works

Classification engine (factory/ace/contributor.py)

User experience

CLI commands

CEO prompt changes

Files changed

Design decisions

Test plan

Uh oh!

codecov Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Classification engine (`factory/ace/contributor.py`)

codecov Bot commented May 25, 2026 •

edited

Loading