Skip to content

Add directive-driven improvement and prompt surface optimization#46

Open
Born14 wants to merge 2 commits intomainfrom
claude/evaluate-self-improvement-loop-3jCG4
Open

Add directive-driven improvement and prompt surface optimization#46
Born14 wants to merge 2 commits intomainfrom
claude/evaluate-self-improvement-loop-3jCG4

Conversation

@Born14
Copy link
Copy Markdown
Owner

@Born14 Born14 commented Apr 5, 2026

Summary

Introduces two major enhancements to the autonomous improvement engine:

  1. Directive-Driven Improvement — Operators can now guide the improvement loop through a improve-directive.md file instead of modifying TypeScript. This follows AutoAgent's "program the meta-agent" pattern.

  2. Prompt Surface Optimization — Extends the improvement loop to recognize and optimize LLM prompts and tunable thresholds within gate files (e.g., vision.ts, triangulation.ts), allowing the LLM to prefer prompt edits over logic changes when appropriate.

  3. Continuous Mode — Implements hill-climbing iteration support, allowing the improvement engine to re-baseline and iterate after each accepted improvement.

Key Changes

  • improve-directive.ts (new)

    • loadDirective() — Loads and parses improve-directive.md with structured fields (priority gates, focus mode, edit style) and custom instructions
    • formatDirectiveForPrompt() — Injects directive context into LLM prompts
    • applyDirectiveToBundles() — Prioritizes evidence bundles based on directive's priority gates
  • improve-prompt-surface.ts (new)

    • Defines known prompt regions in gate files (vision.ts, triangulation.ts, hallucination.ts)
    • extractPromptRegion() — Extracts actual prompt text from source files
    • formatPromptSurfaceContext() — Provides LLM with prompt region metadata and tuning advice
    • isPromptRegion() — Checks if a file/function is a tunable prompt surface
  • improve.ts (modified)

    • Refactored runImproveLoop() into runSingleIteration() to support continuous mode
    • Loads and applies directive at start of each iteration
    • Injects directive and prompt surface context into bundle processing
    • Tracks cumulative LLM usage and accepted improvements across iterations
    • Early termination when no improvements found
  • self-test.ts (modified)

    • Added CLI flags: --continuous, --max-iterations=N, --directive=PATH, --prompt-surface
    • Updated help text with examples for all new modes
  • types.ts (modified)

    • Extended ImproveConfig with maxIterations, directivePath, promptSurface fields
  • improve-directive.md (new)

    • Template file with commented examples showing how to configure improvement priorities
  • improve-directive.test.ts (new)

    • Unit tests for directive parsing, prompt formatting, and bundle prioritization
  • .gitignore (modified)

    • Added .verify/ directory (created by test runs)

Notable Implementation Details

  • Directive parsing is lenient (case-insensitive, flexible delimiters) to reduce friction
  • Prompt regions use start/end markers for robust extraction even if code changes
  • Directive context is injected into both diagnosis and fix generation prompts
  • Continuous mode re-baselines after each accepted improvement, enabling iterative refinement
  • Cumulative LLM usage is tracked and reported at the end of continuous runs
  • Early termination prevents wasted iterations when the improvement frontier is reached

https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD

claude added 2 commits April 4, 2026 22:50
… surface optimization

Three AutoAgent-inspired concepts integrated into the evidence-centric improve loop:

1. Continuous mode (--continuous / --max-iterations=N): Re-baselines after each
   accepted improvement and iterates, compounding small wins. Stops when an
   iteration produces no accepted candidates.

2. Directive-driven improvement (improve-directive.md): Externalizes improvement
   strategy into a human-editable Markdown file. Operators can specify priority
   gates, focus mode (false positives vs negatives), edit style preferences, and
   custom instructions — all injected into LLM diagnosis/fix prompts.

3. Prompt surface optimization (--prompt-surface): Extends the bounded surface
   to include LLM prompts within gates (vision.ts prompt, triangulation weights).
   The fix generator gets context about which regions are prompts vs logic,
   preferring prompt edits for prompt-related failures.

https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD
@Born14
Copy link
Copy Markdown
Owner Author

Born14 commented Apr 5, 2026

Deferring until dirty count reaches 0 and the basic improve loop is stable.

What's merge-ready:

  • --continuous (hill-climbing iteration) — will cherry-pick when dirty = 0

What needs more work:

  • Directive file — not needed until discovery mode is active and we need to steer priorities
  • Prompt surface — marker strings are unvalidated against current gate code, needs design pass

Good research. Just not the priority right now. The priority is clearing the last 7 dirty scenarios and publishing v0.8.0 with clean sensors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants