Skip to content

/hunt jurisdiction system: configurable LLM-as-judge with strict/evidentiary/permissive policies #2094

@Hmbown

Description

@Hmbown

Full jurisdiction system: configurable judge LLM, three policies, trajectory-aware verdicts.

After #2092 (vocabulary, v0.8.45) and #2093 (verifier-preview wiring, v0.8.46), this issue lands the full Codex-style LLM-as-judge with codewhale-native jurisdictions. Absorbs and supersedes the original "port Codex goal system" framing.

Concept

A jurisdiction is the policy a judge applies to decide whether a quarry is statutorily hunted. Three built-in jurisdictions:

Jurisdiction What counts as hunted
strict Diff exists, tests added or updated, CI green (or local verifier equivalent).
evidentiary Diff cites files; agent shows changes; no contradiction with quarry.
permissive Agent declares done; judge sanity-checks.

Each turn ends with a judge call. Judge sees: quarry, current trajectory summary, this turn's evidence, full diff against origin/main (or session start). Judge returns:

struct JudgeRuling {
    verdict: HuntVerdict,
    reasoning: String,
    next_step: NextStep,  // continue | handoff | abandon | declare_hunted
}

In scope (v0.9.0)

  • config.toml [hunt] section: jurisdiction = "evidentiary" (default), judge_model = "auto" (defaults to the session model with a judge-only system prompt), judge_max_tokens = 4096, judge_temperature = 0.0.
  • Judge runs at each turn boundary in hunt mode. Verdict logged inline in transcript between turns.
  • next_step actions:
    • continue — model gets a system-message hint of judge reasoning, continues.
    • handoff — runtime suggests a sub-agent and lists the species + brief.
    • abandon — session marked escaped, no trophy.
    • declare_hunted — trophy written, session may auto-close per config.
  • /hunt jurisdiction <strict|evidentiary|permissive> switches mid-hunt.
  • /why hunted / /why wounded / /why escaped shows the judge's last reasoning.
  • Judge prompt template under crates/tui/src/prompts/judge.txt — small, focused, source-controlled.

Out of scope

  • Trained verifier models (out of scope forever for v0.9.0; remains LLM-as-judge).
  • Per-statute custom jurisdictions in config (RFC; lands later if asked for).
  • Judge-as-sub-agent species (the judge isn't a whale — it's the court).

Acceptance

  • Three jurisdictions selectable and observably different in behavior on the same test quarry.
  • Judge prompt is auditable and source-controlled.
  • Verdict + reasoning render inline in transcript.
  • /why <verdict> returns the most recent judge reasoning.
  • Trophy card includes the jurisdiction the hunt was decided under.
  • Eval harness has at least one regression test per jurisdiction asserting the verdict shape.

Closes / partially closes

Notes

  • The judge isn't a sub-agent species. Whales hunt; the judge is the court. Treat as a distinct primitive.
  • LLM-as-judge has known failure modes (sycophancy toward the agent, brittleness on edge cases). The three-jurisdiction split is partial mitigation: strict is mechanical (CI gates aren't LLM-decided), permissive is honest about its laxness, evidentiary is where the judging actually happens.

Replaces #2058. Final piece of the hunt trilogy: #2092 (vocabulary, v0.8.45) → #2093 (verifier wiring, v0.8.46) → this (v0.9.0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or requestrustPull requests that update rust code

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions