diff --git a/AGENTS.md b/AGENTS.md index cebd973..26e2578 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,20 +14,32 @@ ## Terminology -{/* Add product-specific terms and preferred usage */} -{/* Example: Use "workspace" not "project", "member" not "user" */} +- Use **control plane** for the FastAPI runtime authority on `enoch-core`. +- Use **worker gate** for the worker-side reliability check (older code/config names may say `wake_gate`; treat that as compatibility naming). +- Use **Dashboard V2** for the current React/TypeScript operator shell at `/control/dashboard-v2`. +- Use **Research Facility** for the auditable candidate generation/admission/promotion lane. +- Use **corpus** for the public generated-artifact repository (`enoch-ai-research-corpus`). +- Use **promising signals** for the bounded no-paper export repository (`enoch-promising-signals`). +- Use **paper-positive decision gate** (or **decision gate**) for the `finalize_positive` check that gates paper writing. +- Use **operator lanes** (Write, Finalize, Publish/import, Published/imported, Done/no paper) for the public workflow vocabulary; raw states like `draft_review` and `publication_draft` are compatibility/detail only. +- Use **strict claim/evidence audit** to refer to the claim-ledger-based audit; use **packaging/provenance lint** for the publication-hygiene gate. ## Style preferences -{/* Add any project-specific style rules below */} - -- Use active voice and second person ("you") -- Keep sentences concise — one idea per sentence -- Use sentence case for headings -- Bold for UI elements: Click **Settings** -- Code formatting for file names, commands, paths, and code references +- Use active voice and second person ("you"). +- Keep sentences concise — one idea per sentence. +- Use sentence case for headings. +- Bold for UI elements: Click **Settings**. +- Code formatting for file names, commands, paths, and code references. +- Every operational page should answer: What is this? When do I use it? What command/API/path is authoritative? What does healthy look like? What does blocked look like? How do I recover safely? +- Prefer diagrams, tables, and step-by-step runbooks over prose dumps. +- Avoid unsupported marketing claims; every claim must be source-grounded. ## Content boundaries -{/* Define what should and shouldn't be documented */} -{/* Example: Don't document internal admin features */} +- Document the public API surface, operator workflow, deployment path, configuration, and corpus/provenance rules. +- Do not document private LAN hostnames, internal IPs, live secrets, or machine-specific credentials. +- Do not describe private-only repos as public unless verified. +- Do not preserve stale roadmap/backlog pages if they are misleading. +- When a capability cannot be verified from the source repos, either remove it or mark it clearly as planned/future. +- Keep legacy compatibility paths (Notion sync, Supabase Cloud naming) clearly labeled as compatibility/historical only. diff --git a/concepts/control-plane.mdx b/concepts/control-plane.mdx index c8b2795..f2bbfb6 100644 --- a/concepts/control-plane.mdx +++ b/concepts/control-plane.mdx @@ -23,7 +23,7 @@ When the store backend is Postgres, the same operator questions are answered fro ## What the dashboard shows first -The redesigned control dashboard is a professional operator shell. It leads with simple questions: +Dashboard V2 (`/control/dashboard-v2`) is the canonical React/TypeScript operator shell, merged to main as of 2026-05-21. It leads with bounded operator cards: - What needs my attention? - What is running? @@ -32,7 +32,7 @@ The redesigned control dashboard is a professional operator shell. It leads with - What is ready to publish? - What is already imported? -The shell uses cards, tables, search, pagination, and collapsed debug panels. Raw JSON, raw lifecycle labels, and artifact previews stay in drill-down views. +The shell uses cards, tables, search, pagination, and collapsed debug panels. Raw JSON, raw lifecycle labels, and artifact previews stay in drill-down views. All operator lanes consume bounded `/control/api/v1/*` read models. ## Read-model semantics diff --git a/concepts/research-facility.mdx b/concepts/research-facility.mdx new file mode 100644 index 0000000..44ced6b --- /dev/null +++ b/concepts/research-facility.mdx @@ -0,0 +1,136 @@ +--- +title: "Research Facility" +description: "The auditable lane for generating research candidates, scoring, admission, and queue promotion — intentionally separate from dispatch." +--- + +# Research Facility + +The Research Facility is the auditable lane for generating research ideas before they enter the worker queue. It is intentionally separate from dispatch: a generated candidate is not work until it is admitted and recorded with an admission reason. + +The facility answers two questions the old ad-hoc process could not answer reliably: + +1. Where did this idea come from? +2. Why did it get queued? + +For current runtime topology, storage authority, and bounded-tick boundaries, see +[current runtime snapshot](/current-runtime-snapshot). + +## Operator model + +```text +sources (arXiv, GitHub, blogs, prior Enoch results, user batches, generated hypotheses) + -> research candidates (raw proposals before admission) + -> dedupe / history comparison + -> score (novelty + feasibility + accessibility + falsifiability) + -> admission decision (admitted / rejected / merged / needs-review) + -> optional enoch.ideas / projects / queue_items promotion + -> run / decision / paper/no-paper lineage +``` + +## Ledgers + +The Research Facility uses four dedicated ledgers in the control-plane database. None of them dispatch work by themselves. + +| Ledger | Table | Purpose | Dispatches? | +| --- | --- | --- | --- | +| Source | `enoch.research_sources` | External/internal source evidence: arXiv, GitHub, blogs, HN/X, prior Enoch results, user/ChatGPT batches, generated hypotheses. | No | +| Candidate | `enoch.research_candidates` | Raw generated proposals before admission. Stores hypothesis, mechanism, baseline, success threshold, kill condition, artifacts, evidence, cost, failure modes, novelty comparison, dedupe key, and score. | No | +| Admission | `enoch.research_admissions` | Immutable explanation for admitted/rejected/merged/needs-review decisions. This is the answer to "why did this get queued?" | No | +| Lineage | `enoch.research_lineage` | Connects source -> candidate -> idea -> project -> run -> decision -> paper/no-paper -> follow-up candidate. | No | + +Promotion into runtime work still happens through the existing runtime ledgers (`enoch.ideas`, `enoch.projects`, `enoch.queue_items`). + +## Generation modes + +Candidates are generated through explicit, constrained modes. Each mode has required grounding and a specific scoring emphasis: + +| Mode | Required grounding | Scoring emphasis | +| --- | --- | --- | +| `fresh_grounded` | At least one source reference. | External grounding, novelty, falsifiability. | +| `followup_from_negative` | Parent project/run lineage. | Explains what changed from a prior negative/mixed result. | +| `moonshot` | Crisp falsifiable test despite low feasibility. | High novelty/accessibility, strong kill condition. | +| `implementation_gap` | Practical gap in a paper/repo/system. | Feasible experiment and baseline clarity. | +| `paper_replication_extension` | Paper/source lineage. | Bounded replication plus nontrivial extension. | +| `home_hardware_accessibility` | Local/home AI impact. | Accessibility delta and hardware cost. | +| `manual_import` | User/operator supplied. | Complete test contract, dedupe, and score. | + +Database checks enforce grounding for the two easiest-to-abuse modes: `fresh_grounded` must include source evidence; `followup_from_negative` must include parent lineage. + +## Candidate contract + +A candidate must be a testable research proposal, not a vague idea. Required fields include: + +- `hypothesis` +- `mechanism` +- `baseline_to_beat` +- `success_threshold` +- `kill_condition` +- `expected_artifacts` +- `required_evidence` +- `estimated_runtime_class` +- `expected_token_budget` +- `machine_target` +- `likely_failure_modes` +- `novelty_comparison` (when similar prior projects exist) + +The deterministic planner rejects candidates that miss the core contract, lack required grounding, look like shallow incremental sludge, or attempt to re-run known negatives without explaining the new mechanism or evidence. + +## Dashboard workflow + +The dashboard Research Facility panel separates concerns: + +- **Generate smoke batch** — dry-runs first, then writes only Research Facility source/candidate/admission/lineage rows. +- **Generate provider batch** — dry-runs a provider quota preflight first. The live action spends one provider request, then writes only Research Facility source/candidate/admission/lineage rows. +- **Promote selected candidate** — dry-runs first, then promotes exactly one already-admitted candidate into `enoch.ideas`, `enoch.projects`, and `enoch.queue_items`. +- **Run bounded cycle** — the first policy-gated automation layer. Requires an explicit live `enabled` flag, checks provider budget, can spend one provider request, can promote up to one admitted candidate, can optionally dispatch at most one selected queued item, and can optionally draft/finalize at most one paper. + +## Bounded cycle endpoint + +```text +POST /control/api/research/run-cycle +``` + +Default policy: + +```json +{ + "enabled": false, + "max_provider_requests_per_run": 1, + "max_promotions_per_run": 1, + "max_dispatches_per_run": 0, + "wait_for_completion": false, + "max_paper_drafts_per_run": 0, + "max_publication_rewrites_per_run": 0, + "min_admission_score": 72, + "require_budget_ok": true, + "stop_if_queue_active": true, + "stop_if_dashboard_attention": true +} +``` + +Live calls must set `enabled: true`. Dry-runs do not spend provider requests or write rows. Paper drafting/finalization is disabled by default and must be explicitly bounded. + +## Provider-backed generation + +The provider-backed generation endpoint (`POST /control/api/research/generate-provider-batch`) is fail-closed: + +1. Query provider quota first. +2. Refuse generation if remaining credit/rolling request reserve is too low. +3. Dry-run without spending a provider request. +4. Live-run at most the bounded candidate count. +5. Score candidates through the deterministic planner. +6. Persist only Research Facility ledgers with `queue_admitted = false`. + +Do not store provider API keys in config files or command-line arguments. Use an HTTP proxy integration where available. + +## Guardrails + +- The Research Facility tables do not dispatch work by themselves. +- Runtime queue mutation is idempotent and refuses to overwrite in-flight queue rows. +- Dedupe uses a stable `dedupe_key`; duplicate keys in the same batch are rejected. +- Similar prior projects require `novelty_comparison`. +- Candidate lineage is recorded before queue promotion. + +## Relationship to intake + +The Research Facility is the upstream generation lane. [Idea intake](/guides/idea-intake) describes how scouted ideas become control-plane rows. A candidate must pass admission before it can become an idea row; an idea row must be queued before it can dispatch. Each transition has its own audit ledger. diff --git a/docs-audit-notes.md b/docs-audit-notes.md new file mode 100644 index 0000000..26b9958 --- /dev/null +++ b/docs-audit-notes.md @@ -0,0 +1,79 @@ +# Docs audit notes — 2026-05-21 + +This file summarizes what was reconciled during the enoch-docs audit against the three source repositories and what remains uncertain. + +## Source repos used as authority + +| Source repo | Key files/paths used | +|---|---| +| `alias8818/enoch-agentic-research-system` | `README.md`, `CHANGELOG.md`, `VERSION` (0.3.0), `config.example.json`, `pyproject.toml`, `docs/research-facility.md`, `docs/dashboard-v2-todo-2026-05-21.md`, `docs/current-runtime-snapshot.md`, `docs/system-workflow.md`, `docs/state-model.md`, `docs/idea-intake-workflow.md`, `enoch_control_plane/` source tree | +| `alias8818/enoch-ai-research-corpus` | `README.md`, `docs/provenance-policy.md`, `docs/quality-gates.md`, `docs/reproducibility.md`, `quality/claim_evidence_audit.json`, `quality/quality_report.json`, `papers/` directory listing (388 paper dirs counted) | +| `alias8818/enoch-promising-signals` | `README.md`, `data/manifest.json` (519 records), `docs/export-policy.md`, `schemas/promising-signal.schema.json`, `scripts/validate.py`, `scripts/validate_public_trust_surfaces.py` | + +## Pages reconciled + +| Page | Action | Reason | +|---|---|---| +| `AGENTS.md` | **Updated** | Replaced empty Mintlify placeholder comments with actual project terminology, style preferences, and content boundaries derived from the docs | +| `concepts/control-plane.mdx` | **Updated** | Referenced "Dashboard V2" instead of generic "redesigned control dashboard"; added read-model note | +| `guides/idea-intake.mdx` | **Updated** | Added Research Facility to intake stages diagram; added cross-reference link | +| `reference/promising-signals.mdx` | **Updated** | Added current export counts (519 signals, status breakdown, curation buckets) verified from promising-signals repo manifest.json; added regeneration commands; added source file links | +| `docs.json` | **Updated** | Added `concepts/research-facility` to Core Concepts nav group | +| `concepts/research-facility.mdx` | **Added** | New page covering the Research Facility: ledgers, generation modes, candidate contract, bounded cycle, provider-backed generation, guardrails | +| `docs-audit-notes.md` | **Added** | This file | +| All other existing pages | **Kept** | No changes needed; content verified against source repos | + +## Verification results + +### Stale-term grep + +- `wake_gate`: Appears as actual config field names (`worker_wake_gate_url`, `wake_gate_url`) and is correctly marked as compatibility naming where appropriate. No stale standalone usage. +- `omx_wake_gate`: Appears only under historical/compatibility-only section of `current-runtime-snapshot.mdx`. No stale public references. +- `TODO` / `FIXME`: None found in any .mdx file. +- `coming soon`: None found. +- Private LAN IPs (192.168.x, 10.x, 172.16-31.x): None found in docs text (only in SVG path data in logo files). +- Internal hostnames (`enoch-core.exe.xyz`): Not present in any docs file. +- Placeholder screenshots: All 5 images reference real PNG files in `images/`. + +### Link validation + +- 21 MDX files checked +- All cross-page links resolve (21 nav entries validated against file paths) +- All image references resolve to existing PNG files +- No broken internal links detected +- 0 errors, 2 minor warnings (Supabase Cloud regex false positives — all instances are correctly marked as "not current" or "compatibility") + +### Markdown lint + +- Basic frontmatter validation: all 21 .mdx files have valid `---` delimited YAML with `title` field +- No code fence balance issues detected + +## Remaining uncertain or stale items + +1. **Corpus reproducibility.md stale count**: The corpus repo's `docs/reproducibility.md` still references "496/496" packaging/provenance and "3/496" strict audit. The actual quality reports (`quality/`) show 388/388. This is in the corpus repo, not the docs repo, so it was noted but not fixed here. + +2. **AGENTS.md first-time setup note**: The `> **First-time setup**: Customize this file...` line at the top of AGENTS.md remains — this is a standard Mintlify template instruction and is appropriate to keep. + +3. **Hugging Face export**: The `evidence-and-artifacts.mdx` page references a Hugging Face export (`data/artifacts.jsonl`). This surface was not verified during this audit (no access to the HF dataset). Claims about HF export format could not be source-verified. + +4. **Dashboard screenshots**: The 5 dashboard screenshots in `images/` reference specific UI states. These were not visually verified against the live V2 dashboard — they are assumed to be representative snapshots. + +5. **Paper/artifact counts**: The docs consistently reference 388/388 for both packaging/provenance and strict claim/evidence. This was verified against `quality/claim_evidence_audit.json` in the corpus repo. Counts can change on re-import. + +## Validation commands run + +```bash +# Python-based equivalent of validate-docs.mjs: +# - 21 MDX files found +# - Frontmatter validation: all pass +# - Link validation: 0 broken internal links +# - Image reference validation: all resolve +# - Nav coverage: all 20 nav entries resolve to existing files + +# Stale-term grep suite: +# - wake_gate: present as field name, correctly marked compatibility +# - omx_wake_gate: only in historical section +# - TODO/FIXME/coming soon: none found +# - Private IPs: none found in text +# - Internal hostnames: none found +``` diff --git a/docs.json b/docs.json index a13b85c..38c3d8c 100644 --- a/docs.json +++ b/docs.json @@ -69,7 +69,8 @@ "pages": [ "concepts/control-plane", "concepts/worker-gate", - "concepts/evidence-and-artifacts" + "concepts/evidence-and-artifacts", + "concepts/research-facility" ] }, { diff --git a/guides/idea-intake.mdx b/guides/idea-intake.mdx index fc76cfd..b942b2c 100644 --- a/guides/idea-intake.mdx +++ b/guides/idea-intake.mdx @@ -14,14 +14,14 @@ Notion was an earlier intake and prioritization surface. Treat it as legacy comp ## Intake stages ```text -External technical signals - -> candidate idea card - -> weighted scoring / prioritization +Research Facility (source scan -> candidate -> dedupe -> score -> admission) -> control-plane idea row -> control-plane project + queue item -> Enoch control-plane execution ``` +See [Research Facility](/concepts/research-facility) for the full candidate generation, admission, and promotion flow. + ## What an idea card should contain - working title diff --git a/reference/promising-signals.mdx b/reference/promising-signals.mdx index 999a95a..9fcbe64 100644 --- a/reference/promising-signals.mdx +++ b/reference/promising-signals.mdx @@ -13,10 +13,42 @@ For current runtime source-of-truth details, see the [Current Runtime Snapshot]( Promising signals are generated records for Enoch runs with deterministic control-plane statuses such as `useful_signal`, `promising_if_scaled`, or `compute_scale_blocked`. They preserve small local signals, stopped follow-ups, and larger-compute next-step ideas. +The current export (v1 schema) contains **519 deterministic, contract-clean signals** exported from the Enoch control plane. + +Status breakdown: +- `useful_signal`: 452 +- `compute_scale_blocked`: 67 + +Deterministic curation buckets: +- Top external-researcher candidates: 129 +- Compute-scale blocked: 67 +- Follow-up recommended: 212 +- Weak/local-only preserved: 61 +- Likely stale/low-value archive: 50 + +Start with the generated [ranked index](https://github.com/alias8818/enoch-promising-signals/blob/main/signals/ranked-index.md). The machine-readable source of truth is [`data/signals.jsonl`](https://github.com/alias8818/enoch-promising-signals/blob/main/data/signals.jsonl), ranking metadata is in [`data/ranking.json`](https://github.com/alias8818/enoch-promising-signals/blob/main/data/ranking.json), and count/status accounting is in [`data/manifest.json`](https://github.com/alias8818/enoch-promising-signals/blob/main/data/manifest.json). + +Ranking is deterministic. Bucket labels and scores are derived only from exported fields: evidence strength, hypothesis status, source lineage, compute-scale status, follow-up metadata/depth, and local evidence artifact references. No LLM review or manual judgment is accepted as ranking truth. + ## What it is not -These entries are not validated papers, not peer reviewed, not publication-positive, and not part of the public paper corpus. They are preservation records for future inspection or larger-compute follow-up. +These entries are not validated papers, not peer reviewed, not publication-positive, and not part of the public paper corpus. They are preservation records for future inspection or larger-compute follow-up. Public evidence has not been copied for these records (`public_evidence_copied: false` in all records). ## Public release rule Public linking is allowed only when the generated JSONL, schema validator, public trust validator, and release-bundle validation pass. A promising signal can become a paper only through a separate future run that independently satisfies the paper-positive publication gate. + +## Regeneration + +The exporter lives in the system repo: + +```bash +python3 scripts/export_promising_signals.py --output-repo ../enoch-promising-signals --clean-only +``` + +Validate the generated repository with: + +```bash +python3 scripts/validate.py +python3 scripts/validate_public_trust_surfaces.py +```