fix(dream): audit backlinks without mutating pages#1027
Open
sliday wants to merge 1 commit into
Open
Conversation
garrytan
added a commit
that referenced
this pull request
May 18, 2026
The dream/autopilot maintenance cycle ran the backlinks phase in 'fix' mode, which writes "Referenced in" timeline bullets into entity pages every sync. The graph extractor + auto-link path is the canonical link store during sync/dream/autopilot — the legacy filesystem fixer wrote markdown that fought with both the user's manual edits and the graph layer's own timeline. Cycle now runs backlinks in 'check' mode (audit-only); the materializer remains available via `gbrain check-backlinks fix` for users who really want markdown backlinks committed to disk. Cherry-picked from PR #1027. Co-Authored-By: sliday <sliday@users.noreply.github.com>
7 tasks
garrytan
added a commit
that referenced
this pull request
May 19, 2026
* fix(sync): accept .tf / .tfvars / .hcl in CODE_EXTENSIONS Terraform repos were invisible to `gbrain sync --strategy code` because the three HCL-family extensions never reached the file walker. Silent data loss — the user thinks the sync covered the repo but the IaC layer was dropped on the floor. detectCodeLanguage() returns null for these extensions, so the chunker falls back to recursive (no tree-sitter grammar for HCL) — the same path toml/yaml take. Closes #878. Co-Authored-By: johnybradshaw <johnybradshaw@users.noreply.github.com> * fix(upgrade): run `bun update gbrain` from Bun's global install root `gbrain upgrade --strategy bun` was failing on canonical `bun install -g github:garrytan/gbrain` installs because `execSync('bun update gbrain')` ran in the user's shell cwd. Bun's update operates on whatever package.json it finds via cwd-walk, so a user not standing in the global root got "No package.json, so nothing to update". resolveBunGlobalRoot() returns the right directory: 1. `$BUN_INSTALL/install/global` when set (operator override). 2. `~/.bun/install/global` (Bun's documented default). 3. Walk up from realpath(argv[1]) looking for `node_modules/gbrain` — handles non-standard installs without trusting argv naming. execFileSync replaces execSync (no shell), with cwd pinned. Error path prints the exact `cd && bun update` recovery command instead of a vague hint. Closes #1029. Cherry-picked from PR #1032. Co-Authored-By: mvanhorn <mvanhorn@users.noreply.github.com> * fix(config): redact sensitive values in `config set` output (closes #892) `gbrain config set openai_api_key sk-...` was echoing the full key to stderr via `console.log('Set %s = %s', key, value)`. Shell scrollback and tmux scroll buffers commonly retain stderr for hours; a screen-share or shoulder-glance during set leaked the secret. The `show` path already redacted but used a naive `.includes('key')` substring check that would mask 'monkey' or 'parsekey' (no false-negative but ugly). Single source of truth: `isSensitiveConfigKey()` uses a word-boundary regex (`(^|[._-])(key|secret|token|password|pwd|passwd|auth)([._-]|$)/i`) so 'openai_api_key' matches but 'monkey' doesn't. `redactConfigValue()` composes the postgresql:// URL redactor + sensitive-key check, used by both `show` and `set`. Helpers exported for unit tests. Closes #892. Cherry-pick of @sharziki's PR #918 (config.ts hunk only — the extract.ts walker change in that PR is unrelated and tracked in #202). Co-Authored-By: sharziki <sharziki@users.noreply.github.com> * fix(oauth): throw InvalidTokenError so bearerAuth returns 401, not 500 `verifyAccessToken` was throwing bare `Error` on expired or invalid tokens. The MCP SDK's `requireBearerAuth` middleware catches `InvalidTokenError` and returns 401 with WWW-Authenticate; bare Error falls through to 500. Result: legitimate clients with stale tokens hit 500-not-401, so token-refresh logic (which keys off 401) never fires. Two call sites in verifyAccessToken: token-expired path and invalid-token path. Both now throw InvalidTokenError. Existing tests continue to pass because they assert on the throw, not the message class. Closes #935. Cherry-picked from PR #1012. Co-Authored-By: Aashiqe10 <Aashiqe10@users.noreply.github.com> * fix(serve): return 405 on GET /mcp instead of 404 MCP Streamable HTTP spec says GET /mcp opens an optional SSE backchannel for server-initiated messages. gbrain's transport is stateless and doesn't push server-initiated messages, so per spec we MUST return 405 with Allow: POST, DELETE — not 404. Probing clients (claude.ai, etc.) distinguish "endpoint exists, no SSE channel" from "endpoint missing" on this status code; 404 makes them give up. Cherry-picked from PR #1076. Co-Authored-By: lukejduncan <lukejduncan@users.noreply.github.com> * fix(doctor): resolve whoknows fixture from module location, not cwd `gbrain doctor` warned about a missing whoknows fixture for every install that wasn't standing in the gbrain source repo at run time — which is everyone. The check used `process.cwd()` to locate the fixture, so any real user (running doctor against `~/.gbrain`) saw a spurious warning. `resolveWhoknowsFixturePath()` walks up from `import.meta.url` looking for the source-repo signature (`src/cli.ts` + `skills/RESOLVER.md`), respects `GBRAIN_WHOKNOWS_FIXTURE_PATH` env override (absolute or cwd-relative), and returns null with an actionable warning when the fixture can't be located. Closes #969. Cherry-picked from PR #1034. Co-Authored-By: mvanhorn <mvanhorn@users.noreply.github.com> * fix(frontmatter): centralize --fix backups under ~/.gbrain/backups/ `gbrain frontmatter validate --fix` and `gbrain frontmatter generate --fix` wrote `<file>.bak` siblings into the source tree. Users running gbrain over a brain repo found .bak files scattered through people/, companies/, etc. that broke gitignore expectations and showed up in `git status` after every fix pass. Backups now land under `~/.gbrain/backups/frontmatter/<run-id>/<rel>.bak` with an iso-week-sorted run-id so a multi-fix session keeps the same parent directory. Backup directory + per-file structure mirrored from the original file's relative path. The .bak safety contract is intact for both git and non-git brain repos. Also adds `--include-catch-all` opt-in to `frontmatter generate` so the default catch-all rule (`type: note`) is no longer applied to arbitrary workspace documents that happen to live under a brain root. Closes #902. Cherry-picked from PR #903. Co-Authored-By: 100yenadmin <100yenadmin@users.noreply.github.com> * fix(config): use path.isAbsolute() for GBRAIN_HOME on Windows The GBRAIN_HOME validator rejected every valid Windows path (`C:\\Users\\...`, `D:\\gbrain`, etc.) because it used `trimmed.startsWith('/')` to check for absoluteness — only POSIX absolute paths pass that. `path.isAbsolute()` is the cross-platform check. Same fix for the `..` traversal check: split on both `/` and `\` so Windows path separators don't sneak `..` through. Closes #1019. Cherry-picked from PR #1083. Co-Authored-By: sharziki <sharziki@users.noreply.github.com> * fix(ai): warn only for the configured embedding provider, not all recipes Gateway construction was warning on stderr for every recipe with an embedding touchpoint missing max_batch_tokens — including providers the brain isn't using. Users on Voyage saw noise about OpenAI / Google / DashScope / etc. recipes that never get loaded. Filter the warning to recipes whose provider id is referenced by `embedding_model` or `embedding_multimodal_model` in the active config. The structural protection against forgetting max_batch_tokens stays in place for the recipes that actually run; the noise for unrelated recipes goes away. Cherry-picked from PR #1117. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(sync): skip git pull when repo has no origin remote `gbrain sync` ran `git pull` unconditionally and printed scary stderr on every cycle for brains that have no `origin` remote (local-only workflows, single-machine setups, brains initialized via `gbrain init --pglite` against an arbitrary directory). The pull failed harmlessly but the noise was confusing and made operators think sync was broken. `hasOriginRemote()` probes `git remote get-url origin` with stdio ignored; on failure (`no such remote`), skip the pull, print a single informational line, and proceed with the local working tree. Cherry-picked from PR #1119. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(query): drain cache writes before CLI exit The query cache write was fired with `void promise.catch(...)` — true fire-and-forget. On a fast CLI invocation (`gbrain query <q>` exits in ~50ms), the process terminates before the cache write commits. Result: the cache effectively never warms from CLI use; every query is a miss. `awaitPendingSearchCacheWrites()` tracks each in-flight cache write in a module-level Set. The CLI dispatcher awaits the set after `query` finishes formatting output but before the process exits. MCP server path unchanged (long-lived process, fire-and-forget remains correct). Cherry-picked from PR #1125. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(backlinks): dedupe (source, target) pairs within a single source page A source page that mentions the same entity N times produced N duplicate "Referenced in" lines on the target. `extractEntityRefs` returns one EntityRef per occurrence, and the per-ref `hasBacklink` check reads a snapshot of `target.content` that's frozen at outer scope — so every iteration sees "no backlink yet" and appends another gap. The cumulative effect on a long meeting note with multiple mentions of the same person was visible in PRs landing 3-5 identical Timeline entries. Track seen target slugs per source page; cap gaps at one pair. Cherry-picked from PR #967 with a current-master regression test covering both markdown-link and Obsidian-wikilink formats in the same source page. Co-Authored-By: p3ob7o <p3ob7o@users.noreply.github.com> * fix(dream): audit backlinks without mutating pages during cycle The dream/autopilot maintenance cycle ran the backlinks phase in 'fix' mode, which writes "Referenced in" timeline bullets into entity pages every sync. The graph extractor + auto-link path is the canonical link store during sync/dream/autopilot — the legacy filesystem fixer wrote markdown that fought with both the user's manual edits and the graph layer's own timeline. Cycle now runs backlinks in 'check' mode (audit-only); the materializer remains available via `gbrain check-backlinks fix` for users who really want markdown backlinks committed to disk. Cherry-picked from PR #1027. Co-Authored-By: sliday <sliday@users.noreply.github.com> * fix(autopilot --install): source ~/.zshenv before zshrc/bashrc zshenv is the canonical place for env vars in zsh on macOS — zshrc is sourced only for interactive shells, so vars exported in zshrc don't reach a non-interactive subprocess like the autopilot wrapper. Users who exported GBRAIN_DATABASE_URL, OPENAI_API_KEY, or ANTHROPIC_API_KEY in zshrc and assumed autopilot would inherit them hit silent missing- secret failures on the LaunchAgent. Source ~/.zshenv first (always reaches non-interactive shells per zsh docs), then fall back to ~/.zshrc / ~/.bashrc for users on other profile conventions. Cherry-picked from PR #966. Co-Authored-By: p3ob7o <p3ob7o@users.noreply.github.com> * fix(apply-migrations): return exit 0 on list/dry-run/up-to-date `gbrain apply-migrations list`, `gbrain apply-migrations --dry-run`, and the "All migrations up to date" path were returning from the async function but never calling `process.exit(0)`. The CLI dispatcher in cli.ts treated the implicit fall-through as exit 1 when the parent process inspected status via shell scripts, breaking automation that gates on `apply-migrations list && do-something`. Three call sites: list, dry-run, and the no-op path. All three now exit(0) explicitly. Cherry-picked from PR #1062. Co-Authored-By: nezovskii <nezovskii@users.noreply.github.com> * fix(sync): scope auto-embed to source on incremental syncs `gbrain sync --source-id X` triggered auto-embed for the affected slugs but `runEmbed` ran with no `--source` flag, so it fell back to the default source. For non-default-source syncs the page row lives at (sourceId, slug) — the embed code saw "Page not found" for the right slug under the wrong source, swallowed the error as best-effort, and the sync result reported `embedded: 0` for the wrong reason. `buildAutoEmbedArgs(slugs, sourceId)` is the new helper: when sourceId is set, prepends `--source X`. Exported for the regression test. Pairs with the upcoming source-id write-path audit (P1 #8). Cherry-picked from PR #1120. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(query): honor source_id with no-expand for cross-source search Two related corrections: 1. `gbrain query --no-expand` parsed `--no-expand` as the literal key `no_expand` instead of negating the boolean `expand` param. Result: the flag was silently ignored and expansion always ran. Now any `--no-<key>` where `<key>` is a boolean param flips it false. 2. The `query` op's source-id resolution treated `ctx.sourceId` as authoritative, so an explicit per-call `source_id` was overridden by the federated read scope. Now per-call `source_id` wins; `source_id=__all__` is an explicit opt-out for local cross-source search. Cherry-picked from PR #1124. Co-Authored-By: hnshah <hnshah@users.noreply.github.com> * fix(doctor): child-table orphan detection (closes #1063) The autopilot orphans phase detects orphan PAGES (no inbound links via page-graph) but never scans FK-child tables. After a bulk delete or a pre-FK-migration code path, orphan rows can persist indefinitely in content_chunks, page_versions, tags, takes, raw_data, timeline_entries, or links — all declared ON DELETE CASCADE, so any orphan row is unexpected. `childTableOrphansCheck` enumerates 10 FK columns across 8 tables: - 8 NOT NULL columns (cascade): any value not in pages.id is an orphan. - 2 nullable SET NULL columns (links.origin_page_id, files.page_id): NULL is valid; only NOT-NULL-but-missing-in-pages counts. Surfaces paste-ready cleanup SQL when orphans are found. Cherry-picked from PR #1064. Co-Authored-By: vincedk-alt <vincedk-alt@users.noreply.github.com> * fix(autopilot,cycle): stop respawn-storm from steady-state 'partial' cycles Two compounding bugs under KeepAlive=true: 1. Autopilot tripped its circuit breaker on cycle.status === 'partial', not just 'failed'. 'partial' means at least one phase warned/failed while others ran — a soft signal, not fatal. On every cycle that warned, autopilot logged a failure and the supervisor respawned the worker. 2. The orphans phase emitted 'warn' when `count > 20` orphan pages. That threshold was tuned for small dev brains; on any corpus past a few hundred pages it fires every cycle in steady state. Together with bug 1, this produced visible respawn storms. Fix: - Autopilot trips only on cycle.status === 'failed'. - Orphans phase warns by ratio: orphans / total_pages > 0.5 (the real "your graph fell apart" signal), not by absolute count. Cherry-picked from PR #1113. Co-Authored-By: sergeclaesen <sergeclaesen@users.noreply.github.com> * fix(ai): reject partial embedding responses before indexing `embedSubBatch` only validated the FIRST embedding's dimension and never asserted the response length matched the input length. If a provider returned fewer embeddings than requested (rate-limit truncation, malformed response, etc.), the gateway silently indexed an offset-shifted result — every page after the missing index got the embedding of a different page's chunk. Two new guards: 1. `result.embeddings.length === texts.length` — fail loud if any count mismatch, with a paste-ready retry hint. 2. Validate dim on EVERY embedding, not just the first. Cherry-picked from PR #926. Co-Authored-By: 100yenadmin <100yenadmin@users.noreply.github.com> * fix(serve): admin register-client supports auth_code + PKCE public clients The admin dashboard's /admin/api/register-client endpoint hardcoded client_credentials and ignored grantTypes, redirectUris, and tokenEndpointAuthMethod. Result: you couldn't register a browser-based PKCE client (claude.ai Custom Connector, Cursor, etc.) through the dashboard — only confidential machine-to-machine clients worked. Pass grantTypes / redirectUris through to registerClientManual. When tokenEndpointAuthMethod === 'none', NULL out client_secret_hash so the SDK's clientAuth middleware skips the hash-vs-plaintext compare that would otherwise reject the no-secret PKCE flow. Cherry-picked from PR #1077. Co-Authored-By: lukejduncan <lukejduncan@users.noreply.github.com> * fix(extract-facts): treat slugs:[] as no-op, not unscoped full-walk `runExtractFacts` checked `opts.slugs && opts.slugs.length > 0` to decide between scoped and full-brain walk. Both `undefined` (caller omits → full walk intended) AND `[]` (sync no-op → zero work intended) fall through to the same `else` branch and triggered `engine.getAllSlugs()`. On a multi-thousand-page brain, the unintended full walk exceeded the autopilot-cycle ~600s timeout and dead-lettered the job — visible in production as `[cycle.extract_facts] start` followed by silence until `Autopilot stopping (cycle-failure-cap)`. Use presence (`opts.slugs !== undefined`), not truthiness, to distinguish the two modes. Empty array is a real incremental no-op. Closes #1096. Three regression cases in test/extract-facts-phase.test.ts: slugs=[] no-op, slugs=undefined still walks, slugs=['a'] walks just one. Co-Authored-By: navin-moorthy <navin-moorthy@users.noreply.github.com> * fix(serve): embed admin/dist into binary; serve from manifest (closes #1090) Pre-fix, /admin returned 404 on every globally-installed binary because serve-http.ts:780 resolved admin/dist via process.cwd(). The admin SPA files are checked into git but `bun build --compile` does NOT embed arbitrary directories — only assets imported via `with { type: 'file' }` ESM imports land in the compiled binary. Wire: - scripts/build-admin-embedded.ts walks admin/dist/, emits src/admin-embedded.ts with one `with { type: 'file' }` import per file + a manifest map (request path → resolved path + mime). Auto-invoked by `bun run build:admin`. - src/admin-embedded.ts is the auto-generated module. Bun resolves every file: import to a path that works at runtime inside the compiled binary (same pattern as src/core/chunkers/code.ts WASM imports). - serve-http.ts switches to two-tier resolution: cwd-relative admin/dist for dev (Vite hot-rebuild), embedded manifest otherwise. Embedded path reads bytes lazily and caches per-asset for the lifetime of the process. - scripts/check-admin-embedded.sh CI gate re-runs the generator and fails on drift (mirrors check-wasm-embedded.sh). PRs that rebuild admin/dist but forget to regenerate the embedded module fail loud. - package.json wires build:admin-embedded + check:admin-embedded. Closes #1090. * test(source-id): lock in routing regression coverage (closes #891 #978 #1078) Audit of every page write path (sync, embed, extract, dream, autopilot, wikilinks, tags, chunks) confirmed that sourceId already threads correctly through importFromContent → engine.putPage → SQL INSERT since v0.18.0. The original bug reports from #891, #978, #1078 were real at the time and got swept by the multi-source refactor; today's master is correct. This commit locks in that correctness with six PGLite regression cases (no Postgres fixture needed; runs in CI everywhere): 1. importFromContent({sourceId:"work"}) lands at source_id=work, not the silent 'default' fallback. 2. Two sources hold the same slug independently. 3. Omitting sourceId falls through to 'default' (legacy contract). 4. Chunks land under the requested source. 5. Tags land under the requested source. 6. FK integrity smoke (originally #1078). The earlier issue reports stay closed by the existing threading; this suite ensures any future refactor of the write path can't silently re-introduce the wrong-source-default bug. The 90-minute write-path audit budget from the plan resolves here. * fix(apply-migrations): unblock PGLite chain (closes #1100) `gbrain apply-migrations --yes` was wedging on the v0.11.0 (Minions) schema phase for PGLite installs. Two compounding bugs: 1. `apply-migrations` pre-flight schema-version warning connects to PGLite to read config.version, then disconnects. The brief lock hold races with downstream subprocess spawns that try to re-acquire it; the 30s lock timeout fires before the parent fully releases. Pre-flight is a *warning*; on PGLite it adds no information the orchestrators don't already handle. Skip the probe for PGLite. 2. v0.11.0 phase A spawned `gbrain init --migrate-only` as an execSync subprocess to apply schema migrations. PGLite is single-writer; the subprocess inherits HOME and tries to lock the same DB. On Postgres this works (concurrent connections OK); on PGLite it deadlocks. Route in-process for PGLite — create + connect + initSchema + disconnect directly, skipping the subprocess hop. Postgres keeps the legacy execSync path. Verified: fresh PGLite install now walks the full migration chain through v0.32.2 (Facts SoR) and lands "All migrations up to date" on re-run. Closes #1100. * fix(serve): bootstrap token env override + suppress flag (closes #1024) `gbrain serve --http` regenerated the admin bootstrap token on every restart and printed it to stderr. In supervisor-managed production deployments (LaunchAgent, systemd, k8s) every restart leaks the value into log aggregators and rotates the access for any agent that paste- copied it. Two new knobs: - **GBRAIN_ADMIN_BOOTSTRAP_TOKEN** env var: when set, used as the bootstrap secret instead of a fresh per-process token. Validated: must match `^[A-Za-z0-9_-]{32,}$` (32-char minimum), else refuse to start with a paste-ready generator hint. Failing closed beats silently accepting a weak token. - **--suppress-bootstrap-token** CLI flag: suppresses the printed token line entirely. Operator takes responsibility for tracking the value out-of-band. Startup banner now reflects the chosen source: - `Admin Token: suppressed` when the flag is set. - `Admin Token: from $GBRAIN_ADMIN_BOOTSTRAP_TOKEN` when env-sourced. - Full token print only when both are absent (default behavior, dev installs). Closes #1024. Co-Authored-By: billy-armstrong <billy-armstrong@users.noreply.github.com> * fix(config): migrate legacy 'provider' + 'model' to 'embedding_model' Pre-v0.32 docs and some community templates used a config shape: { "provider": "voyage", "model": "voyage-4-large" } The canonical shape (since the v0.31.12 gateway seam) is: { "embedding_model": "voyage:voyage-4-large" } Users on the legacy shape hit silent fallthrough to the hardcoded OpenAI default; sync + embed errored out with "OpenAI embedding requires OPENAI_API_KEY" regardless of their actual provider config. loadConfig() now translates the legacy keys at parse time: - emits a one-line stderr nudge with the paste-ready canonical key - preserves the rest of the config unchanged - skipped when `embedding_model` is already set (forward-compat) Closes #1086. Co-Authored-By: jeunessima <jeunessima@users.noreply.github.com> * chore(test): quarantine upgrade tests (process.env mutation) PR #1032's cherry-picked tests use the static-snapshot + try/finally pattern for env vars instead of the project's withEnv() helper. The test-isolation lint catches process.env mutations outside withEnv to prevent cross-test leakage in parallel runs. Renaming to *.serial.test.ts (the quarantine convention) is the documented out: runs sequentially, no cross-file race. A future cleanup PR can migrate the tests to withEnv() and drop the quarantine. * fix(test): update brain-writer .bak assertion for centralized backup path The v0.36.x frontmatter backup change (bd60cdf — closes #902) moved .bak files from sibling-of-source to ~/.gbrain/backups/frontmatter/... The old test still asserted on the sibling path, so CI failed even though the production behavior was correct. Updated assertion contract: backup lands under the injected backupRoot (test-isolated), the returned backupPath ends in .bak and exists, and no sibling .bak is created next to the source file. The pre-fix sibling-path is now a negative assertion. * chore: bump version and changelog (v0.36.1.0) v0.36.1.0 — community fix wave (28 atomic fixes + 22 PRs closed as already-shipped + 14 issues triaged). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(fix-wave): close test gaps surfaced by post-ship audit After the fix-wave shipped, an audit found 11 commits with no new test file. Some were inherently structural (build pipelines, shell content) or had existing test coverage that worked either way; others had real regression risk with no guard. This commit closes the gaps that matter. New regression tests for: - OAuth `verifyAccessToken` throws `InvalidTokenError` (not bare Error) on both expired and unknown token paths. Pre-fix, the SDK's `requireBearerAuth` middleware fell through to 500 instead of 401 → client token-refresh logic never fired (#935). - `loadConfig` translates legacy `{provider, model}` config shape to the canonical `embedding_model: <provider>:<model>`. 3 cases: pure legacy → migrated; canonical wins over legacy when both present; canonical-only is untouched. Pre-fix, Voyage/Cohere/Mistral users silently fell through to OpenAI (#1086). - `configDir` rejects relative paths; rejects `..` segments via both separators (regression guard for the Windows path acceptance fix #1019 / cherry-pick #1083). - `resolveBootstrapToken` (new exported helper extracted from `runServeHttp`). 9 cases: unset env generates fresh, valid env accepted, hyphens/underscores accepted, < 32 chars rejected, special chars rejected, whitespace trimmed, empty string rejected, 32-char boundary accepted, 31-char one-short rejected. Security-critical validation surface (#1024). - GET /mcp returns 405 with `Allow: POST, DELETE` (E2E case in `serve-http-oauth.test.ts`). Pre-fix, claude.ai and other probing MCP clients saw 404 and gave up (#1076). - apply-migrations `process.exit(0)` on list / dry-run / up-to-date paths. Source-shape assertion locks the rule in; shell scripts gating on `$?` work (#1062). - Autopilot wrapper sources `~/.zshenv` BEFORE `~/.zshrc`. zshenv is the canonical place for env vars in non-interactive zsh; without this ordering, LaunchAgent subprocesses never inherit secrets exported in zshrc (#966). - `test/fix-wave-structural.test.ts` consolidates source-shape regression guards for fixes whose behavior is hard to runtime-test without heavy mocking: query cache drain (#1125), admin embed manifest + handler (#1090), admin register-client PKCE branch (#1077), PGLite v0.11.0 phase A in-process routing (#1100), query `--no-expand` negation (#1124). 9 source-grep assertions. Refactored `runServeHttp` to extract `resolveBootstrapToken` as a pure helper. The boot path now consumes the helper's tagged-union result ({kind:'ok'|'error'}); side effects (`process.exit`, `console.error`) moved to the caller. Unit-testable without spinning up Express. Test counts: oauth 71 (was 69), config 20 (was 14), apply-migrations 19 (was 18), autopilot-install 5 (was 4), serve-http-bootstrap-token 9 (new file), fix-wave-structural 9 (new file). Net: +28 cases across 6 files; +1 new exported function with full coverage. Remaining audit gaps (deferred): - e82dda0 admin embed E2E (post-deploy curl smoke covers this) - d93fa81 apply-migrations PGLite chain E2E (already smoke-tested manually in the original commit; subprocess test would be flaky in CI without DATABASE_URL gating) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: close the two deferred E2E gaps from the post-ship audit Both gaps now have real behavior coverage. No DATABASE_URL needed (PGLite engine), so they run in standard unit CI alongside the rest of the suite. Serial quarantine because both spawn subprocesses + bind ports / write tmpdirs. test/admin-embed-spawn.serial.test.ts (4 cases, ~6s wall-clock): - Spawns `gbrain serve --http` from a fresh tmpdir so `process.cwd()/ admin/dist` does not exist — this forces the embedded-manifest branch (the one under test). Pre-fix, this exact setup hit 404. - GET /admin/ → 200 + SPA shell HTML (title + #root div), content-type text/html. - GET /admin/index.html → same body via explicit path. - GET /admin/agents → SPA fallback returns index.html for deep links. - GET /admin/api/stats → NOT 200 (regression guard: SPA fallback must not swallow /admin/api/* routes and silently return HTML to a JSON client). Closes #1090. test/apply-migrations-pglite-spawn.serial.test.ts (3 cases, ~25s): - Seeds a fresh PGLite config in a tmpdir, runs `gbrain init --migrate-only` + `gbrain apply-migrations --yes --non-interactive`. Pre-fix this hit "GBrain: Timed out waiting for PGLite lock" because apply-migrations' pre-flight probe + v0.11.0's phase A subprocess both wanted the single-writer lock. - Asserts exit 0, no "Timed out" string, no "Phase A failed" string, brain.pglite file written. - Re-run case: idempotent — "All migrations up to date" exits 0 (also locks in the #1062 exit-code fix end-to-end). - --list path exits 0 (third leg of the #1062 contract). Closes #1100. Pinned bootstrap token via GBRAIN_ADMIN_BOOTSTRAP_TOKEN env so the admin test doesn't have to scrape stderr; the startup banner format is allowed to drift, the /health probe is the readiness contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): consolidate PGLite spawn test to one end-to-end pass CI failed on test/apply-migrations-pglite-spawn.serial.test.ts (Ubuntu, bun 1.3.14). The previous shape ran 3 tests × ~3 spawns each. Each `bun run /abs/src/cli.ts` from a tmpdir cwd pays a full parse/transpile cost (no near-cwd .bun cache); on Ubuntu CI that compounds past the runner's per-test budget. Consolidated to ONE test that exercises the full lifecycle in one brain: init --migrate-only → apply-migrations --yes → re-run → --list. Four spawns instead of eight. Local wall-clock: 32s → 11.5s. All four assertion buckets preserved: no PGLite lock timeout, no Phase A failure, brain.pglite written, idempotent re-run "All migrations up to date" exits 0 (#1062 end-to-end), --list exits 0. Per-test timeout 480_000ms as insurance against the runner's --timeout=60000 default (bun's API spec: per-test wins). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(diag): dump apply-migrations output when CI exit != 0 The PGLite spawn test passes locally on macOS/bun 1.3.13 in ~11s end-to-end but fails on Ubuntu/bun 1.3.14 in 4.92s with apply.exitCode = 1 — fast enough that something is failing early, not timing out. The runCli helper captured stdout+stderr but never printed them, so the CI log only showed the bare assertion failure. This commit prints the captured streams from BOTH init and apply when the exit code mismatches expectation. After the next CI run we can read the actual error message and diagnose the Ubuntu-specific failure mode (likely BUN_INSTALL / HOME / PGLite WASM env quirk). No behavior change; pure diagnostic output gate on failure. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): shim `gbrain` on PATH for PGLite spawn test Root cause of the Ubuntu CI failure: the v0.11.0 orchestrator's phase B runs `execSync('gbrain jobs smoke')`. PGLite phase A now routes in-process (the #1100 fix), but phase B and several follow-up phases still shell out to the `gbrain` binary on PATH. Locally the binary resolves via `bun link`; on CI Ubuntu it does not exist on PATH, so execSync exits 127 → orchestrator returns 'failed' → apply-migrations exits 1. Test failed at 4.92s with exitCode=1, well before any timeout. Verified locally by removing ~/.bun/bin/gbrain to simulate CI: pre-shim: apply.exitCode=1 (same as CI) post-shim: apply.exitCode=0 in 8.4s The shim writes a tiny `gbrain` executable to a tmpdir that just `exec`s `bun run <repo>/src/cli.ts "$@"`. Prepended to PATH for the spawned subprocesses. Mirrors the production contract (gbrain on PATH) without depending on `bun link` having run in the CI image. Diagnostic dump from the previous commit stays — useful insurance for the next time something silently fails inside a spawned binary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: johnybradshaw <johnybradshaw@users.noreply.github.com> Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com> Co-authored-by: sharziki <sharziki@users.noreply.github.com> Co-authored-by: Aashiqe10 <Aashiqe10@users.noreply.github.com> Co-authored-by: lukejduncan <lukejduncan@users.noreply.github.com> Co-authored-by: 100yenadmin <100yenadmin@users.noreply.github.com> Co-authored-by: hnshah <hnshah@users.noreply.github.com> Co-authored-by: p3ob7o <p3ob7o@users.noreply.github.com> Co-authored-by: sliday <sliday@users.noreply.github.com> Co-authored-by: nezovskii <nezovskii@users.noreply.github.com> Co-authored-by: vincedk-alt <vincedk-alt@users.noreply.github.com> Co-authored-by: sergeclaesen <sergeclaesen@users.noreply.github.com> Co-authored-by: navin-moorthy <navin-moorthy@users.noreply.github.com> Co-authored-by: billy-armstrong <billy-armstrong@users.noreply.github.com> Co-authored-by: jeunessima <jeunessima@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gbrain check-backlinks fixTest Plan
bun test test/core/cycle.serial.test.ts --timeout 60000bun test test/e2e/dream-cycle-phase-order-pglite.test.ts --timeout 60000bun run buildbun run src/cli.ts dream --phase backlinks --dir /Users/stas/Dropbox/gbrain --jsonleft the gbrain markdown repo at dirty_before=0 dirty_after=0