Skip to content

V0.1.3/m2/smoke test#9

Open
Vedansi18 wants to merge 66 commits into
hi0001234d:mainfrom
Vedansi18:v0.1.3/m2/smoke-test
Open

V0.1.3/m2/smoke test#9
Vedansi18 wants to merge 66 commits into
hi0001234d:mainfrom
Vedansi18:v0.1.3/m2/smoke-test

Conversation

@Vedansi18
Copy link
Copy Markdown
Collaborator

The integration branch that makes the full chat-history → advisory →
webview round-trip actually run inside Cursor / Windsurf. Wires
together all the pieces built in B1-B4: extension activation now
starts the chat-history watcher (consent-gated, host-aware,
multi-workspace-enumerated), each captured user-prompt event drives
nexpath autonexpath stop → view-provider, and the manual
smoke-test procedure is documented for engineer verification against
a real host.

Stacked on B4 (v0.1.3/m2/cursor-windsurf-adapters, commit
f6a916b) which already has B1+B2+B3+M1 merged in — all prerequisite
contracts available in one tree.

Module covered (per dev plan §3 M2 §2.2)

Module Spec Status
M13 — Local round-trip smoke test on dev machine Code wiring + manual procedure document ✅ Code-side complete + SMOKE-TEST.md shipped. Manual run is the acceptance gate (engineer runs it
separately)

What B5 actually delivers (code-side)

File Purpose Tests
src/ext-vscode/src/path-enumerator.ts enumerateStateVscdbPaths(workspaceStorageDir) — walks per-workspace subdirs, returns existing state.vscdb paths. Injectable fs for tests. 8
src/ext-vscode/src/chat-pipeline.ts createChatEventHandler(deps) orchestrating spawnAuto → spawnStop → publishPayload with 3 independent try/catch blocks (never propagates exceptions to the
watcher) 7
src/ext-vscode/src/onboarding.ts (mod) Exported CONSENT_KEY so extension.ts reads the same globalState key the onboarding writes
src/ext-vscode/src/extension.ts (rewritten) Full activate() flow: detect host → register view provider → onboarding → consent-gated watcher start-up with per-host paths + workspace-prefixed
session IDs. deactivate() stops watcher. 18
src/ext-vscode/src/extension.test.ts (rewritten) 18 tests with vi.hoisted mocks for all new imports + callback wiring tests
src/ext-vscode/src/ipc.ts (mod) Added cwd? to IpcOptions. spawnAuto passes workspace cwd to the spawn process options. spawnStop now sends the FULL StopPayload shape Layer C expects:
{session_id, cwd, hook_event_name: 'Stop', stop_hook_active: false}. 16 (+5 vs B1)
src/ext-vscode/SMOKE-TEST.md Step-by-step manual procedure for the engineer to run against live Cursor / Windsurf, with verification table to paste back as acceptance evidence

The full activate() flow now

activate(context)
├─ detectHost() (B4)
├─ Construct view provider with injectFn-aware onSelect (B3 + B4)
├─ Register view provider with vscode.window (B3)
├─ await showOnboardingIfNeeded(context) (B1)
├─ Gate: consent === true ? skip-return (NEW B5)
├─ Gate: host !== 'vscode-generic' ? skip-return (NEW B5)
├─ workspaceStorageDir + enumerateStateVscdbPaths (B4 + NEW B5)
├─ Gate: dbPaths.length > 0 ? skip-return (NEW B5)
├─ Build chat-pipeline handler (NEW B5)
│ - spawnAuto/spawnStop curried with workspace cwd
│ - composeSessionId prefixes with workspace fsPath
├─ createChatHistoryWatcher with onEvent/onError/onSchemaUnknown
├─ watcher.start() + push stop-disposable on subscriptions
└─ Done

hi0001234d and others added 30 commits May 5, 2026 20:43
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nt sets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ofile

Adds selectAbsenceMap helper (hardcore_pro → formal, else → casual) and
updates resolveDecisionContent to use it for non-vibe profiles. Updates
6 no-profile tests and 2 priority-override tests to assert casual variants,
consistent with selectNonBeginnerVariant's undefined → casual behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add 3 hardcore_pro routing tests for remaining formal absence variants,
add all 8 non-beginner absence sets to allContent and per-set count tests,
add 4 beginner absence sets to C-02 structural validation block with
missing imports for ABSENCE_REGRESSION_CHECK_BEGINNER and
ABSENCE_SPEC_ACCEPTANCE_BEGINNER.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three changes to buildMjsScript() in TtySelectFn.ts:
- Fix _lineCount to split labels by \n and compute per-line visual rows
  instead of treating the whole label as one long string; this correctly
  estimates height for multi-line (numbered-steps) option labels
- Add wrapping guard: when all options fit within budget (no-overflow) but
  their total visual lines exceed the option count, recompute _maxItems
  using the option count as a tight budget ceiling so visually dense
  option sets always get a viewport rather than a flat list
- Pass maxItems to select() so @clack/prompts k() uses _maxItems as the
  viewport window instead of rows-4; without this, maxItems computation
  had no effect on the actual rendered UI

Applies to Mac, Windows, and Linux new-window paths (all share buildMjsScript).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…word lists

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ded vibeKeyword sets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds formal, casual, and beginner DecisionContent objects for
idea_scoping, idea_constraint_check, and idea_user_definition.
Includes content existence tests covering shape and non-empty strings.
Map wiring deferred to Phase 4.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds formal, casual, and beginner DecisionContent objects for
task_ordering, task_sizing, and task_definition_of_done.
Includes content existence tests covering shape and non-empty strings.
Map wiring deferred to Phase 4.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hi0001234d and others added 30 commits May 7, 2026 22:07
…INNER

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cture test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… map

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…NG_CASUAL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 8 new absence signal definitions in signals.ts (scope_creep, context_loss, api_design_review, accessibility, environment_and_secrets, data_validation, ci_pipeline, rate_limiting)
- 24 new content set constants across 3 registers (formal, casual, beginner) in options.ts and options-beginner.ts
- All 8 keys added to ABSENCE_CONTENT, ABSENCE_CONTENT_CASUAL, ABSENCE_CONTENT_BEGINNER maps
- relevantProjectTypes filter on api_design_review, accessibility, rate_limiting signals
- AbsenceDetector.ts: Gate 3 now uses per-signal absenceThreshold with profile multiplier; project-type gate added
- types.ts: relevantProjectTypes field added to SignalDefinition
- auto.ts: projectType extracted from getProject() and passed to detectAbsenceFlags
- Tests: routing, content existence, structure, and buildOptionList coverage for all 8 new signals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the new src/agents/ module: four adapter interfaces (HookAdapter,
VSCodeExtensionAdapter, CLIWrapAdapter, BrowserExtensionAdapter), an
in-process registry (registerAdapter, detectAll, getAdapter), and an
empty index.ts placeholder for future adapter registrations. Unit tests
in registry.test.ts cover the registry behaviour.

Adds src/cli/commands/install.snapshot.test.ts plus its generated
baseline snapshot. The snapshot captures current installAction output
(settings.json bytes + stdout) with $HOME and platform-dependent
strings normalised so the snapshot is portable across machines. This
is the zero-diff safety net for M1 Branch 2 (claude-code refactor):
that branch must keep this snapshot byte-identical.

No existing source code is modified. Per dev plan §1.6 in
reviewduel-submodule.

Branch: v0.1.3/m1/foundation-scaffold

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the six Claude Code hook helpers (getClaudeSettingsPath,
buildHookCommand, buildStopHookCommand, buildHookEntry, writeHookEntry,
removeHookEntry) from src/cli/commands/install.ts to
src/agents/adapters/claude-code.ts. Function bodies are byte-identical.
install.ts re-exports them so existing imports (and install.test.ts)
continue to work unchanged.

Adds claudeCodeAdapter (HookAdapter) that wraps the moved functions and
self-registers via src/agents/index.ts side-effect import.
installAction's Claude Code branch in the for-loop now delegates to the
adapter via getAdapter('claude-code').install(ctx).

Adds optional settingsPath override to InstallContext so callers can
decouple the target file path from ctx.home — preserves the pre-refactor
pattern where paths.claudeSettings was passed independently of homedir()
(used by install.test.ts to inject custom tmp paths without stubbing
HOME). Without this, tests would write hook entries to the real
~/.claude/settings.json instead of their tmp dir.

Adds src/agents/adapters/claude-code.test.ts (18 unit tests) covering
the moved helpers + adapter contract (detect, settingsPath, buildHooks,
install, uninstall) + the settingsPath override behaviour.

Zero-diff invariant preserved: install snapshot from M1 Branch 1 remains
byte-identical. All 177 relevant tests pass. typecheck clean.

Branch: v0.1.3/m1/claude-code-refactor (off v0.1.3/m1/foundation-scaffold,
which sits on upstream/user-experience-improvements-sub-7).

Per dev plan §3.0 in reviewduel-submodule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original install.ts comment was a single line:
  // Register the advisory pipeline hook (separate from MCP — different file)

The previous M1/B2 commit (d93852e) expanded this into a four-line
comment explaining the adapter delegation. Per team feedback, comments
on existing pre-refactor code should be kept verbatim — the §1.5 strict
zero-diff guarantee includes comments on existing code.

No behavioural change. Tests + snapshot unchanged (177/177 pass, install
snapshot remains byte-identical with M1 Branch 1's baseline).

Branch: v0.1.3/m1/claude-code-refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 1 of Milestone M2 (v0.1.3/m2/extension-skeleton). Establishes the
src/ext-vscode/ sub-package with an esbuild-driven build pipeline
(ESM source -> CJS bundle for the VS Code host), activates on
onStartupFinished, and ships the four scoped modules:

  M1 - Skeleton: package.json (activationEvents, activity-bar container +
       placeholder view backed by viewsWelcome so the icon actually renders),
       tsconfig.json, esbuild.config.mjs, src/extension.ts entrypoint.
  M5 - IPC stub: src/ipc.ts. spawnAuto(prompt, sessionId) and
       spawnStop(sessionId) spawn the nexpath CLI as subprocesses and parse
       the decision-session JSON payload from stdout, with typed errors
       (NexpathBinaryNotFoundError, NexpathMalformedPayloadError) and
       configurable binary-path resolution
       (opts.binaryPath -> NEXPATH_BIN env -> 'nexpath' on PATH).
       The exact stdin envelope vs. Layer C input contract is intentionally
       a stub here; Branch 4 (cursor-windsurf-adapters) finalises it.
  M11 - Onboarding: src/onboarding.ts. First-launch consent toast persists
        the user's choice to globalState; on macOS, additionally shows a
        Full-Disk-Access guidance toast that deep-links to the System
        Settings privacy pane.
  M12 - Icon: media/icon.svg. Y-fork (branching path) representing
        "next path" decision points; monochrome currentColor, scalable.

25 unit tests co-located alongside source (8 onboarding, 11 ipc, 6 extension),
runnable via root vitest with vi.mock('vscode') stubs. Sub-package has its
own tsconfig + package-lock; root tsconfig now excludes src/ext-vscode/ so
each side owns its TS build. Both root and sub-package tsc --noEmit are
clean. Full root test suite: 1851 passing + 18 pre-existing unrelated
TtySelectFn Windows-sim failures (carried forward from dev plan §3.0).

Deferred (flagged for follow-up, not blockers for this branch):
- 5 moderate npm-audit warnings in the esbuild -> vite -> vitest dev chain
  (dev-only; will be addressed during M5 hardening).
- IPC stdin envelope contract: real wiring lands in Branch 4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 2 of Milestone M2 (v0.1.3/m2/chat-history-capture). Stacked on
M2 Branch 1 (commit 879ed5e). Adds the three scoped modules:

  M2 - chat-history-watcher.ts: fs.watch on Cursor's state.vscdb and
       Windsurf's ~/.codeium/windsurf/ dir, debounced (default 250ms),
       reads ItemTable via injectable readItemTableFn (sql.js by default),
       diffs against seenSignatures, emits {prompt, rawSessionId,
       capturedAt, sourcePath, extractorId}. Dependency-injectable
       throughout (watchFn, readFileFn, readItemTableFn, nowFn) so the
       unit tests run without sql.js or real fs.watch.

  M3 - extractors/: four per-version row decoders implementing the
       ChatHistoryExtractor contract from chat-history-types.ts.
         - cursor-v2024-q4 (aiService.prompts global key, pre-Composer)
         - cursor-v2025-q1 (composerData.composerData, Composer era)
         - cursor-v2025-q2 (cursorAIChatService.chatHistory.<tabId>
                            per-tab keys, current)
         - windsurf (cascade.* placeholder; real Windsurf decoding lands
                     in Branch 4 alongside windsurfAdapter)
       Each Cursor extractor handles both `role`/`type` and
       `content`/`text` field variants seen across minor versions.
       All four are TODO-flagged for verification against real dumps
       before Branch 6 publishes — scripts/dump-cursor-state.ts (below)
       captures those dumps.

  M4 - pickExtractor in extractors/index.ts: prefix-match each
       extractor's fingerprintKeys against the observed ItemTable keys,
       pick the highest match count (ties broken by registry order =
       newest first). Returns FingerprintResult; unknown schemas surface
       observedSampleKeys for the "schema unknown" toast hook.

scripts/dump-cursor-state.ts: dev-only helper (npx tsx) for capturing
state.vscdb fixtures from a machine with Cursor installed. Filters to
chat-related key prefixes, optional --redact for sensitive content.
Outputs to src/ext-vscode/test-fixtures/state-vscdb-samples/.

Sub-package additions:
  - dependencies: sql.js ^1 (runtime; loaded via dynamic import so wasm
    boot is lazy). Marked external in esbuild so the .vsix ships
    node_modules/sql.js rather than inlining it.
  - devDependencies: tsx ^4 (for running the dump script).

57 new unit tests (sub-package totals: 82 passing across 9 files):
  cursor-v2024-q4   9 tests
  cursor-v2025-q1  10 tests
  cursor-v2025-q2  11 tests
  windsurf          4 tests
  extractors/index 12 tests
  chat-history-watcher 11 tests

Verification: root tsc --noEmit clean; sub-package tsc --noEmit clean;
sub-package vitest 82/82 pass; full root test suite 1908 passing + 18
pre-existing TtySelectFn Windows-sim failures (carried forward from M1
3.0, unrelated); esbuild bundle still builds out/extension.js.

Deferred to follow-up (flagged, not blockers):
- Real-dump verification of all 4 extractors (use dump-cursor-state.ts
  on machines with each Cursor version installed; replace TODO comments
  in extractors with fixture-driven regression tests).
- Windsurf JSON-file decoder (Branch 4).
- Wiring the watcher into extension.ts activate() (Branch 3 webview-ui
  or Branch 4 adapters — depends on UI surface integration).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-machine inspection on Cursor 3.4.20 (2026-05-15) surfaced three
issues with the Branch 2 extractor designs. This commit fixes the
verifiable ones, captures redacted fixtures, and documents the still-
unknown bits for the next round.

Issue 1 — SQLite WAL mode.
The dump script previously used sql.js, which only reads the buffer of
the main `.vscdb` file. Live Cursor writes go to the sibling
`.vscdb-wal` (185 KB while the main file was 4 KB), so sql.js saw
"no such table: ItemTable" even though the table exists.

Fix: switched the dump script to better-sqlite3 (native, WAL-aware).
Copies main + wal + shm siblings to a tmp staging dir before reading
so the live Cursor write path is never touched, then runs
`PRAGMA wal_checkpoint(TRUNCATE)` on the staged copy for consistency.

The PRODUCTION watcher in `chat-history-watcher.ts` still uses sql.js
via dynamic import; the same WAL problem will surface when Branch 4
wires the watcher live. Flagged for Branch 4 design — options are:
(a) switch the watcher to better-sqlite3 (native binding in .vsix), or
(b) implement copy + checkpoint via sql.js. Out of scope for B2.

Issue 2 — `cursor-v2025-q1` extractor's fingerprint key was wrong.
Community docs said `composerData.composerData`; Cursor 3.4.20 actually
uses `composer.composerData`. Updated the key in both the extractor and
its tests + the fingerprint test.

Open finding: the `composer.composerData` value on a chat-less Cursor
3.4.20 workspace DB is metadata only (selectedComposerIds, migration
flags) — not the conversation messages this extractor's decodeRow logic
parses for. Logic falls through cleanly (returns [] when the expected
`allComposers` field is absent) and the JSDoc now documents that the
real Composer message storage location is still TBD and needs a
post-chat snapshot to confirm.

Issue 3 — `cursor-v2025-q2` extractor's fingerprint prefix
(`cursorAIChatService.chatHistory.`) was NOT observed on Cursor 3.4.20.
The extractor still ships (in case older versions use it) but the JSDoc
now flags this as unverified and points to the dump script for capturing
a real fixture before Branch 6 ships.

Dump script additions:
- Discovers ALL state.vscdb under Cursor's config tree (global +
  per-workspace) — chat messages live in the workspace DB, not global.
- Dumps both `ItemTable` (filtered to chat-related key prefixes) AND
  `cursorDiskKV` (Cursor 3.x's parallel KV table; currently empty but
  may hold Composer messages once chats happen).
- One output JSON per discovered DB; suffixed with `global` or
  `workspace-<id>` for traceability.
- `--redact` replaces string values > 8 chars with same-length asterisks.

Dependencies:
- Added better-sqlite3 ^11 + @types/better-sqlite3 ^7 as devDependencies
  in the sub-package. Dev-only — the production extension bundle is
  unaffected.

Captured fixtures (redacted) — all three DBs from a chat-less Cursor
3.4.20 session, committed for regression testing:
  - cursor-3-4-20-initial-global.json (9 rows)
  - cursor-3-4-20-initial-workspace-1778826246907.json (7 rows)
  - cursor-3-4-20-initial-workspace-empty-window.json (2 rows)

Verification:
- Sub-package tsc --noEmit clean.
- Sub-package vitest 82/82 pass.
- Root tsc --noEmit clean.
- Full root test suite 1908 passing + 18 pre-existing TtySelectFn
  carry-forward.

Next step (manual, user-driven): submit a real prompt in Cursor's Ask
mode AND in Composer mode, then re-run the dump script to capture a
chat-bearing snapshot. The new keys / tables that appear will pin down
the Composer-mode message storage location, and a follow-up commit will
finalise the extractor decode logic against that real data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the unit-test audit gap surfaced for M2 Branch 2. The dump script
had real business logic (`redactValue`, `shouldKeepItemTable`,
`parseArgs`, `cursorConfigRoot`, `discoverAllStateVscdb`) with zero test
coverage — `redactValue` in particular has data-leak consequences if
buggy.

Extracted the pure / near-pure helpers into a new module:
  - `src/cursor-state-dump-helpers.ts` — lives under tsconfig rootDir
    so it's typechecked by the sub-package's main `tsc --noEmit` and
    auto-picked-up by vitest. Re-exports `KEEP_ITEMTABLE_PREFIXES`,
    `shouldKeepItemTable`, `redactValue`, `cursorConfigRoot`,
    `discoverAllStateVscdb` (with injectable fs helpers), and
    `parseArgs` (returns a tagged-union result instead of calling
    `process.exit`, so the error paths are testable).

Co-located tests: `src/cursor-state-dump-helpers.test.ts` — 28 tests
covering:
  - `shouldKeepItemTable`: each default prefix matched, unrelated keys
    dropped, custom prefix lists, prefix-not-exact match.
  - `redactValue`: short-string preservation, long-string redaction,
    nested object/array recursion, non-string value preservation, bulk
    redact for non-JSON input, JSON-string root, exact 9-char boundary.
  - `cursorConfigRoot`: linux / darwin / win32 / unknown-platform paths
    and APPDATA fallback.
  - `discoverAllStateVscdb`: empty tree, global-only, global + multiple
    workspaces, skip workspace dirs missing the DB, injectable fs.
  - `parseArgs`: required `--name`, optional `--src` / `--redact`,
    `--help` / `-h` signal, missing-value rejection, unknown-argument
    rejection.

Script entry-point `scripts/dump-cursor-state.ts` now imports from
`../src/cursor-state-dump-helpers.js` and retains only the I/O
orchestration (file copy to tmp staging dir, better-sqlite3 read, fixture
write). Behaviour is byte-for-byte unchanged — verified by re-running
against the live machine and producing identical row counts to the
previous commit (`3794bc3`).

Sub-package totals:
  - Test files: 10 (was 9)
  - Tests: 110 passing (was 82) — +28 helpers tests
  - Sub-package tsc --noEmit clean
  - Root tsc --noEmit clean
  - Full root suite: 1936 passing + 18 pre-existing TtySelectFn
    Windows-sim failures (M1 §3.0 carry-forward, unrelated)

Closes the only remaining audit gap for M2 Branch 2. No further unit-test
work pending; per the auto-commit rule the branch is now closed pending
push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 3 of Milestone M2 (v0.1.3/m2/webview-ui). Stacked on M2 Branch 1
(commit 879ed5e) — does NOT depend on M2 B2's watcher, only on B1's
skeleton + the DecisionSessionPayload type from ipc.ts. Delivers the
three scoped modules:

  M6 — WebviewViewProvider: src/webview/view-provider.ts.
       NexpathDecisionSessionViewProvider implements vscode.WebviewViewProvider
       for the nexpath.status activity-bar view. resolveWebviewView wires
       webview.options (enableScripts + localResourceRoots), sets initial
       HTML, registers onDidReceiveMessage + onDidDispose. publishPayload
       stores the payload, updates the HTML, and calls webviewView.show(true)
       for the auto-reveal UX matching architecture rev 2 §4. Payload
       survives view dispose/re-show. Injectable onSelect dependency for
       tests. Exposes getCurrentPayload() + handleMessage() for direct
       message-routing tests.

  M7 — HTML template: src/webview/html.ts.
       renderDecisionSessionHtml(payload, opts) — pure function, no I/O.
       Returns the full self-contained HTML for the webview. Two modes:
       empty/watching state (no scripts, just "Nexpath is active…") and
       populated state (advisory + numbered option buttons + dismiss).
       CSP: default-src 'none' with nonce-scoped scripts. All user-controlled
       strings HTML-escaped. Theming via --vscode-* CSS variables so the UI
       inherits Cursor/Windsurf's theme. Tests verify both states, nonce
       handling, HTML escaping (incl. <script> + onerror= injection attempts),
       and empty-options array.

  M8 — Prompt injection: src/webview/prompt-injection.ts.
       handleOptionSelection writes the selected option text to the system
       clipboard via vscode.env.clipboard.writeText + shows a non-modal info
       toast directing the user to paste. This is the ONLY reliable path —
       VS Code text-editing APIs target editor documents, not the host's
       (Cursor's) chat input panel (dev plan §2.4). Branch 4 may discover
       a Cursor-specific command that lets us write directly; until then
       clipboard + toast is the primary path. Injectable deps for tests.

extension.ts updates:
  - Registers the view provider on activate via
    vscode.window.registerWebviewViewProvider(VIEW_ID, instance).
  - Pushes the registration disposable onto context.subscriptions for
    cleanup on deactivate.
  - Holds the provider at module scope; exposes via getViewProvider() so
    Branch 4's adapter wiring can publish payloads.
  - View provider registration runs BEFORE onboarding so the icon shows
    immediately on activation, even while consent toasts are open.
  - Onboarding errors still swallowed (per existing B1 behaviour).

package.json updates:
  - nexpath.status view now declares "type": "webview" (was implicit tree).
  - viewsWelcome entry removed — webview-type views render their own empty
    state from inside the webview HTML, not via viewsWelcome. The empty
    state in renderDecisionSessionHtml replaces it.

38 new unit tests:
  - html.test.ts: 13 (escapeHtml + empty state + populated state + nonce
    + HTML escaping in advisory/options + empty options array)
  - view-provider.test.ts: 14 (VIEW_ID + resolveWebviewView × 4 +
    publishPayload × 3 + clearPayload + handleMessage × 5)
  - prompt-injection.test.ts: 6 (clipboard write + toast + error paths +
    DI + empty string)
  - extension.test.ts: +5 (registration test + subscriptions push +
    getViewProvider + onboarding-rejects-but-view-still-registered + the
    deactivate clears viewProvider)

Sub-package totals at branch HEAD: 63 tests across 6 files (was 25 in B1,
+38 here). Root tsc + sub-package tsc clean. Full root test suite 1889
passing + 18 pre-existing TtySelectFn carry-forward (unrelated).
Esbuild bundle grew from 3.4 KB → 11.0 KB (includes the new webview
modules + their CSS template strings).

Deferred (flagged, not blockers for this branch):
- Pre-prompt blocking on Cursor/Windsurf — current architecture only
  shows guidance after the host sends the prompt. Pre-send blocking
  would need a keybinding hijack (architecture doc §11 open question 7).
- Cursor-specific "write to chat input" command — discover in Branch 4
  if it exists, otherwise clipboard + toast remains the only path.
- E2E test against a real Cursor instance — Branch 5 (smoke-test) gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two refinements after cross-confirmation review against the dev plan + a
read of the Layer C TTY UI in src/decision-session/TtySelectFn.ts.

## 1. injectFn contract — addresses Drift hi0001234d#3 (primary text-editing path)

prompt-injection.ts now defines:
  - `OptionInjector = (text: string) => Promise<boolean>` — the contract
    for a direct-injection function (agent-specific, lives in B4).
  - `PromptInjectionDeps.injectFn?` — optional adapter-supplied
    injector. B3 default is absent → clipboard fallback always wins.

handleOptionSelection now has two paths:
  1. If `deps.injectFn` is provided AND `injectFn(text)` resolves true:
     skip clipboard. Text is in the chat input. Done.
  2. Otherwise (injectFn absent, returned false, or threw): fall back
     to clipboard + info toast.

B4 (cursor-windsurf-adapters / M9 + M10) will:
  - Discover Cursor / Windsurf command ids that write text to the
    AI chat input (via `vscode.commands.getCommands(true)`).
  - Implement `cursorChatInputInject` / `windsurfChatInputInject` of
    type OptionInjector.
  - Pass them through the view-provider constructor's onSelect arg as:
      const onSelect = (text) =>
        handleOptionSelection(text, { injectFn: cursorChatInputInject });

Decision saved to memory at
~/.claude/projects/-home-emptyops-Documents-Vedanshi-NexPathMain-reviewduel/memory/project_b4_prompt_injection_contract.md
— marked load-bearing (do not delete or rename the named symbols).
This guarantees the deferred work doesn't get forgotten in a future
session.

4 new unit tests in prompt-injection.test.ts:
  - injectFn returning true → clipboard NOT touched
  - injectFn returning false → falls back to clipboard
  - injectFn throwing → falls back to clipboard
  - injectFn absent → clipboard path (default B3 behaviour)

## 2. Keyboard shortcuts — addresses Drift hi0001234d#2 (Layer C UX consistency)

After reading TtySelectFn.ts, the relevant UX patterns to mirror:
  - Ctrl+X = opt-out / dismiss (matches Layer C's `\\x18` keypress
    handler at TtySelectFn.ts:128 + the install disclosure copy:
    "press Ctrl+X during an advisory")
  - Esc = standard web cancel (TTY doesn't have this but it's
    expected web UX)

Added to the webview HTML script:
  - keydown handler for Ctrl+X → dispatches `{type: 'dismiss'}`
  - keydown handler for Esc → dispatches `{type: 'dismiss'}`
  - keydown handler for digits 1-9 → dispatches `{type: 'select'}`
    against the Nth option (matches the visible numbering)
  - First option focused on render so keyboard users land on
    something actionable.
  - Visible kbd-hint text in the options header and on the dismiss
    button so the shortcuts are discoverable.

Patterns NOT mirrored (intentional, rationale):
  - TTY's two-step "Send to Claude now" / "Copy to clipboard" sub-menu:
    redundant in the webview — until B4's injectFn lands, every path
    ends in clipboard anyway. The two-step adds friction without value.
  - 60s auto-dismiss timeout: the webview is non-modal; the user can
    let it sit indefinitely. Adds complexity without UX gain.
  - Arrow-key navigation (Tab already works natively in HTML; number
    keys are the faster path for our short option lists).

5 new unit tests in html.test.ts:
  - keyboard hint string visible in options header
  - hint range scoped to option count (capped at 9)
  - keydown handler dispatches select on digits 1-9
  - Esc + Ctrl+X handlers dispatch dismiss
  - first option button focused on render

## Verification

  - Sub-package tsc --noEmit clean
  - Sub-package vitest: 72/72 pass (was 63, +9 new)
  - Root tsc --noEmit clean
  - Full root test suite: 1898 passing + 18 pre-existing TtySelectFn
    carry-forward
  - Esbuild bundle: 11.0 KB → 12.3 KB (the new keyboard handler script
    + injectFn branch)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-confirmation audit caught one real resilience gap + two missing
unit-test coverage points. All scoped to M2/B3 work.

## Resilience fix in view-provider.ts

NexpathDecisionSessionViewProvider.handleMessage previously did:

    await this.onSelect(msg.optionLabel);

If onSelect rejected (which a real B4 injectFn can — e.g. when a Cursor
command is missing or throws), the rejection propagated up. The caller
chain is `webview.onDidReceiveMessage` → `void this.handleMessage(raw)`
in resolveWebviewView, which has no `await` to catch it — so it would
have surfaced as an unhandled promise rejection in the extension host.

Fixed by wrapping the onSelect call in try/catch + console.error. The
user-facing error path stays in handleOptionSelection (which already
shows a toast on clipboard failure); the catch here is a last-resort
guard so the extension host doesn't see unhandled rejections.

## 3 new unit tests covering previously-untested behaviour

view-provider.test.ts (+2):
  - "a second publishPayload replaces the first (no stacking)" — verifies
    the latest payload wins, both currentPayload and webview.html
    reflect it, view.show is called per publish.
  - "catches errors from onSelect so they never become unhandled
    rejections" — proves the new try/catch works + the error is logged
    to console.error with the right prefix.

html.test.ts (+1):
  - "escapes attribute-breaking quote characters in option id and label"
    — the existing escape test covered `<` `>` `&`. Quotes (`"`) inside
    a data-option-id="..." attribute would close the attribute and
    allow injection. Verifies escapeHtml correctly converts `"` to
    `&quot;` in both data-option-id and data-option-label.

## Verification

  - Sub-package tsc --noEmit clean
  - Sub-package vitest: 75/75 pass (was 72; +3)
  - Root tsc --noEmit clean
  - Full root test suite: 1901 passing + 18 pre-existing TtySelectFn
  - Esbuild bundle: still builds clean (~12.3 KB)

Closes the M2/B3 unit-test audit gap. Per auto-commit rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…actors to wire alongside the view-provider from B3
…ursor/Windsurf CLI adapters live in src/agents/adapters/ which is M1's territory
Branch 4 of Milestone M2 (v0.1.3/m2/cursor-windsurf-adapters). This is
the integration branch — stacked on B3 (`3d0957e`), with B2
(`94d81dc`) merged in (`536bca8`) and M1 (`66dd54b`) merged in
(`21f3f48`) so all four prerequisite contracts are available in one
working tree: M1's adapter registry, B2's chat-history watcher +
extractors, B3's webview view-provider + injectFn contract.

This commit covers the narrow dev-plan scope for B4: the CLI-side
adapters (M9 + M10). The bigger wiring (extension host-detection,
chat-input injector, WAL fix for the production watcher, and
extension.ts activate wiring) lands as a follow-up commit on this
same branch — keeps the diff reviewable.

  M9 — src/agents/adapters/cursor.ts: VSCodeExtensionAdapter.
    - detect() checks for Cursor's OS-specific config dir
      (~/.config/Cursor on linux, Library/Application Support/Cursor on
      darwin, %APPDATA%/Cursor on win32).
    - install() prints deep-link install instructions when Cursor is
      present (Open VSX + VS Code Marketplace URLs + cursor
      --install-extension CLI fallback). Returns status: 'skipped' if
      Cursor isn't installed.
    - chatHistoryPaths() returns the User/workspaceStorage base dir; the
      extension enumerates per-workspace state.vscdb files at activation
      time, not at install time.
    - extractPrompt() returns null. The architecture interface declares
      it for symmetric API shape, but actual row decoding lives in the
      extension runtime via src/ext-vscode/src/extractors/ — the CLI
      never runs the watcher. JSDoc spells this out.
    - Self-registers via the agent registry side-effect on module load.

  M10 — src/agents/adapters/windsurf.ts: same shape as cursor.ts.
    - Windsurf is also a VS Code fork; ships the same extension.
    - Detection checks BOTH ~/.config/Windsurf/ AND the legacy
      ~/.codeium/windsurf/ Cascade directory (Windsurf may populate
      either or both depending on version). chatHistoryPaths returns
      both for the watcher to track. extractPrompt stubbed identically.

  src/agents/index.ts: side-effect imports both adapters so
  `nexpath install` picks them up via the registry's detectAll/getAdapter.

Tests (31 new, both adapters):
  - cursor.test.ts: 15 tests covering cursorConfigDir × 4 OS branches,
    static fields, detect (present/absent), chatHistoryPaths shape,
    extractPrompt-returns-null, install (skip + present + log content),
    uninstall (skip + present), registry self-registration.
  - windsurf.test.ts: 16 tests covering the same surface area + the
    "detect by EITHER config dir" branches (windsurf-only,
    codeium-only, both).

Verification:
  - Root tsc --noEmit clean.
  - Full root test suite: 2047 passing + 18 pre-existing TtySelectFn
    carry-forward (M1 §3.0 carry-forward, unrelated).
  - Snapshot invariant preserved — no install.ts modification.

Deferred to a follow-up commit on this same branch (within scope):
  - WAL fix: switch chat-history-watcher.ts's default reader from
    sql.js to better-sqlite3 (per dev plan §2.5). The dev-only dump
    script already uses better-sqlite3.
  - Extension host-detector (decide Cursor vs Windsurf vs plain VS Code
    at activation time via vscode.env.appName).
  - chat-input-injector implementing the OptionInjector contract (per
    memory project_b4_prompt_injection_contract.md) — try Cursor /
    Windsurf chat-input commands via vscode.commands.executeCommand
    then fall back to clipboard.
  - extension.ts wiring: construct WatchTargets from host-detector
    paths, hook watcher.onEvent to spawn nexpath auto/stop, publish
    payloads to the view provider with the injectFn-aware onSelect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ovider wiring

Wires the architectural pieces that B4's narrow scope (cursor + windsurf
CLI adapters) leaves open. Four concerns, all closely related, all sized
to land together.

## 1. WAL-mode fix — chat-history-watcher's default reader

Dev plan §2.5 flagged this as a B4-decision: sql.js operates on a
buffer of the main .vscdb file and CANNOT see the WAL siblings
(.vscdb-wal / .vscdb-shm), which is where Cursor 3.4.20's live writes
actually land. The dev-only dump-cursor-state script already uses
better-sqlite3 + a copy-to-staging-dir strategy; this commit lifts the
same approach into the production watcher.

Changes:
  - Swapped sql.js → better-sqlite3 in the production watcher's default
    readItemTableFn. The new reader copies main + .vscdb-wal + .vscdb-shm
    to a tmp staging dir, opens read-only, runs
    PRAGMA wal_checkpoint(TRUNCATE) belt-and-braces, queries ItemTable,
    cleans up the staging dir.
  - **API change:** ReadItemTableFn signature changed from
    `(dbBytes: Buffer) => Promise<rows>` to
    `(dbPath: string) => Promise<rows>` so the reader can access the WAL
    siblings itself (the buffer-based form couldn't). Watcher no longer
    needs readFileFn — removed it from ChatHistoryWatcherOptions. Tests
    updated to match (one test scenario removed: the readFileFn error
    forwarding test — the failure mode is now subsumed by
    readItemTableFn errors which has its own test).
  - Defensive: if ItemTable doesn't exist on the file (freshly-created
    state.vscdb), reader returns [] rather than throwing.

Package + bundle changes:
  - better-sqlite3 moved from devDependencies to dependencies (now a
    runtime dep, not just dev-only).
  - sql.js removed from dependencies (no longer used by either the
    watcher or the dump script).
  - esbuild external list updated: 'vscode', 'better-sqlite3'. The
    .vsix needs to ship node_modules/better-sqlite3 with prebuilt
    binaries for each platform — Branch 6 (publish) responsibility.

## 2. host-detector — Cursor vs Windsurf vs plain VS Code

src/ext-vscode/src/host-detector.ts — small pure module:
  - classifyHost(appName): maps "Cursor*" → cursor, "Windsurf*" →
    windsurf, everything else → vscode-generic.
  - detectHost(deps?): reads vscode.env.appName (or injected override).
  - chatHistoryBaseDir(inputs?): per-host OS-specific config dir;
    returns null for vscode-generic (no AI chat to watch).
  - workspaceStorageDir(inputs?): appends User/workspaceStorage to the
    base — the directory the watcher will enumerate for per-workspace
    state.vscdb paths.

11 unit tests covering all host × platform × inputs combinations.

## 3. chat-input-injector — fills the B4 injectFn contract

src/ext-vscode/src/chat-input-injector.ts — implements OptionInjector
per memory `project_b4_prompt_injection_contract`:

  - For vscode-generic host → returns false immediately (no AI chat to
    inject into; clipboard fallback wins).
  - For cursor / windsurf:
    1. Get the live command list via vscode.commands.getCommands(true).
    2. Try each host-specific candidate id (in order). First one that
       executes without throwing returns true.
    3. If none available or all fail → returns false (clipboard fallback
       takes over in handleOptionSelection).

**Candidate command IDs are HEURISTIC GUESSES** based on community
documentation. They're explicitly marked unverified — Branch 5
(smoke-test) is where the engineer hand-verifies against a live
Cursor / Windsurf, prunes / re-orders the list. Until then the
practical net effect on Cursor 3.4.20 is "no match → fall through to
clipboard", which is the safe behaviour.

8 unit tests covering: vscode-generic short-circuit, cursor happy
path, candidate-try-order, command-list filtering, all-fail-fallback,
getCommands throwing, windsurf branch, exported candidate list shape.

## 4. extension.ts wiring

extension.ts now constructs the view provider with an injectFn-aware
onSelect:

  const host = detectHost();
  const onSelect = (text) =>
    handleOptionSelection(text, {
      injectFn: (t) => chatInputInject(t, { host }),
    });
  viewProvider = new NexpathDecisionSessionViewProvider(
    context.extensionUri,
    onSelect,
  );

The chat-history watcher is NOT yet started in activate() — that
wiring is deferred to a B5 follow-up where it can be smoke-tested
against a real Cursor instance (the "stop trigger" timing — when do
we call `nexpath stop`? — needs real-Cursor behaviour to settle).
For now the view-provider just sits with the empty-state HTML; B4 +
B5 close the loop.

extension.test.ts updated:
  - Added 4 new vi.mock blocks for the new modules (prompt-injection,
    host-detector, chat-input-injector) + extended the vscode mock
    with env.appName + commands.
  - Adjusted "constructs the view provider" test to expect the second
    onSelect argument.

## Verification

  - Root tsc --noEmit clean.
  - Sub-package tsc --noEmit clean.
  - Sub-package vitest: 181/181 pass (was 160 baseline; +21 from B4
    follow-up: 11 host-detector + 8 chat-input-injector + 2 net
    elsewhere).
  - Full root suite: 2068 passing + 18 pre-existing TtySelectFn
    carry-forward.
  - Esbuild bundle: 14.7 KB (was 12.3 KB in B3 — added host-detector +
    chat-input-injector + wiring).

## Memory update

The `project_b4_prompt_injection_contract` memory said B4 must
"Implement cursorChatInputInject / windsurfChatInputInject of type
OptionInjector. Wire them through the view-provider constructor's
onSelect arg." Done — both halves filled in. The memory remains
load-bearing because the symbols still exist; what's changed is
the candidate command list is now an EDUCATED-GUESS placeholder
awaiting real-Cursor verification in Branch 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Prompt docs

Closes the two drifts surfaced in the M2/B4 cross-confirmation review.

## Drift hi0001234d#2 — installAction + uninstallAction now invoke registry adapters

Before this commit: cursorAdapter and windsurfAdapter self-registered
in the agent registry but `nexpath install` never called them. The
legacy `detectAgents()` for-loop only routed `claude-cli` agents
through the registry (`getAdapter('claude-code').install`); everything
else was gated by `REGISTER_MCP_SERVER = false`. The B4 acceptance
line "`nexpath install` detects both, prints correct deep-link
instructions, registers with registry" was strictly not met — the
adapters existed but weren't reached.

Fix: add a small registry-iteration block AFTER the legacy for-loop
in both `installAction` and `uninstallAction`. Iterates
`await detectAll(adapterCtx)` and calls `adapter.install(ctx)` /
`adapter.uninstall(ctx)` for every detected adapter except
`claude-code` (already handled in the legacy loop above).

  - 17 new lines in installAction + 17 new lines in uninstallAction.
  - Built `adapterCtx: InstallContext` from `homedir()` + `process.cwd()`
    + `dbPath` so adapters read the same OS-level paths they do when
    called directly.
  - Errors from `adapter.install(ctx)` are caught + logged as a single
    `✗ <label> — failed: <message>` line; don't halt the loop.
  - Imports: added `detectAll` to the existing
    `import { getAdapter } from '../../agents/registry.js'` line +
    `InstallContext` type to the existing types import.

## Snapshot invariant — preserved byte-identical

The B1 install-snapshot test (`install.snapshot.test.ts`) runs in a
tmp HOME that has neither `~/.config/Cursor` nor `~/.config/Windsurf`.
The cursor + windsurf adapters' `detect()` return false → registry
iteration block prints nothing → install-snapshot bytes are unchanged.
CI fails red on snapshot diff. Verified: snapshot test passes
post-change.

## 6 new install/uninstall tests

`install.test.ts` now has 6 additional `installAction` /
`uninstallAction` tests covering the new registry behaviour:

  - "calls cursor adapter and prints deep-link instructions when
    Cursor is detected" — sets up `~/.config/Cursor` inside the test
    tmpDir, stubs HOME to tmpDir, asserts the Cursor block appears.
  - "calls windsurf adapter and prints deep-link instructions when
    Windsurf is detected" — same.
  - "prints both cursor + windsurf deep-link blocks when both are
    detected".
  - "does NOT double-invoke the claude-code adapter from the registry
    loop" — counts `advisory hook written to` lines == 1 (would be 2
    if the registry loop double-called claude-code).
  - "calls cursor adapter uninstall and prints uninstall instructions
    when Cursor is detected" — mirror in uninstallAction.
  - "calls windsurf adapter uninstall and prints uninstall instructions
    when Windsurf is detected" — mirror.

All use `vi.stubEnv('HOME', tmpDir)` to keep the tests hermetic and
independent of whether the dev machine actually has Cursor / Windsurf
installed.

## Drift hi0001234d#5 — extractPrompt JSDoc upgrade (no functional change)

`extractPrompt` on `cursorAdapter` and `windsurfAdapter` remains a
stub that returns null. The architectural decision was already made
in the original B4 commit; this commit upgrades the JSDoc to:

  - Explain WHY it's a stub (no CLI caller decodes rows — the
    extension's chat-history-watcher does, via the extractor modules
    in `src/ext-vscode/src/extractors/`).
  - Document the migration path if a CLI tool ever needs row decoding:
    promote extractors to `src/agents/chat-history-extractors/`,
    widen sub-package tsconfig.rootDir, leave re-export shims at the
    old paths, wire the adapter's `extractPrompt` to call
    `pickExtractor` + `extractor.decodeRow`.
  - Acknowledge this is a non-trivial cross-tree refactor (esbuild
    externals + tsconfig + vitest config) and intentionally deferred
    because there's currently no caller demanding it.

Per the user's no-code-removing constraint: nothing removed. The stub
behaviour is contract-compliant ("null = I don't know"). When a CLI
caller appears, the migration path is documented in the JSDoc.

## Verification

  - Root tsc --noEmit clean.
  - Sub-package tsc --noEmit clean.
  - Full root test suite: 2074 passing + 18 pre-existing TtySelectFn
    carry-forward (was 2068; +6 from the new registry install/uninstall
    tests).
  - Sub-package vitest: 181/181 pass unchanged.
  - install-snapshot test passes byte-identical (zero-diff invariant
    preserved).
  - Esbuild bundle: 14.7 KB unchanged.

The B4 acceptance line now strictly holds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two unit-test gaps surfaced in the M2/B4 audit. Both
identified previously-untested production paths.

## Gap 1 — defaultReadItemTable (WAL-aware better-sqlite3 reader)

The watcher's tests all inject a mock readItemTableFn, so the real
production reader added in M2/B4 follow-up (commit fa3c134) was never
exercised. That reader contains the entire WAL fix from dev plan §2.5:
copy main + .vscdb-wal + .vscdb-shm → tmp staging dir, open better-
sqlite3 read-only, run PRAGMA wal_checkpoint(TRUNCATE), check for
ItemTable existence, query, cleanup. A typo in any of those steps
would have shipped untested.

Fix: exported defaultReadItemTable from chat-history-watcher.ts and
added 4 integration-style tests that build real .vscdb files with
better-sqlite3 directly, then assert the production reader handles
them correctly:
  - happy-path: 3 ItemTable rows, all retrieved correctly
  - defensive: .vscdb with NO ItemTable returns [] (real production
    scenario — freshly-created VS Code state.vscdb)
  - WAL-mode: .vscdb opened with `PRAGMA journal_mode = WAL` (the
    Cursor scenario) → rows in WAL siblings are read correctly
  - source-untouched: 3 consecutive reads do not modify the source
    file's size — verifies the copy-to-staging strategy keeps the
    live file safe (important when Cursor is actively writing)

## Gap 2 — adapter-error catch blocks in install/uninstall

The audit follow-up (commit 55477c2) added registry-iteration blocks
in installAction + uninstallAction with `try/catch` around each
adapter call. The happy path got 6 tests. The catch blocks
(`✗ <adapter> — failed: <message>`) didn't.

Fix: 2 new tests using `vi.spyOn(cursorAdapter, 'install').mockRejectedValueOnce`
to simulate a real adapter failure. Each test asserts:
  - The synthetic error is surfaced in the console output
  - The loop continues (windsurf's install/uninstall also ran)

This proves the registry loop's resilience guarantee — one failing
adapter doesn't halt the others. Mirrors the proven legacy for-loop's
catch block contract.

## Verification

  - Root tsc --noEmit clean.
  - Full root suite: 2080 passing + 18 pre-existing TtySelectFn
    carry-forward (was 2074; +4 watcher reader + +2 install/uninstall
    catch = +6).
  - Watcher test count: 14 (was 10) — defaultReadItemTable suite added.
  - Snapshot invariant preserved (no install.ts source change).

## What's still NOT tested (acceptable gaps)

  - chatInputInject candidate command IDs against a live Cursor. These
    are heuristic guesses; B5 (smoke-test) is where they're verified
    against a real running instance. Documented in JSDoc.
  - Extension watcher start-up wiring (intentionally deferred to B5).
  - extractPrompt(rowKey, rowValue) on cursor/windsurf adapters — stub
    returns null; documented architectural choice. No caller exists.

Closes B4 unit-test audit. Per auto-commit rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 5 of Milestone M2 (v0.1.3/m2/smoke-test). Stacked on B4 (f6a916b),
which already has B1+B2+B3+M1 merged in — so this branch has every
prerequisite in one working tree.

This commit closes the only remaining wiring gap from B4 (chat-history
watcher start-up was deferred from B4 to here so it could be smoke-
tested against a real Cursor). After this, the full chain works:

  Cursor types prompt
    → state.vscdb wal-write fires
    → fs.watch event in chat-history-watcher
    → debounce + read → extractor decodes user prompt
    → chat-pipeline.handleChatEvent
        → ipc.spawnAuto(prompt, workspace-prefixed session id)
        → ipc.spawnStop(session id) → DecisionSessionPayload | null
        → view-provider.publishPayload (if non-null)
            → webview auto-reveals with advisory + numbered options
              + keyboard shortcuts (1-9 / Esc / Ctrl+X)
            → click / number key → handleOptionSelection
                → injectFn primary path (per-host chat-input command)
                → clipboard + toast fallback

## Files

  - src/path-enumerator.ts — enumerateStateVscdbPaths(workspaceStorageDir)
    walks <base>/<workspace-id>/state.vscdb and returns the paths that
    exist. Returns [] for null base / missing dir / empty dir. Skips
    non-directory siblings + workspace dirs that lack state.vscdb.
    Injectable fs for tests; production uses node:fs defaults.
    +8 unit tests.

  - src/chat-pipeline.ts — createChatEventHandler(deps) builds the
    (event) => Promise<void> handler the watcher calls. Orchestrates
    spawnAuto → spawnStop → publishPayload. Three independent
    try/catch blocks so a failure at any stage logs + returns
    without propagating to the watcher (the watcher's onEvent is
    fire-and-forget; unhandled rejections would crash the extension
    host). Optional composeSessionId lets the caller prefix the
    session with workspace id. +7 unit tests covering happy path,
    null-payload skip, custom composer, each error path, never-
    propagates guarantee.

  - src/onboarding.ts — exported CONSENT_KEY (was a module-private
    const) so extension.ts reads the same globalState key the
    onboarding writes to. No behaviour change.

  - src/extension.ts — substantially rewritten activate(). New flow:
      1. detectHost() — Cursor / Windsurf / vscode-generic
      2. Construct + register the view provider with injectFn-aware
         onSelect (B4 wiring unchanged)
      3. await showOnboardingIfNeeded(context) — consent prompt
      4. Watcher gating — all of these must be true to start:
           - context.globalState.get<boolean>(CONSENT_KEY) === true
           - host !== 'vscode-generic'  (no AI chat on plain VS Code)
           - enumerateStateVscdbPaths returns at least one path
      5. Build watcher targets from the discovered paths, kind
         'cursor-sqlite' (extractor selected by fingerprint at read
         time)
      6. Build the chat-event handler with workspace-prefixed
         session-id composer
      7. Create the watcher with onEvent calling the handler;
         onSchemaUnknown surfacing a friendly toast; onError
         logging
      8. watcher.start() + push a stop disposable onto
         context.subscriptions
    deactivate() now also stops the watcher and clears the
    module-level handles.

  - src/extension.test.ts — substantially rewritten with vi.hoisted
    mocks for the new imports (host-detector, path-enumerator,
    chat-history-watcher, chat-pipeline, ipc). +15 tests covering:
      - activation log, view-provider registration regardless of
        consent
      - watcher NOT started when consent undefined / false / plain
        VS Code host / no dbs
      - watcher started when consent=true + host=cursor + dbs present
      - chat-event handler built with workspace-prefixed composer
      - watcher.stop() called on deactivate
      - getViewProvider() lookup
      - onboarding-rejects-but-rest-continues resilience

  - SMOKE-TEST.md — manual smoke-test procedure that the engineer
    runs against a live Cursor / Windsurf install. Walks through:
      - Build extension + nexpath CLI install + extension install
        (Extension Development Host or .vsix path)
      - Activation verification (log line, consent toast, icon,
        watcher start log)
      - Trigger round-trip + observe webview auto-reveal
      - Verify (and update) the heuristic candidate chat-input
        command IDs in chat-input-injector.ts using
        vscode.commands.getCommands(true)
      - Verification table to paste back as B5 acceptance evidence
      - Troubleshooting section + explicit non-goals (cross-OS = B6,
        pre-prompt blocking = open question)

## Verification

  - Root tsc --noEmit clean.
  - Sub-package tsc --noEmit clean.
  - Sub-package vitest: 204 / 204 pass across 17 files (was 185 in B4;
    +19: 8 path-enumerator + 7 chat-pipeline + 4 net extension).
  - Full root test suite: 2099 passing + 18 pre-existing TtySelectFn
    carry-forward (was 2080; +19 from B5).
  - Esbuild bundle: 31.2 KB (was 14.7 KB — the new wiring pulls
    chat-history-watcher + extractors + chat-pipeline into the
    extension's bundle now that activate() actually uses them).
  - Install snapshot byte-identical (no install.ts source touched).

## B5 acceptance gate (manual)

The acceptance line for B5 is "End-to-end on dev machine: type real
prompt in Cursor → real round-trip → decision UI appears." That's a
MANUAL test the engineer runs per SMOKE-TEST.md. The code-side
deliverable (the wiring) is in this commit; the acceptance evidence
goes back as a verification-table entry in SMOKE-TEST.md.

## Deferred to B6 / M5 (explicitly out of scope)

  - macOS + Windows verification (B6 — needs VM / physical access)
  - Marketplace publish (B6)
  - Real-Cursor verification of candidate chat-input command IDs
    (a manual step in SMOKE-TEST.md step 6; the engineer updates
    chat-input-injector.ts based on what they find)
  - "Response done" detection for a smarter spawnStop trigger time
    (currently auto + stop fire back-to-back; M5 hardening)
  - Multi-workspace concurrency formal testing (M5)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Layer C changes)

Closes Drift hi0001234d#6 from the M2/B5 cross-confirmation review. The B5 smoke
test would have failed at the "advisory appears" step because:

  (a) `nexpath stop` (Layer C) expects a full `StopPayload` shape on
      stdin — `{session_id?, cwd, hook_event_name, stop_hook_active, ...}`
      (see src/cli/commands/stop.ts:37-43). Our `ipc.spawnStop` was
      only sending `{session_id}`, so `runStop` saw `cwd === undefined`
      and failed project-root resolution.

  (b) `nexpath auto` defaults `--project` to `process.cwd()` of the
      spawned process. Our ipc was inheriting whatever cwd the extension
      host process had (typically the user's home, not the workspace).
      `.env` loading and hook-stats writes were therefore landing in
      the wrong directory.

Both fixed inside our layer — `src/ext-vscode/src/ipc.ts` only. Layer
C remains entirely untouched (per the boundary rule and your standing
instruction). The fix uses Layer C's existing public stdin contract;
we just send the right shape.

## Changes

`src/ext-vscode/src/ipc.ts`:
  - Added `cwd?: string` to `IpcOptions`. Documents WHY it's needed
    (project-root resolution on the Layer C side).
  - New helper `buildSpawnOptions(opts)` constructs `SpawnOptions` with
    `stdio: ['pipe', 'pipe', 'pipe']` AND `cwd: opts.cwd ?? process.cwd()`.
    Both `spawnAuto` and `spawnStop` now use it.
  - `spawnStop` stdin payload changed from `{session_id}` to the full
    `StopPayload` shape:
        {
          session_id,
          cwd:              opts.cwd ?? process.cwd(),
          hook_event_name:  'Stop',
          stop_hook_active: false,
        }
    `last_assistant_message` is omitted (we capture user prompts only;
    no assistant signal yet — M5 hardening concern).
  - JSDoc on both spawn functions now references the exact Layer C
    file:line where the contract is defined.

`src/ext-vscode/src/extension.ts`:
  - The chat-pipeline now curries `spawnAuto` / `spawnStop` with the
    workspace folder's fsPath as `cwd`. When no workspace is open,
    falls back to `process.cwd()` of the extension host.
  - Same workspace path is used as the session-id prefix (was already
    the case; just consolidated to one variable).

`src/ext-vscode/src/ipc.test.ts`:
  - +5 new tests covering:
      - `spawnAuto` passes opts.cwd to the spawned process options
      - `spawnAuto` defaults to `process.cwd()` when omitted
      - `spawnStop` writes the FULL StopPayload shape to stdin
        (session_id + cwd + hook_event_name='Stop' + stop_hook_active=false)
      - `spawnStop` defaults stdin cwd to `process.cwd()` when omitted
      - `spawnStop` passes opts.cwd to spawn options
    The full-payload-shape test is the load-bearing one — it locks the
    Layer C stdin contract so a regression breaks loudly.

`src/ext-vscode/src/extension.test.ts`:
  - Adjusted the "composeSessionId" assertion to accept either
    `process.cwd()` or any path prefix (was pinned to literal
    'no-workspace' which no longer matches).

`src/ext-vscode/SMOKE-TEST.md`:
  - Troubleshooting table gained a row for "nexpath auto runs but no
    advisory appears later" — explains the OPENAI_API_KEY .env path +
    how to check the prompt-store.db for captured prompts.

## Verification

  - Root tsc --noEmit clean.
  - Sub-package tsc --noEmit clean.
  - Sub-package vitest: 209/209 pass across 17 files (was 204; +5 from
    the new ipc cwd / payload-shape tests).
  - Full root test suite: 2104 passing + 18 pre-existing TtySelectFn
    carry-forward (was 2099).
  - Esbuild bundle: 31.5 KB (was 31.2 KB — minor growth for the new
    payload + buildSpawnOptions helper).
  - Install snapshot byte-identical.

The B5 smoke test should now succeed at the "advisory appears" step
when the engineer runs it per SMOKE-TEST.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the one B5 unit-test gap surfaced in the audit. extension.ts
constructs the chat-history watcher with three callbacks (onEvent,
onError, onSchemaUnknown) but none of them were exercised by tests.
The most consequential is onEvent — it's the integration proof that
watcher events actually reach the chat-event handler.

The watcher itself is mocked in these tests, so the strategy is:
capture the opts object passed to createChatHistoryWatcher and
invoke each callback directly.

## Changes

`src/ext-vscode/src/extension.test.ts` — +3 tests via a shared
`activateWithWatcher()` helper:

  1. "routes watcher onEvent through the chat-event handler (the
     integration proof)" — captures the handler returned by
     createChatEventHandler, invokes opts.onEvent with a synthetic
     event, asserts the tracked handler was called with that event.
     This is the load-bearing test — a refactor that silently
     disconnects watcher → handler would break it loudly.
  2. "watcher onSchemaUnknown surfaces a visible info toast with
     path + observed keys" — invokes opts.onSchemaUnknown, asserts
     vscode.window.showInformationMessage was called once with a
     message containing the path and the first observed key.
  3. "watcher onError logs to console.error (does not crash the
     extension)" — invokes opts.onError with a fake Error,
     asserts no throw + console.error called with the right
     prefix.

## Verification

  - Root tsc --noEmit clean.
  - Sub-package tsc --noEmit clean.
  - Sub-package vitest: 212 / 212 pass (was 209; +3).
  - Full root suite: 2107 passing + 18 pre-existing TtySelectFn
    carry-forward (was 2104).
  - Esbuild bundle unchanged (no source-code changes — tests only).

Closes B5 audit. No remaining unit-test gaps in scope.

## What's still NOT tested (acceptable per B5 scope)

  - End-to-end against a real Cursor (manual smoke test per
    SMOKE-TEST.md).
  - Multi-workspace concurrency (deferred to M5).
  - "Response done" timing (auto+stop fire back-to-back — deferred
    to M5 hardening).
  - The watcher's actual fs.watch firing on real state.vscdb writes
    (covered by chat-history-watcher.test.ts at the unit level with
    a synthetic fs.watch stub).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants