Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
0509579
feat: expose semantic MCP tools
thymikee May 26, 2026
83b8900
docs: remove semantic mcp prd
thymikee May 26, 2026
1b773fd
refactor: deepen semantic command surface
thymikee May 26, 2026
7afc60f
refactor: add mcp execution seam
thymikee May 26, 2026
6d97289
refactor: deepen command grammar
thymikee May 26, 2026
7be61dd
refactor: remove legacy command definitions
thymikee May 26, 2026
5e8ffdb
refactor: collapse semantic cli wrappers
thymikee May 26, 2026
402f642
refactor: remove local mcp placeholders
thymikee May 26, 2026
a81e446
refactor: derive semantic cli routing
thymikee May 26, 2026
f7475f8
refactor: trim mcp status metadata
thymikee May 26, 2026
ed11470
refactor: derive semantic input contracts
thymikee May 26, 2026
4bec604
refactor: split semantic grammar modules
thymikee May 26, 2026
7909732
refactor: derive batch input schema
thymikee May 26, 2026
359f05c
refactor: centralize cli command schema catalog
thymikee May 26, 2026
d3f81bc
refactor: share semantic cli output projections
thymikee May 26, 2026
c323285
refactor: remove legacy cli output paths
thymikee May 26, 2026
3e21980
refactor: consolidate command interface surface
thymikee May 27, 2026
82a7af3
docs: align command contract wording
thymikee May 27, 2026
54ad2cd
refactor: split command projection from cli grammar
thymikee May 27, 2026
8250149
refactor: trim projection exports
thymikee May 27, 2026
996613c
fix: satisfy fallow command contract audit
thymikee May 27, 2026
029a160
refactor: structure public batch steps
thymikee May 27, 2026
a88264d
chore: clean batch architecture references
thymikee May 27, 2026
ef27c41
fix: keep legacy cli batch steps working
thymikee May 27, 2026
a54eb6d
fix: serialize mcp batches
thymikee May 27, 2026
42aaf50
chore: tighten command surface cleanup
thymikee May 27, 2026
c030d01
fix: serialize mcp stdin requests
thymikee May 27, 2026
118999c
chore: keep mcp config out of command contracts
thymikee May 27, 2026
ef20069
fix: project structured batch targets
thymikee May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 38 additions & 15 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,29 @@ Single-context repo. Read `CONTEXT.md` for domain language and testing/architect
- Keep modules small for agent context safety:
- target <= 300 LOC per implementation file when practical.
- if a file grows past 500 LOC, plan/extract focused submodules before adding new behavior.
- exception: generated files, schema/fixture snapshots, and integration test aggregations.
- if a file grows past 1,000 LOC, treat it as architecture debt unless it is generated data, a fixture snapshot, or an integration test aggregation.
- long guidance/data tables should live behind focused modules instead of sharing a file with parser/runtime logic.
- prefer deep modules over mechanical splits: extract when it improves locality for a concept callers already need, not just to reduce line count.

## Context Management
- Optimize for one-pass agent reads. A module that requires reading many siblings to understand one change is usually too shallow; a module that hides one concept behind a small interface is usually worth keeping.
- Start with the owning module, then one shared helper, then one downstream caller or adapter. Broaden only when the contract crosses that edge.
- Use targeted symbol searches before opening large files. For files over 500 LOC, search for the relevant type/function/section first, then read a bounded range.
- Do not add unrelated exports just to make tests easier. Test through the public interface when possible; if that is awkward, consider whether the module's interface is too shallow.
- When adding new guidance, examples, schemas, or command metadata, decide whether it belongs in the command surface, CLI grammar, CLI help, MCP projection, or daemon runtime before editing.
- Prefer updating existing domain vocabulary in `CONTEXT.md` when naming a new durable module concept. Do not coin parallel names in docs, tests, and code.

## Routing
- Keep `src/daemon.ts` as a thin router.
- Keep command names and daemon routing groups centralized in `src/command-catalog.ts`; do not re-create command string sets in handlers or request policy modules.
- Keep CLI/client positional grammar in `src/command-codecs.ts` and its `src/command-codecs/*` command-family modules. CLI commands, typed client methods, and daemon interaction adapters should reuse these codecs instead of duplicating selector/ref/positionals parsing.
- Keep command input/output contracts in the command modules:
- command surface and shared schemas: `src/commands/command-surface.ts`, `src/commands/command-contract.ts`, `src/commands/command-input.ts`
- typed client command execution: `src/commands/client-command-contracts.ts`
- command families: `src/commands/interaction-command-contracts.ts`, `src/commands/batch-command.ts`, with other typed client contracts in `src/commands/client-command-contracts.ts`
- CLI positional/flag grammar: `src/commands/cli-grammar.ts` and `src/commands/cli-grammar/*`
- typed input to daemon request projection: `src/commands/command-projection.ts`
- CLI/client/runtime output projection: `src/commands/cli-output.ts`, `src/commands/client-output.ts`, `src/commands/runtime-output.ts`
- Do not reintroduce CLI-shaped command adapters or schemas as a second source of truth. CLI, Node.js, and MCP should project from command contracts.
- Keep `src/daemon/request-router.ts` as request orchestration: auth, diagnostics scope, request admission, locking, handler chain, and fallback dispatch.
- Put request policies in focused request modules:
- tenant/lease/selector/lock admission: `src/daemon/request-admission.ts`
Expand Down Expand Up @@ -111,17 +128,18 @@ Single-context repo. Read `CONTEXT.md` for domain language and testing/architect

## Adding a New CLI Flag

A new snapshot/command flag touches up to 7 files in a fixed order. Follow this checklist:
A new snapshot/command flag touches only the layers that need to understand it. Follow this checklist in order:

1. `src/utils/command-schema.ts`: add to `CliFlags` type, `FLAG_DEFINITIONS` array, and the relevant `*_FLAGS` constant (e.g. `SNAPSHOT_FLAGS`). Update the command's `usageOverride` string.
2. `src/utils/snapshot.ts` (or the relevant options type): add to `SnapshotOptions` or equivalent.
3. `src/client-types.ts`: add to `CaptureSnapshotOptions` (or equivalent public options type) **and** `InternalRequestOptions`.
4. `src/client-normalizers.ts`: map the public option name to the internal flag name in `buildFlags`.
5. `src/daemon/context.ts`: add to `DaemonCommandContext` type and `contextFromFlags` function.
6. `src/core/dispatch-context.ts`: add to `DispatchContext` when the flag flows into platform dispatch, then thread it through the relevant dispatcher module.
7. `src/cli/commands/<command>.ts`: pass the flag from `flags.*` to the client call.
1. `src/utils/cli-flags.ts`: add to `CliFlags`, `FLAG_DEFINITIONS`, and the relevant exported flag group (e.g. `SNAPSHOT_FLAGS`). Add the flag to `CLI_COMMAND_OVERRIDES` in `src/utils/cli-command-overrides.ts` for each command that supports it; command names/descriptions come from command contracts unless CLI help needs a specific override.
2. `src/commands/cli-grammar/*`: read the CLI flag into command input when the CLI accepts it.
3. `src/commands/command-projection.ts` and command-family projection helpers: write the input into the daemon request only if the flag affects daemon execution.
4. `src/commands/*-command-contracts.ts`: add or update the command input schema only if the option should be available through Node.js or MCP as structured input.
5. `src/client-types.ts`: update the public typed client option only when the Node.js interface exposes the option.
6. `src/client-normalizers.ts`: update daemon flag normalization only when the request still needs a public-to-internal option translation.
7. `src/daemon/context.ts` and `src/core/dispatch-context.ts`: add the field only when it flows into platform dispatch.
8. Handler/platform modules: thread the option only after the command surface, grammar, and projection prove it belongs there.

Command-only flags (like `find --first`) that don't flow to the platform layer only need steps 1 and the handler file.
Command-only flags (like `find --first`) that do not flow to the platform layer usually stop at steps 1-3.

## Hard Rules
- Use process helpers from `src/utils/exec.ts` for TypeScript process execution: `runCmd`, `runCmdStreaming`, `runCmdSync`, `runCmdBackground`, and `runCmdDetached`. Do not import raw `spawn`/`spawnSync` outside `src/utils/exec.ts`; add or extend an exec helper instead. Plain `.mjs` packaging fixtures that cannot import TypeScript helpers should keep child-process usage local and prefer `execFile`/`execFileSync` over spawn.
Expand Down Expand Up @@ -190,7 +208,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o

## Testing Matrix
- Docs/skills only: no tests required unless a more specific rule below applies.
- CLI help/guidance changes in `src/utils/command-schema.ts`: run `pnpm exec vitest run src/utils/__tests__/args.test.ts`.
- CLI help/guidance changes in `src/utils/cli-help.ts`, `src/utils/cli-command-overrides.ts`, or `src/utils/command-schema.ts`: run `pnpm exec vitest run src/utils/__tests__/args.test.ts`.
- SkillGym prompt/assertion changes: run `pnpm test:skillgym:case <case-id>`; the script builds local CLI help first. For broad validation, use `pnpm test:skillgym`; append `-- --tag fixture-smoke` or `-- --tag skill-guidance` when validating one suite group.
- Non-TS, no behavior impact: no tests unless requested.
- Keep tests behavioral; do not assert shapes or cases TypeScript already proves.
Expand All @@ -208,6 +226,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Do not run integration tests by default.
- Do not inspect both iOS and Android codepaths unless task requires both.
- Prefer targeted `git diff -- <paths>` over broad file reads during review.
- Keep long help prose in `src/utils/cli-help.ts`; keep flag definitions in `src/utils/cli-flags.ts`; keep CLI-specific command usage/flag metadata in `src/utils/cli-command-overrides.ts`.
- Prefer `snapshot -i`, `find`, and scoped selectors over repeated full snapshot dumps when exploring Apple desktop UIs.
- Keep PR summaries short and scoped.

Expand All @@ -222,9 +241,10 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Changing `tsconfig.lib.json`/build tooling without running `pnpm check:tooling`; declaration generation is stricter than `tsc --noEmit`.

## Docs & Skills
- Versioned CLI help is the agent-facing source of truth. Put workflow guidance in `src/utils/command-schema.ts` help topics and assert important copy in `src/utils/__tests__/args.test.ts`.
- Versioned CLI help is the agent-facing source of truth. Put workflow guidance and help-topic prose in `src/utils/cli-help.ts`, keep flag definitions in `src/utils/cli-flags.ts`, keep CLI command overrides in `src/utils/cli-command-overrides.ts`, and assert important copy in `src/utils/__tests__/args.test.ts`.
- Keep parser schema and help rendering separate: `src/utils/command-schema.ts` composes contract-derived command schemas with CLI overrides; `src/utils/cli-help.ts` owns help topics and usage rendering.
- Skills are thin routers. Keep `skills/**/SKILL.md` focused on when to use the skill, version gating, which `agent-device help <topic>` page to read, and a short default loop. Do not duplicate full CLI manuals in skills.
- For behavior/CLI surface changes, update the versioned help instructions in `src/utils/command-schema.ts` and assert important help copy in `src/utils/__tests__/args.test.ts`. Also update `README.md` and relevant `website/docs/**` when user-facing docs need it.
- For behavior/CLI surface changes, update the versioned help instructions in `src/utils/cli-help.ts` or the CLI command metadata in `src/utils/cli-command-overrides.ts`, then assert important help copy in `src/utils/__tests__/args.test.ts`. Also update `README.md` and relevant `website/docs/**` when user-facing docs need it.
- For behavior/CLI surface changes and command-planning guidance changes, write or update a SkillGym case in `test/skillgym/suites/agent-device-smoke-suite.ts` that captures the expected agent command plan.
- Do not update `skills/**/SKILL.md` for command behavior or workflow guidance unless the user explicitly asks; skills must route to versioned CLI help instead of carrying behavior details.
- Keep SkillGym cases behavioral and command-planning oriented. Prefer prompts that assert the user-visible contract and expected command family over brittle exact output, but forbid known bad patterns.
Expand All @@ -245,6 +265,7 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o

## Key Files
- CLI parse + formatting: `src/bin.ts`, `src/cli.ts`, `src/utils/args.ts`
- CLI help + option metadata: `src/utils/cli-help.ts`, `src/utils/cli-flags.ts`, `src/utils/cli-command-overrides.ts`, `src/utils/command-schema.ts`, `src/utils/cli-option-schema.ts`
- Daemon client transport: `src/daemon-client.ts`
- Daemon state/store: `src/daemon/session-store.ts`
- Selector DSL and matching: `src/daemon/selectors.ts`
Expand All @@ -254,7 +275,9 @@ Command-only flags (like `find --first`) that don't flow to the platform layer o
- Handler context helpers: `src/daemon/context.ts`, `src/daemon/device-ready.ts`
- Request routing/policy: `src/daemon/request-router.ts`, `src/daemon/request-admission.ts`, `src/daemon/request-generic-dispatch.ts`
- Dispatcher + capability map: `src/core/dispatch.ts`, `src/core/dispatch-context.ts`, `src/core/dispatch-interactions.ts`, `src/core/capabilities.ts`
- Command catalog + positional codecs: `src/command-catalog.ts`, `src/command-codecs.ts`, `src/command-codecs/*`
- Command catalog + command surface: `src/command-catalog.ts`, `src/commands/command-surface.ts`, `src/commands/command-contract.ts`, `src/commands/client-command-contracts.ts`
- CLI grammar: `src/commands/cli-grammar.ts`, `src/commands/cli-grammar/*`
- Daemon request projection: `src/commands/command-projection.ts`
- Platform backends: `src/platforms/ios/*`, `ios-runner/*`, `src/platforms/android/*`

## Pull Requests
Expand Down
1 change: 1 addition & 0 deletions CONTEXT.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
- Target: selected automation destination, such as mobile, tv, or desktop.
- Modality: broad supported device family, such as mobile, tv, or desktop.
- Session: daemon-owned state for a selected target and opened app or surface.
- Command surface: catalog of public command identity, interface exposure, adapter policy, and shared command metadata across CLI, Node.js, MCP, and batch entrypoints.

## Testing Principles

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Snapshots assign refs like `@e1`, `@e2`, and `@e3` to elements on the current sc

## Next Steps

- **Set up your agent**: run the CLI from Cursor, Codex, Claude Code, Windsurf, or another agent terminal. For skills, rules, MCP discovery, and client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup).
- **Set up your agent**: run the CLI from Cursor, Codex, Claude Code, Windsurf, or another agent terminal. For skills, rules, direct MCP tools, and client-specific setup, see [AI Agent Setup](https://incubator.callstack.com/agent-device/docs/agent-setup).
- **Try the sample app**: clone the repo and run the bundled Expo fixture when you want a guided first dogfood run with screenshots, replay, and performance evidence. See [Quick Start](https://incubator.callstack.com/agent-device/docs/quick-start).
- **Go deeper**: use [Commands](https://incubator.callstack.com/agent-device/docs/commands), [Replay & E2E](https://incubator.callstack.com/agent-device/docs/replay-e2e), and [Debugging & Profiling](https://incubator.callstack.com/agent-device/docs/debugging-profiling) for production workflows.

Expand Down
Loading
Loading