Skip to content

Adopt spec v0.2: agentic generalist direction#8

Merged
rockfordlhotka merged 1 commit intomainfrom
spec/v0.2
Apr 22, 2026
Merged

Adopt spec v0.2: agentic generalist direction#8
rockfordlhotka merged 1 commit intomainfrom
spec/v0.2

Conversation

@rockfordlhotka
Copy link
Copy Markdown
Member

Summary

Pivots docs/foragent-specification.md to an agentic model: one generalist browser-task capability plus narrow fast-path specialists, with RockBot's ISkillStore + ILongTermMemory as the learning substrate. Steps 6–9 added to §9.1. Built directly on Microsoft.Playwright NuGet (no MCP sidecar, no Stagehand port — see Appendix A #16).

Context: the v0.1 spec envisioned five narrow site-specific verbs. Step 5 surfaced the scaling problem (hand-written ISitePoster per site) and the calling-agent shape mismatch (invoke_agent passes free-text; typed skills are hostile to natural-language callers). v0.2 addresses both.

What changed in the spec

  • §1 Summary — rephrased for the agentic framing.
  • §3.7 (new) — LLM tier routing via TieredChatClientRegistry.
  • §5 (wholesale rewrite) — capability model (§5.1), v0.2 initial set (§5.2), out-of-scope (§5.3), design principles (§5.4), multi-phase flows (§5.5), learning substrate (§5.6), human-in-the-loop (§5.7).
  • §7.1 — allowlists are mandatory; wildcards supported (*.example.com, *).
  • §9.1 — steps 6–9 added; 1–5 condensed as shipped.
  • §9.2 — Stagehand exclusion dropped.
  • §12 — Q1/Q2 closed; Q6/Q7/Q8 added.
  • Appendix A — decisions #16–#20 (direct-SDK, tier routing, wildcarded allowlists, framework persistence, multi-phase as separate A2A tasks).

What didn't change

  • Credentials-by-reference (§6) — the v0.1 design was right; kept as-is.
  • One shared browser / fresh BrowserContext per task (§3.5).
  • Prohibited-capability list (§7.3) — account creation, financial transactions, security-permission changes.
  • RockBot framework as a versioned NuGet dep, not a monkey-patched internal (§8.4).

Supporting artifacts

The direction-setting proposal that walked through the decision (including the Stagehand / playwright-mcp / direct-SDK survey) is archived at docs/archive/foragent-spec-v0.2-proposal.md for context.

Test plan

🤖 Generated with Claude Code

Pivots Foragent from five narrow verbs to one generalist browser-task
capability plus a small set of fast-path specialists. Step 5 showed
hand-written site-specific code doesn't scale and that structured
typed skills are hostile to the natural-language callers (mostly other
LLM agents) Foragent actually has.

§5 wholesale rewrite: two-tier capability model (§5.1), v0.2 initial
set with browser-task as the generalist (§5.2), multi-phase flows
with returned artifacts (§5.5), learning substrate on RockBot's
ISkillStore + ILongTermMemory (§5.6), human-in-the-loop explicitly
caller-side (§5.7).

§3.7 adds LLM tier routing via RockBot's TieredChatClientRegistry.
§7.1 makes allowlists mandatory with wildcard support.
§9.1 adds steps 6-9; §9.2 drops the Stagehand exclusion.
§12 closes Q1/Q2; adds Q6/Q7/Q8 for the step 6-8 work.
Appendix A gains decisions #16-#20: direct-SDK (no MCP/Stagehand),
tier routing, mandatory wildcarded allowlists, framework persistence
for learned knowledge, multi-phase as separate tasks.

Working doc from the direction-setting discussion archived to
docs/archive/foragent-spec-v0.2-proposal.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rockfordlhotka rockfordlhotka merged commit e55c09d into main Apr 22, 2026
1 check passed
@rockfordlhotka rockfordlhotka deleted the spec/v0.2 branch April 22, 2026 03:28
rockfordlhotka added a commit that referenced this pull request Apr 23, 2026
Ships the phase-1 / phase-3 form capability pair from spec §5.5 and
bumps the RockBot framework to 0.9.* for the multi-file skill API.

- FormSchema / FormField wire types with stable JSON serializer options.
- IBrowserPage.ScanFormAsync: single-JS-pass deterministic form read
  (labels, validation attrs, select/radio options). Adds
  SelectOptionAsync + SetCheckedAsync for the batch submit path.
- LearnFormSchemaCapability: navigate → deterministic scan →
  FormSchemaEnricher (one LLM turn, can only add dependsOn + notes;
  structural fields are DOM-authoritative) → persist as Skill with a
  SkillResourceType.JsonSchema "schema.json" resource at
  sites/{host}/forms/{slug}.
- ExecuteFormBatchCapability: resolves schema by skillRef via
  ISkillStore.GetResourceAsync or inline; streams per-row progress via
  AgentTaskContext.PublishStatus; default mode "abort-on-first"
  (spec open-question #8 resolution), caller opts into "continue".
  Success signal: optional successIndicator selector, URL-change
  fallback.
- Package bump 0.8.* → 0.9.* (adds RockBot.Llm.Abstractions). Docker
  image rockylhotka/rockbot-agent 0.8.5 → 0.9.11 so the peer side
  gets PR #291 structured-data invoke_agent.
- Version scheme adopted: 0.2.0-alpha.8 in Directory.Build.props,
  documented in CLAUDE.md Conventions.
- 14 unit tests + 1 Kestrel+Chromium end-to-end integration test.

Resolves spec open-questions #6 (typed artifacts — uses RockBot 0.9
resource files, no parallel Foragent store) and #8 (batch semantics —
abort-on-first default).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant