You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 25, 2026. It is now read-only.
* Sandbox and runtime updates
* Update the "why" for features
* Tweaks
* Tweaks
* Tweaks
* Tweaks
* Updates
* Clean up
* Fixes
* Add Community section and move Cookbook to top level
- Create Community section with Snow Leopard example
- Add `ExternalCard` component for external link cards
- Move Cookbook from `Learn/` to top-level nav
- Update internal links to Cookbook pages
- Remove empty Learn section
* Tweaks
* Fixes
* Update examples
* Fix Bun links
* Tweaks
* Reorganize and add more info
* Update AI commands
* Tweaks
The [OpenCode plugin](/Reference/CLI/opencode-plugin) provides AI-assisted development for full-stack Agentuity projects, including agents, routes, frontend, and deployment.
377
+
</Callout>
378
+
367
379
-[Using the AI SDK](/Agents/ai-sdk-integration): Add LLM capabilities with generateText and streamText
368
380
-[Managing State](/Agents/state-management): Persist data across requests with thread and session state
369
381
-[Calling Other Agents](/Agents/calling-other-agents): Build multi-agent workflows
Copy file name to clipboardExpand all lines: content/Agents/evaluations.mdx
+46Lines changed: 46 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,21 @@ description: Automatically test and validate agent outputs for quality and compl
5
5
6
6
Evaluations (evals) are automated tests that run after your agent completes. They validate output quality, check compliance, and monitor performance without blocking agent responses.
7
7
8
+
## Why Evals?
9
+
10
+
Most evaluation tools test the LLM: did the model respond appropriately? That's fine for chatbots, but agents aren't single LLM calls. They're entire runs with multiple model calls, tool executions, and orchestration working together.
11
+
12
+
Agent failures can happen anywhere in the run—a tool call that returned bad data, a state bug that corrupted context, and more. Testing just the LLM response misses most of this.
13
+
14
+
Agentuity evals test the whole run—every tool call, state change, and orchestration step. They run on every session in production, so you catch issues with real traffic.
15
+
16
+
**The result:**
17
+
18
+
-**Full-run evaluation**: Test the entire agent execution, not just LLM responses
19
+
-**Production monitoring**: Once configured, evals run automatically on every session
20
+
-**Async by default**: Evals don't block responses, so users aren't waiting
21
+
-**Preset library**: Common checks (PII, safety, hallucination) available out of the box
22
+
8
23
Evals come in two types: **binary** (pass/fail) for yes/no criteria, and **score** (0-1) for quality gradients.
Use `{ strict: true }` when generating schemas for LLM structured output (e.g., OpenAI's `response_format`). Strict mode ensures the schema is compatible with model constraints and produces more reliable outputs.
127
+
</Callout>
128
+
106
129
<Callouttype="info"title="When to Use">
107
130
Use `@agentuity/schema` for simple validation needs. For advanced features like email validation, string length constraints, or complex transformations, consider Zod or Valibot.
Copy file name to clipboardExpand all lines: content/Agents/standalone-execution.mdx
+50-20Lines changed: 50 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ description: Execute agents programmatically for cron jobs, bots, CLI tools, and
6
6
Sometimes your agent logic needs to run without an incoming HTTP request. `createAgentContext()` gives standalone code the same infrastructure that HTTP handlers get automatically: tracing, sessions, and storage access.
`createAgentContext()` requires Agentuity's runtime initialization and only works within the Agentuity runtime (Discord bots, CLI tools, queue workers deployed alongside your agents). It will throw an error if called from external frameworks like Next.js or Express. To access storage from external backends, see [SDK Utilities for External Apps](/Learn/Cookbook/Patterns/server-utilities).
9
+
`createAgentContext()` requires Agentuity's runtime initialization and only works within the Agentuity runtime (Discord bots, CLI tools, queue workers deployed alongside your agents). It will throw an error if called from external frameworks like Next.js or Express. To access storage from external backends, see [SDK Utilities for External Apps](/Cookbook/Patterns/server-utilities).
10
10
</Callout>
11
11
12
12
## Basic Usage
@@ -16,10 +16,20 @@ import { createAgentContext } from '@agentuity/runtime';
16
16
importchatAgentfrom'@agent/chat';
17
17
18
18
const ctx =createAgentContext();
19
-
const result =awaitctx.invoke(() =>chatAgent.run({ message: 'Hello' }));
19
+
const result =awaitctx.run(chatAgent, { message: 'Hello' });
20
20
```
21
21
22
-
The `invoke()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
22
+
The `run()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
Copy file name to clipboardExpand all lines: content/Agents/streaming-responses.mdx
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,6 +50,10 @@ Streaming requires both: `schema.stream: true` in your agent (so the handler ret
50
50
51
51
Enable streaming by setting `stream: true` in your schema and returning a `textStream`:
52
52
53
+
<Callouttype="info"title="AI SDK Integration">
54
+
The `textStream` from AI SDK's `streamText()` works directly with Agentuity's streaming middleware. Return it from your handler without additional processing.
Copy file name to clipboardExpand all lines: content/Agents/workbench.mdx
+13Lines changed: 13 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,19 @@ description: Use the built-in development UI to test agents, validate schemas, a
5
5
6
6
Workbench is a built-in UI for testing your agents during development. It automatically discovers your agents, displays their input/output schemas, and lets you execute them with real inputs.
7
7
8
+
## Why Workbench?
9
+
10
+
Testing agents isn't like testing traditional APIs. You need to validate input schemas, see how responses format, test multi-turn conversations, and understand execution timing. Using `curl` or Postman means manually constructing JSON payloads and parsing responses.
11
+
12
+
Workbench understands your agents. It reads your schemas, generates test forms, maintains conversation threads, and shows execution metrics. When something goes wrong, you see exactly what the agent received and returned.
13
+
14
+
**Key capabilities:**
15
+
16
+
-**Schema-aware testing**: Input forms generated from your actual schemas
17
+
-**Thread persistence**: Test multi-turn conversations without manual state tracking
18
+
-**Execution metrics**: See token usage and response times for every request
19
+
-**Quick iteration**: Test prompts display in the UI for one-click execution
20
+
8
21
## Enabling Workbench
9
22
10
23
Add a `workbench` section to your `agentuity.config.ts`:
The second argument to `invoke()` accepts `params` for path parameter values (e.g., `{ params: { itemId: '123' } }`).
316
+
317
+
<Callouttype="info"title="Query and Headers">
318
+
To set `query` or `headers`, pass them when calling `useAPI()`, not to `invoke()`. For dynamic query parameters, see the [example above](#request-options).
319
+
</Callout>
320
+
283
321
## Auth State with useAuth
284
322
285
323
Access authentication state for protected components or custom auth logic:
0 commit comments