Lightweight sandboxes for executing LLM-generated code using QuickJS compiled to WebAssembly.
sandcastle/
├── packages/
│ └── sandbox/ # @nigams/sandcastle — QuickJS wrapper library
│ └── src/index.ts # runCode() and runLLMCode() APIs
└── apps/
└── sample-runner/ # Example app demonstrating sandbox usage
└── src/
├── main.ts # Runner (executes all or selected examples)
└── examples/
├── 01-arithmetic.ts # Basic expression evaluation
├── 02-llm-code.ts # Auto-wrapped LLM output
├── 03-console.ts # Console output from sandbox
├── 04-env-vars.ts # Passing data via environment
├── 05-error-handling.ts # Safe error capture
├── 06-data-transform.ts # Simple data transformation
└── 07-large-dataset.ts # Large dataset reduction pattern
pnpm install
pnpm run build
node apps/sample-runner/dist/main.jsRun a single example:
node apps/sample-runner/dist/main.js largeDataset
node apps/sample-runner/dist/main.js --list # see all available examplesExecutes code in a QuickJS WASM sandbox. The code must use export default <value> to return a result.
import { runCode } from "@nigams/sandcastle";
const result = await runCode("export default 2 + 2");
// { ok: true, data: 4 }Convenience wrapper for LLM-generated code. If the code doesn't contain export default, it's automatically wrapped in an async IIFE with return as the output mechanism.
import { runLLMCode } from "@nigams/sandcastle";
const result = await runLLMCode(`
const x = [1, 2, 3];
return x.reduce((a, b) => a + b, 0);
`);
// { ok: true, data: 6 }| Option | Type | Default | Description |
|---|---|---|---|
env |
Record<string, unknown> |
{} |
Key-value pairs accessible as env.KEY in sandbox |
allowFetch |
boolean |
false |
Enable fetch() inside the sandbox |
allowFs |
boolean |
false |
Enable node:fs inside the sandbox |
executionTimeout |
number |
— | Max execution time in seconds |
Both functions return a SandboxResult:
// Success
{ ok: true, data: <any> }
// Failure — errors are captured, never thrown
{ ok: false, error: { name: string, message: string, stack?: string } }The most powerful use case for sandboxed execution is reducing large API responses before they enter an LLM's context window.
An API returns 1,000 records (136KB). Feeding this directly to an LLM:
- Wastes context window tokens
- Increases latency and cost
- Often exceeds context limits entirely
Let the LLM write the transformation, execute it in the sandbox, and return only the compact result.
┌─────────────┐ ┌────────────┐ ┌───────────────┐ ┌─────────────┐
│ LLM │────▶│ Host │────▶│ QuickJS │────▶│ LLM │
│ generates │ │ fetches │ │ sandbox runs │ │ receives │
│ transform │ │ 1000 rows │ │ the transform │ │ compact │
│ code │ │ (136KB) │ │ code │ │ summary │
└─────────────┘ └────────────┘ └───────────────┘ │ (587 bytes) │
└─────────────┘
// 1. Host fetches data (credentials stay on host, never in sandbox)
const orders = await fetchFromCRM("/api/orders?limit=1000");
// 2. LLM generates transformation code
const llmCode = `
const orders = JSON.parse(env.ORDERS);
const completed = orders.filter(o => o.status === "completed");
const summary = {};
for (const order of completed) {
if (!summary[order.region]) summary[order.region] = {};
if (!summary[order.region][order.product]) {
summary[order.region][order.product] = { count: 0, total: 0 };
}
summary[order.region][order.product].count++;
summary[order.region][order.product].total += order.amount;
}
return { totalOrders: orders.length, completedOrders: completed.length, byRegionAndProduct: summary };
`;
// 3. Execute in sandbox — 136KB in, 587 bytes out
const result = await runLLMCode(llmCode, {
env: { ORDERS: JSON.stringify(orders) },
});| Concern | How it's handled |
|---|---|
| Security | API keys and credentials stay on the host — never passed into the sandbox |
| Isolation | LLM-generated code runs in WASM — cannot access host filesystem, network, or process |
| Efficiency | 136KB → 587 bytes (99.6% reduction) before hitting the LLM context window |
| Flexibility | The LLM generates different transforms per question — group by date, top-N, filter by status, etc. |
| Error safety | Broken or malicious code returns { ok: false, error } — never crashes the host |
- Summarizing large API responses (CRM, ERP, analytics)
- Aggregating time-series data before charting
- Filtering and reshaping database query results
- Extracting specific fields from verbose JSON payloads
- Any scenario where raw data is too large for the LLM context but the LLM needs to define how to reduce it