Skip to content

feat: expose buildPlanTree in order to be able to visualize smithers steps/graph in 3rd party engines #90

@Faolain

Description

@Faolain

Expose buildPlanTree (and scheduler types) for external graph visualization

Problem

Smithers constructs an internal scheduling DAG via buildPlanTree() in src/engine/scheduler.ts, but this function and its types are not exported. External consumers who want to visualize or analyze workflow graphs (custom dashboards, CI integrations, debugging tools) are left with two incomplete options:

  1. The tasks array from renderFrame() / smithers graph — a flat list with ordinals. No edges, no dependency information. You know the tasks exist but not how they relate to each other.

  2. The xml tree — structural nesting that encodes ordering semantics (smithers:sequence, smithers:parallel, smithers:ralph), but requires the consumer to reimplement the XML-to-plan-tree conversion themselves, duplicating the logic already in buildPlanTree.

The actual graph that Smithers uses to schedule execution — the PlanNode tree — is the only representation that cleanly encodes task dependencies, parallel groups, and loop semantics. But it's internal.

What buildPlanTree does today

// src/engine/scheduler.ts (not exported from smithers-orchestrator)

export function buildPlanTree(xml: XmlNode | null): {
  plan: PlanNode | null;
  ralphs: RalphMeta[];
}

It walks the XML tree produced by the renderer and converts it into a PlanNode tree:

XML Tag PlanNode Kind Scheduling Semantics
smithers:workflow sequence Children execute sequentially
smithers:sequence sequence Children execute sequentially
smithers:parallel parallel Children execute concurrently
smithers:merge-queue parallel Concurrent, concurrency enforced via descriptors
smithers:ralph ralph Loop until condition or max iterations
smithers:worktree group No special scheduling, preserves boundaries
smithers:task task Leaf node referencing a nodeId

The result is a tree like:

sequence
├── task("implement")
├── parallel
│   ├── task("update-linear")
│   └── task("update-docs")
└── task("summary")

This is what the engine feeds into scheduleTasks() every tick to determine which tasks are runnable.

Current types (all unexported)

type PlanNode =
  | { kind: "task"; nodeId: string }
  | { kind: "sequence"; children: PlanNode[] }
  | { kind: "parallel"; children: PlanNode[] }
  | { kind: "ralph"; id: string; children: PlanNode[];
      until: boolean; maxIterations: number;
      onMaxReached: "fail" | "return-last" }
  | { kind: "group"; children: PlanNode[] };

type RalphMeta = {
  id: string;
  until: boolean;
  maxIterations: number;
  onMaxReached: "fail" | "return-last";
};

Proposal

Export buildPlanTree and its types so external consumers can go from XmlNode (already exported) to a proper plan tree without reimplementing the conversion.

Minimal change (just export what exists)

// src/index.ts — add to existing exports

// Scheduler / Plan Tree
export { buildPlanTree } from "./engine/scheduler";
export type { PlanNode, RalphMeta } from "./engine/scheduler";

This is zero new code — just making existing internals public. Consumers would use it like:

import { renderFrame, buildPlanTree } from "smithers-orchestrator";
import type { PlanNode } from "smithers-orchestrator";

const snap = await renderFrame(workflow, ctx);
const { plan, ralphs } = buildPlanTree(snap.xml);
// plan is now a PlanNode tree you can walk to build edges

Optional: also export scheduleTasks and state types

If we want consumers to be able to replay scheduling logic (e.g. to show "what would run next given current state"), we'd also export:

export { scheduleTasks, buildStateKey } from "./engine/scheduler";
export type {
  TaskState,
  TaskStateMap,
  ScheduleResult,
  RalphState,
  RalphStateMap,
} from "./engine/scheduler";

This is more surface area to maintain but enables richer visualization (showing in-progress vs. blocked vs. runnable states).

Optional: edge-list helper

A convenience function that walks a PlanNode tree and produces an explicit edge list, so consumers don't have to implement their own tree traversal:

type GraphEdge = {
  from: string;   // nodeId
  to: string;     // nodeId
  kind: "sequence" | "parallel-sync" | "ralph-loop";
};

function planToEdges(plan: PlanNode): GraphEdge[];

This is new code and could live in a smithers-orchestrator/graph subpath export.

Questions for discussion

  1. Export surface — Is exporting just buildPlanTree + PlanNode + RalphMeta enough, or should scheduleTasks and the state types come along too?

  2. StabilityPlanNode is a simple discriminated union. Is this shape stable enough to be public API, or is it likely to change (e.g. new node kinds for future components)?

  3. Edge-list helper — Should Smithers itself provide a planToEdges() convenience, or is that better left to consumers? The tree is simple enough to walk, but an edge list is the lingua franca for graph visualization libraries (D3, Dagre, Mermaid, etc.).

  4. CLI integration — Should smithers graph include the PlanNode tree (or edge list) in its JSON output alongside the existing xml and tasks? That would make it available without writing any code.

  5. Subpath export — If we add graph utilities, should they go in the main export or a dedicated subpath like smithers-orchestrator/graph?

Context

  • buildPlanTree is called once per engine tick in runWorkflow() (line ~1801 of src/engine/index.ts), after rendering the workflow JSX to XML
  • It's a pure function: XmlNode -> { plan, ralphs } — no side effects, no internal state
  • renderFrame() is already the public API for getting the XmlNode that feeds into buildPlanTree
  • The smithers graph CLI command already calls renderFrame() but does not call buildPlanTree() — it just serializes the raw snapshot
  • PR fix(cli): handle cyclic references in smithers graph output #89 fixes a crash in smithers graph where cyclic Drizzle references made the output un-serializable

Why querying SQLite directly doesn't replace buildPlanTree

Smithers persists execution state in SQLite across several internal tables. A natural question is whether consumers could skip buildPlanTree entirely and reconstruct the graph from the database. The short answer is no — the database stores the input to buildPlanTree, not its output.

What SQLite stores

The _smithers_frames table captures a snapshot per engine tick:

Column Contents
xml_json The full rendered XML tree (canonicalized, SHA-256 hashed for dedup)
task_index_json Flat array of { nodeId, ordinal, iteration } per task
mounted_task_ids_json Which tasks were active at that point in time

Other tables track per-task execution state:

Table Purpose
_smithers_runs Run metadata (workflow name, status, timestamps, config)
_smithers_nodes Node state per iteration (pending/in-progress/finished/failed/cancelled/skipped)
_smithers_attempts Individual attempt details (agent response, error, cached flag)
_smithers_ralph Ralph loop iteration counters and done flags
_smithers_approvals Approval state for nodes requiring approval
_smithers_tool_calls Individual tool invocations during attempts
_smithers_events Event log (NodeStarted, NodeFinished, FrameCommitted, etc.)

What's missing

None of these tables encode dependency semantics. The XML stored in xml_json contains the structural nesting (smithers:sequence, smithers:parallel, smithers:ralph, etc.), but a consumer reading it would need to:

  1. Parse the JSON back into an XmlNode tree
  2. Map each XML tag to its scheduling semantics (e.g. smithers:sequence = children are sequential, smithers:parallel = children are concurrent)
  3. Handle edge cases (nested Ralph detection, stable ID generation for unnamed nodes, merge-queue-as-parallel, worktree-as-group)

This is exactly what buildPlanTree already does. Without it, every external consumer reimplements the same tag-to-scheduling-kind mapping.

The ideal consumer flow

SQLite _smithers_frames.xml_json   (or)   renderFrame().xml
              │                                    │
              ▼                                    ▼
         JSON.parse()                         (already XmlNode)
              │                                    │
              └──────────────┬─────────────────────┘
                             ▼
                    buildPlanTree(xml)
                             │
                             ▼
                  PlanNode tree + RalphMeta[]
                             │
                             ▼
              Consumer's graph visualization

Whether the XML comes from a live renderFrame() call or from a stored frame in SQLite, buildPlanTree is the function that turns raw structure into scheduling semantics. Exposing it lets consumers use either path without duplicating Smithers internals.

Real-time execution state: staying in sync with a running workflow

buildPlanTree gives you the static control flow graph. But a live visualization also needs to know which step is currently executing. Smithers already has four mechanisms for this — the question is whether they're sufficient or need to be surfaced differently alongside buildPlanTree.

SmithersEvent types

Smithers emits a comprehensive event stream during execution. Every event carries { runId, timestampMs } and node-level events add { nodeId, iteration, attempt }:

Event Meaning
RunStarted Workflow execution begins
NodePending Node queued but not yet running
NodeStarted Node execution begins
NodeFinished Node execution succeeds
NodeFailed Node execution fails
NodeSkipped Node skipped due to skipIf condition
NodeRetrying Node failed but will retry
NodeCancelled Node cancelled (unmounted or stale)
NodeWaitingApproval Node awaiting manual approval
NodeOutput Node produces stdout/stderr (high-frequency, not persisted to event table)
FrameCommitted XML snapshot saved (graph structure may have changed)
RunFinished / RunFailed / RunCancelled Terminal states

There are also approval events (ApprovalRequested, ApprovalGranted, ApprovalDenied), revert events, tool call events (ToolCallStarted, ToolCallFinished), and hot reload events.

Observation mechanisms

1. SSE event stream (remote / web frontends)

The built-in HTTP server exposes an SSE endpoint:

GET /v1/runs/{runId}/events?afterSeq=-1
  • Streams every SmithersEvent as event: smithers\ndata: {JSON}\n\n
  • Supports afterSeq for stateless resumption after disconnects
  • Polls the DB every 500ms, sends keep-alive heartbeats every 10s
  • Auto-closes when the run reaches a terminal state

Other useful server endpoints:

Endpoint Method Purpose
/v1/runs/{runId} GET Current run status + countNodesByState summary
/v1/runs/{runId}/frames GET List graph snapshots (paginated)
/v1/runs/{runId}/cancel POST Abort a running workflow
/v1/runs/{runId}/nodes/{nodeId}/approve POST Approve a node

2. In-process callback (same Node.js process)

runWorkflow(workflow, {
  input: { ... },
  onProgress: (event: SmithersEvent) => {
    // fires synchronously for every event including NodeOutput
  },
});

3. SQLite polling (local, no server needed)

Query _smithers_nodes for current state of all tasks:

SELECT node_id, state, iteration, updated_at_ms
FROM _smithers_nodes WHERE run_id = ?

The state column updates in real-time as tasks transition (pending -> in-progress -> finished/failed/skipped). Or query _smithers_events ordered by seq for a full event log with seq > lastSeen pagination.

4. NDJSON log file (tail -f or post-hoc analysis)

Written to .smithers/executions/{runId}/logs/stream.ndjson — one event per line.

How a third-party visualization tool would work

1. renderFrame() or load xml_json from _smithers_frames
                    |
                    v
2. buildPlanTree(xml) --> PlanNode tree (the control flow graph)
                    |
                    v
3. Render graph with all nodes in "pending" state
                    |
                    v
4. Subscribe to events via SSE / onProgress / DB polling
                    |
                    v
5. On each NodeStarted/NodeFinished/NodeFailed event,
   update the corresponding node's visual state
                    |
                    v
6. On FrameCommitted, re-fetch XML and rebuild PlanNode tree
   (graph structure may change due to Ralph iterations or hot reload)

The PlanNode tree gives you the edges (which tasks block which). The event stream gives you the node states in real-time. Together they're everything needed to render a live DAG visualization.

Additional question for discussion

  1. Should buildPlanTree output be included in the SSE event stream? — When a FrameCommitted event fires (indicating the graph structure changed), consumers currently need to re-fetch the XML and call buildPlanTree themselves. Should FrameCommitted include the PlanNode tree directly, or should there be a dedicated endpoint like GET /v1/runs/{runId}/plan that returns the current plan tree?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions