Problem
The CopilotBackend launches the Copilot CLI with -s (silent mode), which outputs plain text only. The parseTokenUsage() function then tries to extract token counts from that plain text using regex patterns, but the Copilot CLI's plain-text output doesn't include a recognizable token usage summary. Result: tokenUsage is always 0.
Evidence from a real 310-task migration (zstd C → Rust, via AAMF)
- 579 agent invocations, all returning
tokenUsage: 0
- The Copilot CLI does report token data when using
--output-format json:
- Every
assistant.message event includes data.outputTokens (integer)
- The
result event includes usage.premiumRequests, usage.totalApiDurationMs, usage.sessionDurationMs
- 10,284
assistant.message events with outputTokens summing to 12.4M output tokens — all ignored because -s mode doesn't emit JSONL
- The
result event does not include input/output/inputTokens fields — only premiumRequests, totalApiDurationMs, sessionDurationMs, codeChanges
Required Changes
1. Switch CopilotBackend from -s to --output-format json
In CopilotBackend.invoke(), replace:
with:
'--output-format', 'json',
2. Parse JSONL output for token usage and text content
The JSONL stream contains these event types:
Add a JSONL parser that:
- Iterates lines of stdout, parses each as JSON
- For
assistant.message events: accumulates data.outputTokens and concatenates data.content to reconstruct the text output
- For
result events: extracts premiumRequests from usage
- Non-JSON lines get appended to the text content as-is (for backwards compat)
3. Update AgentResult.tokenUsage
Populate tokenUsage from the accumulated outputTokens:
tokenUsage: accumulatedOutputTokens > 0
? { input: 0, output: accumulatedOutputTokens, model: model ?? 'unknown' }
: 0
Note: The Copilot CLI does not report input tokens per-message. Setting input: 0 is accurate to the data available. Consumers that need total token estimates can use premiumRequests as a proxy.
4. Expose premiumRequests on AgentResult
Add an optional field to AgentResult:
/** Estimated premium requests consumed (Copilot CLI only). */
premiumRequests?: number;
This is the Copilot CLI's native cost metric and should be surfaced for budget tracking.
5. Keep parseTokenUsage() as fallback
The existing parseTokenUsage() regex-based parser should remain as a fallback for when JSONL parsing finds nothing (e.g., older CLI versions that don't emit outputTokens).
6. Reconstruct stdout for consumers
Since -s mode is gone, AgentResult.stdout now contains JSONL instead of plain text. The framework should reconstruct the text content from assistant.message events and set that as stdout, or add a separate field. Downstream consumers (like AAMF's parseAamfOutput) parse stdout for structured agent output blocks — they need the text content, not raw JSONL.
Option A (recommended): Set stdout to the reconstructed text content (concatenated data.content from assistant.message events). Store raw JSONL in a new optional field if needed.
Option B: Keep stdout as raw JSONL and let consumers deal with it. (This is what AAMF already does with its own AamfCopilotBackend that overrides the framework's backend, but it means every framework consumer has to implement JSONL parsing.)
Context
AAMF already works around this by registering a custom AamfCopilotBackend that uses --output-format json and parses the JSONL stream (see src/core/agent-launcher.ts). This fix would bring the framework's built-in CopilotBackend up to parity, eliminating the need for downstream workarounds.
Non-goals
- Do not attempt to estimate input tokens from output tokens or API duration — that's the consumer's responsibility
- The Claude backend is unaffected; it already parses JSON usage correctly
Problem
The
CopilotBackendlaunches the Copilot CLI with-s(silent mode), which outputs plain text only. TheparseTokenUsage()function then tries to extract token counts from that plain text using regex patterns, but the Copilot CLI's plain-text output doesn't include a recognizable token usage summary. Result:tokenUsageis always0.Evidence from a real 310-task migration (zstd C → Rust, via AAMF)
tokenUsage: 0--output-format json:assistant.messageevent includesdata.outputTokens(integer)resultevent includesusage.premiumRequests,usage.totalApiDurationMs,usage.sessionDurationMsassistant.messageevents withoutputTokenssumming to 12.4M output tokens — all ignored because-smode doesn't emit JSONLresultevent does not includeinput/output/inputTokensfields — onlypremiumRequests,totalApiDurationMs,sessionDurationMs,codeChangesRequired Changes
1. Switch CopilotBackend from
-sto--output-format jsonIn
CopilotBackend.invoke(), replace:with:
2. Parse JSONL output for token usage and text content
The JSONL stream contains these event types:
Add a JSONL parser that:
assistant.messageevents: accumulatesdata.outputTokensand concatenatesdata.contentto reconstruct the text outputresultevents: extractspremiumRequestsfromusage3. Update
AgentResult.tokenUsagePopulate
tokenUsagefrom the accumulatedoutputTokens:Note: The Copilot CLI does not report input tokens per-message. Setting
input: 0is accurate to the data available. Consumers that need total token estimates can usepremiumRequestsas a proxy.4. Expose
premiumRequestsonAgentResultAdd an optional field to
AgentResult:This is the Copilot CLI's native cost metric and should be surfaced for budget tracking.
5. Keep
parseTokenUsage()as fallbackThe existing
parseTokenUsage()regex-based parser should remain as a fallback for when JSONL parsing finds nothing (e.g., older CLI versions that don't emitoutputTokens).6. Reconstruct stdout for consumers
Since
-smode is gone,AgentResult.stdoutnow contains JSONL instead of plain text. The framework should reconstruct the text content fromassistant.messageevents and set that asstdout, or add a separate field. Downstream consumers (like AAMF'sparseAamfOutput) parsestdoutfor structured agent output blocks — they need the text content, not raw JSONL.Option A (recommended): Set
stdoutto the reconstructed text content (concatenateddata.contentfromassistant.messageevents). Store raw JSONL in a new optional field if needed.Option B: Keep
stdoutas raw JSONL and let consumers deal with it. (This is what AAMF already does with its ownAamfCopilotBackendthat overrides the framework's backend, but it means every framework consumer has to implement JSONL parsing.)Context
AAMF already works around this by registering a custom
AamfCopilotBackendthat uses--output-format jsonand parses the JSONL stream (seesrc/core/agent-launcher.ts). This fix would bring the framework's built-inCopilotBackendup to parity, eliminating the need for downstream workarounds.Non-goals