Skip to content

Drop -o json from cfl; design token-dense output formats for agent consumption #383

@rianjs

Description

@rianjs

Motivation

cfl's primary reader/writer is an LLM agent (CLI agents, MCP integrations), not a shell script. With that lens, -o json is the worst output mode we ship:

  • It's lossier than the human-readable views. page view -o json returns the content field as escape-encoded XHTML (<h2>...</h2>), even though the default table view already shows clean markdown. An agent has to mentally unescape the JSON string back into XHTML before it can be used — meanwhile --raw --content-only already gives us lossless XHTML in one step.
  • The "agent artifact" promise isn't kept. tools/cfl/CLAUDE.md says the JSON agent artifact returns "content as markdown"; in practice it returns XHTML (see view.go:130 — the source comment openly admits "until 7/8 test(cfl): establish fidelity fixtures for agent-facing content artifacts #196 validates markdown"). The format that's supposedly for agents is the one with the biggest doc/reality mismatch.
  • Double-escaped emoji. search -o json excerpts come through with literal \\uD83D\\uDDD3 (eight characters representing one codepoint), not a real surrogate pair. Agent has to know to decode twice.
  • Inconsistent shape. page list -o json returns a tidy artifact projection; page create -o json and attachment upload -o json return the raw Confluence API object (camelCase, _links, parentType, position, etc.). The contract in docs/ARTIFACT_CONTRACT.md is not uniformly applied.
  • Buggy in places. config show -o json emits broken pseudo-JSON (one {"key":"value (source: x)"} per line + a non-JSON footer). --full is a no-op for attachment list -o json (cfl attachment list: --full is a no-op for -o json #381).

There is no script-based consumer of cfl JSON that we know of — by design, the primary integration point is conversational. If we accept that, we can stop carrying the JSON contract and design something that's genuinely token-dense and easy for an LLM to parse on sight.

Proposal

  1. Remove -o json entirely from cfl. The -o flag accepts table and plain only (or we collapse those too — see below).
  2. Re-design human/agent output to be token-dense. Brainstorm in the discussion below.
  3. Rethink --full. Currently --full only modifies JSON. If JSON is gone, --full becomes either (a) extra columns in the table view, (b) extra Key: value lines after the body for detail views, or (c) goes away entirely with the verbose set always present.
  4. Resolve plain vs table. For detail views (page view, space view), -o plain and -o table are currently byte-identical (it's "no color" only). For list commands, -o plain is TSV. Either we keep plain strictly as the TSV mode for list-only commands and drop the redundant detail-view branch, or we drop plain entirely and use --no-color for the color toggle.

What "token-dense" should mean

Goals, in priority order:

  1. One canonical shape per command that an LLM can parse without thinking. No "agent" vs "full" vs "raw" matrix.
  2. Lossless markdown for page content. Today only --raw is lossless, and only as XHTML. We want lossless markdown to be the default reading mode.
  3. No escape-encoded payloads. No <, no double-encoded surrogate pairs.
  4. No structural redundancy. No {\"_meta\": {\"count\": N, \"hasMore\": false}} wrappers — count is wc -l and "hasMore" is one line at the bottom.
  5. Minimal column padding. Today's ASCII tables pad with whitespace to align columns; this burns tokens. TSV or compact pipe-separated formats are tighter.

Candidate formats to discuss

(Pick or combine — this is the ideation space.)

A. TSV everywhere

3367829907	page	confluence	Template - Meeting notes
  • Pros: cheapest possible, easy to grep, easy to read.
  • Cons: titles with embedded tabs/newlines need quoting (Confluence titles can have anything).

B. Markdown-table-ish, no padding

3367829907 | page | confluence | Template - Meeting notes
  • Pros: readable, fewer pad-tokens than ASCII tables.
  • Cons: still needs pipe-escaping in fields.

C. Front-matter + body for detail views

id: 3367829866
title: confluence-cli
space: confluence (3367829530)
version: 1
---
## Description
...page body in markdown...
  • Pros: clean separation, easy for agents to slice with a regex, lossless markdown body, no JSON escape rules.
  • Cons: yet-another-format, but it's basically how Hugo/Jekyll/Obsidian already store pages so most LLMs are very fluent in it.

D. Key-value lines + body (current page view table, but stable)

  • Already what we do for detail views. Keep it; just promise it's stable and add the --full fields inline.

E. NDJSON (one JSON object per line, no escape-of-escape)

  • Pros: still structured.
  • Cons: still requires JSON-decoding, still has the escape problem unless we mandate the body field is a separate document.
  • I'd vote against this — if we're keeping JSON-shaped output we haven't really removed it.

F. "Block" format — one field per line, blank-line-separated records

id: 3367829907
type: page
space: confluence
title: Template - Meeting notes

id: 3367829895
type: page
space: confluence
title: Template - Decision documentation
  • Pros: reads like email headers, trivially parseable, no escaping needed for values that don't have newlines.

G. Markdown all the way down

  • List commands return a markdown table. Detail commands return a markdown front-matter document. Search returns a markdown list with each result as a bullet.
  • Lossless for content. Maximally fluent for agents. Could double as something pasteable into Confluence/Slack.

My instinct: G or C are the most agent-friendly, with detail views landing on something like:

# confluence-cli

`id: 3367829866`  ·  `space: confluence (3367829530)`  ·  `version: 1`

## Description
...

…and list views landing on a real markdown table. But this is a brainstorm — push back.

Open questions

  • Do we have downstream JSON consumers we don't know about? Worth asking on Slack / in the issue thread before removing.
  • What about set-credential --from-env and config test automation? Those don't currently emit JSON, so they're unaffected.
  • Migration path. Do we ship a feat!: breaking-change release, or gate -o json behind a deprecation period?
  • Apply the same treatment to jtk? Same arguments apply — the existing artifact contract is shared via atlassian-go/artifact. Worth scoping in a follow-up.

Related findings / bugs

Surfaced during the surface-inventory work (raw evidence at /tmp/cfl-outputs-60314/, summary at tools/cfl/cfl-outputs.md):

This issue is the parent for the JSON-removal + format-redesign work. Sub-issues to follow once we converge on a format.

Next step

Leave a comment with your preferred format (A–G or a hybrid), any consumer we'd break, and how aggressive you want to be on the migration (breaking release vs. deprecation window).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions