Skip to content

Add experimental/ast/printer package#657

Draft
emcfarlane wants to merge 63 commits intomainfrom
ed/printer2
Draft

Add experimental/ast/printer package#657
emcfarlane wants to merge 63 commits intomainfrom
ed/printer2

Conversation

@emcfarlane
Copy link
Copy Markdown
Contributor

@emcfarlane emcfarlane commented Jan 22, 2026

This adds experimental/ast/printer, an AST printer for protobuf files. It does not yet implement formatting, but uses the dom library to produce correct indentation for synthetic edits. Formatting support will be added later.

The printer preserves comments and whitespace using a trivia index. Each comment is classified as either attached (bound to a specific token) or detached (bound to a positional slot between declarations). Attached comments travel with their token when declarations are moved; detached comments stay in place. At print time, the printer zips slot trivia with children within each scope, and looks up attached trivia when emitting individual tokens. This works identically for natural and synthetic tokens (no special-casing needed). Synthetic tokens won't have a trivia and fallback to the declared gap. This gap is also what will be used to inform how to format the trivia between tokens when formatting.

Copy link
Copy Markdown
Member

@doriable doriable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API looks very reasonable to me, I just had a few nitpicks/discussion points for clarifying a few things. Thank you!

Comment thread experimental/printer/printer.go Outdated
Comment thread experimental/printer/printer.go Outdated
Comment thread experimental/printer/printer.go Outdated
Comment thread experimental/ast/printer/printer_test.go
@emcfarlane emcfarlane changed the title Add experimental/printer package Add experimental/ast/printer package Feb 6, 2026
@emcfarlane emcfarlane marked this pull request as ready for review February 17, 2026 16:52
Comment thread experimental/ast/printer/printer.go Outdated
)

// PrintFile renders an AST file to protobuf source text.
func PrintFile(file *ast.File, opts Options) string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want this to take function options right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean type Option func(...)? Was following dom and report options

Comment thread experimental/ast/printer/printer.go Outdated
Comment on lines +135 to +136
// the pending buffer so that adjacent pure-newline runs are combined into a
// single kindBreak dom tag, preventing the dom from merging them and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be dealt with by making kindBreak a bit more sophisticated?

Comment thread experimental/ast/printer/trivia.go
@emcfarlane emcfarlane marked this pull request as draft March 11, 2026 19:23
Add TestBufFormat that reads buf's bufformat testdata (54 test cases)
and compares printer output against golden files. Currently 35/54 pass.

Fixes:
- Empty files produce empty output instead of trailing newline
- New gapInline style keeps punctuation on same line as preceding
  comments: `value /* comment */;` instead of breaking to new line
- New gapGlue style for path separators glues comments without
  spaces: `header/*comment*/.v1`
- Preserve source blank lines after detached comments using actual
  newline count from pending trivia tokens
- Body declarations at file level only get blank lines when source
  had them, instead of unconditionally
- Compound string first element on new indented line after `=`
- Strip trailing whitespace from comments in format mode
Add test cases documenting 8 remaining formatting issues:
- Issue 1: trailing // before ] should become /* */ on single-line
- Issue 2: trailing comment after , should stay inline
- Issue 3: comment after [ opener should expand to multi-line
- Issue 5: comment before } in enum (already passes)
- Issue 6: EOF comment after blank line should preserve blank line
- Issue 7: block comments in RPC parens should not add extra spaces
- Issue 8: extension path comments should preserve spaces
- Issue 9: message literal with block comments should expand
Two formatting fixes:
- Extract inline trailing comments after commas in walkDecl so they
  stay on the same line as the comma instead of becoming leading
  trivia on the next token
- Preserve blank lines before EOF comments by checking
  trivia.blankBeforeClose in printFile
…sion

- Negative prefix (-) with block comments uses gapSpace for proper
  spacing (e.g., "- /* comment */ 32")
- Revert compound string // to /* */ conversion attempt as it caused
  trailing comment conversion on the following semicolon
This ensures block comments in glued contexts (RPC parens, path
separators, generics) always get a space after them before the next word
token. Without comments, behavior is unchanged.
…rsion

Three fixes for comment preservation in the printer:

1. Generalize inline trailing comment extraction in walkDecl to all
   tokens, not just commas. A comment on the same line as any token
   (e.g., "bar: 2 // comment") is now correctly attached as trailing
   trivia on that token. Guarded by firstNewline < len(leading) to
   avoid reclassifying block comments between same-line tokens.

2. Add emitCommaTrivia to printDict so that comments attached to
   comma tokens (which are removed during message literal formatting)
   are never silently dropped.

3. Manage convertLineToBlock in printCompoundString: clear it for
   intermediate parts (// comments between string parts on their own
   lines are fine), restore the caller's value for the last part's
   trailing (a // there would eat the following ; or ]). Add
   withLineToBlock helper for scoped save/restore. Set it in
   printOption since ; follows the value inline.
…ia only

Set convertLineToBlock in printPath since path components are glued
inline (gapGlue) and a trailing // comment between components would
eat the next identifier.

Remove the convertLineToBlock check from emitTrivia (leading trivia).
After the generalized inline trailing extraction in walkDecl, all
comments remaining in leading trivia are on their own lines and never
eat following tokens. Only emitTrailing needs the conversion. This
prevents over-conversion of leading // comments that are safe as-is.
Categorize and explain the stylistic differences between our printer
output and the old buf format golden files. All remaining differences
are intentional formatting choices, not correctness issues.
// See the License for the specific language governing permissions and
// limitations under the License.

package printer_test
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only exists to validate against the current to produce the above doc.

Copy link
Copy Markdown
Member

@doriable doriable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow turnaround time on the review :< Left some comments -- I think the primary feedback is around the handling of trivia and the edge-cases/invariants that can come up. I think once those are ironed out, we can finalise the behaviour through the test results.

Comment thread experimental/dom/tags.go
Comment on lines +108 to +109
// - Consecutive newline tags accumulate, but are capped at two newlines
// (one blank line) in the output.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this doc is a little bit confusing/misleading on the context of the behaviour implemented in print.go:

			if len(tag.text) == 1 {
				// Single-newline breaks increment by 1, capped at 2
				// (one blank line maximum).
				p.newlines = min(p.newlines+1, 2)
			} else {
				// Multi-newline breaks set the floor directly.
				p.newlines = max(p.newlines, len(tag.text))
			}

In the case of Text("\n\n\n"), this would actually just jump to the else condition... this behaviour only applies if new lines are accumulated one tag at a time... which I don't think we can assume to be the case. I believe the fix should actually be in print.go to check for p.newlines?

Comment on lines +216 to +218
// Convert // comment to /* comment */ for inline contexts.
body := strings.TrimPrefix(strings.TrimRight(t.Text(), " \t"), "//")
p.push(dom.Text("/*" + body + " */"))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work if the comment is something like:

message Foo { // i am an agent of chaos */

Since you'll get /* i am an agent of chaos */*/

}
}
} else {
p.pending = append(p.pending, trailing...)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an edge case here... so, only non-synthetic tokens can have trivia, e.g. trailing, and if a non-synthetic token with trailing is followed by synthetic token, which will always be hasTrivia == false, then pending is only emitted at the next point with trivia?

p.printToken(expr.Colon(), gapInline, ctx)
} else if p.options.Format && !expr.Key().IsZero() && !expr.Value().IsZero() {
// Insert colon in format mode when missing (e.g. "e []" -> "e: []").
p.push(dom.Text(":"))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth commenting that in the case where p.pending is not empty, the pending is flushed after the :, since we're just pushing the : without flushing pending first... it seems like a behaviour that might be okay (but also, for trailing, it's a little weird...), so it's at least something we should comment (or maybe fix... o_o)

}

// emitGap pushes whitespace tags for the given gap style.
func (p *printer) emitGap(gap gapStyle) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that this doesn't handle if things exist in p.pending.

// withIndent runs fn with an indented printer, swapping the sink temporarily.
func (p *printer) withIndent(fn func(p *printer)) {
originalPush := p.push
p.push(dom.Indent(strings.Repeat(" ", p.options.TabstopWidth), func(indentSink dom.Sink) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a strings.Repeat allocation per indent... perhaps it's worth caching this? o_o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants