Add experimental/ast/printer package#657
Conversation
dbf4d2f to
4422e60
Compare
4422e60 to
c170def
Compare
doriable
left a comment
There was a problem hiding this comment.
The API looks very reasonable to me, I just had a few nitpicks/discussion points for clarifying a few things. Thank you!
experimental/printer packageexperimental/ast/printer package
ee5138a to
67878ec
Compare
| ) | ||
|
|
||
| // PrintFile renders an AST file to protobuf source text. | ||
| func PrintFile(file *ast.File, opts Options) string { |
There was a problem hiding this comment.
You want this to take function options right?
There was a problem hiding this comment.
Do you mean type Option func(...)? Was following dom and report options
| // the pending buffer so that adjacent pure-newline runs are combined into a | ||
| // single kindBreak dom tag, preventing the dom from merging them and |
There was a problem hiding this comment.
Could this be dealt with by making kindBreak a bit more sophisticated?
Add TestBufFormat that reads buf's bufformat testdata (54 test cases) and compares printer output against golden files. Currently 35/54 pass. Fixes: - Empty files produce empty output instead of trailing newline - New gapInline style keeps punctuation on same line as preceding comments: `value /* comment */;` instead of breaking to new line - New gapGlue style for path separators glues comments without spaces: `header/*comment*/.v1` - Preserve source blank lines after detached comments using actual newline count from pending trivia tokens - Body declarations at file level only get blank lines when source had them, instead of unconditionally - Compound string first element on new indented line after `=` - Strip trailing whitespace from comments in format mode
Add test cases documenting 8 remaining formatting issues: - Issue 1: trailing // before ] should become /* */ on single-line - Issue 2: trailing comment after , should stay inline - Issue 3: comment after [ opener should expand to multi-line - Issue 5: comment before } in enum (already passes) - Issue 6: EOF comment after blank line should preserve blank line - Issue 7: block comments in RPC parens should not add extra spaces - Issue 8: extension path comments should preserve spaces - Issue 9: message literal with block comments should expand
Two formatting fixes: - Extract inline trailing comments after commas in walkDecl so they stay on the same line as the comma instead of becoming leading trivia on the next token - Preserve blank lines before EOF comments by checking trivia.blankBeforeClose in printFile
…sion - Negative prefix (-) with block comments uses gapSpace for proper spacing (e.g., "- /* comment */ 32") - Revert compound string // to /* */ conversion attempt as it caused trailing comment conversion on the following semicolon
This ensures block comments in glued contexts (RPC parens, path separators, generics) always get a space after them before the next word token. Without comments, behavior is unchanged.
…rsion Three fixes for comment preservation in the printer: 1. Generalize inline trailing comment extraction in walkDecl to all tokens, not just commas. A comment on the same line as any token (e.g., "bar: 2 // comment") is now correctly attached as trailing trivia on that token. Guarded by firstNewline < len(leading) to avoid reclassifying block comments between same-line tokens. 2. Add emitCommaTrivia to printDict so that comments attached to comma tokens (which are removed during message literal formatting) are never silently dropped. 3. Manage convertLineToBlock in printCompoundString: clear it for intermediate parts (// comments between string parts on their own lines are fine), restore the caller's value for the last part's trailing (a // there would eat the following ; or ]). Add withLineToBlock helper for scoped save/restore. Set it in printOption since ; follows the value inline.
…ia only Set convertLineToBlock in printPath since path components are glued inline (gapGlue) and a trailing // comment between components would eat the next identifier. Remove the convertLineToBlock check from emitTrivia (leading trivia). After the generalized inline trailing extraction in walkDecl, all comments remaining in leading trivia are on their own lines and never eat following tokens. Only emitTrailing needs the conversion. This prevents over-conversion of leading // comments that are safe as-is.
Categorize and explain the stylistic differences between our printer output and the old buf format golden files. All remaining differences are intentional formatting choices, not correctness issues.
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package printer_test |
There was a problem hiding this comment.
This code only exists to validate against the current to produce the above doc.
doriable
left a comment
There was a problem hiding this comment.
Sorry for the slow turnaround time on the review :< Left some comments -- I think the primary feedback is around the handling of trivia and the edge-cases/invariants that can come up. I think once those are ironed out, we can finalise the behaviour through the test results.
| // - Consecutive newline tags accumulate, but are capped at two newlines | ||
| // (one blank line) in the output. |
There was a problem hiding this comment.
So this doc is a little bit confusing/misleading on the context of the behaviour implemented in print.go:
if len(tag.text) == 1 {
// Single-newline breaks increment by 1, capped at 2
// (one blank line maximum).
p.newlines = min(p.newlines+1, 2)
} else {
// Multi-newline breaks set the floor directly.
p.newlines = max(p.newlines, len(tag.text))
}
In the case of Text("\n\n\n"), this would actually just jump to the else condition... this behaviour only applies if new lines are accumulated one tag at a time... which I don't think we can assume to be the case. I believe the fix should actually be in print.go to check for p.newlines?
| // Convert // comment to /* comment */ for inline contexts. | ||
| body := strings.TrimPrefix(strings.TrimRight(t.Text(), " \t"), "//") | ||
| p.push(dom.Text("/*" + body + " */")) |
There was a problem hiding this comment.
This doesn't work if the comment is something like:
message Foo { // i am an agent of chaos */
Since you'll get /* i am an agent of chaos */*/
| } | ||
| } | ||
| } else { | ||
| p.pending = append(p.pending, trailing...) |
There was a problem hiding this comment.
There is an edge case here... so, only non-synthetic tokens can have trivia, e.g. trailing, and if a non-synthetic token with trailing is followed by synthetic token, which will always be hasTrivia == false, then pending is only emitted at the next point with trivia?
| p.printToken(expr.Colon(), gapInline, ctx) | ||
| } else if p.options.Format && !expr.Key().IsZero() && !expr.Value().IsZero() { | ||
| // Insert colon in format mode when missing (e.g. "e []" -> "e: []"). | ||
| p.push(dom.Text(":")) |
There was a problem hiding this comment.
Might be worth commenting that in the case where p.pending is not empty, the pending is flushed after the :, since we're just pushing the : without flushing pending first... it seems like a behaviour that might be okay (but also, for trailing, it's a little weird...), so it's at least something we should comment (or maybe fix... o_o)
| } | ||
|
|
||
| // emitGap pushes whitespace tags for the given gap style. | ||
| func (p *printer) emitGap(gap gapStyle) { |
There was a problem hiding this comment.
Worth noting that this doesn't handle if things exist in p.pending.
| // withIndent runs fn with an indented printer, swapping the sink temporarily. | ||
| func (p *printer) withIndent(fn func(p *printer)) { | ||
| originalPush := p.push | ||
| p.push(dom.Indent(strings.Repeat(" ", p.options.TabstopWidth), func(indentSink dom.Sink) { |
There was a problem hiding this comment.
There is a strings.Repeat allocation per indent... perhaps it's worth caching this? o_o
This adds
experimental/ast/printer, an AST printer for protobuf files. It does not yet implement formatting, but uses the dom library to produce correct indentation for synthetic edits. Formatting support will be added later.The printer preserves comments and whitespace using a trivia index. Each comment is classified as either attached (bound to a specific token) or detached (bound to a positional slot between declarations). Attached comments travel with their token when declarations are moved; detached comments stay in place. At print time, the printer zips slot trivia with children within each scope, and looks up attached trivia when emitting individual tokens. This works identically for natural and synthetic tokens (no special-casing needed). Synthetic tokens won't have a trivia and fallback to the declared gap. This gap is also what will be used to inform how to format the trivia between tokens when formatting.