Skip to content

Commit 7334c0a

Browse files
committed
rn-134: add line log pipeline article
1 parent fff093f commit 7334c0a

1 file changed

Lines changed: 183 additions & 2 deletions

File tree

rev_news/drafts/edition-134.md

Lines changed: 183 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,190 @@ This edition covers what happened during the months of March and April 2026.
2121
### General
2222
-->
2323

24-
<!---
2524
### Reviews
26-
-->
25+
26+
+ [[PATCH 0/4] line-log: route -L output through the standard diff pipeline](https://lore.kernel.org/git/pull.2065.git.1772845338.gitgitgadget@gmail.com)
27+
28+
`git log -L` lets users follow the history of a specified line range
29+
inside a file, for example by passing `-L:funcname:file.c` to track
30+
the evolution of a function. Since the feature was introduced,
31+
however, its diff output has been generated by a hand-rolled helper
32+
called `dump_diff_hacky()` rather than by Git's standard diff
33+
pipeline, with a `NEEDSWORK` comment in `line-log.c` openly
34+
admitting:
35+
36+
```
37+
/*
38+
* NEEDSWORK: manually building a diff here is not the Right
39+
* Thing(tm). log -L should be built into the diff pipeline.
40+
*/
41+
```
42+
43+
The practical consequence is that almost every diff formatting
44+
option that users have come to rely on (`--word-diff`,
45+
`--color-moved`, the `-w`/`-b` whitespace options, `--no-prefix`,
46+
`--src-prefix`/`--dst-prefix`, `--full-index`, `--abbrev`, `-R`,
47+
`--output-indicator-*`, the pickaxe options `-S`/`-G`, and so on) is
48+
silently ignored when combined with `-L`. The hand-rolled output
49+
also omits the `index` lines, `new file mode` headers, and funcname
50+
context in `@@` hunk headers that the standard pipeline produces.
51+
52+
Michael Montalbo opened the discussion by sending a four-patch
53+
series that finally addresses this long-standing limitation. The
54+
series explicitly replaced an earlier attempt of his,
55+
["line-log: fix `-L` with pickaxe options"](https://lore.kernel.org/git/pull.2061.git.1772651484.gitgitgadget@gmail.com/),
56+
which had taken the opposite approach of *rejecting* `-S`/`-G` when
57+
combined with `-L`; the new direction is to make those options
58+
*work* instead. Patch 1 carries over a crash fix from that previous
59+
attempt unchanged. Patch 2 is the core change. Patch 3 adds an
60+
extensive set of tests for the newly-working options. Patch 4
61+
updates the documentation.
62+
63+
Patch 1 fixes a real assertion failure that could be triggered by
64+
combining `-L` with pickaxe options across a merge that contains a
65+
rename, an issue originally reported by Matthew Hughes. Inside
66+
`queue_diffs()`, the caller's `diff_options` was being reused for
67+
rename detection, which meant that any user-specified pickaxe state
68+
(`-G`, `-S`, or `--find-object`) would run inside `diffcore_std()`
69+
and silently discard diff pairs that the rename machinery still
70+
needed. The fix builds a private `diff_options` for the
71+
rename-detection path, mirroring the pattern already used in `git
72+
blame`'s `find_rename()`, and isolates the rename machinery from
73+
unrelated user options.
74+
75+
Patch 2 is where the heavy lifting happens. Instead of formatting
76+
output by hand, `-L` now feeds its filepairs through
77+
`builtin_diff()` and `fn_out_consume()`, the same path used by
78+
`git diff` and `git log -p`. The mechanism is a pair of callback
79+
wrappers that sit between `xdi_diff_outf()` and `fn_out_consume()`,
80+
filtering xdiff's output down to only the tracked line ranges. To
81+
make sure xdiff actually emits every line within each tracked range
82+
as context, the context length is inflated to span the largest
83+
range. The tracked line ranges themselves are now carried on `struct
84+
diff_filepair` as a borrowed pointer, so that each file's ranges
85+
travel with its filepair through the rest of the pipeline. As a side
86+
effect, `line_log_print()` shrinks down to little more than a
87+
`diffcore_std()` call followed by `diff_flush()`, the
88+
`-L`-implies-`--patch` default is wired up in revision setup rather
89+
than forced at output time, and `diff_filepair_dup()` is switched
90+
from `xmalloc` to `xcalloc` so that newly added fields (including
91+
the `line_ranges`) are zero-initialized.
92+
93+
Because `diffcore_std()` now actually runs at output time, options
94+
such as `-S`, `-G`, `--orderfile`, and `--diff-filter` come along
95+
for the ride and start working with `-L` for the first time. Michael
96+
also notes in the commit message that the context-length inflation
97+
means xdiff might process more output than strictly needed for very
98+
wide ranges, but his benchmarks on files up to 7800 lines showed no
99+
measurable regression.
100+
101+
There is, of course, a user-visible output change: `-L` output now
102+
includes `index` lines, `new file mode` headers, and funcname
103+
context in `@@` hunk headers that were previously absent. Tools that
104+
parse `-L` output may need to handle these additional lines. The
105+
cover letter is upfront about this, and also lists two limitations
106+
that are deliberately left for follow-up work: `line_log_print()`
107+
still calls `show_log()` and `diff_flush()` directly rather than
108+
going through `log_tree_diff_flush()`, and the non-patch diff
109+
formats (`--raw`, `--numstat`, `--stat`, etc.) remain unimplemented
110+
for `-L`.
111+
112+
Junio Hamano, the Git maintainer, replied to the cover letter the
113+
same day with a single word: "Exciting." He approved the deliberate
114+
incremental scope, observing that since "previously all the output
115+
routines were hand-rolled, but this reduces the extent of deviation
116+
--- as long as we are moving in the right direction, it is a good
117+
idea to find a good place to stop and leave the rest for later." On
118+
the note about non-patch diff formats, Junio remarked that "it would
119+
not hurt if these are omitted", which led to a small back-and-forth
120+
where Michael initially thought he was being asked to do something
121+
extra in a follow-up; Junio clarified that the series was already
122+
omitting them ("You are already omitting, no? I took 'remain
123+
unimplemented' to mean exactly that"), and that simply *mentioning*
124+
the omission, as the cover letter already did, was the right thing
125+
to do.
126+
127+
Junio also pointed out that the "Michael Montalbo (4): ..." block in
128+
the cover letter looked like a reflowed duplicate of the proper
129+
commit list right below it. Michael acknowledged that as a mistake
130+
in crafting the cover letter and offered to add a few names from
131+
`git shortlog --no-merges -s -n line-log.[ch]` to the Cc list to
132+
attract more reviewers.
133+
134+
On the documentation patch (4/4), Kristoffer Haugsbakk caught a
135+
subtle AsciiDoc problem: by indenting the new paragraph with tabs,
136+
Michael had inadvertently turned the new prose into a code block.
137+
Kristoffer recommended dropping the indentation in favour of a plain
138+
list-continuation marker so the text would render as regular
139+
paragraph text, "flush to the left." Michael thanked him and folded
140+
the fix into his next iteration.
141+
142+
For readers less familiar with the relevant pieces of the diff
143+
stack, a few words of context may help.
144+
145+
`git log -L` is itself a relatively unusual citizen in Git's command
146+
zoo: most `git log` machinery walks commits and emits whatever its
147+
configured formatters dictate, but `-L` additionally carries a set
148+
of line ranges per file, narrowing the history to commits that touch
149+
those ranges. Mapping that range-aware view onto the standard diff
150+
output machinery is non-trivial because xdiff itself does not know
151+
anything about the user's tracked ranges; it just produces a unified
152+
diff for two blobs. The new callback wrappers introduced in patch 2
153+
bridge that gap by intercepting xdiff's output as it is generated
154+
and discarding hunks that fall outside the requested ranges.
155+
156+
The `diffcore_std()` function is the standard point at which Git
157+
applies a number of cross-cutting transformations to a queued set of
158+
diff pairs: rename detection, the pickaxe filters (`-S`, `-G`,
159+
`--find-object`), the orderfile sort, and the `--diff-filter`
160+
filter, among others. Once `-L` actually feeds its pairs through
161+
this function, all of those features become available essentially
162+
"for free." That is also why patch 1 has to be careful: rename
163+
detection performed during the line-history walk must *not* let a
164+
user's pickaxe filter inadvertently throw away the very pairs the
165+
rename machinery needs to do its job.
166+
167+
After the initial round of review, Michael sent
168+
[version 2](https://lore.kernel.org/git/pull.2065.v2.git.1773714095.gitgitgadget@gmail.com)
169+
of the series. The only structural change from v1 was that patch 4
170+
now uses a list-continuation marker instead of indentation in
171+
`Documentation/line-range-options.adoc`, addressing Kristoffer's
172+
review feedback so the new paragraph renders correctly. The
173+
crash-fix patch (1/4) also gained an explanatory comment in its test
174+
file about commit-level filtering with pickaxe still being a known
175+
limitation: `show_log()` prints the commit header before
176+
`diffcore_std()` runs, so commits cannot yet be suppressed even when
177+
no diff pairs survive filtering. Fixing that would require deferring
178+
`show_log()` until after `diffcore_std()`, which is again a larger
179+
log-tree restructuring that v2 explicitly leaves for later.
180+
181+
Junio reviewed v2 patch 2 again and was generally positive. He noted
182+
that "huge diff to the test material mostly comes from the addition
183+
of the diff headers like the index line, etc., which makes this
184+
patch scary but is very welcome addition", and on the new field
185+
`line_ranges` carried on `struct diff_filepair`, simply replied
186+
"OK." On the rewrite of `line_log_print()` itself, which now queues
187+
a duplicated filepair per range, attaches the borrowed
188+
`line_ranges`, and calls `diffcore_std()` followed by
189+
`diff_flush()`, he wrote: "Very welcome change."
190+
191+
Some weeks later, after no further substantive review arrived, Junio
192+
came back to the v2 cover letter and wrote, in a slightly resigned
193+
but encouraging tone: "The central part of the series (i.e., patch
194+
#2) looked quite sensible. I haven't read the tests very carefully,
195+
though. I was hoping that we will see another set of eyes or two to
196+
help review this series, but nothing has happened in the past few
197+
weeks, so let's mark the topic for 'next'." The series was later
198+
merged into 'master' and these improvements have been released as
199+
part of Git v2.54.0.
200+
201+
This is an example of long-standing technical debt finally being
202+
paid down. A `NEEDSWORK` comment that has lived in `line-log.c` for
203+
many years is finally retired; an entire family of diff options
204+
(formatting, whitespace, pickaxe, output-indicator, prefix,
205+
color-moved, and more) becomes available with `-L` for the first
206+
time; and a real assertion failure involving merges, renames, and
207+
pickaxe filters is fixed along the way.
27208

28209
<!---
29210
### Support

0 commit comments

Comments
 (0)