Add instruction based stack slot usage analysis#6
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends stackwhere list to detect additional stack slot usage by analyzing eBPF instructions (to fill gaps left by DWARF-only variable tracking), and enriches that analysis with DWARF range/callstack context.
Changes:
- Add instruction-pattern-based stack slot discovery (Mov from
R10+ stack-offset arithmetic) and merge it with DWARF-derived slots instackwhere list. - Introduce DWARF
.debug_rnglistsparsing and exposeDW_AT_low_pc/DW_AT_high_pc/DW_AT_rangesto build instruction→DWARF-node mappings. - Add new
spilltest program and tests covering the new instruction-based analysis path.
Reviewed changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| testdata/spill.c | New BPF test program intended to force stack spills/stack slot patterns not present in DWARF vars. |
| testdata/Makefile | Builds the new spill test object. |
| internal/dwarf/tree.go | Adds range-list support to DWARF tree nodes via Node.Ranges(). |
| internal/dwarf/reexport.go | Re-exports additional DWARF attrs required for range mapping. |
| internal/dwarf/rangelist.go | New .debug_rnglists parser to support DW_AT_ranges. |
| internal/dwarf/loclist.go | Minor comment tweak (no behavior change). |
| cmd/stackwhere/list.go | Loads eBPF collection, merges DWARF+instruction-derived stack slot usage, and builds instruction→callstack mapping. |
| cmd/stackwhere/list_test.go | Adds tests for DWARF-only and instruction-based slot discovery. |
| go.mod | Adds new module dependencies for eBPF parsing and testing. |
| go.sum | Updates dependency checksums accordingly. |
Comments suppressed due to low confidence (4)
cmd/stackwhere/list.go:475
- instructionToNodes does an unchecked type assertion on prog.Entry().Val(AttrLowpc). If AttrLowpc is missing or not a uint64, this will panic and break listing. Handle the nil / unexpected-type cases and return an empty mapping (or an error) instead of panicking.
func instructionToNodes(prog *dwarf.Node) map[uint64][]*dwarf.Node {
instRange := make(map[uint64][]*dwarf.Node)
progInsOffset := prog.Entry().Val(dwarf.AttrLowpc).(uint64)
cmd/stackwhere/list.go:449
- line can remain nil if there is no BTF line info after the Mov/Add sequence (e.g. objects built without line info, or patterns near end of program). Dereferencing line.FileName()/LineNumber() will panic. Guard against nil and use an empty/unknown FileCol fallback when no line is found.
Offset: -prog.Instructions[iter.Index+1].Constant,
Name: iter.Ins.Dst.String(),
ByteSize: -1,
FileCol: line.FileName() + ":" + fmt.Sprint(line.LineNumber()),
}
cmd/stackwhere/list.go:482
- AttrHighpc in DWARF can be either an address (uint64) or an offset from lowpc (int64), but this code assumes int64 and will panic on uint64. Handle both representations when building the instruction ranges to avoid runtime panics on different toolchains.
highpc := n.Entry().Val(dwarf.AttrHighpc)
if lowpc != nil && highpc != nil {
for i := lowpc.(uint64); i < lowpc.(uint64)+uint64(highpc.(int64)); i += asm.InstructionSize {
instRange[i-progInsOffset] = append(instRange[i-progInsOffset], n)
}
internal/dwarf/rangelist.go:161
- DW_RLE_offset_pair operands are offsets relative to the current base address, but this code stores them directly as Range.Start/End. If base != 0 this will produce incorrect ranges and break instruction mapping. Convert offset pairs to absolute addresses (base+start/base+end) before appending to entry.Ranges.
case DW_RLE_offset_pair:
var rng Range
var l uint32
rng.Start, l, err = leb128.DecodeUnsigned(r)
if err != nil {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
717b4ba to
511682c
Compare
It turns out that clang does not always emit DWARF variable nodes for all stack slot usage. This caused gaps in the list output. This commit adds a second source of stack slot usage information be analyzing the eBPF instructions directly to find pattens missed by the DWARF analysis. Specifically when clang emits instructions like: ``` Mov R2, R10 Add R2, -16 ... Call some_func ``` This is a common pattern when values are passed to helper functions or kfuncs. The downside is that we do not get information about the size of the variable being referenced. Specifically for map lookups there is no size parameter passed, rather size is inferred from the map definition. So for now we just report the slot size as -1 / unknown. During testing it was found that the Mov+Add instructions usually do not have line info themselfs and that the closest line info is often on the Call instruction below it. So when we find the pattern we associate the first line info below the Mov+Add instructions with the stack slot usage. To obtain the callstack we create a instruction -> callstack mapping for every instruction in the program based on DWARF lowpc, highpc and range lists. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
511682c to
5192e8b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It turns out that clang does not always emit DWARF variable nodes for all stack slot usage. This caused gaps in the list output.
This commit adds a second source of stack slot usage information be analyzing the eBPF instructions directly to find pattens missed by the DWARF analysis. Specifically when clang emits instructions like:
This is a common pattern when values are passed to helper functions or kfuncs.
The downside is that we do not get information about the size of the variable being referenced. Specifically for map lookups there is no size parameter passed, rather size is inferred from the map definition. So for now we just report the slot size as -1 / unknown.
During testing it was found that the Mov+Add instructions usually do not have line info themselfs and that the closest line info is often on the Call instruction below it. So when we find the pattern we associate the first line info below the Mov+Add instructions with the stack slot usage.
To obtain the callstack we create a instruction -> callstack mapping for every instruction in the program based on DWARF lowpc, highpc and range lists.
Fixes: #1