probe: futex tracing — 3 idle waiters, not contention#143
Merged
Conversation
Third follow-up to PR #128. Adds bpftrace probe-futex-trace.sh that aggregates per-uaddr futex wait time (sys_enter/sys_exit_futex correlated by tid) during a BRANCH window, plus a cross-reference against /proc/\$pid/maps so we can place each uaddr in FC's address space. Findings (one 8 s window over a 153 ms BRANCH): @wait_ns[0x...7648, 137]: 152.35 ms (3 calls) @wait_ns[0x...8c88, 137]: 152.49 ms (2 calls) @wait_ns[0x...8d08, 137]: 152.56 ms (2 calls) @wait_ns[0x...52a0, 137]: 1.14 ms (6 calls) Op 137 = FUTEX_WAIT_BITSET_PRIVATE. The 3 hot futexes each accumulated ~152 ms wait ≈ the entire pause window. They live in anonymous heap mappings adjacent to FC binary's .bss (Rust heap- allocated synchronization primitives — parking_lot::Mutex / Condvar inner words). Interpretation: this is NOT contention on a hot mutex. It's 3 idle threads parked on the snapshot worker's completion signal, each sleeping the entire pause window then waking when the worker finishes. The contention is the *symptom*; the real bottleneck is whatever the snapshot worker does single-threaded. Revised #118 implication (third pass): - Phase 2 (io_uring) addresses ~2 % of the window — narrows further - Phase 3 (1 s tick) compounds nothing about the futexes — the 3 waiters are passive - Next operational step: build FC with DWARF symbols + perf-record flamegraph the snapshot worker thread → find the per-snapshot growing loop inside vmm::persist::create_snapshot Refs #118. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Third follow-up to #128. bpftrace on the futex syscall pair, aggregating per-uaddr wait time during a BRANCH window. Adds `probe-futex-trace.sh` and a new "Follow-up: futex tracing" section to PROBE-multi-branch-anomaly.md.
Finding
8 s window over one 153 ms BRANCH:
```
@wait_ns[0x...7648, 137]: 152.35 ms (3 calls)
@wait_ns[0x...8c88, 137]: 152.49 ms (2 calls)
@wait_ns[0x...8d08, 137]: 152.56 ms (2 calls)
@wait_ns[0x...52a0, 137]: 1.14 ms (6 calls) ← FC .data/.bss
```
Op 137 = `FUTEX_WAIT_BITSET_PRIVATE`. Three different futexes each accumulated ~152 ms ≈ the entire pause window. Cross-referenced against `/proc/$pid/maps`: three of them fall in anonymous heap mappings adjacent to FC binary's `.bss` (consistent with Rust heap-allocated synchronization primitives — `parking_lot::Mutex`/`Condvar` inner atomics). One falls in FC's `.data/.bss` (possibly a `static`-held mutex, lower amplitude).
Interpretation flip
The previous probe pass said "futex contention". This one corrects: it's not contention, it's 3 idle waiters parked on the snapshot worker's completion signal. Each thread sleeps the entire pause then wakes when the worker finishes. Eliminating the futex calls wouldn't speed anything up — the bottleneck is whatever the snapshot worker is doing single-threaded.
Revised #118 implication (third pass)
Files
Refs #118.
🤖 Generated with Claude Code