Skip to content

probe: futex tracing — 3 idle waiters, not contention#143

Merged
WaylandYang merged 1 commit into
mainfrom
probe/118-futex-tracing
May 21, 2026
Merged

probe: futex tracing — 3 idle waiters, not contention#143
WaylandYang merged 1 commit into
mainfrom
probe/118-futex-tracing

Conversation

@WaylandYang
Copy link
Copy Markdown
Contributor

Summary

Third follow-up to #128. bpftrace on the futex syscall pair, aggregating per-uaddr wait time during a BRANCH window. Adds `probe-futex-trace.sh` and a new "Follow-up: futex tracing" section to PROBE-multi-branch-anomaly.md.

Finding

8 s window over one 153 ms BRANCH:

```
@wait_ns[0x...7648, 137]: 152.35 ms (3 calls)
@wait_ns[0x...8c88, 137]: 152.49 ms (2 calls)
@wait_ns[0x...8d08, 137]: 152.56 ms (2 calls)
@wait_ns[0x...52a0, 137]: 1.14 ms (6 calls) ← FC .data/.bss
```

Op 137 = `FUTEX_WAIT_BITSET_PRIVATE`. Three different futexes each accumulated ~152 ms ≈ the entire pause window. Cross-referenced against `/proc/$pid/maps`: three of them fall in anonymous heap mappings adjacent to FC binary's `.bss` (consistent with Rust heap-allocated synchronization primitives — `parking_lot::Mutex`/`Condvar` inner atomics). One falls in FC's `.data/.bss` (possibly a `static`-held mutex, lower amplitude).

Interpretation flip

The previous probe pass said "futex contention". This one corrects: it's not contention, it's 3 idle waiters parked on the snapshot worker's completion signal. Each thread sleeps the entire pause then wakes when the worker finishes. Eliminating the futex calls wouldn't speed anything up — the bottleneck is whatever the snapshot worker is doing single-threaded.

Revised #118 implication (third pass)

  • Phase 2 (io_uring) addresses ~2 % of the window — narrows further from "Phase 2 still helps disk write" to "Phase 2 doesn't help the dominant cost"
  • Phase 3 (1 s tick) is orthogonal to the waiters
  • Next operational step: build FC with DWARF symbols and perf-record-flamegraph the snapshot worker thread. The current FC binary is static-pie-linked without frame pointers; bpftrace ustack can't symbolize. Until that's set up, we can't tell which Rust function in `vmm::persist::create_snapshot` (~21 KB of compiled code) holds the per-snapshot growing loop.

Files

  • `bench/pause-window/probe-futex-trace.sh` — the probe script
  • `bench/pause-window/PROBE-multi-branch-anomaly.md` — new section + revised next-steps

Refs #118.

🤖 Generated with Claude Code

Third follow-up to PR #128. Adds bpftrace probe-futex-trace.sh that
aggregates per-uaddr futex wait time (sys_enter/sys_exit_futex
correlated by tid) during a BRANCH window, plus a cross-reference
against /proc/\$pid/maps so we can place each uaddr in FC's address
space.

Findings (one 8 s window over a 153 ms BRANCH):

  @wait_ns[0x...7648, 137]: 152.35 ms  (3 calls)
  @wait_ns[0x...8c88, 137]: 152.49 ms  (2 calls)
  @wait_ns[0x...8d08, 137]: 152.56 ms  (2 calls)
  @wait_ns[0x...52a0, 137]:   1.14 ms  (6 calls)

Op 137 = FUTEX_WAIT_BITSET_PRIVATE. The 3 hot futexes each
accumulated ~152 ms wait ≈ the entire pause window. They live in
anonymous heap mappings adjacent to FC binary's .bss (Rust heap-
allocated synchronization primitives — parking_lot::Mutex /
Condvar inner words).

Interpretation: this is NOT contention on a hot mutex. It's
3 idle threads parked on the snapshot worker's completion signal,
each sleeping the entire pause window then waking when the worker
finishes. The contention is the *symptom*; the real bottleneck is
whatever the snapshot worker does single-threaded.

Revised #118 implication (third pass):
- Phase 2 (io_uring) addresses ~2 % of the window — narrows further
- Phase 3 (1 s tick) compounds nothing about the futexes — the
  3 waiters are passive
- Next operational step: build FC with DWARF symbols + perf-record
  flamegraph the snapshot worker thread → find the per-snapshot
  growing loop inside vmm::persist::create_snapshot

Refs #118.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@WaylandYang WaylandYang merged commit 3c39eb5 into main May 21, 2026
2 checks passed
@WaylandYang WaylandYang deleted the probe/118-futex-tracing branch May 21, 2026 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant