-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Description
This issue is specific to users that rely on dlopen + library mode.
When libdd_profiling.so is loaded via dlopen, threads that already exist do not get allocation profiling. This is because TrackerThreadLocalState is only initialized via:
notify_thread_start() — called from the pthread_create hook for new threads
AllocationTracker::allocation_tracking_init() — for the thread that starts tracking
Pre-existing threads hit the allocation hooks (GOT patching is process-wide), but get_tl_state() returns nullptr and the hooks silently skip tracking.
Why lazy init was removed
Lazy init from allocation hooks is dangerous because the hooks run with the allocator's internal lock held:
We used to rely on pthread tls APIs which could allocate. This is no longer the case.
init_tl_state() currently calls retrieve_stack_bounds() → pthread_getattr_np(), which can call malloc internally. This would deadlock on the already-held allocator lock.
This is not a reentry problem solvable with our reentry_guard — the deadlock happens inside glibc's allocator, below our interposition layer.
What init_tl_state() does today
auto *state = new (ddprof_lib_state) TrackerThreadLocalState{}; // placement new into TLS — safe
state->tid = ddprof::gettid(); // syscall — safe
state->stack_bounds = retrieve_stack_bounds(); // pthread_getattr_np — NOT safe (can malloc)
The TrackerThreadLocalState constructor also calls std::random_device{}() to seed the RNG. On modern glibc this uses the getrandom() syscall (no heap allocation), but this is implementation-dependent.
Next Steps
Ensure that all of the work done within the init avoids allocations.