android: Fix GC hook installation on Android 16 (CMC + stripped libart)#394
Open
Kolektori wants to merge 1 commit into
Open
android: Fix GC hook installation on Android 16 (CMC + stripped libart)#394Kolektori wants to merge 1 commit into
Kolektori wants to merge 1 commit into
Conversation
Android 16 ships libart.so without .symtab, so findExportByName and findSymbolByName miss internal ART symbols. Parse .gnu_debugdata via enumerateSymbols() and cache the result per module. Also attach the GC synchronize-on-leave hook to MarkCompact::RunPhases for Android 16's Concurrent Mark Compact collector, and fall back to ConcurrentCopying::RunPhases when CopyingPhase is inlined. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
oleavr
requested changes
May 11, 2026
| if (byName === undefined) { | ||
| byName = new Map(); | ||
| try { | ||
| for (const sym of module.enumerateSymbols()) { |
Member
There was a problem hiding this comment.
Thanks for the detailed write-up.
Before merging I want to understand the findSymbolByName vs enumerateSymbols split, because in Gum both go through the same gum_elf_module_enumerate_symbols, which already falls back to .gnu_debugdata when .symtab is missing. So in principle they shouldn't disagree.
A few questions:
- Which frida / frida-server version on the A16 device? The mini-debuginfo fallback landed in gum 8ed32c4d (Dec 2024), the dynsym fallback in 01eadbff (Mar 2026).
- Does
findSymbolByName('libart.so', '_ZN3art2gc4Heap22CollectGarbageInternalENS0_9collector6GcTypeENS0_7GcCauseEbj')start working after an
enumerateSymbols()pass on the same module? That would point at a state/ordering bug in Gum. readelf -Son the device'slibart.so— which of.symtab/.dynsym/.gnu_debugdataare present?
If it's a Gum bug I'd rather fix it there.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #387.
On Android 16 (e.g. build
BP4A.251205.006,com.android.art@361302280) the process crashes with a NULL dereference inart::CodeInfo::DecodeGcMasksOnlyduring a GC stack walk whenever a replacementArtMethodis on some thread's stack when the collector runs. The reporter's backtrace in #387 shows the fault driven byart::gc::collector::MarkCompact::RunPhases→VisitRoots→StackVisitor::WalkStack→DecodeGcMasksOnly.Two A16 libart.so changes together break the GC synchronization machinery in
ensureArtKnowsHowToHandleReplacementMethodsandinstrumentArtGarbageCollection:libart.so is now stripped —
.symtabis gone but the library retains a.gnu_debugdatasection (LZMA-compressed mini-debuginfo).Module.findSymbolByNamereads.dynsym+.symtaband so returnsnullforHeap::CollectGarbageInternalandConcurrentCopying::CopyingPhaseon A16, silently skipping the hook install.Module.enumerateSymbols()does parse the mini-debuginfo, so the symbols are still reachable that way.A16 defaults to Concurrent Mark Compact (CMC) instead of Concurrent Copying. Even if we resolve
ConcurrentCopying::CopyingPhase, it never fires under CMC, so replacementArtMethods are never re-synchronized after compaction.MarkCompact::RunPhasesis the CMC lifecycle-event equivalent.Changes
resolveDebugdataSymbol(module, name)fallback that lazily cachesModule.enumerateSymbols()per module, and plumb it intotemporaryApi.findas a last resort afterfindExportByName/findSymbolByName. Same plumbing restoresHeap::CollectGarbageInternalresolution transparently for all ART-symbol callers.art.findSymbolByName(...)call sites ininstrumentArtGarbageCollectionandinstrumentArtFixupStaticTrampolinesthroughapi.findso they pick up the mini-debuginfo fallback.CopyingPhaseis inlined (also seen on A16+), fall back toConcurrentCopying::RunPhasesas the hook point — one level up in the same phase-driver function.MarkCompact::RunPhaseswith the existingartController.hooks.Gc.copyingPhasecallback. The callback is collector-agnostic — it just synchronizes entrypoints at a "world is consistent again" lifecycle point — so reusing it for CMC is correct. Both hooks can coexist; only the active collector dispatches its phase.Net diff:
lib/android.js+38 / −4.Testing
Reproduced #387 on a Cuttlefish x86_64 guest running
aosp-android-latest-release(build15150359, API 36, same libart BuildId class as the Pixel 7 reporter's build). Unpatched 7.0.13: HeapTaskDaemon SIGSEGV within seconds of attaching a.implementationhook to any hot constructor (e.g.java.net.URL.<init>(String)orjava.io.File.<init>(String)).With this patch applied, the following hook set runs to completion simultaneously on the same target for a full 90s analysis window:
.implementationswaps onjava.net.URL.<init>(String)andjava.io.File.<init>(String).implementationswaps on all 17java.lang.StringFactory.newString*overloadsInterceptor.attachonart::mirror::String::AllocFromModifiedUtf8(3 overloads) andAllocFromUtf16Interceptor.attachon libc__system_property_get,__system_property_find,open,fopen*,freopen*,statInterceptor.attachon libdl dynamic-loader exportsInterceptor.attachon libart/libdexfile dex-retrieval pathsObserved: zero tombstones, no
DecodeGcMasksOnlyframes, full MITM / logcat / media / trace artifact set collected, hooks actively firing (600+__system_property_getrewrites, 121File.<init>callbacks, 60MessageDigestcallbacks in one run).Notes
nullthere viaapi.find, and the mini-debuginfo fallback is a no-op whenfindSymbolByNamealready succeeds.WeakMap-keyed symbol cache so entries die with the module.Thread::RunFlipFunctionhook as-is — still exported, still correct on CC builds.