Skip to content

Fix stack frame when using runtime async methods#127767

Draft
eduardo-vp wants to merge 35 commits intodotnet:mainfrom
eduardo-vp:naot-build-stack-frame
Draft

Fix stack frame when using runtime async methods#127767
eduardo-vp wants to merge 35 commits intodotnet:mainfrom
eduardo-vp:naot-build-stack-frame

Conversation

@eduardo-vp
Copy link
Copy Markdown
Member

@eduardo-vp eduardo-vp commented May 4, 2026

When building the stack frame in Native AOT, inject continuations if they exist and skip frames marked as hidden, matching CoreCLR behavior.

Works on top of #125396.

Fixes #122547.

tommcdon and others added 30 commits April 15, 2026 14:11
output native sequence points

wip

WIP Testing

More debug output

testing

Temp disable il offset lookup on exceptions

Walk continuation resume chain and append to stackwalk data structure

Use AsyncResumeILStubResolver to get the target method address

Rebase fixes and collect native offsets for Env.StackTrace

Inject ResumeData frames GetStackFramesData::Elements

Truncate stack when async v2 continuation data is present

Fix bad merge

Addional fixes from previous merge

Update to latest Continuation changes in main
Validates that Environment.StackTrace correctly includes async method
frames and filters out internal DispatchContinuations frames when a
runtime async method resumes via continuation dispatch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When executing inside a runtime async (v2) continuation dispatch,
the stack trace is augmented by truncating internal dispatch frames
and appending continuation DiagnosticIP values from the async
continuation chain. This parallels the CoreCLR implementation in
debugdebugger.cpp but operates on the NativeAOT IP-address-based
stack trace model.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix ExtractContinuationData to load AsyncDispatcherInfo type by name
  instead of searching the wrong MethodTable (RuntimeAsyncTask<T>)
- Fix field name: t_dispatcherInfo -> t_current (matching managed code)
- Fix type name: use strstr for RuntimeAsyncTask substring matching
  instead of exact match against non-existent RuntimeAsyncTaskCore
- Add IsDynamicMethod() safety check before AsDynamicMethodDesc() to
  skip non-stub continuations (Task infrastructure callbacks)
- Walk full dispatcher chain via Next pointer (was TODO with break)
- Tighten NativeAOT type matching to AsyncHelpers+RuntimeAsyncTask
- Update env-stacktrace test to verify OuterMethod appears via
  continuation injection (not just physical stack presence)
- Update reflection tests to expect logical async caller at frame 1
  when an async caller exists in the continuation chain

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…action

- Register AsyncDispatcherInfo and its t_current field in corelib.h
  binder instead of using ClassLoader::LoadTypeByNameThrowing and
  manual field iteration
- Use ContinuationObject::GetNext() and GetResumeInfo() accessors
  instead of hand-parsing fields via ApproxFieldDescIterator with
  string name matching
- Add GetNext() accessor to ContinuationObject in object.h
- GC-protect both continuation and pNext as CONTINUATIONREF in the
  gc struct across calls that may trigger GC
- Define ResumeInfoLayout struct with documented field offsets
  matching the managed ResumeInfo struct layout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ameter, tighten boundary check

- Use CoreLib binder DEFINE_FIELD entries for AsyncDispatcherInfo.Next and
  NextContinuation instead of hardcoded DispatcherInfoLayout struct (jkotas)
- Remove unused pContinuationMT parameter from ExtractContinuationData
  (copilot-reviewer)
- NativeAOT FindAsyncDispatchBoundary now also checks method name is
  DispatchContinuations, not just owning type (copilot-reviewer)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dation

Replace runtime binder field offset lookups with direct struct member
access via AsyncDispatcherInfoLayout. Add DEFINE_CLASS_U/DEFINE_FIELD_U
entries so debug builds validate the native struct matches the managed
layout at startup. Keep DEFINE_CLASS/DEFINE_FIELD for t_current TLS
access which is still needed at runtime.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove (DWORD) cast on SIZE_T offset to avoid 64-bit truncation.
Use already-fetched pFunc instead of redundant pCf null check and
triple GetFunction() calls in GetStackFramesCallback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix FindAsyncDispatchBoundary to use the fully qualified NativeAOT
metadata type name for the DispatchContinuations boundary frame.
NativeAOT metadata uses '.' as the nested type separator (e.g.
'System.Runtime.CompilerServices.AsyncHelpers.RuntimeAsyncTask`1')
rather than '+', so the previous substring match never matched.

Use exact string match on the fully qualified type name to avoid
false matches with user-defined methods.

Add AsyncStackTrace test to NativeAOT UnitTests smoke tests that
validates OuterMethod -> MiddleMethod -> InnerMethod continuation
stitching via Environment.StackTrace.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Merge async stitching into InitializeForIpAddressArray to eliminate
  separate scan pass that double-formatted stack metadata (perf fix)
- Detect dispatch boundary during frame construction using already-resolved
  StackFrame metadata (IsAsyncDispatchBoundary) at zero additional cost
- Remove NativeAOT smoke test per reviewer guidance (duplicates library test)
- Fix comment: use correct type name RuntimeAsyncTask, remove NativeAOT-specific framing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Check DiagnosticIP is non-null before appending to continuation list
- Skip null pResumeIp entries before constructing EECodeInfo
- Add pFunc null guard in else-if branch before dereferencing
- Use exact strcmp with fully qualified type name instead of strstr substring

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…c stitching

- Replace expensive GetFullyQualifiedNameForClassNestedAware string formatting
  in debugdebugger.cpp with a MethodDesc pointer comparison via CoreLibBinder.
  Register RuntimeAsyncTask\1 and DispatchContinuations in corelib.h binder.
- Fix NativeAOT FindAsyncDispatchBoundary to use '.' separator instead of '+'
  for nested type names in stack trace formatting.
- Add shared EnvironmentStackTrace_AsyncContinuationStitching library test that
  validates async v2 continuation stitching on both CoreCLR and NativeAOT.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…o assertion messages

- Replace Task.Yield() with Task.Delay(1) for consistency with library test
- Move stack trace output from Console.WriteLine to assertion failure messages
  to reduce CI log noise while preserving diagnostics on failure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hide non-async frames from Environment.StackTrace once runtime async v2
frames are present, making pre-await and post-await traces consistent.

DOTNET_HideAsyncDispatchFrames controls behavior:
  0 = show all frames (no filtering)
  1 = hide all non-async frames below first async frame (default)
  2 = keep non-async frames between async frames, truncate trailing

Implemented for both CoreCLR (C++ stack walk callback + post-processing)
and NativeAOT (managed frame loop + IsAsyncMethod metadata flag).

NativeAOT plumbing: IsAsyncMethod flag flows through StackTraceData
command byte (0x20) -> ILC StackTraceRecordFlags -> metadata blob ->
StackTraceMetadata decoder -> StackFrame.IsAsyncV2Method property.

Tests:
- env-stacktrace: mode 1 (default) pre/post-await consistency
- env-stacktrace-nohide: mode 0 full stack visibility
- env-stacktrace-truncate: mode 2 interim kept, trailing truncated
- Library StackTraceTests: in-process DefaultOn test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update config and code comments to make clear that mode 0 shows all
frames including non-async ones, but async continuation stitching
(dispatch boundary truncation + continuation splicing) still occurs.
This distinguishes it from PhysicalOnly which skips stitching entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the 'V2' implementation detail from the property name to align
with reviewer feedback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add mode 3 which disables async continuation stitching entirely,
returning only the physical call stack. This preserves the pre-async-
stitching behavior as an opt-out for the breaking change.

CoreCLR: skip async detection and continuation extraction in callback
when mode=3. NativeAOT: skip CollectAsyncContinuationIPs when mode=3.

Add env-stacktrace-physical test that verifies post-await traces do
not contain suspended callers when stitching is disabled.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extract inline string literals into constants AsyncDispatchMethodName
and AsyncDispatchOwningType so there is only one place to update if
the method is renamed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add comment explaining that MinimumFunctionAlignment = 4 guarantees
bits 0 and 1 are clear in method RVAs, making them safe for flag
packing. Extract FlagsMask constant for cleaner bit manipulation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On ARM32, RhFindMethodStartAddress adds the THUMB bit (bit 0) to method
addresses. This bit flows into RVA computations used for BinarySearch
lookups in StackTraceMetadata. Packing IsAsyncMethodFlag into bit 0
would collide with the THUMB bit, causing every ARM32 method to appear
as an async method.

Move IsAsyncMethod to a separate auto-property. IsHidden remains packed
in bit 1, which is safe since THUMB only uses bit 0 and
MinimumFunctionAlignment = 4 guarantees bit 1 is always zero.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ExtractContinuationData and CollectAsyncContinuationIPs previously walked
the entire AsyncDispatcherInfo.t_current linked list. Only the innermost
(first) dispatcher represents the currently executing async scope; outer
dispatchers represent already-completed scopes whose continuations should
not appear in the stack trace.

CoreCLR: add break after processing the first dispatcher with continuations.
NativeAOT: replace the for loop with direct access to pInfo->NextContinuation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Revert IsAsyncMethod from a separate bool field back to bit 0 of the
packed RVA, per review feedback that StackTraceData is allocated for all
method bodies and the extra field increases array size.

Add GetAlignedMethodRva helper that masks off bits 0-1 (& ~0x3) at all
three RVA computation sites. MinimumFunctionAlignment = 4 guarantees
these bits are always zero for real method RVAs. On ARM32, the THUMB bit
(bit 0) is present in addresses from RhFindMethodStartAddress and must
be stripped before storage or lookup.

Validated on ARM32 hardware (Raspberry Pi 3) — exit code 100.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update config description to accurately reflect that mode 0 shows all
frames above the dispatch boundary with async continuation stitching,
not all frames unconditionally.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… isolate tests

- Cache CoreLibBinder::GetMethod(METHOD__RUNTIME_ASYNC_TASK__DISPATCH_CONTINUATIONS)
  in GetStackFramesData to avoid repeated binder lookups during stack walk.
- Add explicit #include <sarray.h> in debugdebugger.h.
- Pass computed hideMode into InitializeForIpAddressArray (NativeAOT) to
  eliminate redundant DOTNET_HideAsyncDispatchFrames environment read.
- Wrap async stitching and frame hiding library tests in RemoteExecutor
  with explicit DOTNET_HideAsyncDispatchFrames=1 for test isolation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tommcdon and others added 5 commits April 16, 2026 21:24
- Rename DOTNET_HideAsyncDispatchFrames to DOTNET_StackTraceAsyncBehavior
- Update config description to cover both Environment.StackTrace and
  System.Diagnostics.StackTrace
- Revert pDispatchContinuationsMD caching (CoreLibBinder already caches)
- Remove unnecessary null checks: CoreLibBinder::GetField throws on
  failure, GetThread() is never null from managed code
- Remove left-over mode description comment in NativeAOT
- Pin DOTNET_StackTraceAsyncBehavior in all env-stacktrace test projects
  with RequiresProcessIsolation
- Cache env var read in NativeAOT GetHideAsyncDispatchMode

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… comparison

- Cache DispatchContinuations method start address in StackTraceMetadata
  during first metadata resolution instead of matching names at runtime
- Add IsAsyncDispatchBoundaryMethod virtual callback for address comparison
- Add fallback in NativeAOT: if boundary not found (e.g. inlined), append
  continuation frames at end of stack trace
- Use fully-qualified const strings for boundary method identification

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Exception stack traces should not be modified by DOTNET_StackTraceAsyncBehavior
to maintain consistency with CoreCLR, where exception traces go through a
separate path that does not apply async frame hiding.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Simplify StackTraceAsyncBehavior config to 3 modes:
  0 = show all frames with async continuation stitching
  1 = hide non-async frames below first async frame (default)
  2 = physical only (no async stitching)

The old mode 2 (keep interim non-async frames, truncate trailing) added
complexity without clear value. Old mode 3 is renumbered to mode 2.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 4, 2026 17:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error: Your billing is not configured or you have Copilot licenses from multiple standalone organizations or enterprises. To use premium requests, select a billing entity via the GitHub site, under Settings > Copilot > Features.

@eduardo-vp eduardo-vp marked this pull request as draft May 4, 2026 17:39
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

src/coreclr/nativeaot/System.Private.StackTraceMetadata/src/Internal/StackTraceMetadata/StackTraceMetadata.cs:120

  • Methods that are present in the reflection invoke table are intentionally omitted from the stack-trace mapping blob, so this reflection fallback is the only metadata source for them. Leaving isAsyncMethod hard-coded to false here means reflectable runtime-async methods will never be recognized as async in NativeAOT, which breaks the new hiding/stitching logic for public or otherwise reflectable async methods.
            isStackTraceHidden = false;
            isAsyncMethod = false;

            // We haven't found information in the stack trace metadata tables, but maybe reflection will have this
            if (ReflectionExecution.TryGetMethodMetadataFromStartAddress(methodStartAddress,
                out MetadataReader reader,
                out TypeDefinitionHandle typeHandle,
                out MethodHandle methodHandle))

Comment on lines +231 to +232
pDispatcherInfo = (AsyncDispatcherInfoLayout*)pDispatcherInfo->Next;
continue;
Comment on lines +99 to +105
// Identify and cache the async dispatch boundary method on first encounter.
if (_asyncDispatchBoundaryMethodAddress == IntPtr.Zero &&
isStackTraceHidden &&
methodName is AsyncDispatchBoundaryMethodName &&
owningTypeName is AsyncDispatchBoundaryOwningType)
{
_asyncDispatchBoundaryMethodAddress = methodStartAddress;
Comment on lines +171 to +200
IntPtr[]? continuationIPs = StackTrace.CollectAsyncContinuationIPs();

int visibleIndex = 0;
for (int i = 0; i < totalFrameCount; i++)
{
IntPtr ip = frameArray[i];

if (stackTraceCallbacks != null)
{
IntPtr methodStart = RuntimeImports.RhFindMethodStartAddress(ip);
int offset = (int)((nint)ip - (nint)methodStart);
stackTraceCallbacks.TryGetMethodStackFrameInfo(methodStart, offset, false,
out _, out _, out _, out bool isHidden, out _, out _, out _);

bool isBoundary = stackTraceCallbacks.IsAsyncDispatchBoundaryMethod(methodStart);

// If this frame is the async dispatch boundary and we have continuations,
// inject continuations in place of remaining physical frames.
if (isBoundary && continuationIPs is not null)
{
for (int j = 0; j < continuationIPs.Length; j++)
{
if (visibleIndex == frameIndex)
{
ipAddress = continuationIPs[j];
break;
}
visibleIndex++;
}
InitializeForIpAddress(ipAddress, needFileInfo);
AppDomain *pDomain;
BOOL fDoWeHaveAnyFramesFromForeignStackTrace;
BOOL fAsyncFramesPresent; // True if async frames were present in the stack
DWORD hideAsyncDispatchMode; // 0 = show all above dispatch boundary (with stitching), 1 = hide non-async, 2 = truncate trailing, 3 = physical only (no stitching)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stack traces with async variants

3 participants