feat(ingress): add opt-in agent audit header propagation#4554
Open
teochenglim wants to merge 2 commits intorestatedev:mainfrom
Open
feat(ingress): add opt-in agent audit header propagation#4554teochenglim wants to merge 2 commits intorestatedev:mainfrom
teochenglim wants to merge 2 commits intorestatedev:mainfrom
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
Author
|
I have read the CLA Document and I hereby sign the CLA |
Contributor
|
Thanks a lot for creating this PR @teochenglim. We probably need a little bit to properly review your contribution as the team is quite busy these days. @slinkydeveloper and @gvdongen for your visibility as you were looking into tracing and how to integrate Restate with AI observability tools before. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Agent Audit — Design Doc
Date: 2026-04-03
Status: Ready for implementation
Problem
When Restate is used to orchestrate multi-agent AI workflows, each agent invocation needs to carry enough identity context to answer:
triggered_by)conversation_id)agent_id)workflow_id)workflow_step)Today, none of these are first-class concepts in Restate. Users must roll their own ad-hoc solutions.
Audit Trace Model
Field-to-Restate Mapping
trace_idServiceInvocationSpanContextparent_trace_idagent_idctx.key()(object key)agent_typeinvocation_target.service_name()agent_versionworkflow_idctx.invocation_id()workflow_definvocation_target(name + handler)workflow_stepinvocation_target.handler_name()triggered_byconversation_idOnly
triggered_byandconversation_idneed explicit header propagation — everything else is already derivable from Restate's existing invocation context.Chosen Approach: Well-Known Headers (opt-in, disabled by default)
Design Principles
x-restate-audit-*headers are stripped at ingress so no untrusted client can inject fake audit context. When enabled, they pass through to handlers.traceparent.Header Constants
Defined in
restate_types::invocation::audit:Config
In
IngressOptions(crates/types/src/config/ingress.rs):File Changes
crates/types/src/invocation/audit.rscrates/types/src/invocation/mod.rspub mod audit;crates/types/src/config/ingress.rsagent_audit: bool(defaultfalse)crates/ingress-http/src/handler/mod.rsagent_audit: booltoHandlerstructcrates/ingress-http/src/server.rsagent_auditfromIngressOptions→HyperServerIngress→Handlercrates/ingress-http/src/handler/service_handler.rsx-restate-audit-*inparse_headers()when disabledHeader Stripping in
parse_headers()Usage Pattern (Python SDK)
The SDK receives both constants as well-known strings to reference.
What Is Not In This PR
The following were considered and explicitly deferred:
ctx.run()Alternative Approaches (PR Comments)
Alt 1: Server-side auto-propagation
What: When Agent A calls Agent B, the Restate server automatically copies
x-restate-audit-*headers from the caller'sServiceInvocation.headersinto the callee'sServiceInvocation.headers.Where:
crates/worker/src/partition/state_machine/entries/call_commands.rs, in_ApplyCallCommand::apply(), after theCallRequestis destructured — merge anyx-restate-audit-*headers fromcaller_invocation_metadatainto the outgoingServiceInvocation.headers.Trade-offs:
caller_invocation_status)headersonInvocationMetadata(currently only onServiceInvocation), or a separate lookupVerdict: Correct long-term direction for a fully-managed audit trail, but too much scope for a minimum PR. Revisit after Option A is validated.
Alt 2: Audit headers as first-class fields on
ServiceInvocationWhat: Instead of using
Vec<Header>as the carrier, addaudit_ctx: Option<AuditContext>directly toServiceInvocationandCallRequest.Where:
crates/types/src/invocation/mod.rs(ServiceInvocation) andcrates/types/src/journal_v2/command.rs(CallRequest).Trade-offs:
CallRequestis part of the service protocol v4 Bilrost encoding; adding a field requires a protocol version bumpVerdict: The right design if audit becomes a core Restate primitive (like idempotency key is today). Premature for the initial feature.
Alt 3: OTel span attributes instead of headers
What: Store
triggered_byandconversation_idas OpenTelemetry span attributes on theServiceInvocationSpanContextrather than as headers.Where: Extend
SpanContextDefor add a bag toServiceInvocationSpanContextincrates/types/src/invocation/mod.rs; emit attributes viainvocation_span!macro incrates/tracing-instrumentation.Trade-offs:
Verdict: Useful as a complementary feature (emit audit fields as span attributes when audit is enabled), not a replacement. Could be layered on top of Option A later.