Skip to content

perf: pre-compile APM tag regex filters at startup#1138

Open
Dogbu-cyber wants to merge 2 commits intomainfrom
david.ogbureke/precompile-regex-filters
Open

perf: pre-compile APM tag regex filters at startup#1138
Dogbu-cyber wants to merge 2 commits intomainfrom
david.ogbureke/precompile-regex-filters

Conversation

@Dogbu-cyber
Copy link
Copy Markdown
Contributor

@Dogbu-cyber Dogbu-cyber commented Mar 29, 2026

Overview

Pre-compile all four APM tag filter configs (DD_APM_FILTER_TAGS_REQUIRE, DD_APM_FILTER_TAGS_REJECT, DD_APM_FILTER_TAGS_REGEX_REQUIRE, DD_APM_FILTER_TAGS_REGEX_REJECT) at startup into a single TagFilters struct, rather than parsing them on every trace chunk.

Previously, regex filters were being compiled via Regex::new on every root span evaluated, and exact-match filters were split and trimmed on every call to filter_span_by_tags. Since these filters come from environment variables that don't change at runtime, there is no reason to repeatedly parse and compile them. This change compiles all four filter lists once at startup into ExactFilter and RegexFilter structs and shares them via Arc<TagFilters>.

As a consequence, ChunkProcessor no longer needs to hold Arc<config::Config>. Its only use of config was accessing those four filter fields, so it now holds Arc<TagFilters> directly.

Note: Invalid regex patterns in DD_APM_FILTER_TAGS_REGEX_* are now validated once at startup and skipped with a debug log if invalid. Previously, invalid regexes were re-parsed on every root span evaluation and effectively never matched.

Testing

Existing unit tests in trace_processor.rs cover all four filter paths (require/reject, exact/regex). Test sites that constructed ChunkProcessor directly were updated to remove the config field and pass Arc<TagFilters> instead. Test sites calling filter_span_by_tags were updated to the new single-argument signature.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves trace filtering performance by pre-compiling APM tag filters (exact and regex) once at startup into a shared TagFilters structure, avoiding repeated parsing/regex compilation on every trace chunk/root span evaluation.

Changes:

  • Introduces TagFilters plus ExactFilter/RegexFilter and compiles filter config once in ServerlessTraceProcessor::new.
  • Updates span filtering to use precompiled filters (filter_span_by_tags(span, &TagFilters)) and removes Arc<config::Config> from ChunkProcessor.
  • Updates call sites and unit tests to construct ServerlessTraceProcessor via new(...) and adapt to the new filtering signature.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
bottlecap/src/traces/trace_processor.rs Adds TagFilters compilation + refactors filtering to use precompiled exact/regex filters; updates tests accordingly.
bottlecap/src/lifecycle/invocation/processor.rs Updates test setup to construct ServerlessTraceProcessor via the new constructor.
bottlecap/src/bin/bottlecap/main.rs Updates agent startup wiring to use ServerlessTraceProcessor::new(...).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 208 to +213
attach_span_pointers_to_meta(&mut span.meta, &self.span_pointers);
}
}
}

fn filter_span_by_tags(span: &Span, config: &config::Config) -> bool {
fn filter_span_by_tags(span: &Span, tag_filters: &TagFilters) -> bool {
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The require filter logic uses all(...) (AND semantics). In config/env.rs, the DD_APM_FILTER_TAGS_REQUIRE docs say spans matching at least one tag are sent (OR semantics). Please align implementation/tests with the documented behavior, or update the env var documentation to reflect that all required tags must match.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@Dogbu-cyber Dogbu-cyber Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.datadoghq.com/tracing/guide/ignoring_apm_resources/?tab=kubernetes

Public docs and previous logic indicate that the require filter should be using AND semantics, which would mean the comment in env.rs was wrong.

Comment on lines 233 to 236
.map(format_exact_filter)
.collect::<Vec<_>>()
.join(", ")
require_tags.join(", ")
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the exact-match require path: this uses all(...) (AND semantics) for DD_APM_FILTER_TAGS_REGEX_REQUIRE, but config/env.rs describes "at least one" match (OR semantics). Either the code/tests or the env var docs should be updated so they agree.

Suggested change
.map(format_exact_filter)
.collect::<Vec<_>>()
.join(", ")
require_tags.join(", ")
.any(|filter| span_matches_regex_filter(span, filter));
if !matches_require_regex {
debug!(
"TRACE_PROCESSOR | Filtering out span '{}' - doesn't match any required regex tags {}",

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +128
})
} else {
debug!(
"TRACE_PROCESSOR | Invalid regex pattern '{}' for key '{}', skipping filter",
pattern.trim(),
key.trim()
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch says it's logging a warning and skipping invalid patterns, but it currently uses debug!. Consider logging at warn! (and importing tracing::warn) so misconfiguration is visible in default logs. Also consider including which env var/list the filter came from (require vs reject) since this helper is used for both.

Copilot uses AI. Check for mistakes.
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/precompile-regex-filters branch 3 times, most recently from 5abf32b to b046848 Compare March 29, 2026 00:56
Compile DD_APM_FILTER_TAGS_REGEX_REQUIRE and DD_APM_FILTER_TAGS_REGEX_REJECT
patterns once at startup into a RegexFilter struct { key, regex: Option<Regex> }
rather than re-parsing and compiling on every span.
@Dogbu-cyber Dogbu-cyber force-pushed the david.ogbureke/precompile-regex-filters branch from b046848 to d0743b0 Compare March 29, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants