perf: pre-compile APM tag regex filters at startup#1138
perf: pre-compile APM tag regex filters at startup#1138Dogbu-cyber wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves trace filtering performance by pre-compiling APM tag filters (exact and regex) once at startup into a shared TagFilters structure, avoiding repeated parsing/regex compilation on every trace chunk/root span evaluation.
Changes:
- Introduces
TagFiltersplusExactFilter/RegexFilterand compiles filter config once inServerlessTraceProcessor::new. - Updates span filtering to use precompiled filters (
filter_span_by_tags(span, &TagFilters)) and removesArc<config::Config>fromChunkProcessor. - Updates call sites and unit tests to construct
ServerlessTraceProcessorvianew(...)and adapt to the new filtering signature.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| bottlecap/src/traces/trace_processor.rs | Adds TagFilters compilation + refactors filtering to use precompiled exact/regex filters; updates tests accordingly. |
| bottlecap/src/lifecycle/invocation/processor.rs | Updates test setup to construct ServerlessTraceProcessor via the new constructor. |
| bottlecap/src/bin/bottlecap/main.rs | Updates agent startup wiring to use ServerlessTraceProcessor::new(...). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| attach_span_pointers_to_meta(&mut span.meta, &self.span_pointers); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| fn filter_span_by_tags(span: &Span, config: &config::Config) -> bool { | ||
| fn filter_span_by_tags(span: &Span, tag_filters: &TagFilters) -> bool { |
There was a problem hiding this comment.
The require filter logic uses all(...) (AND semantics). In config/env.rs, the DD_APM_FILTER_TAGS_REQUIRE docs say spans matching at least one tag are sent (OR semantics). Please align implementation/tests with the documented behavior, or update the env var documentation to reflect that all required tags must match.
There was a problem hiding this comment.
https://docs.datadoghq.com/tracing/guide/ignoring_apm_resources/?tab=kubernetes
Public docs and previous logic indicate that the require filter should be using AND semantics, which would mean the comment in env.rs was wrong.
| .map(format_exact_filter) | ||
| .collect::<Vec<_>>() | ||
| .join(", ") | ||
| require_tags.join(", ") |
There was a problem hiding this comment.
Same as the exact-match require path: this uses all(...) (AND semantics) for DD_APM_FILTER_TAGS_REGEX_REQUIRE, but config/env.rs describes "at least one" match (OR semantics). Either the code/tests or the env var docs should be updated so they agree.
| .map(format_exact_filter) | |
| .collect::<Vec<_>>() | |
| .join(", ") | |
| require_tags.join(", ") | |
| .any(|filter| span_matches_regex_filter(span, filter)); | |
| if !matches_require_regex { | |
| debug!( | |
| "TRACE_PROCESSOR | Filtering out span '{}' - doesn't match any required regex tags {}", |
| }) | ||
| } else { | ||
| debug!( | ||
| "TRACE_PROCESSOR | Invalid regex pattern '{}' for key '{}', skipping filter", | ||
| pattern.trim(), | ||
| key.trim() |
There was a problem hiding this comment.
This branch says it's logging a warning and skipping invalid patterns, but it currently uses debug!. Consider logging at warn! (and importing tracing::warn) so misconfiguration is visible in default logs. Also consider including which env var/list the filter came from (require vs reject) since this helper is used for both.
5abf32b to
b046848
Compare
Compile DD_APM_FILTER_TAGS_REGEX_REQUIRE and DD_APM_FILTER_TAGS_REGEX_REJECT
patterns once at startup into a RegexFilter struct { key, regex: Option<Regex> }
rather than re-parsing and compiling on every span.
b046848 to
d0743b0
Compare
Overview
Pre-compile all four APM tag filter configs (
DD_APM_FILTER_TAGS_REQUIRE,DD_APM_FILTER_TAGS_REJECT,DD_APM_FILTER_TAGS_REGEX_REQUIRE,DD_APM_FILTER_TAGS_REGEX_REJECT) at startup into a singleTagFiltersstruct, rather than parsing them on every trace chunk.Previously, regex filters were being compiled via
Regex::newon every root span evaluated, and exact-match filters were split and trimmed on every call tofilter_span_by_tags. Since these filters come from environment variables that don't change at runtime, there is no reason to repeatedly parse and compile them. This change compiles all four filter lists once at startup intoExactFilterandRegexFilterstructs and shares them viaArc<TagFilters>.As a consequence,
ChunkProcessorno longer needs to holdArc<config::Config>. Its only use of config was accessing those four filter fields, so it now holdsArc<TagFilters>directly.Note: Invalid regex patterns in
DD_APM_FILTER_TAGS_REGEX_*are now validated once at startup and skipped with a debug log if invalid. Previously, invalid regexes were re-parsed on every root span evaluation and effectively never matched.Testing
Existing unit tests in
trace_processor.rscover all four filter paths (require/reject, exact/regex). Test sites that constructedChunkProcessordirectly were updated to remove theconfigfield and passArc<TagFilters>instead. Test sites callingfilter_span_by_tagswere updated to the new single-argument signature.