fix(integration): compare query fuzz errors by class, not exact string#7550
Open
sandy2008 wants to merge 2 commits into
Open
fix(integration): compare query fuzz errors by class, not exact string#7550sandy2008 wants to merge 2 commits into
sandy2008 wants to merge 2 commits into
Conversation
Three call sites in integration/query_fuzz_test.go compared two
Prometheus HTTP API errors with cmp.Equal(tc.err1, tc.err2), which is
a strict byte-wise string compare on the underlying *promv1.Error.Msg.
When a fuzz query exercised a vector-matching path, the upstream
engine produced messages of shape
execution: found duplicate series for the match group ...
[{__name__="test_series_a", series="2"}, {series="2"}]; ...
The contents of the [...] group are built from Go-map iteration in
vendor/github.com/prometheus/prometheus/promql/engine.go, so the two
entries can appear in either order between Cortex and the reference
Prometheus, even though the error class and the underlying series set
are identical. The strict compare then reported a divergence that did
not exist. Issue cortexproject#7546 captures the full census.
Fix: add sameErrorClass(err1, err2) backed by two helpers in the same
file. canonicalizeErrMsg sorts the contents of every [...] group on
top-level commas; splitTopLevelCommas is brace-depth aware so commas
inside a Prometheus labelset literal ({a="1", b="2"}) are NOT split.
When both errors unwrap to *promv1.Error via errors.As, the
comparator requires equal Type AND equal canonicalized Msg; otherwise
it falls back to comparing canonicalized Error() strings. Replace
the three strict-string sites in TestDisableChunkTrimmingFuzz
(~:408), TestExpandedPostingsCacheFuzz (~:644), and the shared
runQueryFuzzTestCases helper (~:1959) used by ~ten other fuzz tests
in this file.
Add TestSameErrorClass with nine subtests, including the explicit
"same typed class, materially different messages (must not mask)"
guard — two errors with the same *promv1.Error.Type but truly
different messages must still diverge, so the comparator cannot
silently swallow a real Cortex-vs-Prometheus regression.
Fixes cortexproject#7546
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
Contributor
Author
Failing TestVerticalShardingFuzz on the prior CI run was the same pre-existing flake tracked in cortexproject#7547 / fixed by cortexproject#7551 — not a regression from this PR's error-comparator change. Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Replaces three strict
cmp.Equal(tc.err1, tc.err2)call sites inintegration/query_fuzz_test.gowith a newsameErrorClass(err1, err2)helper that compares two Prometheus HTTP API errors by*promv1.Error.Typeand a canonicalized message. Canonicalization sorts the contents of any[...]group inside an error message on top-level commas (brace-depth aware so labelset commas are preserved), so two errors that differ only in the order of a non-deterministic series list compare equal — while two genuinely different errors of the sameTypestill diverge.Touches
TestDisableChunkTrimmingFuzz,TestExpandedPostingsCacheFuzz, and the sharedrunQueryFuzzTestCaseshelper (used by ~ten other fuzz tests in this file).Root cause
Full empirical census (occurrences, environments, exact failing log) is in issue #7546. In short: when a fuzz query exercises a vector-matching path such as
vector(0) or test_series_a + test_series_b, the upstream PromQL engine emits an error of shapeThe list inside
[...]is built from Go-map iteration invendor/github.com/prometheus/prometheus/promql/engine.go, so its order is non-deterministic. Cortex and the reference Prometheus produce the same error class with the same set of series, but the two entries can appear in either order across two processes.cmp.Equal(tc.err1, tc.err2)then fails on a pure string diff and the fuzz test reports a divergence that does not exist.Fix
Two new helpers and one comparator in
integration/query_fuzz_test.go:splitTopLevelCommas(s string) []string— splitsson commas at brace depth 0. Commas inside{...}(Prometheus labelset literals) are preserved, so{a="1", b="2"}is one part, not two.canonicalizeErrMsg(msg string) string— for every[...]group inmsg, splits its contents withsplitTopLevelCommas, trims, sorts, and rejoins with", ". Groups with fewer than two parts are returned untouched.sameErrorClass(err1, err2 error) bool— both-nil → true; one-nil → false; both*promv1.Error(viaerrors.As) → require equalTypeand equal canonicalizedMsg; otherwise fall back to comparing canonicalizedError()strings.Then replace the three strict-string comparators:
integration/query_fuzz_test.go:408(insideTestDisableChunkTrimmingFuzz)integration/query_fuzz_test.go:644(insideTestExpandedPostingsCacheFuzz)integration/query_fuzz_test.go:1959(insiderunQueryFuzzTestCases, used by ~ten sibling fuzz tests)A new
TestSameErrorClassexercises the helper with nine subtests: both-nil, each-only-nil, typed-same-class-with-reordered-bracket-list, typed-different-classes, typed-same-class-with-materially-different-messages (must NOT mask), labelset-with-internal-comma (verifies brace-depth awareness), untyped-same-after-canonicalization, untyped-different. Runs locally; all subtests pass.Why not other approaches
Typeand drop message comparison entirely. Too loose. A pure-Typecomparator would mask cases likeexecution: division by zerovsexecution: parse error: unexpected token— same class, materially different bug. The test plan deliberately includes that subtest as a guard.Trade-off worth flagging honestly: two errors with the same
*promv1.Error.Typebut materially different messages now have to both match before they are accepted. That is intentional — the comparator must not mask a real divergence between Cortex and Prometheus that happens to share an error category. The newTestSameErrorClasshas an explicit subtest pinning this behaviour (same typed class, materially different messages (must not mask)).Which issue(s) this PR fixes
Fixes #7546
Checklist
CHANGELOG.mdupdated — not applicable; test-only change with no user-facing behaviour change.TestSameErrorClass(9 subtests) added next to the new helper, covering all branches including the "must not mask materially different messages" guard.Test plan
Local (no Docker harness required):
gofmt -l integration/query_fuzz_test.go— cleangoimports -local github.com/cortexproject/cortex -l integration/query_fuzz_test.go— cleango vet -tags "netgo slicelabels integration integration_query_fuzz" ./integration/...— cleango build -tags "netgo slicelabels integration integration_query_fuzz" ./integration/...— cleango test -tags "netgo slicelabels integration integration_query_fuzz" -run "^TestSameErrorClass$" ./integration/...— PASS (9/9 subtests)Validation that the flake is gone (requires Docker):
make ./cmd/cortex/.uptodatego test -v -tags=integration,requires_docker,integration_query_fuzz -timeout 2400s -count=10 ./integration/... -run "^TestDisableChunkTrimmingFuzz$|^TestExpandedPostingsCacheFuzz$|^TestPrometheusCompatibilityQueryFuzz$"— expect 10/10 PASS on each. If any new failures surface they should be triaged as real divergences (not regressions from this PR).