Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
88f6d4c
feat(aggregate): cost-aware partial-aggregation skip (opt-in)
zhuqi-lucas May 26, 2026
9940d8a
[FOR BENCHMARK ONLY] flip default to true + update SLT
zhuqi-lucas May 26, 2026
2b49798
tune: drop cost_ns_per_row default from 1000 → 100
zhuqi-lucas May 26, 2026
5c10375
Merge branch 'main' into feat/adaptive-partial-agg-cost
zhuqi-lucas May 26, 2026
e6a98fe
pivot to lower-ratio Rule 2 + diagnostic gauges
zhuqi-lucas May 26, 2026
c087cb4
revert benchmark-only default flip; default back to false
zhuqi-lucas May 26, 2026
3cbcdfa
feat: A/B sampling for cost-aware partial-agg skip decision
zhuqi-lucas May 26, 2026
acb2ad9
[FOR BENCHMARK ONLY] flip default to true so bot exercises A/B sampling
zhuqi-lucas May 26, 2026
df6e264
ci: re-trigger after GitHub Actions infra outage
zhuqi-lucas May 26, 2026
44f815a
feat: segment-level re-probing for dynamic distribution shifts
zhuqi-lucas May 26, 2026
c716f33
Merge branch 'main' into feat/adaptive-partial-agg-cost
zhuqi-lucas May 26, 2026
a258afe
Revert "feat: segment-level re-probing for dynamic distribution shifts"
zhuqi-lucas May 26, 2026
a2baa6c
test: explicitly disable cost model in legacy not-locked-until-skip test
zhuqi-lucas May 26, 2026
08215ab
test: regen push_down_filter_parquet.slt for new probe gauges
zhuqi-lucas May 27, 2026
62c2580
lazy-register diagnostic gauges so small queries don't show '...=0' n…
zhuqi-lucas May 27, 2026
ffbf659
Merge branch 'main' into feat/adaptive-partial-agg-cost
zhuqi-lucas May 27, 2026
81493aa
test: disable cost model in push_down_filter_regression to stabilise …
zhuqi-lucas May 27, 2026
d1f8b7b
Merge branch 'main' into feat/adaptive-partial-agg-cost
zhuqi-lucas May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions datafusion/common/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -648,6 +648,33 @@ config_namespace! {
/// aggregation ratio check and trying to switch to skipping aggregation mode
pub skip_partial_aggregation_probe_rows_threshold: usize, default = 100_000

/// (experimental) When true, apply a *secondary* skip rule on top
/// of `skip_partial_aggregation_probe_ratio_threshold`: skip
/// partial aggregation when the measured ratio is at least
/// `skip_partial_aggregation_cost_min_ratio` (default 0.5).
/// Targets ClickBench Q18-shape queries where the ratio (~0.56)
/// sits just below the fixed 0.8 threshold so partial agg keeps
/// running, but the absolute work (heavy variable-length keys,
/// complex aggregates) makes it net-negative.
///
/// Empirical motivation: lowering the global ratio threshold to
/// 0.6 fixes Q18 (1.73× faster) but risks regressing low-cost
/// queries at similar ratios. This flag exposes the lower
/// threshold as a separate, opt-in knob. Whether the cost-aware
/// signal (`partial_agg_probe_ns_per_row` metric) can replace
/// this static threshold is an open question — for now the
/// metric is reported alongside so callers can evaluate.
pub skip_partial_aggregation_use_cost_model: bool, default = true

/// Number of input rows used in the A/B sampling window after the
/// initial partial probe completes. During this window the operator
/// routes input through the passthrough (`transform_to_states`)
/// path so the probe can measure `passthrough_ns/row` and compare
/// it against the previously measured `partial_ns/row`. Default
/// 10000 — large enough to amortise per-row noise, small enough to
/// be cheap if the decision turns out to be "keep partial".
pub skip_partial_aggregation_ab_sampling_rows: usize, default = 10_000

/// Should DataFusion use row number estimates at the input to decide
/// whether increasing parallelism is beneficial or not. By default,
/// only exact row numbers (not estimates) are used for this decision.
Expand Down
Loading
Loading