perf(physical-optimizer): skip ensure_distribution rebuild when children are unchanged by zhuqi-lucas · Pull Request #22521 · apache/datafusion

zhuqi-lucas · 2026-05-26T06:34:44Z

Which issue does this PR close?

Rationale for this change

ensure_distribution in datafusion/physical-optimizer/src/ensure_requirements/enforce_distribution.rs unconditionally calls plan.with_new_children(children_plans) after collecting the (possibly redistributed) children, even when none of those children were actually replaced. For nodes like ProjectionExec, that path runs through try_new and recomputes the schema, equivalence properties, output ordering, and output partitioning, then allocates a new Arc<dyn ExecutionPlan>. When every child Arc is pointer-identical to the input, that work produces a logically identical node — pure overhead.

The cost is amplified by two factors:

Plan depth. Workloads dominated by point queries (no join / aggregate / unmet ordering — i.e. nothing for ensure_distribution to inject a RepartitionExec or SortExec for) hit this wasted rebuild at every node in the plan. A 5–30 deep ProjectionExec stack pays the cost N times.
Schema width. Most steps inside ProjectionExec::try_new are O(num_columns): per-column data_type / nullable lookup to build the new schema, per-column remapping of equivalence classes through the projection mapping, and per-column lookup when rewriting PhysicalSortExprs into the output ordering. Wide schemas (tens of columns) make every wasted rebuild proportionally heavier.

Profiling a production point-query workload (wide schemas, deep ProjectionExec stacks) showed ProjectionExec::with_new_children as the single largest cost inside ensure_distribution:

ensure_distribution total: 2.87s of a 60s CPU sample
ProjectionExec::with_new_children: 1.94s (56% of the rule)
SortExec::with_new_children: 0.11s
Other ExecutionPlan nodes: 0.82s

What changes are included in this PR?

After collecting children_plans, compare each new child Arc with the original via Arc::ptr_eq. When every child is unchanged, reuse the existing plan Arc and skip with_new_children. The UnionExec to InterleaveExec special case still runs first because it intentionally produces a new node even when child Arcs are unchanged.

This relies on the fact that ensure_distribution already produces pointer-identical Arcs for children that need no redistribution (it threads the original Arc through unchanged), so Arc::ptr_eq precisely distinguishes "rewritten" from "untouched" children at O(1) per child.

Are these changes tested?

Yes. The existing enforce_distribution suite passes unchanged (66/66):

cargo test --release -p datafusion --test core_integration -- physical_optimizer::enforce_distribution

The behavior is observable only as a CPU reduction; correctness is preserved because ExecutionPlan nodes are immutable, so reusing the original Arc produces the same plan tree as with_new_children(unchanged_children) would have, just without the schema / ordering / equivalence / partitioning recomputation.

Are there any user-facing changes?

No. Same plans, lower planning time.

Micro-benchmark

Plan shape: 30-deep ProjectionExec stack over a sorted parquet scan, 5000 iterations.

Without fix: 852.74 ms total, 170.55 us/call
With fix: 296.81 ms total, 59.36 us/call
~2.87x speedup, -65% CPU per call

Wider schemas (more projection expressions per node) widen the gap further because each skipped with_new_children avoids more O(num_columns) work.

…ren are unchanged ensure_distribution was unconditionally calling plan.with_new_children after collecting the (possibly redistributed) children, even when none of the children were actually replaced. For nodes like ProjectionExec, that path runs through try_new and recomputes the schema, equivalence properties, and output ordering each time, which is pure overhead when the input Arcs are identical. Compare each new child Arc with the original via Arc::ptr_eq and reuse the existing plan Arc when nothing changed. The UnionExec to InterleaveExec special case still runs first because it intentionally produces a new node. On a representative deep ProjectionExec stack (30 layers over a sorted parquet scan, no join / aggregate / unmet ordering, 5000 iterations) this brings ensure_distribution from 170.55 us/call to 59.36 us/call, a ~2.87x speedup. Profiling on a real workload dominated by point queries showed ProjectionExec::with_new_children taking 1.94s out of a 2.87s ensure_distribution slice in a 60s sample, so this is the bulk of the rule's cost on that shape. Closes apache#22520

Copilot

Pull request overview

This PR optimizes the physical optimizer’s distribution enforcement by avoiding unnecessary plan-node rebuilds when the computed child plans are identical to the existing children, reducing planning CPU overhead (notably for deep ProjectionExec stacks) without changing correctness.

Changes:

Detect when all post-processing child plan Arcs are pointer-identical to the original children via Arc::ptr_eq.
Reuse the existing plan Arc and skip with_new_children when children are unchanged.
Preserve the existing UnionExec → InterleaveExec special-case behavior by applying it before the “children unchanged” fast-path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings May 26, 2026 06:34

Copilot started reviewing on behalf of zhuqi-lucas May 26, 2026 06:34 View session

github-actions Bot added the optimizer Optimizer rules label May 26, 2026

Copilot AI reviewed May 26, 2026

View reviewed changes

Merge branch 'main' into qizhu/enforce-distribution-skip-rebuild

565af48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(physical-optimizer): skip ensure_distribution rebuild when children are unchanged#22521

perf(physical-optimizer): skip ensure_distribution rebuild when children are unchanged#22521
zhuqi-lucas wants to merge 2 commits into
apache:mainfrom
zhuqi-lucas:qizhu/enforce-distribution-skip-rebuild

zhuqi-lucas commented May 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhuqi-lucas commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Micro-benchmark

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhuqi-lucas commented May 26, 2026 •

edited

Loading